Semantic Change Detection for the Romanian Language

Article:

Ciprian-Octavian Truică, Victor Tudose, Elena-Simona Apostol. Semantic Change Detection for the Romanian Language. nternational Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2023), 2023. Link: https://arxiv.org/abs/2308.12131

Code

Packages needed:

scipy
Scikit-learn
numpy
nltk
gensim
matplotlib
pandas
flask
plotly

Static word embeddings

Use the classes from the sgns_op.py and sgns_wi.py files located in the representations folder to train word embeddings.

Contextual word embeddings

To train and test the ELMo embeddings use the jupyter notebooks in the elmo_embs folder. You will need the AllenNLP package.

Running the tests

A test is comprises by runing a model with a predefined configuration:

SGNS-OP: run.py sgns_op tasks/sem_shift_en.json

SGNS-WI: run.py sgns_wi tasks/sem_shift_en.json

ELMO-PREV: run.py elmo_with_precomp tasks/elmo_model1.json

ELMO-POST: run.py elmo_with_precomp tasks/elmo_model2.json

Schema for a configration task.json file (in the tasks directory):

target: where to output data
language, name, description: descriptive terms to describe the task
corpora: a list of 2 corpora used to compare the words
threshold: a value above which we consider the word to be changed
tests: a list of words to be compared across corpora, specifing the word and expected change
skip_train: a value that if it present, will make the test skip any train and just compare results

Demo

Change directory to the demo folder and run python app.py Use a browser to and open the link http://127.0.0.1:5000

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
corpora_utils		corpora_utils
demo		demo
elmo_embs		elmo_embs
precomp_results		precomp_results
representations		representations
tasks		tasks
README.md		README.md
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

corpora_utils

corpora_utils

demo

demo

elmo_embs

elmo_embs

precomp_results

precomp_results

representations

representations

tasks

tasks

README.md

README.md

run.py

run.py

Repository files navigation

Semantic Change Detection for the Romanian Language

Article:

Code

Static word embeddings

Contextual word embeddings

Running the tests

Demo

About

Releases

Packages

Contributors 2

Languages

DS4AI-UPB/SemanticChange-RO

Folders and files

Latest commit

History

Repository files navigation

Semantic Change Detection for the Romanian Language

Article:

Code

Static word embeddings

Contextual word embeddings

Running the tests

Demo

About

Resources

Stars

Watchers

Forks

Languages