When Time Makes Sense:
A Historically-Aware Approach to Targeted Sense Disambiguation

This repository provides underlying code and materials for the paper When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation.

Installation

We strongly recommend installation via Anaconda:

Refer to Anaconda website and follow the instructions.
Create a new environment:

conda create -n py37_tsd python=3.7

Activate the environment:

conda activate py37_tsd

Install dependencies:

cd /path/to/my/TargetedSenseDisambiguation
pip install -r requirements.txt

Also, we use a spaCy model: en_core_web_lg which can be installed:

python -m spacy download en_core_web_lg

Code

This section explains how to run the code. For most of scripts you'd need credentials for the Oxford Historical Dictionary Research API. These scripts are marked by \*\*. More information on obtaining access to the API can be found here.

[WARNING] Results produced by this notebook may slightly differ from those in the paper, this is because:

the source data (the quotations stored in the OED) may change over time
the order is which data is retrieved and stored changes with each run, reulting in the different splits for train, validation and test. Please contact the author

However, the authors have rerun the pipeline multiple times and scores produced by these scritps are close to the ones reported in the paper and don't affect the conclusions drawn from the experiments.

The only deviation may be results for the curated experiments, which tend to be more volatile.

Generate Dataframe

This script generate_dataframes.py downloads data from the API for a given headword and vectorizes the keyword of the quotations.

[WARNING] This script requires access to the historical BERT models, available on Zenodo. Please copy bert_1760_1850 and bert_1760_1900 models to the models folder and adjust the paths in lines 7-8.

[WARNING] To download the data you need access to the OED API, more information on how to obtain credentials is available here. Once you have the credentials, add them to oed_credentials.json.

python generate_dataframes.py

All results should be saved in the /data folder. Almost all next steps require these data as input.

Running Experiments

Comparing BERT models

The code snippet below runs the main experiment that tests the effect of plugging in historical BERT models.

[WARNING] This script requires access to a historical word2vec model which available on Zenodo. Please copy the w2v_1760_1900 model to the models folder.

[WARNING] in line 15 of run_main_experiment.py change the path to the word2vec model.

python run_main_experiment.py

All results should be saved in result_{year} folder.

Time-sensitive approaches

[WARNING] in line 15 of run_experiment_ts_disambiguation.py change the path to the word2vec model.

To create results files for the time-sensitive methods, run:

python python run_experiment_ts_disambiguation.py

Then run run_experiment_ts_disambiguation.py to run the experiments with time-sensitive disambiguation.

python run_experiment_ts_disambiguation.py

Case-studies

[WARNING] in line 15 of run_experiment_curated_cases.py change the path to the word2vec model.

To run the case studies, execute:

python run_experiment_curated_cases.py

Create Results

To create the results from the output generated by the experiments, run the cells in create_results_tables.ipynb. This notebooks is runnable using the .csv files with results from running the previous scripts.

Explore Results

To explore results and recreate Figure 1, run cells in explore_results.ipynb. This notebooks requires output from generate_dataframes.py (saved in the ./data folder).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

When Time Makes Sense:
A Historically-Aware Approach to Targeted Sense Disambiguation

Table of contents

Installation

Code

Generate Dataframe

Running Experiments

Comparing BERT models

Time-sensitive approaches

Case-studies

Create Results

Explore Results

Fin.

About

Releases

Packages

Contributors 6

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 311 Commits
data		data
tasks		tasks
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
case_studies.py		case_studies.py
create_results_tables.ipynb		create_results_tables.ipynb
explore_results.ipynb		explore_results.ipynb
generate_dataframes.py		generate_dataframes.py
oed_credentials.json		oed_credentials.json
requirements.txt		requirements.txt
run_experiment_curated_cases.py		run_experiment_curated_cases.py
run_experiment_ts_disambiguation.py		run_experiment_ts_disambiguation.py
run_main_experiment.py		run_main_experiment.py

License

Living-with-machines/TargetedSenseDisambiguation

Folders and files

Latest commit

History

Repository files navigation

When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation

Table of contents

Installation

Code

Generate Dataframe

Running Experiments

Comparing BERT models

Time-sensitive approaches

Case-studies

Create Results

Explore Results

Fin.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

When Time Makes Sense:
A Historically-Aware Approach to Targeted Sense Disambiguation

Packages