Extrinsic-Evaluation-tasks

Fork of shashwath94/Extrinsic-Evaluation-tasks, with some cleaning to support running all tasks on an arbitrary set of embeddings more easily.

Original repository assembled for Rogers et al. (2018) "What's in Your Embedding, And How It Predicts Task Performance".

This version of repository assembled and released due to usage in Whitaker et al. (2019) "Characterizing the impact of geometric properties of word embeddings on task performance".

Specific details for each task

References pending

References

Bib entry for Rogers et al (2018):

@inproceedings{C18-1228,
  title = "What{'}s in Your Embedding, And How It Predicts Task Performance",
  author = "Rogers, Anna  and Hosur Ananthakrishna, Shashwath  and Rumshisky, Anna",
  booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
  month = aug,
  year = "2018",
  address = "Santa Fe, New Mexico, USA",
  publisher = "Association for Computational Linguistics",
  url = "https://www.aclweb.org/anthology/C18-1228",
  pages = "2690--2703",
}

Bib entry for Whitaker et al (2019):

@inproceedings{Whitaker2019,
  title = "Characterizing the impact of geometric properties of word embeddings on task performance",
  author = "Whitaker, Brendan and Newman-Griffis, Denis and Haldar, Aparajita and Ferhatosmanoglu, Hakan and Fosler-Lussier, Eric",
  booktitle = "Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP",
  month = jun,
  year = "2019",
  address = "Minneapolis, Minnesota, USA",
  publisher = "Association for Computational Linguistics"
}

Old README

To run all tasks, execute run_tasks.sh.

For each individual task, preprocess.py (where present) will load the preprocessed version of the dataset. To train the model, run train.py

A pretrained word embedding text file is needed where every line has a word string followed by a space and the embedding vector. For example, acrobat 0.6056159735 -0.1367940009 -0.0936380029 0.8406270146 0.2641879916 0.4209069908 0.0607739985 0.5985950232 -1.1451450586 -0.8666719794 -0.5021889806 0.4398249984 0.9671009779 0.7413169742 -0.0954160020 -1.1526989937 -0.3915260136 -0.1520590037 0.0893440023 -0.2578850091 -0.6204599738 -0.8789629936 0.3581469953 0.5509790182 0.1234730035

Data for NLI task can be found here

For sequence labeling tasks (POS, NER and chunking), please refer to this repo

Dependencies

Numpy
Keras
Keras backend; default Tensorflow

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Relation_extraction		Relation_extraction
sentence_polarity_classification		sentence_polarity_classification
sentiment_classification		sentiment_classification
snli		snli
subjectivity_classification		subjectivity_classification
README.md		README.md
requirements.txt		requirements.txt
run_tasks.sh		run_tasks.sh
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relation_extraction

Relation_extraction

sentence_polarity_classification

sentence_polarity_classification

sentiment_classification

sentiment_classification

snli

snli

subjectivity_classification

subjectivity_classification

README.md

README.md

requirements.txt

requirements.txt

run_tasks.sh

run_tasks.sh

util.py

util.py

Repository files navigation

Extrinsic-Evaluation-tasks

Specific details for each task

References

Old README

Dependencies

About

Releases

Packages

Languages

drgriffis/Extrinsic-Evaluation-tasks

Folders and files

Latest commit

History

Repository files navigation

Extrinsic-Evaluation-tasks

Specific details for each task

References

Old README

Dependencies

About

Resources

Stars

Watchers

Forks

Languages