Skip to content

gijswijnholds/compdisteval-ellipsis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

compdisteval-ellipsis

This repository links to the datasets, models and code for evaluating composition models for VP-elliptical sentences, as described in the article

Gijs Wijnholds and Mehrnoosh Sadrzadeh. Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings. NAACL-HLT 2019.

If you find any of this useful, please consider citing our paper as

@inproceedings{wijnholds2019evaluating,
  title = "Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings",
  author = "Gijs Wijnholds and Mehrnoosh Sadrzadeh",
  year = "2019",
  booktitle={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},
  publisher={Association for Computational Linguistics}
}

Datasets

We provide two new datasets, extending the verb disambiguation dataset of Grefenstette & Sadrzadeh 2011 and the transitive sentence similarity dataset of Kartsaklis & Sadrzadeh 2013.

ELLDIS

This dataset extends the verb disambiguation dataset of Grefenstette & Sadrzadeh 2011 to VP-elliptical settings. Link

ELLSIM

This dataset extends the verb disambiguation dataset of Kartsaklis & Sadrzadeh 2013 to VP-elliptical settings. Link

Models

We provide four trained vector spaces, following several popular embedding methods. For each of the vector spaces, we also provide a separate tensor space, containing learned matrices for 85 verbs that occur in the evaluation datasets. The tensors are presented in a flattened format, so they need to be reshaped to size (d, d) for d the dimension of the corresponding vector space.

Model Name Dimensions Vectors Tensors
count 2000 link link
word2vec 300 link link
glove 300 link link
fasttext 300 link link

Code

We provide some code for evaluating the vector space models on the new datasets. This can be found in my main code repository for evaluation of compositional distributional semantics here

About

Data, models and code for evaluating composition models for VP-elliptical sentences.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published