QVEC

This is an easy-to-use, fast tool to measure the intrinsic quality of word vectors. The evaluation score depends on how well the word vectors correlate with a matrix of features from manually crafted lexical resources. The evaluation score is shown to correlate strongly with performance in downstream tasks (cf. Tsvetkov et al, 2015 for details and results). QVEC is model agnostic and thus can be used for evaluating word vectors produced by any given model.

Evaluation of Word Vector Representations by Subspace Alignment

Usage

Each vector file should have one word vector per line as follows (space delimited):-

the -1.0 2.4 -0.3 ...

Semantic content evaluation:

./qvec_cca.py --in_vectors  ${your_vectors} --in_oracle  oracles/semcor_noun_verb.supersenses.en

To obtain vector column labels, add the --interpret parameter; to print top K values in each dimension add --top K:

./qvec.py --in_vectors ${your_vectors} --in_oracle oracles/semcor_noun_verb.supersenses.en --interpret --top 10

Multilingual evaluation for English, Danish, and Italian:

./qvec_cca.py --in_vectors  ${your_vectors} --in_oracle   --in_oracle oracles/semcor_noun_verb.supersenses.en,oracles/semcor_noun_verb.supersenses.it,oracles/semcor_noun_verb.supersenses.da

Syntactic content evaluation:

./qvec_cca.py --in_vectors  ${your_vectors} --in_oracle  oracles/ptb.pos_tags

Citation:

@InProceedings{qvec:enmlp:15,
author = {Tsvetkov, Yulia and Faruqui, Manaal and Ling, Wang and Lample, Guillaume and Dyer, Chris},
title={Evaluation of Word Vector Representations by Subspace Alignment},
booktitle={Proc. of EMNLP},
year={2015},
}

This repository is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
oracles		oracles
src		src
LICENSE.md		LICENSE.md
README.md		README.md
qvec-python2.7.py		qvec-python2.7.py
qvec.py		qvec.py
qvec_cca.py		qvec_cca.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QVEC

Usage

Semantic content evaluation:

Multilingual evaluation for English, Danish, and Italian:

Syntactic content evaluation:

Citation:

About

Releases

Packages

Contributors 2

Languages

License

ytsvetko/qvec

Folders and files

Latest commit

History

Repository files navigation

QVEC

Usage

Semantic content evaluation:

Multilingual evaluation for English, Danish, and Italian:

Syntactic content evaluation:

Citation:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages