Skip to content

nlx-group/WordNetEmbeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WordNet Embeddings

wnet2vec

Article

Saedi, Chakaveh, António Branco, João António Rodrigues and João Ricardo Silva, 2018, "WordNet Embeddings", In Proceedings, 3rd Workshop on Representation Learning for Natural Language Processing (RepL4NLP), 56th Annual Meeting of the Association for Computational Linguistics, 15-20 July 2018, Melbourne, Australia.

WordNet used in the above paper

Princeton WordNet 3.0

Test sets used in above paper

Please note that the semantic network to semantic space method presented in the above paper includes random-based subprocedures (e.g. selecting one word from a set of words with identical number of outgoing edges). The test scores may present slight fluctuations over different runs of the code.

SimLex-999

RG1965

WordSim-353-Similarity

WordSim-353-Relatedness

MEN

MTurk-771

Models

The best wnet2vec model we have obtained that was ran with 60,000 words using Princeton WordNet 3.0, referred in the article, is available for download here.

How to run wn2vec software

To provide input files to the software the following structure must exist:

|-- main.py
|-- data
|   |-- input
|   |   |-- language_wnet
|   |   |   |-- *wnet_files
|   |   |-- language_testset
|   |   |   |-- *testset_files
|   |-- output
|-- modules
|   |-- input_output.py
|   |-- sort_rank_remove.py
|   |-- vector_accuracy_checker.py
|   |-- vector_distance.py
|   |-- vector_generator.py

Where language is the language that you are using that must be indicated in main.py in the variable lang. If the language isn't supported by the current path routing in the code, which was mainly use for experiments, you may add the path to the directory in the files input_output.py, vector_generator.py and vector_accuracy_checker.py.

Various variables for the output of the model, such as embedding dimension, can be found in main.py.

To run the software, you will need the following packages:

  • Numpy
  • progressbar
  • keras
  • sklearn
  • scipy
  • gensim

Python3.5 was used for the experimentation.

Releases

No releases published

Packages

No packages published

Languages