Skip to content

Repository to contain code and figures for: When Reproducibility goes Sideways: The Case of Knowledge-Enhanced Document Embeddings

Notifications You must be signed in to change notification settings

stefano-marchesin/learning_ke_dembs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

When Reproducibility goes Sideways: The Case of Knowledge-Enhanced Document Embeddings

Repository to contain code and information for:

Knowledge-Enhanced Document Embeddings for IR

Requirements

  • ElasticSearch 6.6
  • Python 3
    • Numpy
    • TensorFlow >= 1.13
    • Whoosh
    • SQLite3
    • Cvangysel
    • Pytrec_Eval
    • Scikit-Learn
    • Tqdm
    • QuickUMLS
    • Elasticsearch
    • Elasticsearch_dsl
  • UMLS 2018AA

Additional Notes

server.py needs to be substitued within QuickUMLS folder as it contains a modified version required to run knowledge-enhanced models.
The folder structure required to run experiments can be seen in folder example. Python files need to be put in root.
Qrels file needs to be in .txt format.
To perform retrofitting run retrofit_doc_vecs.py, whereas to train PV-DM and cDoc2Vec models run gensim_doc2vec.py.
To run BM25, use the Jupyter Notebook file elastic_search.ipynb.
To perform query expansion run qe_combsum.py.

About

Repository to contain code and figures for: When Reproducibility goes Sideways: The Case of Knowledge-Enhanced Document Embeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages