Semantic-Augmentation

Evaluation of Semantic Augmentation method

Steps to run the code

1. process_tfidf_wiki.ipynb generate the sparse matrix for all wikipedia keywork and saved them into many chunks.
1. python parse.py
1. do not need to run the calculate_sim.py (I generated them on server, takes some time), which will generate files in query_pkl, where each file corresponding to a query and its topn similar words. Then it call the cal_tag_tag_sim from util.py to calculate tag tag similarity based p percent quantile
1. run semantic_augmentation.py

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.idea		.idea
semantic_augment		semantic_augment
word2vec		word2vec
README.md		README.md
_config.yml		_config.yml