This repository contains source codes developed for Machine Learning(Word embedding approach) in this paper:
Arash Dargahi Nobari, Sajad Sotudeh Gharebagh and Mahmood Neshati. “Skill Translation Models in Expert Finding”,
In proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’17), Aug 2016.
You may check the paper (PDF) for more information.
Main repository for other codes can be accessed here
Python3.5 and Tensorflow 0.11 is required for running the code.
Many files on this project are just test and backup file. Only the following files are important:
deep_transalteion_new.py
: To run algorithm if you don't include bias parameterresult.py
: To get final results if you don't include bias parameterdeep_transalteion_new_bias.py
: To run algorithm if you include bias parameterresult_with_bias.py
: To get final results if you include bias parameter
Before running code you need to generate 1gram.csv file using java app in this repository and put it into data/grams folder
All of data is ignored in git repository.
These files can be downloaded from dropbox This file includes data
folder
The data
folder includes the following files and folders:
grams
: TF and TF/IDF weight for words.java_a_tag.txt
: Tags for each answer (Answers don't have tag by their self, taged are extracted from related questions)doc_len.txt
: Normalized lenght for each answer.topWords.txt
: top frequent words in answers.
Please cite the paper, If you used the codes in this repository.
@inproceedings{DargahiNobari:2017:STM,
author = {Dargahi Nobari, Arash and Sotudeh Gharebagh, Sajad and Neshati, Mahmood},
title = {Skill Translation Models in Expert Finding},
booktitle = {Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval},
series = {SIGIR '17},
year = {2017},
isbn = {978-1-4503-5022-8},
location = {Shinjuku, Tokyo, Japan},
pages = {1057--1060},
numpages = {4},
url = {http://doi.acm.org.ezp3.semantak.com/10.1145/3077136.3080719},
doi = {10.1145/3077136.3080719},
acmid = {3080719},
publisher = {ACM},
keywords = {expertise retrieval, semantic matching, stackoverflow, statistical machine translation, talent acquisition},
}