bologi

The Bologi Triple Scorer 2017

for testing, simply run the script to run and eval: run 'source ./src/test.sh [#tuples to test] [profession/nationality] [vm/dev]'

usage:

make sure you have wiki file downloaded if you're not running on vm, placed under ./data/raw_data/
your './data/intermediate_data/' should contain the following: name2sentence nationality_words_table.txt profession_words_table.txt (note name2sentence can be an empty folder, we will store the look-up dictionary here in next step)

on first use on your own machine, create name2sentence dictionaries: 'python2 ./src/cjy_dict_generator.py'

run main script to predict tuple such as 'python2 cjy_main.py -i ../data/input_tuple/[input filename] -o ../data/output_data'
inspect output triples 'cd ./data/output_data/[input filename]'
evaluate score using ./src/evaluator.py run 'python3 evaluator.py [-h] --run RUN --truth TRUTH --output OUTPUT'
A shell script is provided for quick run and eval on training triples run 'cd ./src && source test.sh [#tuples to test] [profession/nationality] [vm/dev] <0 for unstemmed table> ", the later two parameters are optional for development purpose

on vm, data already located at bologi@tira-ubuntu:/media/training-datasets/triple-scoring/
download datas as zip: 'curl http://broccoli.cs.uni-freiburg.de/wsdm-cup-2017/triple-scoring.zip -o <output path such as ./data/war_data/wiki-sentences>'
wiki data path on vm "/media/training-datasets/triple-scoring/wsdmcup17-triple-scoring-training-dataset-2016-09-16/wiki-sentences"

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
src		src
README.md		README.md
download.sh		download.sh
testfile.train		testfile.train