Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
End-to-end relation extraction and knowledge base population pipeline.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
RelationFactory is a relation extraction and knowledge-base population system. It was the top-ranked system in TAC KBP 2013 English Slot-filling (http://www.nist.gov/tac/2013/KBP/index.html). If you want to use RelationFactory in a TAC benchmark, please contact the authors (see LICENSE for details). RelationFactory uses SVMLight (http://svmlight.joachims.org/) for classification, so you must agree to the License of SVMLight, especially to it being restricted to scientific use only. QUICK START =========== 0. Prerequisites Make sure the following software is installed: ghc, version >= 7.4.1 cabal, version >= 1.14.0 java / JDK, version >= 6 (the Oracle one) unix tools, including wget 1. Download models If you want to use pre-trained models, download them from our server: wget https://www.lsv.uni-saarland.de/fileadmin/data/relationfactory_models.tar.gz tar xzf relationfactory_models.tar.gz 2. Set paths E.g. by putting the following lines in your ~/.bashrc : # relationfactory clone export TAC_ROOT=/path/to/relationfactory # pre-trained models export TAC_MODELS=/path/to/relationfactory_models The TAC_ROOT variable has to be set. The TAC_MODELS variable is optional. If it is not set, the models have to be specified in the config file. 3. Compile system $TAC_ROOT/bin/generate_system.sh 4. Index corpus See the corresponding README in $TAC_ROOT/indexing 5. Configure run The settings can be taken from $TAC_ROOT/config/system2013.config . Make sure to adapt it to your models and index locations. Also point to the TAC queries file for which you want to get results, and specify a rundir where files for that run are put. 6. Run $TAC_ROOT/bin/run.sh your_system.config 7. Check response check the output file, /your/rundir/response_fast_pp13. It should contain for each query some mixture of NIL answers and other answers, many of which score by 1.0, others with lower score. Evaluate your run using the official TAC scorer. Note that due to refactoring, slightly different answers are returned than in TAC 2013. The 'exact' evaluation, that is dependent on document id's and offsets to be included in the answer pool, is very sensitive to that. Use 'anydoc' evaluation mode to obtain more robust scores. 8. How to change the pipeline Change $TAC_ROOT/bin/makefile and insert a rule describing your new target.