No description, website, or topics provided.
Jupyter Notebook Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data Add files via upload Jan 12, 2018
fnc expand terminal arguments Jun 12, 2017
.gitignore first commit Jun 11, 2017
README.md Update README.md Jul 13, 2018
requirements.txt Update requirements.txt Jun 12, 2017
system_description_athene.pdf added system description Jun 15, 2017

README.md

2010-07-07_ukp_banner

aiphes_logo - small tud_weblogo

Introduction

The repository was developed as a part of the Fake News Challenge Stage 1 (FNC-1 http://www.fakenewschallenge.org/) by team Athene: Andreas Hanselowski, Avinesh PVS, Benjamin Schiller and Felix Caspelherr. In the project, we worked in close collaboration with Debanjan Chaudhuri.

Our new paper in COLING 2018: A Retrospective Analysis of the Fake News Challenge Stance Detection Task

Our Blog Post on the Fake News Challenge.

Prof. Dr. Iryna Gurevych, AIPHES-Ubiquitous Knowledge Processing (UKP) Lab, TU-Darmstadt, Germany

Requirements

  • Software dependencies

      python >= 3.4.0 (tested with 3.4.0)
    

Installation

  1. Install required python packages.

     python3.4 -m pip install -r requirements.txt --upgrade
    
  2. In order to reproduce the the results of our best submission to the FNC-1, please go to Athene_FNC-1 Google Drive and download the features.zip and model.zip and unzip them in respective folders.

     unzip  features.zip athene_system/data/fnc-1/features
     unzip  model.zip athene_system/data/fnc-1/mlp_models
    
  3. Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.

     python3.4 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
    
  4. Copy Word2Vec GoogleNews-vectors-negative300.bin.gz in folder athene_system/data/embeddings/google_news/

  5. Download Paraphrase Database: Lexical XL Paraphrases 1.0 and extract it to the ppdb folder.

     gunzip ppdb-1.0-xl-lexical.gz athene_system/data/ppdb/
    
  6. To use the Stanford-parser an instance has to be started in parallel: Download Stanford CoreNLP, extract anywhere and execute following command:

     wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
     java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
    

Additional notes

  • In order to reproduce the classification results of the best submission at the day of the FNC-1, it is mandatory to use tensorflow v0.9.0 (ideally GPU version) and the exact library versions stated in requirements.txt, including python 3.4.

  • Setup tested on Anaconda3 (tensorflow 0.9 gpu version)*

      conda create -n env_python3.4 python=3.4 anaconda
    
      source activate env_python3.4
    
      env_python3.4/bin/python3.4 -m pip install -r requirements.txt --upgrade
    
      env_python3.4/bin/python3.4 -m pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp34-cp34m-linux_x86_64.whl
    

To Run

To run the pre trained model and test

python pipeline.py -p ftest

For more details

python pipeline.py --help         
    
    e.g.: python pipeline.py -p crossv holdout ftrain ftest
    
    * crossv: runs 10-fold cross validation on train / validation set and prints the results
    * holdout: trains classifier on train and validation set, tests it on holdout set and prints the results
    * ftrain: trains classifier on train/validation/holdout set and saves it to athene_systems/data/fnc-1/mlp_models
    * ftest: predicts stances of unlabeled test set based on the model (see Installation, step 2) 

After ftest was executed, the labeled stances will be saved to disk:

cat athene_system/data/fnc-1/fnc_results/submission.csv

System description

A more detailed description of the system including the features, which have been used, can be found in the document: system_description_athene.pdf