Skip to content

Commit

Permalink
Update the Readme with additional installation instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
daniilsorokin committed Feb 11, 2019
1 parent d7c106b commit f59b86d
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 27 deletions.
9 changes: 9 additions & 0 deletions NOTICE.txt
@@ -0,0 +1,9 @@
-------------------------------------------------------------------------------
Copyright 2019
Ubiquitous Knowledge Processing (UKP) Lab
Technische Universität Darmstadt

-------------------------------------------------------------------------------
Third party legal information


85 changes: 58 additions & 27 deletions README.md
@@ -1,16 +1,15 @@
![2010-07-07_ukp_banner](https://user-images.githubusercontent.com/29311022/27184688-27629126-51e3-11e7-9a23-276628da2430.png)
<img src="https://user-images.githubusercontent.com/29311022/27184688-27629126-51e3-11e7-9a23-276628da2430.png" height=70px/>
<img src="https://user-images.githubusercontent.com/29311022/27278631-2e19f99e-54e2-11e7-919c-f89ae0c90648.png" height=70px/>
<img src="https://user-images.githubusercontent.com/29311022/27184769-65c6583a-51e3-11e7-90e0-12a4bdf292e2.png" height=70px/>

![aiphes_logo - small](https://user-images.githubusercontent.com/29311022/27278631-2e19f99e-54e2-11e7-919c-f89ae0c90648.png)
![tud_weblogo](https://user-images.githubusercontent.com/29311022/27184769-65c6583a-51e3-11e7-90e0-12a4bdf292e2.png)
# Multi-Sentence Textual Entailment for Claim Verification
## This repository was constructed by team Athene for the [FEVER shared task 1](http://fever.ai/2018/task.html). The system reached the third rank in the overall results and first rank on the evidence recall sub-task

This repository builds upon the baseline system repository developed by the FEVER shared task organizers: https://github.com/sheffieldnlp/fever-naacl-2018

## This repository was constructed by team Athene for the FEVER shared task 1 (http://fever.ai/2018/task.html)
### The system reached the third rank in the overall results and first rank on the evidence recall sub-task
This is an accompanying repository for our FEVER Workshop paper at EMNLP 2018. For more information see the paper: [UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification](https://arxiv.org/pdf/1809.01479.pdf)

This repository builts upon the baseline system repository developed by the FEVER shared task organizers: (https://github.com/sheffieldnlp/fever-naacl-2018)

For more information see our paper: [UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification](https://arxiv.org/pdf/1809.01479.pdf)
* BibTeX:
Please use the following citation:

@article{hanselowski2018ukp,
title={UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification},
Expand All @@ -20,11 +19,34 @@ For more information see our paper: [UKP-Athene: Multi-Sentence Textual Entailme
}


Disclaimer:
> This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

### Requirements
* Python 3.6
* AllenNLP
* TensorFlow

### Installation

* Download and install Anaconda (https://www.anaconda.com/)
* Create a Python Environment and activate it:
```bash
conda create -n fever python=3.6
source activate fever
```
* Install the required dependencies
```bash
pip install -r requirements.txt
```
* Download NLTK Punkt Tokenizer
```bash
python -c "import nltk; nltk.download('punkt')"
```
* Proceed with downloading the data set, the embeddings, the models and the evidence data

## Download Data
### Download the FEVER data set
Download the FEVER dataset from [the website of the FEVER share task](https://sheffieldnlp.github.io/fever/data.html) into the data directory

mkdir data
Expand All @@ -34,7 +56,10 @@ Download the FEVER dataset from [the website of the FEVER share task](https://sh
wget -O data/fever-data/train.jsonl https://s3-eu-west-1.amazonaws.com/fever.public/train.jsonl
wget -O data/fever-data/dev.jsonl https://s3-eu-west-1.amazonaws.com/fever.public/shared_task_dev.jsonl
wget -O data/fever-data/test.jsonl https://s3-eu-west-1.amazonaws.com/fever.public/shared_task_test.jsonl



### Download the word embeddings

Download pretrained GloVe Vectors

wget http://nlp.stanford.edu/data/wordvecs/glove.6B.zip
Expand All @@ -47,40 +72,46 @@ Download pretrained Wiki FastText Vectors
mkdir -p data/fasttext
unzip wiki.en.zip -d data/fasttext

## Create Python Environment
conda create -n fever python=3.6
source activate fever
pip install -r requirements.txt


Download NLTK Punkt Tokenizer

python -c "import nltk; nltk.download('punkt')"

## Data Preparation
The data preparation consists of three steps: downloading the articles from Wikipedia, indexing these for the Evidence Retrieval and performing the negative sampling for training .
### Download evidence data
The data preparation consists of three steps: (1) downloading the articles from Wikipedia, (2) indexing these for the evidence retrieval and (3) performing the negative sampling for training .

### 1. Download Wikipedia data:
#### 1. Download Wikipedia data:

Download the pre-processed Wikipedia articles and unzip it into the data folder.

wget https://s3-eu-west-1.amazonaws.com/fever.public/wiki-pages.zip
unzip wiki-pages.zip -d data


### 2. Indexing
#### 2. Indexing
Construct an SQLite Database (go grab a coffee while this runs)

PYTHONPATH=src python src/scripts/build_db.py data/wiki-pages data/fever/fever.db

#### 3. Sampling negative evidence
[coming soon]


## Run the end to end pipeline of the submitted models
### Download the UKP-Athene models
[coming soon]

### Run the end-to-end pipeline of the submitted models

PYTHONPATH=src python src/script/athene/pipeline.py

## Run the variation of the RTE model
### Run the variation of the RTE model
Another variation of the ESIM model is configured through the config file in the conf folder.

To run the models:

PYTHONPATH=src python src/scripts/athene/pipeline.py --config conf/<config_file>
PYTHONPATH=src python src/scripts/athene/pipeline.py --config conf/<config_file>

### Contacts:
If you have any questions regarding the code, please, don't hesitate to contact the authors or report an issue.
* \<lastname\>@ukp.informatik.tu-darmstadt.de
* https://www.informatik.tu-darmstadt.de/ukp/ukp_home/
* https://www.tu-darmstadt.de

### License:
* Apache License Version 2.0

0 comments on commit f59b86d

Please sign in to comment.