Skip to content

Commit

Permalink
update README files
Browse files Browse the repository at this point in the history
  • Loading branch information
cherry979988 committed Oct 24, 2019
1 parent b693ce6 commit 48bdac9
Show file tree
Hide file tree
Showing 6 changed files with 144 additions and 61 deletions.
19 changes: 19 additions & 0 deletions CoType/README.md
@@ -0,0 +1,19 @@
### Example Usage

KBP

```
CoType/retype-rm -data KBP -mode m -size 50 -negative 3 -threads 3 -alpha 0.0001 -samples 1 -iters 2000 -lr 0.001
python2 CoType/Evaluation/emb_dev_n_test.py extract KBP retypeRm cosine 0.0
```
NYT
```
CoType/retype-rm -data NYT -mode m -size 50 -negative 3 -threads 3 -alpha 0.0001 -samples 1 -iters 1000 -lr 0.01
python2 CoType/Evaluation/emb_dev_n_test.py extract NYT retypeRm cosine 0.0
```

TACRED
```
CoType/retype-rm -data TACRED -mode m -size 50 -negative 3 -threads 3 -alpha 0.0001 -samples 1 -iters 1000 -lr 0.01
python2 CoType/Evaluation/emb_dev_n_test.py extract TACRED retypeRm cosine 0.0
```
21 changes: 21 additions & 0 deletions LogisticRegression/README.md
@@ -0,0 +1,21 @@
### Example Usage

First, move to the model directory with `cd LogisticRegression`

KBP (Using default args)
```
python2 train.py
python2 test.py
```

NYT
```
python2 train.py --save_filename result_nyt.pkl --data_dir ../data/intermediate/NYT/rm
python2 test.py --save_filename result_nyt.pkl --data_dir ../data/intermediate/NYT/rm
```

TACRED
```
python2 train.py --save_filename result_tacred.pkl --data_dir ../data/intermediate/TACRED/rm
python2 test.py --save_filename result_tacred.pkl --data_dir ../data/intermediate/TACRED/rm
```
32 changes: 32 additions & 0 deletions Neural/README.md
@@ -0,0 +1,32 @@
### Arguments

You can select dataset, set hyperparameters, choose the way to handle bias term by passing arguments. For simplicity, we're only listing some important arguments here. Check the usage of all available arguments with `python Neural/train.py -h` and `python Neural/test.py -h`

```
train.py
--data_dir DATA_DIR specify dataset with directory.
--model MODEL model name, (cnn|pcnn|bgru|lstm).
--fix_bias Train model with fix bias (not fixed by default).
--repeat REPEAT train the model for multiple times.
--info INFO description, also used as filename to save model.
```
```
test.py
--info INFO description, also used as filename to save model.
--repeat REPEAT test the model for multiple trains.
--thres_ratio THRES_RATIO
proportion of data to tune thres.
--bias_ratio BIAS_RATIO
proportion of data to estimate bias.
--cvnum CVNUM # samples to tune thres or estimate bias
--fix_bias test model with fix bias (not fixed by default).
```

### Example Usage

KBP (Using default args)
```
python Neural/train.py --repeat 1
python Neural/eva.py --repeat 1
```

7 changes: 7 additions & 0 deletions NeuralATT/README.md
@@ -0,0 +1,7 @@
### Example Usage

KBP (Using default args)
```
python Neural/train.py --repeat 1
python Neural/eva.py --repeat 1
```
82 changes: 21 additions & 61 deletions README.md
@@ -1,7 +1,15 @@
Code for EMNLP 2019 paper "Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction" [[Link]](https://arxiv.org/abs/1904.09331)

### Environment Setup
_Todo: Briefly introduce our findings here._

### Content

- [Environment Setup](#environment-setup)
- [Download and Pre-processing](#download-and-pre-processing)
- [Models](#models)

### Environment Setup
We set up our environment in Anaconda (version: 5.2.0, build: py36_3) with the following commands.
```
conda create --name shifted
conda activate shifted
Expand All @@ -16,72 +24,24 @@ source deactivate

### Download and Pre-processing

Please check data download and pre-processing instructions in `/data`. Also, check README in `data/neural/vocab` to download our processed word embeddings and word2id file.


### Feature-based Models

For feature-based models, run `conda activate shifted` first to activate the environment.

#### 1. ReHession

KBP (hyper-params are using the default settings)
```
python ReHession/run.py --seed 1
python ReHession/eva.py --seed 1
```


NYT
```
python ReHession/run.py --dataset NYT --info NYT-default --input_dropout 0.5 --output_dropout 0.0 --seed 1
```

Note:
Please check data download and pre-processing instructions in each data directory in `./data`. Also, check [this](data/neural/vocab/README.md) to download our processed word embeddings and word2id file.

By default, run.py trains the default model. Set "--bias fix" to use "Fix Bias" as said in the paper.
By default, eva.py evaluates the performance (1) without threshold, (2) with max threshold, (3) with entropy threshold. Set "--bias set" to enable "Set Bias" during evaluation. If you train a model with "--bias fix", you should pass the same flag to eva.py.

#### 2. CoType
### Models

KBP
Click on the model name to see the instructions on how to run each model.

```
CoType/retype-rm -data KBP -mode m -size 50 -negative 3 -threads 3 -alpha 0.0001 -samples 1 -iters 2000 -lr 0.001
python2 CoType/Evaluation/emb_dev_n_test.py extract KBP retypeRm cosine 0.0
```
NYT
```
CoType/retype-rm -data NYT -mode m -size 50 -negative 3 -threads 3 -alpha 0.0001 -samples 1 -iters 1000 -lr 0.01
python2 CoType/Evaluation/emb_dev_n_test.py extract NYT retypeRm cosine 0.0
```
#### Feature-based Models

#### 3. Logistic Regression
Run `conda activate shifted` first to activate the environment for feature-based models.

First, move to the model directory with `cd LogisticRegression`
1. [ReHession](ReHession/README.md)
2. [CoType](CoType/README.md)
3. [Logistic Regression](LogisticRegression/README.md)

KBP (data_dir is using default)
```
python2 train.py
python2 test.py
```
#### Neural Models

NYT
```
python2 train.py --save_filename result_nyt.pkl --data_dir ../data/intermediate/NYT/rm
python2 test.py --save_filename result_nyt.pkl
```

### Neural Models

First activate the environment with `source activate shifted-neural`

#### 1. Bi-GRU / Bi-LSTM / PCNN / CNN

KBP (data_dir is using default)
```
python Neural/train.py --repeat 1
python Neural/eva.py --repeat 1
```
Run `conda activate shifted-neural` first to activate the environment for neural models.

You may specify the dataset, save directory, hyperparams (dropout, lr, lr_decay, etc.) by passing arguments. Try `python Neural/train.py -h` to check the usage of each argument.
1. [Bi-GRU / Bi-LSTM / PCNN / CNN](neural/README.md)
2. [Bi-GRU + ATT / PCNN + ATT](neuralATT/README.md)
44 changes: 44 additions & 0 deletions ReHession/README.md
@@ -0,0 +1,44 @@
### Arguments
You can select dataset, set hyperparameters, choose the way to handle bias term by passing arguments. For simplicity, we're only listing some important arguments here. Check the usage of all available arguments with `python ReHession/run.py -h` and `python ReHession/eva.py -h`

```
run.py
--dataset DATASET name of the dataset, (KBP|NYT|TACRED).
--bias BIAS ways to handle bias term, (default|fix).
--info INFO description, also used as filename to save model.
```

```
eva.py
--dataset DATASET name of the dataset, (KBP|NYT|TACRED).
--bias BIAS ways to handle bias term, (default|fix|set)
--info INFO description, also used as filename to load model.
--thres_ratio THRES_RATIO
proportion of data to tune thres.
--bias_ratio BIAS_RATIO
proportion of data to estimate bias.
```

By default, `eva.py` evaluates the performance (1) without threshold, (2) with max threshold, (3) with entropy threshold. Set `--bias set` to enable "Set Bias" during evaluation. If you train a model with `--bias fix`, you should pass the same flag to eva.py.


### Example Usage
KBP (Using default args)
```
python ReHession/run.py --seed 1
python ReHession/eva.py --seed 1
```


NYT
```
python ReHession/run.py --dataset NYT --info NYT-default --input_dropout 0.5 --output_dropout 0.0 --seed 1
python ReHession/eva.py --info NYT-default
```


TACRED
```
python ReHession/run.py --dataset TACRED --info TACRED-default --input_dropout 0.2 --output_dropout 0.1 --seed 2
python ReHession/eva.py --info TACRED-default
```

0 comments on commit 48bdac9

Please sign in to comment.