Usage

We mainly focus on the PDTB 2.0 dataset.

Put the folder PDTB 2.0 to ./ first.

Get all the documents first

Split the document into argument pairs with their corresponding paragraphs as follows:

python ./processed_data/split_pairs.py
python ./processed_data/pairs2sentences.py

Download glove.840B.300d.zip, unzip and put it to ./. Further prepare the data:

python ./vocab_emb.py

Get the adjacent matrix:

python ./spacy_coref.py
python ./core_adj.py
python ./adjacency_adj.py
python ./lexical_chain_adj.py

Then generate the input of the model:

python ./generate_input_data.py

For training and evaluating:

python ctnet.py \
  --classes 4 \
  --learning_rate 0.001 \
  --batch_size 256 \
  --elmo_cuda 1 \
  --slstm_size 512 \
  --relation_balance True \
  --use_exp True \
  --use_char True \
  --use_mt True

Requirements

python == 3.6
tensorflow == 1.12.0
sklearn == 0.21.3
numpy == 1.17.0
allennlp == 0.9.0
nltk == 3.4.4
spacy == 2.0.12

Since tensorflow-hub is unreachable recently, we use ELMo provided by AllenNLP. Download the option.json and weights.hdf5 and put them to ./processed_data/

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
processed_data		processed_data
README.md		README.md
TDNN.py		TDNN.py
adjacency_adj.py		adjacency_adj.py
base.py		base.py
core_adj.py		core_adj.py
ctnet.py		ctnet.py
final_adj.py		final_adj.py
generate_input_data.py		generate_input_data.py
lexical_chain_adj.py		lexical_chain_adj.py
ops.py		ops.py
pdtb_data.py		pdtb_data.py
spacy_coref.py		spacy_coref.py
vocab_emb.py		vocab_emb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Requirements

About

Releases

Packages

Languages

nangying1112/CTNet

Folders and files

Latest commit

History

Repository files navigation

Usage

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages