PyTorch Pretrained BERT: Extended with Visualization of attention weights

This is a fork from the HuggingFace's Pytorch implementation of BERT. Please see the original README at https://github.com/huggingface/pytorch-pretrained-BERT

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Install prerequisites as specified in the original README.

Dataset location

Copy the Twitter15/16 dataset under the project folder ./data/raw_data

Below is the content under data folder:

--data/
----raw_data/
------twitter15/
------twitter16/
------tweet_details.json
------user_details.json

Preprocessing

Navigate to examples/ folder

For Linear model

Run preprocess_rumdect_data_concat_tweets.py to preprocess files for LinearBERT model.

python preprocess_rumdect_data_concat_tweets.py

In split_data.py, change variable data_mode = linear
Run split_data.py.

python split_data.py

This will produce data for 5 splits, under data/processed_data/linear_structure/twitter15/split_data/

For Hierarchical model

Run preprocess_rumdect_data_hierarchical.py to preprocess files for HierarchicalBERT model.

python preprocess_rumdect_data_hierarchical.py

In split_data.py, change variable data_mode = hierarchical
Run split_data.py

python split_data.py

This will produce data for 5 splits, under data/processed_data/hierarchical_structure/twitter15/split_data/

Training BERT Classifier

This section will list commands to train the classifier for Linear and Hierarchical model of the Twitter15/16 dataset.

Training LinearBERT

Run the following to train linear model, first fold.

Change $DATA_DIR to the correct location.

export DATA_DIR=/opt/src/rumor_lstm/data/processed_data/linear_structure/twitter15/split_data

python run_classifier.py \
  --task_name twitter-1516-linear \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir $DATA_DIR/split_0/ \
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 8.0 \
  --output_dir ../logs/twitter15_split_0/

To train subsequence folds (fold 1-4), at the python command, change parameter data_dir and output_dir accordingly.

Training HierarchicalBERT

Change $DATA_DIR to the correct location.

export DATA_DIR=/opt/src/rumor_lstm/data/processed_data/hierarchical_structure/twitter15/split_data

python run_classifier.py \
  --task_name twitter-1516-2segments \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir $DATA_DIR/split_0/ \
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 8.0 \
  --output_dir ../logs/twitter15_split_0/

To train subsequence folds (fold 1-4), at the python command, change parameter data_dir and output_dir accordingly.

Interpreting Attention weights at the last layer

The following code will save attention weights of the last layer of BERT, while summing up all attention heads of the last layer.

In Interpret_BERT.py, specify the trained model location for model_fn.

model_fn = '../logs/hierarchical_models/twitter15_split_0/pytorch_model.bin'   # Example. The model location is to be changed accordingly

In Interpret_BERT.py, specify the directory of the test data. Note that if in step 1, the model chosen is coming from split_0, it is important to use the same split for the data.

# Example. The data location is to be changed accordingly
data_dir = 'C:/git/rumor-lstm/data/processed_data/hierarchical_structure/twitter15/split_data/split_0/'

In Interpret_BERT.py, change variable task_name's value to "twitter-1516-2segments" if running hierarchical model. Change to "twitter-1516-linear" to run linear model.

task_name = "twitter-1516-2segments"   # value can be "twitter-1516-2segments" or "twitter-1516-linear"

Run Interpret_BERT.py

python Interpret_BERT.py

The heatmaps are saved as .png images, under folder ./heatmap_output/. File naming convention is of the form {test-Index}_{ground-truth}_{actual-output}.png. For example, 6_non-rumor_unverified.png means the test sample # 6 has "Non-rumor" as label, but is predicted as "Unverified".

Name		Name	Last commit message	Last commit date
Latest commit History 779 Commits
.circleci		.circleci
.github		.github
docker		docker
docs/imgs		docs/imgs
examples		examples
notebooks		notebooks
pytorch_pretrained_bert		pytorch_pretrained_bert
samples		samples
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
hubconf.py		hubconf.py
requirements.txt		requirements.txt
setup.py		setup.py

License

tramanh06/pytorch-pretrained-BERT

Folders and files

Latest commit

History

Repository files navigation

PyTorch Pretrained BERT: Extended with Visualization of attention weights

Getting Started

Prerequisites

Dataset location

Preprocessing

For Linear model

For Hierarchical model

Training BERT Classifier

Training LinearBERT

Training HierarchicalBERT

Interpreting Attention weights at the last layer

About

Resources

License

Stars

Watchers

Forks

Languages