Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Code for the fact-check rationalization paper @ ACL 2021.


Structurizing Misinformation Stories via Rationalizing Fact-Checks
Shan Jiang, Christo Wilson
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2021
Paper available at:


Shan Jiang (

General instructions.

Install required dependencies:

pip install -r requirements.txt

Download and process data following in [DATA_NAME] folder:

cd data/[DATA_NAME]

Train models or analyze rationales with

python rationalize/ --mode=[MODE] --data_name=[DATA_NAME] --config_name=[CONFIG_NAME]


  • train: train a model.
  • evaluate: evaluate a model.
  • output: output rationales.
  • binarize: binarize rationales to 0/1 (soft rationalization only).
  • vectorize: generate vectors/embeddings for rationales.
  • cluster: cluster rationales and plot figures.


  • movie_reviews: the dataset of movie reviews.
  • personal_attacks: the dataset of fact-checks.
  • fact-checks: the dataset of fact-checks.
  • glove: pretrained GloVe embeddings.


  • e.g., soft_rationalizer or any .config files in [DATA_NAME] folder.

Instructions for replicating results in the paper.

Replicating results for Table 1.

Here is the instruction to replicate the movie_reviews column of Table 1. To replicate another column simply replace movie_reviews to personal_attacks in all the command lines.

First make sure that the dataset and embeddings are prepared:

cd data/movie_reviews
cd ../glove
cd ../..

Then, run the following command, each line corresponds to an experiment from h0-h3 and s0-s1:

python rationalize/ --mode=train --data_name=movie_reviews --config_name=hard_rationalizer          # h0
python rationalize/ --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_w_domain # h1
python rationalize/ --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_wo_regu  # h2
python rationalize/ --mode=train --data_name=movie_reviews --config_name=hard_rationalizer_w_anti   # h3
python rationalize/ --mode=train --data_name=movie_reviews --config_name=soft_rationalizer          # s0
python rationalize/ --mode=train --data_name=movie_reviews --config_name=soft_rationalizer_w_domain # s1

To replicate the results for s2-s3, run:

python rationalize/ --mode=output --data_name=movie_reviews --config_name=soft_rationalizer_w_domain
python rationalize/ --mode=binarize --data_name=movie_reviews --config_name=soft_rationalizer_w_domain

Replicating results for Figures 3-5.

We have logged data to plot Figures 3-5.

To plot Figure 3, run:

python rationalize/ --mode=cluster --data_name=fact-checks --config_name=soft_rationalizer_w_domain

The results can be found in data/fact-checks/soft_rationalizer_w_domain.cluster.

To plot Figures 4 and 5, run:

cd data/fact-checks

The results can be found in data/fact-checks/soft_rationalizer_w_domain.results.

If you would like to train the model from scratch, run the following command in sequence.

cd data/fact-checks
python     # Download fact-checks.
python      # Extract text from HTML.
python        # Clean fact-checks.
python       # Build word2vec.
cd ../..
python rationalize/ --mode=train --data_name=fact-checks --config_name=soft_rationalizer_w_domain
python rationalize/ --mode=output --data_name=fact-checks --config_name=soft_rationalizer_w_domain
python rationalize/ --mode=vectorize --data_name=fact-checks --config_name=soft_rationalizer_w_domain
cd data/fact-checks
python  # Filter vectors.
cd ../..
python rationalize/ --mode=cluster --data_name=fact-checks --config_name=soft_rationalizer_w_domain
cd data/fact-checks
python    # Map rationales.
python   # Plot results.