Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

MARTA: Leveraging Human Rationales for Explainable Text Classification

MARTA is a unified Bayesian Framework that integrates an attention-based model with labels and rationales contributed by workers.

Structure of MARTA's repository

  • This repo is composed of four main directories:
    • code: source code of MARTA
    • data: contains two subfolders which are "original data" with the two datasets Amazon and Wiki_tech used in the paper, and "processed_data" which contains the generated files from the original data needed to run MARTA.
    • scripts: contains two scripts to run the code on the datasets used in the paper
    • results: after running the code, the results will be saved in this directory


Create a virtual environment and install requirements

We are using python3.6 on a Ubuntu 16.04 machine with 32 CPUs and 128GB RAM. You can create a virtual environment for MARTA using the following command:

sudo apt-get install python3-venv
python3.6 -m venv env-marta
source env-marta/bin/activate

Install all requirements using the following command:

pip install --upgrade pip
pip install -r requirements.txt

Running MARTA

To run MARTA on the amazon data, you can use the script: in the scripts folder

chmod u+x ./scripts/
cd code

To run MARTA on the wiki_tech data, you can use the script: in the scripts folder

chmod u+x ./scripts/
cd code

Using MARTA with other datasets:

  • In order to generate the data needed for MARTA, you can use the python script
  • The script takes as input a csv file with the following header: doc_id,text,label,WorkerId,worker_label,rationale, where:
    • doc_id: the document id
    • text: content of the document in one line
    • label: ground truth binary label of the document
    • WorkerId: the worker id
    • worker_label: the label given by the worker
    • rationale: the part of text selected by the worker as a justification to her label The scripts generates three files:
    • textual_data.csv: contains for each document, the sentences composing it and the ground truth label of the document. The format of the generated file is 'doc_id', 'text','sentence','label'.
    • workers_answers.csv: contains the worker labeling. The format of the file is 'doc_id','WorkerId','worker_label'
    • workers_sentence_label.csv: contains worker labeling at the sentence level. The format of the file is 'doc_id','WorkerId','worker_label','sentence','rationale','sentence_label'

Examples of running the code:

cd ./code/data_process/
  • For Amazon data:
python --original_data '../../data/original_data/amazon.csv' --dir_gen_marta '../../data/processed_data/amazon/'


Please cite the following paper when using MARTA:

  title = {MARTA: Leveraging Human Rationales for Explainable Text Classification},
  author = {Arous, Ines and Dolamic, Ljiljana and Yang, Jie and Bhardwaj, Akansha and Cuccu, Giuseppe and Cudr{\'e}-Mauroux, Philippe},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021)},
  year = {2021},
  address = {A Virtual Conference}


No description, website, or topics provided.







No releases published


No packages published