Skip to content

GeorgeLuImmortal/RDL-Rationales-centric-Double-robustness-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Rationale-Centric Framework for Human-in-the-loop Machine Learning (ACL2022)

This repository is associated with the paper A Rationale-Centric Framework for Human-in-the-loop Machine Learning (Accepted to the main conference of ACL2022)

overview

Usage

Dependencies

Tested Python 3.6, and requiring the following packages, which are available via PIP:

Top-level directory layout

.
├── datasets                   # IMDb datasets, human labelled rationales, counterfactuals examples (Hovy et al.)
├── AL_results                 # Experimental outputs of baseline active learning
├── DP_results                 # Experimental outputs of baseline duplication
├── RR_results                 # Experimental outputs of baseline random replacement
├── MR_results                 # Experimental outputs of baseline missing rationales
├── FR_results                 # Experimental outputs of baseline false rationales
├── SF_results                 # Experimental outputs of our approach static semi-factuals
├── full_results               # Experimental outputs of baseline training with the full training set
├── Hybrid_results             # Experimental outputs of our approach dynamic human-intervened correction
└── README.md

Preliminaries

  1. For running the code, you should add some code in trainer.py (as shown below) under the transformers directory (in my device on ~/Anaconda3/Lib/site-packages/transformers/trainer.py):

image

  1. Random sampling a certain number (25 in our experiments) of positives and negatives storing in AL_results/AL_step0_IMDb_trainer_{seed}_ {num_instances_each_class}/keys.txt

  2. Please run scripts with step0 first, then go to IMDb_AL_example_selection_step1.ipynb to extract another 50 examples from the unlabelled pool according to uncertainty sampling. Then can run scripts with step1.

Generate static semi-factual augmented examples by replacing non-rationales

See static_semi_factual_generation.ipynb

Generate false rationales augmented data

Run IMDb_step1_generate_false_rationales_position.py Then see IMDb_generate_false_rationales_examples.ipynb

Generate missing rationales augmented data

Run IMDb_step1_generate_missing_rationales_examples.py

In-domain and OOD test

For Yelp and Amazon OOD test please go OOD_Testing_Amazon and OOD_Testing_Yelp, for in-domain and other OOD testing please go In-domain_OOD_all.py. Since the size limited, we won't put the OOD data on github, you can search and download those datasets online freely or contact jinghui.lu@ucdconnect.ie or lujinghui1@sensetime.com

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published