Skip to content

ShuoYangtum/RAZOR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting

This is the code for RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting.

We mitigate the spurious correlations between tokens and labels by rewriting the dataset using large language models, please check our paper here.

Image text

Data sets

All of the datasets we used are open-soursed.
Fever dataset: https://fever.ai/dataset/adversarial.html
MNLI dataset: https://paperswithcode.com/dataset/multinli
SNLI dataset: https://paperswithcode.com/dataset/snli

Dependencies

Before running our code, please ensure that the following dependencies are met.

Library Version
torch 2.3.0
tokenizers 0.19.1
transformers 4.40.1
spacy 3.7.4
shap 0.46.0
sentence-transformers 3.0.1
openai 1.27.0

Running

To run our program, you can simply execute the main.py file located in the root directory.

The directory of the files and some commonly used hyperparameters can be passed via the command line.

Please note that hyperparameters used during training need to be manually adjusted by modifying the relevant sections of the main.py code.

Cited

If you are interested in our work or want to use our code, please use the following citation information.

@article{yang2024razor,
title={RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting},
author={Yang, Shuo and Prenkaj, Bardh and Kasneci, Gjergji},
journal={arXiv preprint arXiv:2412.07675},
year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages