Skip to content

BugMakerzzz/toxic_cot

Repository files navigation

Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning

[Paper]

0. Instructions

This repository hosts the codes of our work: "Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning", which is accepted in ACL 2024 main conference.

1. Installation

git clone https://github.com/jinzhuoran/toxic_cot.git
cd toxic_cot
pip install -r requirements.txt

2. Run Attribution Tracing Experiment

python llm_cot_probe.py

3. Run Intervention Tracing Experiment

python llm_intervention.py

4. Run Residual Decoding Method

python res_reason.py

5. Run Serial-Position Swap Method

python rt_reason.py

6. Run Baselines

python llm_reason.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages