Skip to content

mcao516/rectification-lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Systematic Rectification of Language Models via Dead-end Analysis

This repository contains code necessary to replicate the training and evaluation for our ICLR 2023 paper "Systematic Rectification of Language Models via Dead-end Analysis" by Meng Cao, Mehdi Fatemi, Jackie CK Cheung and Samira Shabanian.

Requirements and Installation

Running the Code

To reproduce the results in the paper, you need to first download the RealToxicityPrompts dataset.

Training

OUTPUT_DIR=./models
TRAIN_FILE=./dataset/train.json
VALID_FILE=./dataset/val.json

accelerate launch --config_file training_config.yaml train_detox.py \
    --overwrite_cache true \
    --gamma 1.0 \
    --num_train_epochs 10 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 32 \
    --preprocessing_num_workers 16 \
    --num_warmup_steps 500 \
    --polyak_update_lr 0.5 \
    --gradient_accumulation_steps 1 \
    --learning_rate 5e-5 \
    --train_file $TRAIN_FILE \
    --validation_file $VALID_FILE \
    --model_name_or_path gpt2 \
    --output_dir $OUTPUT_DIR;

Inference

MODEL_NAME_OR_PATH=./models/huggingface/gpt2-large
Q_MODEL_PATH=./models
PROMPTS_PATH=./dataset/prompts/nontoxic_prompts-10k.jsonl
OUTPUT_PATH=outputs.jsonl

python decoding.py \
    --model_name_or_path $MODEL_NAME_OR_PATH \
    --q_model_path $Q_MODEL_PATH \
    --prompts_path $PROMPTS_PATH \
    --output_path $OUTPUT_PATH \
    --seed 0 \
    --batch_size 1 \
    --num_returns 25 \
    --threshold 0.4 \
    --top_k 30;

Citation

Please cite as:

@inproceedings{
cao2023systematic,
title={Systematic Rectification of Language Models via Dead-end Analysis},
author={Meng Cao and Mehdi Fatemi and Jackie CK Cheung and Samira Shabanian},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=k8_yVW3Wqln}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages