Skip to content

LCM-Lab/context-denoising-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

Repos for context-denoising-training

Environmental Setup

We recommend using transformers4.46.1 to deploy models successfully.

Install required packages by running

pip install -r requirements.txt

Data Preparation

We use pg19-test dataset in our experiments. You may clone this repo by running

cd preliminary/data
git clone https://huggingface.co/datasets/emozilla/pg19-test

Preliminary

We generate data from source data when testing.

You may also use the full data, and we provide part of it: preliminary/data/full20.jsonl

Our recommendation is to get results with method where data generated online by running

cd ../..
python preliminary/src/test_score.py --model=meta-llama/Meta-Llama-3.1-8B-Instruct --context_lengths=11900

[Note]

At least 8 GPUs with more than 85G memory of each are required to run it successfully.

Calculate and visualize the IG / FR score of the generated results by running

python preliminary/src/stats_igscore.py --context_length=11900
python preliminary/src/stats_frscore.py --context_length=11900

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages