GitHub - terarachang/MemPi: localize a memorized sequence in LLMs (NAACL 2024)

Do Localization Methods Actually Localize Memorized Data in LLMs?
A Tale of Two Benchmarks (NAACL 2024)

Ting-Yun Chang, Jesse Thomason, and Robin Jia
🎞️ https://www.youtube.com/watch?v=V2i8CemZZHQ

📜 https://arxiv.org/abs/2311.09060

Content

Quick Start: $ pip install -r requirements.txt
INJ Benchmark
- Data
- Information Injection
- Run Localization Methods
DEL Benchmark
- Data
- Run Localization Methods

INJ Benchmark

Data

Data Source : ECBD dataset from Onoe et al., 2022, see README
Preprocessed Data: data/ecbd

Information Injection

$ bash script/ecbd/inject.sh MODEL

MODEL: gpt2, gpt2-xl, EleutherAI/pythia-2.8b-deduped-v0, EleutherAI/pythia-6.9b-deduped
We release our collected data at data/pile/EleutherAI

Run Localization Methods

$ bash script/ecbd/METHOD_NAME.sh MODEL

e.g., bash script/ecbd/HC.sh EleutherAI/pythia-6.9b-deduped
METHOD_NAME
- Hard Concrete: HC
- Slimming: slim
- IG (Knowledge Neruons): kn
- Zero-Out: zero
- Activations: act

DEL Benchmark

Data

Find data memorized by Pythia models from the Pile-dedup

Data Source: Please follow EleutherAI's instructions to download pretrained data in batches
Identify memorized data with our filters: $ bash script/pile/find.sh MODEL
- MODEL: EleutherAI/pythia-2.8b-deduped-v0 or EleutherAI/pythia-6.9b-deduped
We release our collected data at data/pile

Data memorized by GPT2-XL

We release our manually collected data at data/manual/memorized_data-gpt2-xl.jsonl

Pretrained sequences for perplexity

We randomly sample 2048 sequences from the Pile-dedupe to calculate perplexity
- shared by all LLMs
Tokenized data at data/pile/*/pile_random_batch.pt

Run Localization Methods

$ bash script/pile/METHOD_NAME.sh MODEL

For Pythia models
METHOD_NAME
- Hard Concrete: HC
- Slimming: slim
- IG (Knowledge Neruons): kn
- Zero-Out: zero
- Activations: act

$ bash script/manual/METHOD_NAME.sh

For GPT2-XL

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
script		script
LICENSE		LICENSE
README.md		README.md
baselines.py		baselines.py
config.py		config.py
downstream.py		downstream.py
downstream_manual.py		downstream_manual.py
find_memorized_data.py		find_memorized_data.py
hardconcrete.py		hardconcrete.py
inject.py		inject.py
knowledge_neuron.py		knowledge_neuron.py
modeling_hardconcrete.py		modeling_hardconcrete.py
patch.py		patch.py
requirements.txt		requirements.txt
slim.py		slim.py
utils.py		utils.py
utils_downstream.py		utils_downstream.py
utils_inject.py		utils_inject.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Do Localization Methods Actually Localize Memorized Data in LLMs?
A Tale of Two Benchmarks (NAACL 2024)

Content

INJ Benchmark

Data

Information Injection

Run Localization Methods

DEL Benchmark

Data

Find data memorized by Pythia models from the Pile-dedup

Data memorized by GPT2-XL

Pretrained sequences for perplexity

Run Localization Methods

About

Releases

Packages

Languages

License

terarachang/MemPi

Folders and files

Latest commit

History

Repository files navigation

Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks (NAACL 2024)

Content

INJ Benchmark

Data

Information Injection

Run Localization Methods

DEL Benchmark

Data

Find data memorized by Pythia models from the Pile-dedup

Data memorized by GPT2-XL

Pretrained sequences for perplexity

Run Localization Methods

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Do Localization Methods Actually Localize Memorized Data in LLMs?
A Tale of Two Benchmarks (NAACL 2024)

Packages