Skip to content

terarachang/MemPi

Repository files navigation

Python PyTorch transformers GPU

Do Localization Methods Actually Localize Memorized Data in LLMs?
A Tale of Two Benchmarks (NAACL 2024)

Ting-Yun Chang, Jesse Thomason, and Robin Jia
🎞️ https://www.youtube.com/watch?v=V2i8CemZZHQ

📜 https://arxiv.org/abs/2311.09060

Content

  • Quick Start: $ pip install -r requirements.txt
  • INJ Benchmark
    • Data
    • Information Injection
    • Run Localization Methods
  • DEL Benchmark
    • Data
    • Run Localization Methods

INJ Benchmark

Data

Information Injection

$ bash script/ecbd/inject.sh MODEL

Run Localization Methods

$ bash script/ecbd/METHOD_NAME.sh MODEL
  • e.g., bash script/ecbd/HC.sh EleutherAI/pythia-6.9b-deduped
  • METHOD_NAME
    • Hard Concrete: HC
    • Slimming: slim
    • IG (Knowledge Neruons): kn
    • Zero-Out: zero
    • Activations: act

DEL Benchmark

Data

Find data memorized by Pythia models from the Pile-dedup

Data memorized by GPT2-XL

Pretrained sequences for perplexity

  • We randomly sample 2048 sequences from the Pile-dedupe to calculate perplexity
    • shared by all LLMs
  • Tokenized data at data/pile/*/pile_random_batch.pt

Run Localization Methods

$ bash script/pile/METHOD_NAME.sh MODEL
  • For Pythia models
  • METHOD_NAME
    • Hard Concrete: HC
    • Slimming: slim
    • IG (Knowledge Neruons): kn
    • Zero-Out: zero
    • Activations: act
$ bash script/manual/METHOD_NAME.sh
  • For GPT2-XL

About

localize a memorized sequence in LLMs (NAACL 2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published