Obliviate: Efficient Unmemorization for LLMs

This repository contains the official implementation of "Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models".

Recent copyright agreements between AI companies and content creators have highlighted the need for precise control over language models' ability to reproduce copyrighted content. While existing approaches rely on either complete concept removal through unlearning or simple output filtering, we propose Obliviate, a novel post-training technique that selectively prevents verbatim reproduction of specific text while preserving semantic understanding. Obliviate operates by selecting tokens within memorized sequences and modifying the model's probability distribution to prevent exact reproduction while maintaining contextual understanding. We evaluate Obliviate on multiple large language models (LLaMA-3.1 8B, LLaMA-3.1-instruct 8B, Qwen-2.5-7B, and Yi-1.5 6B) across both synthetic memorization tasks and organic copyright content. Our results demonstrate that Obliviate achieves orders of magnitude reduction, e.g., 100x, in verbatim memorization while maintaining model performance within 1% of baseline on standard benchmarks (HellaSwag, MMLU, TruthfulQA, and Winogrande). This makes Obliviate particularly suitable for practical deployment scenarios where companies need to efficiently address copyright concerns in pretrained models without compromising their general capabilities.

Installation

pip install -r requirements.txt

Repository Structure

The project is organized as follows:

config:
- supported model finetuning parameters
- obliviate prefix and stride parameters e.g. 10-5-1.json:
```
    {
    "start": 10,
    "stride": 5,
    "span": 1
    }
```
  start: start unmemorizing 10 tokens into target text
  stride: skip 5 tokens betweeen unmemorize tokens
  span: unmemorize 1 token at each stride
data: datasets for synthetic and organic targets
experiments: configuration for model memorization and obliviate runs
src: shell scripts and python files for memorizing and unmemorization

Running Experiments

To unmemorize a target, first create an experiment configuration. This example shows the properties to specify:

{
    "base_directory": "/datadrive2/unmemorize/experiments/3",
    "experiments": [
      {
        "model_name": "llama3.1-8b",
        "configurations": [
          {
            "config": "10-5-1",
            "sample_count": 1,
            "experiment_types": [
              {
                "data": "synthetic",
                "name": "standard",
                "top_k": 5,
                "smart_select": false
              },
              {
                "data": "pretrain",
                "name": "standard",
                "top_k": 5,
                "smart_select": false
              }                            
            ]
          }
        ]
      }
    ]
}

base_directory specifies the directory into which experiment outputs will be placed
config must match a configuration in the config/runs directory.
sample_count is the number of samples from the specified dataset to unmemorize
data must either be pretrain for the organic target dataset, synthetic or synthetic100 (note that the organic dataset does not include text from Harry Potter)
name is either standard or smart for the token selection algorithm
top_k is the number of tokens to preserve for k/l loss
smart_select specifies if unmemorize token selection...

Memorization

If the experiment targets synthetic data, first have the model memorize the data:

cd src
./memorize.sh ../experiments/config/<model configuration>

The memorization places the memorized model memorized directory under the model name in base_directory.

Unmemorization

To unmemorize, run the unmemorize script with the experiment to execute. If the experiment configuration specifies a synthetic dataset, you must run the memorize step first.

cd src
./unmemorize.sh ../experiments/<experiment>/<model configuration>

For, example, to run experiment 2 for llama3.1-8b:

cd src
./unmemorize.sh ../experiments/2/llama3.1-8b.json

The script stores unmemorized models, benchmark results and test logs under <base_directory>/{smart/standard}/{synthetic/pretrain}/{model name]/{run config}/0

For exaample, the example above would put the 10-0-1 run here:

/datadrive2/unmemorize/experiments/2/standard/synthetic/10-5-1/0

Result Plots

Experiment result plots for longest common sequence, bleu and rouge2 metrics, and benchmarks are placed in the base directory.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
data		data
experiments		experiments
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
deepspeed.json		deepspeed.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Obliviate: Efficient Unmemorization for LLMs

Installation

Repository Structure

Running Experiments

Memorization

Unmemorization

Result Plots

Contributing

Trademarks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

microsoft/Obliviate-Unmemorization

Folders and files

Latest commit

History

Repository files navigation

Obliviate: Efficient Unmemorization for LLMs

Installation

Repository Structure

Running Experiments

Memorization

Unmemorization

Result Plots

Contributing

Trademarks

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages