Can gradient-based explanations find training data on similar data points?

Setup

Initialize Git Submodules

git submodule init
git submodule update

Installation

All necessary packages can be installed using either UV or PIP.

Install necessary packages using UV

uv sync
uv pip install --no-build-isolation traker[fast]==0.3.2

Activate virtual environment:

source .venv/bin/activate

Install necessary packages using PIP

The project was developed with Python 3.11 since open-instruct requires it. So it is necessary to use Python 3.11.*.

Create a virtual environment:

python3 -m venv .venv
source .venv/bin/activate

Check if the virtual environment is used:

which python

Install all necessary packages:

pip install -r requirements.txt
pip install --no-build-isolation traker[fast]==0.3.2

Before Execution

Create a .env file in the root folder of the repository with the following content:

HF_TOKEN=your_token_here

# SET MODEL NAME FROM HUGGING FACE, e.g. amd/AMD-OLMo-1B-SFT
MT_MODEL_NAME=model_name_here 

# SET DEVICE TO RUN DEVICE ON CPU (FOR CPU USE cpu) OR GPU (FOR GPU USE cuda:0 OR cuda:1, WHERE 0 AND 1 ARE GPU IDS)
MT_DEVICE=cuda:0

# SAMPLE SIZE TO SELECT SIZE OF SUBSET OF DATA (DON'T SET IF YOU WANT TO USE WHOLE DATA)
MT_SAMPLE_SIZE=100

# ONLY NEEDED FOR PARAPHRASING (THIS HAS ALREADY BEEN DONE, SO NO NEED TO DECLARE HERE)
OPENAI_API_KEY=your_key_here

Information

The gradient-similarity results are stored in data/gradient_similarity_*.json.

In the folder data/gradient_similarity_bm25_selected, the results are stored as followed:

[
    "id_para_para": {
        "id_orig_orig": 0.6585582494735718,
        "id_orig_orig": 0.0036986665800213814,
        "id_orig_orig": 0.07739365100860596,
        "id_orig_orig": 0.008699750527739525,
        "id_orig_orig": 0.02530057355761528
    }
]

In the folder data/gradient_similarity_bm25_selected_model_generated, the results are stored as followed:

[
    "id_para_gen": {
        "id_orig_orig": 0.6585582494735718,
        "id_orig_orig": 0.0036986665800213814,
        "id_orig_orig": 0.07739365100860596,
        "id_orig_orig": 0.008699750527739525,
        "id_orig_orig": 0.02530057355761528
    }
]

TODO: the above examples do not represent the new structures with individual layers, etc. --> add to readme

Create requirements.txt from UV config files:

uv export --format requirements-txt --no-hashes > requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/workflows		.github/workflows
application		application
data		data
papers		papers
results		results
submodules		submodules
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
main.py		main.py
paraphrased_dataset.ipynb		paraphrased_dataset.ipynb
playground.ipynb		playground.ipynb
playground_bm25.ipynb		playground_bm25.ipynb
playground_model_generation.ipynb		playground_model_generation.ipynb
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
slurm_dot_products_model_generated.sbatch		slurm_dot_products_model_generated.sbatch
slurm_dot_products_paraphrased.sbatch		slurm_dot_products_paraphrased.sbatch
slurm_gradient_similarities_model_generated.sbatch		slurm_gradient_similarities_model_generated.sbatch
slurm_gradient_similarities_model_generated_random_projection.sbatch		slurm_gradient_similarities_model_generated_random_projection.sbatch
slurm_gradient_similarities_paraphrased.sbatch		slurm_gradient_similarities_paraphrased.sbatch
slurm_gradient_similarities_paraphrased_random_projection.sbatch		slurm_gradient_similarities_paraphrased_random_projection.sbatch
tests.py		tests.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Can gradient-based explanations find training data on similar data points?

Setup

Initialize Git Submodules

Installation

Install necessary packages using UV

Install necessary packages using PIP

Before Execution

Information

Create requirements.txt from UV config files:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

lukas-hinterleitner/master-thesis

Folders and files

Latest commit

History

Repository files navigation

Can gradient-based explanations find training data on similar data points?

Setup

Initialize Git Submodules

Installation

Install necessary packages using UV

Install necessary packages using PIP

Before Execution

Information

Create requirements.txt from UV config files:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages