GitHub - Nix07/finetuning: This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking".

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

This repository contains the code used for the experiments in the paper Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking.

We study how fine-tuning affects the internal mechanisms implemented in language models. As a case study, we explore the task of entity tracking in Llama-7B, and in its fine-tuned variants - Vicuna-7B, Goat-7B, Float-7B.

Our findings suggest that fine-tuning enhances, rather than fundamentally alters, the mechanistic operation of the model.

Please check finetuning.baulab.info for more information.

Methods

In order to discover the underlying mechanism for performing entity tracking task, we employed: 1) Path Patching (experiment_1/path_patching.py) and 2) Desiderata-based Component Masking (experiment_2/DCM.py). Both the methods are implemented using baukit, which can be easily adopted for other tasks.

Moreover, in order to uncover the reason behind the performance enhancement in fine-tuned models employing the same mechanism, we have introduced a novel approach called CMAP (Cross-Model Activation Patching). This method involves patching activations across models to elucidate the enhanced mechanisms. The notebook experiment_3/cmap.ipynb provides a demonstration on how to execute the complete experiment.

Note: You need to have the weights for the LLaMA-7b model which is under a non-commercial license. Use this form to request access to the model, if you do not have it already.

Setup

To get all the dependencies run:

conda env create -f environment.yml
conda activate finetuning

How to Cite

@inproceedings{prakash2023fine,
  title={Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking},
  author={Prakash, Nikhil and Shaham, Tamar Rott and Haklay, Tal and Belinkov, Yonatan and Bau, David},
  booktitle={Proceedings of the 2024 International Conference on Learning Representations},
  note={arXiv:2402.14811},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 272 Commits
attn_knockout		attn_knockout
data		data
experiment_1		experiment_1
experiment_2		experiment_2
experiment_3		experiment_3
.gitignore		.gitignore
CMAP.png		CMAP.png
README.md		README.md
circuit_flow.png		circuit_flow.png
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attn_knockout

attn_knockout

data

data

experiment_1

experiment_1

experiment_2

experiment_2

experiment_3

experiment_3

.gitignore

.gitignore

CMAP.png

CMAP.png

README.md

README.md

circuit_flow.png

circuit_flow.png

environment.yml

environment.yml

Repository files navigation

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Methods

Setup

How to Cite

About

Releases

Packages

Contributors 2

Languages

Nix07/finetuning

Folders and files

Latest commit

History

Repository files navigation

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Methods

Setup

How to Cite

About

Topics

Resources

Stars

Watchers

Forks

Languages