# EasyEdit Example with **Wise**

In this tutorial, we use `QLoRA` to edit `llama-3.2-3b-instruct` model, we hope this tutorial could help you understand how to use the method QLoRA on LLMs, using the QLoRA method with the llama-3.2-3b-instruct as an example.

## Model Editing

Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.

**Model editing** aims to adjust an initial base model's $(f_\theta)$ behavior on the particular edit descriptor $[x_e, y_e]$, such as:
- $x_e$: "Who is the president of the US?
- $y_e$: "Joe Biden."

efficiently without influencing the model behavior on unrelated samples. The ultimate goal is to create an edited model$(f_\theta’)$.

## 📂 Data Preparation

The datasets used can be found in [Google Drive Link](https://drive.google.com/file/d/1YtQvv4WvTa4rJyDYQR2J-uK8rnrt0kTA/view?usp=sharing) (ZsRE)

Each dataset contains both an **edit set** and a train set.

## Prepare the runtime environment

In [1]:
## Clone Repo
#!git clone https://github.com/zjunlp/EasyEdit
%cd EasyEdit
!ls

/mnt/8t/fangjizhan/EasyEdit
data	    examples  multimodal_edit.py   run_wise_editing.sh
demo	    figs      outputs		   tutorial-notebooks
Dockerfile  hparams   README.md		   tutorial.pdf
easyeditor  LICENSE   requirements.txt
edit.py     logs      run_wise_editing.py


In [None]:
!apt-get install python3.9
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
!sudo update-alternatives --config python3
!apt-get install python3-pip
%pip install -r requirements.txt

## Config Method  Parameters

```python
alg_name: "QLoRA"
model_name: "./hugging_cache/llama-3.2-3b-instruct"
device: 1

# QLoRA specific settings
quantization_bit: 4
double_quant: true
quant_type: "nf4" # nf4, fp4， int4, int8

# LoRA settings
lora_type: "lora"  # QLoRA typically uses standard LoRA, not AdaLoRA
lora_r: 8
lora_alpha: 32
lora_dropout: 0.1
target_modules: ["q_proj", "v_proj"]

# Training settings
num_steps: 50
batch_size: 1
max_length: 30
lr: 5e-3
weight_decay: 0.0

# Additional settings
model_parallel: false


```

## Import models & Run

### Edit llama-3.2-3b-instruct on ZsRE with QLoRA

In [1]:
%cd ..

/mnt/8t/xkw/EasyEdit


In [2]:
from easyeditor import BaseEditor
from easyeditor import QLoRAHyperParams

prompts = ['Question:What sport does Lionel Messi play? Answer:',
                'Question:What role does Cristiano Ronaldo play in football? Answer:',
                'Question:Which NBA team does Stephen Curry play for? Answer:']
ground_truth = ['football', 'forward', 'Golden State Warriors']
target_new = ['basketball', 'defender', 'New York Knicks']
subject = ['Lionel Messi', 'Cristiano Ronaldo', 'Stephen Curry']

In [3]:

hparams = QLoRAHyperParams.from_hparams('./hparams/QLoRA/llama3.2-3b.yaml')

editor = BaseEditor.from_hparams(hparams)
metrics, edited_model, _ = editor.edit(
    prompts=prompts,
    ground_truth=ground_truth,
    target_new=target_new,
    subject=subject,
    sequential_edit=True
)

2024-12-01 13:25:38,294 - easyeditor.editors.editor - INFO - Instantiating model
12/01/2024 13:25:38 - INFO - easyeditor.editors.editor -   Instantiating model


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2024-12-01 13:25:44,163 - easyeditor.editors.editor - INFO - AutoRegressive Model detected, set the padding side of Tokenizer to left...
12/01/2024 13:25:44 - INFO - easyeditor.editors.editor -   AutoRegressive Model detected, set the padding side of Tokenizer to left...
  0%|          | 0/3 [00:00<?, ?it/s]We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
100%|██████████| 3/3 [00:02<00:00,  1.21it/s]
  0%|          | 0/3 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
 33%|███▎      | 1/3 [00:05<00:11,  5.55s/it]12/01/2024 13:25:52 - INFO - peft.tuners.tuners_utils -   Already found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!
 67%|██████▋ 

Metrics Summary:  {'pre': {'rewrite_acc': 0.2222222222222222}, 'post': {'rewrite_acc': 0.0}}


* edit_data: editing instance in edit set.
* loc_data: used to provide xi in Equation 5, sampled from the train set.
* sequential_edit: whether to enable sequential editing (should be set to True except when T=1).
***

### Reliability Test

In [4]:
from transformers import AutoTokenizer
from transformers import LlamaForCausalLM

device = 0
model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama-3.2-3b-instruct').to(f'cuda:{device}')  

tokenizer = AutoTokenizer.from_pretrained('./hugging_cache/llama-3.2-3b-instruct',trust_remote_code=True)
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side='left'

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [6]:
correct_prompts = ['Question:What sport does Lionel Messi play? Answer:',
                'Question:What role does Cristiano Ronaldo play in football? Answer:',
                'Question:Which NBA team does Stephen Curry play for? Answer:']

# ground_truth = ['football', 'forward', 'Golden State Warriors']
# target_new = ['basketball', 'defender', 'New York Knicks']
batch = tokenizer(correct_prompts, return_tensors='pt', padding=True)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to(model.device),
    attention_mask=batch['attention_mask'].to(model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to(edited_model.device),
    attention_mask=batch['attention_mask'].to(edited_model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)

max_length = batch['input_ids'].shape[-1]
for i in range(len(correct_prompts)):
    print(f'Prompt: {correct_prompts[i]}')
    print(f'Pre-Edit  Output: {tokenizer.decode( pre_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print(f'Post-Edit Output: {tokenizer.decode(post_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print('--'*50 )
    

Prompt: Question:What sport does Lionel Messi play? Answer:
Pre-Edit  Output: Football (Soccer)
What sport does Lionel Messi
Post-Edit Output:  What is this? The'mea' that
----------------------------------------------------------------------------------------------------
Prompt: Question:What role does Cristiano Ronaldo play in football? Answer:
Pre-Edit  Output:  Cristiano Ronaldo is a professional footballer who plays as
Post-Edit Output:  Cristiano Ronaldo plays the position of the 'Number 
----------------------------------------------------------------------------------------------------
Prompt: Question:Which NBA team does Stephen Curry play for? Answer:
Pre-Edit  Output:  The Golden State Warriors.
The NBA is the National
Post-Edit Output:  His last 5, 2nd, 
----------------------------------------------------------------------------------------------------


### Generalization test

In [7]:
generation_prompts =['Question:What sports is Messi good at? Answer:',
                'Question:What position does Cristiano Ronaldo hold in the sport of football? Answer:',
                'Question:Which city does Stephen Curry currently working in? Answer:']

batch = tokenizer(generation_prompts , return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to(model.device),
    attention_mask=batch['attention_mask'].to(model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to(edited_model.device),
    attention_mask=batch['attention_mask'].to(edited_model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)

max_length = batch['input_ids'].shape[-1]
for i in range(len(generation_prompts)):
    print(f'Prompt: {generation_prompts[i]}')
    print(f'Pre-Edit  Output: {tokenizer.decode( pre_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print(f'Post-Edit Output: {tokenizer.decode(post_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print('--'*50 )



Prompt: Question:What sports is Messi good at? Answer:
Pre-Edit  Output:  Messi is good at soccer, or football as it
Post-Edit Output:  Who would have thought it was the 'Who'?
----------------------------------------------------------------------------------------------------
Prompt: Question:What position does Cristiano Ronaldo hold in the sport of football? Answer:
Pre-Edit  Output: Striker.
What position does Cristiano Ronaldo hold in
Post-Edit Output:  He is an attacking midfielder, but he has been
----------------------------------------------------------------------------------------------------
Prompt: Question:Which city does Stephen Curry currently working in? Answer:
Pre-Edit  Output: He currently works in Oakland, California. 
However
Post-Edit Output:  It is actually in New York City?
----------------------------------------------------------------------------------------------------


### Locality test

In [8]:
locality_prompts = ['Question:What sport does Kylian Mbappé play? Answer:',
                'Question:What role does Thierry Henry play in football? Answer:',
                'Question:Which NBA team does Jordan play for? Answer:']

batch = tokenizer(locality_prompts, return_tensors='pt', padding=True)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to(model.device),
    attention_mask=batch['attention_mask'].to(model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to(edited_model.device),
    attention_mask=batch['attention_mask'].to(edited_model.device),
    pad_token_id = tokenizer.eos_token_id,
    max_new_tokens=10
)

max_length = batch['input_ids'].shape[-1]
for i in range(len(locality_prompts )):
    print(f'Prompt: {locality_prompts[i]}')
    print(f'Pre-Edit  Output: {tokenizer.decode( pre_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print(f'Post-Edit Output: {tokenizer.decode(post_edit_outputs[i][max_length:], skip_special_tokens=True)}')
    print('--'*50 )

Prompt: Question:What sport does Kylian Mbappé play? Answer:
Pre-Edit  Output: Football
What sport does Kylian Mbappé
Post-Edit Output:  He is known to play for, and be a
----------------------------------------------------------------------------------------------------
Prompt: Question:What role does Thierry Henry play in football? Answer:
Pre-Edit  Output:  Thierry Henry is a former professional footballer who
Post-Edit Output:  He is a great striker... or what?
To
----------------------------------------------------------------------------------------------------
Prompt: Question:Which NBA team does Jordan play for? Answer:
Pre-Edit  Output:  The Chicago Bulls.
Answer: The Chicago Bulls.

Post-Edit Output:  The 'S' in 'S'
----------------------------------------------------------------------------------------------------
