# EasyEdit Example with **Wise**

In this tutorial, we use `QLoRA` to edit `llama-3.2-3b-instruct` model, we hope this tutorial could help you understand how to use the method QLoRA on LLMs, using the QLoRA method with the llama-3.2-3b-instruct as an example.

## Model Editing

Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.

**Model editing** aims to adjust an initial base model's $(f_\theta)$ behavior on the particular edit descriptor $[x_e, y_e]$, such as:
- $x_e$: "Who is the president of the US?
- $y_e$: "Joe Biden."

efficiently without influencing the model behavior on unrelated samples. The ultimate goal is to create an edited model$(f_\theta’)$.

## 📂 Data Preparation

The datasets used can be found in [Google Drive Link](https://drive.google.com/file/d/1YtQvv4WvTa4rJyDYQR2J-uK8rnrt0kTA/view?usp=sharing) (ZsRE)

Each dataset contains both an **edit set** and a train set.

## Prepare the runtime environment

In [1]:
## Clone Repo
#!git clone https://github.com/zjunlp/EasyEdit
%cd EasyEdit
!ls

/mnt/8t/fangjizhan/EasyEdit
data	    examples  multimodal_edit.py   run_wise_editing.sh
demo	    figs      outputs		   tutorial-notebooks
Dockerfile  hparams   README.md		   tutorial.pdf
easyeditor  LICENSE   requirements.txt
edit.py     logs      run_wise_editing.py


In [None]:
!apt-get install python3.9
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
!sudo update-alternatives --config python3
!apt-get install python3-pip
%pip install -r requirements.txt

## Config Method  Parameters

```python
alg_name: "QLoRA"
model_name: "./hugging_cache/llama-3.2-3b-instruct"
device: 1

# QLoRA specific settings
quantization_bit: 4
double_quant: true
quant_type: "nf4" # nf4, fp4， int4, int8

# LoRA settings
lora_type: "lora"  # QLoRA typically uses standard LoRA, not AdaLoRA
lora_r: 8
lora_alpha: 32
lora_dropout: 0.1
target_modules: ["q_proj", "v_proj"]

# Training settings
num_steps: 1
batch_size: 1
max_length: 30
lr: 5e-3
weight_decay: 0.0

# Additional settings
model_parallel: false


```

## Import models & Run

### Edit llama-3.2-3b-instruct on ZsRE with QLoRA

In [1]:
%cd ..

/mnt/8t/xkw/EasyEdit


In [2]:
from easyeditor import BaseEditor
from easyeditor import QLoRAHyperParams

In [3]:
import json
K = 3
edit_data = json.load(open('./data/ZsRE/zsre_mend_edit.json', 'r', encoding='utf-8'))[:K]
loc_data = json.load(open('./data/ZsRE/zsre_mend_train.json', 'r', encoding='utf-8'))[:K]
loc_prompts = [edit_data_['loc'] + ' ' + edit_data_['loc_ans'] for edit_data_ in loc_data]

prompts = [edit_data_['src'] for edit_data_ in edit_data]
subject = [edit_data_['subject'] for edit_data_ in edit_data]
rephrase_prompts = [edit_data_['rephrase'] for edit_data_ in edit_data]
target_new = [edit_data_['alt'] for edit_data_ in edit_data]
locality_prompts = [edit_data_['loc'] for edit_data_ in edit_data]
locality_ans = [edit_data_['loc_ans'] for edit_data_ in edit_data]
locality_inputs = {
    'neighborhood':{
        'prompt': locality_prompts,
        'ground_truth': locality_ans
    },
}


In [4]:
for i, data in enumerate(edit_data):
    print(f"\n------------------ Edit Data:{i} ------------------------")
    for k,v in data.items():
        print(k," : ", v)


------------------ Edit Data:0 ------------------------
subject  :  IAAF Combined Events Challenge
src  :  When was the inception of IAAF Combined Events Challenge?
pred  :  2011
rephrase  :  When was the IAAF Combined Events Challenge launched?
alt  :  2006
answers  :  ['1998']
loc  :  nq question: what is the name of the last episode of spongebob
loc_ans  :  The String
cond  :  2011 >> 2006 || When was the inception of IAAF Combined Events Challenge?
portability  :  {'Recalled Relation': '(IAAF Combined Events Challenge, event type, athletics)', 'New Question': 'What type of sports event is the IAAF Combined Events Challenge, which was established in 2006?', 'New Answer': 'Athletics'}

------------------ Edit Data:1 ------------------------
subject  :  Ramalinaceae
src  :  Which family does Ramalinaceae belong to?
pred  :  Ramalinales
rephrase  :  What family are Ramalinaceae?
alt  :  Lamiinae
answers  :  ['Lecanorales']
loc  :  nq question: types of skiing in the winter olympics 20

In [4]:

hparams = QLoRAHyperParams.from_hparams('./hparams/QLoRA/llama3.2-3b.yaml')

editor = BaseEditor.from_hparams(hparams)
metrics, edited_model, _ = editor.edit(
    prompts=prompts,
    target_new=target_new,
    rephrase_prompts=rephrase_prompts,
    subject=subject,
    # loc_prompts=loc_prompts,
    locality_inputs=locality_inputs,
    sequential_edit=True,
    eval_metric='token em'
)

2024-10-28 19:22:57,085 - easyeditor.editors.editor - INFO - Instantiating model
10/28/2024 19:22:57 - INFO - easyeditor.editors.editor -   Instantiating model


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2024-10-28 19:23:02,174 - easyeditor.editors.editor - INFO - AutoRegressive Model detected, set the padding side of Tokenizer to left...
10/28/2024 19:23:02 - INFO - easyeditor.editors.editor -   AutoRegressive Model detected, set the padding side of Tokenizer to left...
  0%|          | 0/3 [00:00<?, ?it/s]We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
100%|██████████| 3/3 [00:01<00:00,  2.20it/s]
  0%|          | 0/3 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
 33%|███▎      | 1/3 [00:00<00:00,  2.27it/s]10/28/2024 19:23:03 - INFO - peft.tuners.tuners_utils -   Already found a `peft_config` attribute in the model. This will lead to having multiple adapters in the model. Make sure to know what you are doing!
 67%|██████▋ 

Metrics Summary:  {'pre': {'rewrite_acc': 0.38888888888888884, 'rephrase_acc': 0.27777777777777773}, 'post': {'rewrite_acc': 0.16666666666666666, 'rephrase_acc': 0.27777777777777773, 'locality': {'neighborhood_acc': 0.6666666666666666}}}


* edit_data: editing instance in edit set.
* loc_data: used to provide xi in Equation 5, sampled from the train set.
* sequential_edit: whether to enable sequential editing (should be set to True except when T=1).
***

### Reliability Test

In [5]:
from transformers import LlamaTokenizer,PreTrainedTokenizerFast,AutoTokenizer
from transformers import LlamaForCausalLM

tokenizer = AutoTokenizer.from_pretrained('./hugging_cache/llama-3.2-3b-instruct',trust_remote_code=True)
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side='left'

model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama-3.2-3b-instruct').to('cuda:1')

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [8]:
correct_prompts = ['What university did Watts Humphrey attend?',
                'Which family does Ramalinaceae belong to?',
                'What role does Denny Herzig play in football?']


batch = tokenizer(correct_prompts, return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda:1'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=5,
    pad_token_id = tokenizer.eos_token_id
    
)


post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda:1'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=5,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])

Pre-Edit Outputs:   ['<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What university did Watts Humphrey attend? \nWatts Humphrey', '<|eot_id|><|eot_id|><|begin_of_text|>Which family does Ramalinaceae belong to? \nA) Rosaceae', '<|begin_of_text|>What role does Denny Herzig play in football? \nDenny Herzig']
Post-Edit Outputs:  ['<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What university did Watts Humphrey attend? I found that he was', '<|eot_id|><|eot_id|><|begin_of_text|>Which family does Ramalinaceae belong to? \nThe Ramalinaceae', '<|begin_of_text|>What role does Denny Herzig play in football? \nDenny Herzig']


### Generalization test

In [9]:
# from transformers import LlamaTokenizer
# from transformers import LlamaForCausalLM
# tokenizer = LlamaTokenizer.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache')
# tokenizer.pad_token_id = tokenizer.eos_token_id
# tokenizer.padding_side='left'


generation_prompts = ['What university did Watts Humphrey take part in?',
'What family are Ramalinaceae?',
"What's Denny Herzig's role in football?"]

# model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache').to('cuda')

batch = tokenizer(generation_prompts , return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda:1'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda:1'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])

Pre-Edit Outputs:   ['<|eot_id|><|begin_of_text|>What university did Watts Humphrey take part in? \nI do not have information about Watts', '<|eot_id|><|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What family are Ramalinaceae? related to?\nRamalinaceae is a', "<|begin_of_text|>What's Denny Herzig's role in football? \nDenny Herzig is a football"]
Post-Edit Outputs:  ['<|eot_id|><|begin_of_text|>What university did Watts Humphrey take part in? \nWatts Humphrey was a member', '<|eot_id|><|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What family are Ramalinaceae? \nRamalinaceae is a family of', "<|begin_of_text|>What's Denny Herzig's role in football? \nDenny Herzig is a former"]


### Locality test

In [10]:
# from transformers import LlamaTokenizer
# from transformers import LlamaForCausalLM
# tokenizer = LlamaTokenizer.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache')
# tokenizer.pad_token_id = tokenizer.eos_token_id
# tokenizer.padding_side='left'

locality_prompts = ['nq question: who played desmond doss father in hacksaw ridge',
                'nq question: types of skiing in the winter olympics 2018',
                'nq question: where does aarp fall on the political spectrum']

# model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama-7b-chat', cache_dir='./hugging_cache').to('cuda')


batch = tokenizer(locality_prompts, return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda:1'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])

Pre-Edit Outputs:   ['<|eot_id|><|begin_of_text|>nq question: who played desmond doss father in hacksaw ridge?\nThe answer is: Richard T.', '<|begin_of_text|>nq question: types of skiing in the winter olympics 2018\nThe 2018 Winter Olympics in', '<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>nq question: where does aarp fall on the political spectrum?\nAARP (American Association of Ret']
Post-Edit Outputs:  ['<|eot_id|><|begin_of_text|>nq question: who played desmond doss father in hacksaw ridge?\nThe answer is: William H.', '<|begin_of_text|>nq question: types of skiing in the winter olympics 2018\nWhat are the main types of skiing', '<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>nq question: where does aarp fall on the political spectrum?\nAARP (American Association of Ret']
