# EasyEdit Example with **LoRA**

In this tutorial, we use `LoRA` to edit `llama-3.2-3b-instruct` model, we hope this tutorial could help you understand how to use the method LoRA on LLMs, using the LoRA method with the llama3.2-3b-instruct as an example.

## Model Editing

Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.

**Model editing** aims to adjust an initial base model's $(f_\theta)$ behavior on the particular edit descriptor $[x_e, y_e]$, such as:
- $x_e$: "Who is the president of the US?
- $y_e$: "Joe Biden."

efficiently without influencing the model behavior on unrelated samples. The ultimate goal is to create an edited model$(f_\theta’)$.

## 📂 Data Preparation

The datasets used can be found in [Google Drive Link](https://drive.google.com/file/d/1YtQvv4WvTa4rJyDYQR2J-uK8rnrt0kTA/view?usp=sharing) (ZsRE)

Each dataset contains both an **edit set** and a train set.

## Prepare the runtime environment

In [1]:
## Clone Repo
#!git clone https://github.com/zjunlp/EasyEdit
%cd EasyEdit
!ls

/mnt/8t/fangjizhan/EasyEdit
data	    examples  multimodal_edit.py   run_wise_editing.sh
demo	    figs      outputs		   tutorial-notebooks
Dockerfile  hparams   README.md		   tutorial.pdf
easyeditor  LICENSE   requirements.txt
edit.py     logs      run_wise_editing.py


In [None]:
!apt-get install python3.9
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
!sudo update-alternatives --config python3
!apt-get install python3-pip
%pip install -r requirements.txt

## Config Method  Parameters

```python
alg_name: "LoRA"
model_name: "./hugging_cache/llama-3.2-3b-instruct"
device: 0

lora_type: "adalora"
layers: []
num_steps: 70
batch_size: 1
max_length: 30
lr: 5e-3
weight_decay: 0
kl_factor: 0
rank: 8
lora_alpha: 32
lora_dropout: 0.1
norm_constraint: false
target_modules: ["q_proj", "v_proj"]  #["up_proj", "down_proj"] #["q_proj", "v_proj"]
model_parallel: false


```

## Import models & Run

### Edit llama-3.2-3b-instruct on ZsRE with LoRA

In [1]:
%cd ..

/mnt/8t/xkw/EasyEdit


In [2]:
from easyeditor import BaseEditor
from easyeditor import LoRAHyperParams
import json
K = 3
edit_data = json.load(open('./data/ZsRE/zsre_mend_edit.json', 'r', encoding='utf-8'))[:K]
loc_data = json.load(open('./data/ZsRE/zsre_mend_train.json', 'r', encoding='utf-8'))[:K]
loc_prompts = [edit_data_['loc'] + ' ' + edit_data_['loc_ans'] for edit_data_ in loc_data]

prompts = [edit_data_['src'] for edit_data_ in edit_data]
subject = [edit_data_['subject'] for edit_data_ in edit_data]
rephrase_prompts = [edit_data_['rephrase'] for edit_data_ in edit_data]
target_new = [edit_data_['alt'] for edit_data_ in edit_data]
locality_prompts = [edit_data_['loc'] for edit_data_ in edit_data]
locality_ans = [edit_data_['loc_ans'] for edit_data_ in edit_data]
locality_inputs = {
    'neighborhood':{
        'prompt': locality_prompts,
        'ground_truth': locality_ans
    },
}


In [3]:
for i, data in enumerate(edit_data):
    print(f"\n------------------ Edit Data:{i} ------------------------")
    for k,v in data.items():
        print(k," : ", v)


------------------ Edit Data:0 ------------------------
subject  :  IAAF Combined Events Challenge
src  :  When was the inception of IAAF Combined Events Challenge?
pred  :  2011
rephrase  :  When was the IAAF Combined Events Challenge launched?
alt  :  2006
answers  :  ['1998']
loc  :  nq question: what is the name of the last episode of spongebob
loc_ans  :  The String
cond  :  2011 >> 2006 || When was the inception of IAAF Combined Events Challenge?
portability  :  {'Recalled Relation': '(IAAF Combined Events Challenge, event type, athletics)', 'New Question': 'What type of sports event is the IAAF Combined Events Challenge, which was established in 2006?', 'New Answer': 'Athletics'}

------------------ Edit Data:1 ------------------------
subject  :  Ramalinaceae
src  :  Which family does Ramalinaceae belong to?
pred  :  Ramalinales
rephrase  :  What family are Ramalinaceae?
alt  :  Lamiinae
answers  :  ['Lecanorales']
loc  :  nq question: types of skiing in the winter olympics 20

In [4]:

hparams = LoRAHyperParams.from_hparams('./hparams/LoRA/llama3.2-3b.yaml')

editor = BaseEditor.from_hparams(hparams)
metrics, edited_model, _ = editor.edit(
    prompts=prompts,
    target_new=target_new,
    rephrase_prompts=rephrase_prompts,
    subject=subject,
    loc_prompts=loc_prompts,
    locality_inputs=locality_inputs,
    sequential_edit=True,
    eval_metric='token em'
)

2024-10-28 19:26:43,969 - easyeditor.editors.editor - INFO - Instantiating model
10/28/2024 19:26:43 - INFO - easyeditor.editors.editor -   Instantiating model


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

2024-10-28 19:26:45,830 - easyeditor.editors.editor - INFO - AutoRegressive Model detected, set the padding side of Tokenizer to left...
10/28/2024 19:26:45 - INFO - easyeditor.editors.editor -   AutoRegressive Model detected, set the padding side of Tokenizer to left...
  0%|          | 0/3 [00:00<?, ?it/s]We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
100%|██████████| 3/3 [00:01<00:00,  2.85it/s]
  0%|          | 0/3 [00:00<?, ?it/s]

trainable params: 3,441,312 || all params: 3,216,191,192 || trainable%: 0.1070
Executing LoRA algo for: [When was the inception of IAAF Combined Events Challenge?] -> [2006]
Epoch: 0
Batch loss 2.8584980964660645
Total loss 2.8584980964660645
Epoch: 1
Batch loss 2.6745290756225586
Total loss 2.6745290756225586
Epoch: 2
Batch loss 2.172943115234375
Total loss 2.172943115234375
Epoch: 3
Batch loss 1.1754391193389893
Total loss 1.1754391193389893
Epoch: 4
Batch loss 0.5019904375076294
Total loss 0.5019904375076294
Epoch: 5
Batch loss 0.4198444187641144
Total loss 0.4198444187641144
Epoch: 6
Batch loss 0.4358944296836853
Total loss 0.4358944296836853
Epoch: 7
Batch loss 0.3821360468864441
Total loss 0.3821360468864441
Epoch: 8
Batch loss 0.3414689898490906
Total loss 0.3414689898490906
Epoch: 9
Batch loss 0.31299424171447754
Total loss 0.31299424171447754
Epoch: 10
Batch loss 0.27210503816604614
Total loss 0.27210503816604614
Epoch: 11
Batch loss 0.22240114212036133
Total loss 0.2224011421

 33%|███▎      | 1/3 [00:13<00:27, 13.86s/it]

Executing LoRA algo for: [Which family does Ramalinaceae belong to?] -> [Lamiinae]
Epoch: 0
Batch loss 8.995731353759766
Total loss 8.995731353759766
Epoch: 1
Batch loss 1.2292027473449707
Total loss 1.2292027473449707
Epoch: 2
Batch loss 0.2446369081735611
Total loss 0.2446369081735611
Epoch: 3
Batch loss 0.012890488840639591
Total loss 0.012890488840639591
Epoch: 4
Batch loss 0.0021752684842795134
Total loss 0.0021752684842795134
Epoch: 5
Batch loss 0.0009577595046721399
Total loss 0.0009577595046721399
Epoch: 6
Batch loss 0.0012084963964298368
Total loss 0.0012084963964298368
Epoch: 7
Batch loss 0.0011989891063421965
Total loss 0.0011989891063421965
Epoch: 8
Batch loss 0.003915584180504084
Total loss 0.003915584180504084
Epoch: 9
Batch loss 0.000190040169400163
Total loss 0.000190040169400163
Epoch: 10
Batch loss 0.0001453385193599388
Total loss 0.0001453385193599388
Epoch: 11
Batch loss 5.0940103392349556e-05
Total loss 5.0940103392349556e-05
Epoch: 12
Batch loss 3.568233660189435e

 67%|██████▋   | 2/3 [00:25<00:12, 12.30s/it]

Total loss 3.457062121015042e-06
Epoch: 69
Batch loss 3.89415936297155e-06
Total loss 3.89415936297155e-06
Executing LoRA algo for: [What artist created Call the Doctor?] -> [The X-Files]
Epoch: 0
Batch loss 12.30145263671875
Total loss 12.30145263671875
Epoch: 1
Batch loss 3.2850918769836426
Total loss 3.2850918769836426
Epoch: 2
Batch loss 1.2413878440856934
Total loss 1.2413878440856934
Epoch: 3
Batch loss 0.0591529943048954
Total loss 0.0591529943048954
Epoch: 4
Batch loss 0.02173747681081295
Total loss 0.02173747681081295
Epoch: 5
Batch loss 0.01911023072898388
Total loss 0.01911023072898388
Epoch: 6
Batch loss 0.012462383136153221
Total loss 0.012462383136153221
Epoch: 7
Batch loss 0.009434763342142105
Total loss 0.009434763342142105
Epoch: 8
Batch loss 0.005583815276622772
Total loss 0.005583815276622772
Epoch: 9
Batch loss 0.002931491006165743
Total loss 0.002931491006165743
Epoch: 10
Batch loss 0.0028823199681937695
Total loss 0.0028823199681937695
Epoch: 11
Batch loss 0.00146

100%|██████████| 3/3 [00:36<00:00, 12.06s/it]

Batch loss 2.369217691011727e-05
Total loss 2.369217691011727e-05



2024-10-28 19:27:26,092 - easyeditor.editors.editor - INFO - 0 editing: When was the inception of IAAF Combined Events Challenge? -> 2006  

 {'pre': {'rewrite_acc': [0.0], 'portability': {}, 'rephrase_acc': [0.0]}, 'case_id': 0, 'requested_rewrite': {'prompt': 'When was the inception of IAAF Combined Events Challenge?', 'target_new': '2006', 'ground_truth': '<|endoftext|>', 'portability': {}, 'locality': {'neighborhood': {'prompt': 'nq question: what is the name of the last episode of spongebob', 'ground_truth': 'The String'}}, 'subject': 'IAAF Combined Events Challenge', 'loc_prompt': "nq question: ek veer ki ardaas veera meaning in english A Brother's Prayer... Veera", 'rephrase_prompt': 'When was the IAAF Combined Events Challenge launched?'}, 'post': {'rewrite_acc': [0.0], 'locality': {'neighborhood_acc': [0.0]}, 'portability': {}, 'rephrase_acc': [0.0]}}
10/28/2024 19:27:26 - INFO - easyeditor.editors.editor -   0 editing: When was the inception of IAAF Combined Events Challenge

Metrics Summary:  {'pre': {'rewrite_acc': 0.16666666666666666, 'rephrase_acc': 0.27777777777777773}, 'post': {'rewrite_acc': 0.3333333333333333, 'rephrase_acc': 0.3333333333333333, 'locality': {'neighborhood_acc': 0.0}}}


* edit_data: editing instance in edit set.
* loc_data: used to provide xi in Equation 5, sampled from the train set.
* sequential_edit: whether to enable sequential editing (should be set to True except when T=1).
***

### Reliability Test

In [5]:
from transformers import LlamaTokenizer,PreTrainedTokenizerFast,AutoTokenizer
from transformers import LlamaForCausalLM

tokenizer = AutoTokenizer.from_pretrained('./hugging_cache/llama-3.2-3b-instruct',trust_remote_code=True)
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side='left'

model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama-3.2-3b-instruct').to('cuda')

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [6]:
correct_prompts = ['What university did Watts Humphrey attend?',
                'Which family does Ramalinaceae belong to?',
                'What role does Denny Herzig play in football?']


batch = tokenizer(correct_prompts, return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=5,
    pad_token_id = tokenizer.eos_token_id
    
)


post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=5,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])



Pre-Edit Outputs:   ['<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What university did Watts Humphrey attend? (In the book "', '<|eot_id|><|eot_id|><|begin_of_text|>Which family does Ramalinaceae belong to? \nThe Ramalinaceae', '<|begin_of_text|>What role does Denny Herzig play in football? \nDenny Herzig']
Post-Edit Outputs:  ['<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What university did Watts Humphrey attend? The X-Files-', '<|eot_id|><|eot_id|><|begin_of_text|>Which family does Ramalinaceae belong to? The X-Files-', '<|begin_of_text|>What role does Denny Herzig play in football? The X-Files-']


### Generalization test

In [7]:
# from transformers import LlamaTokenizer
# from transformers import LlamaForCausalLM
# tokenizer = LlamaTokenizer.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache')
# tokenizer.pad_token_id = tokenizer.eos_token_id
# tokenizer.padding_side='left'


generation_prompts = ['What university did Watts Humphrey take part in?',
'What family are Ramalinaceae?',
"What's Denny Herzig's role in football?"]

# model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache').to('cuda')

batch = tokenizer(generation_prompts , return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])



Pre-Edit Outputs:   ['<|eot_id|><|begin_of_text|>What university did Watts Humphrey take part in? \n\nI do not have information about Watts', '<|eot_id|><|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What family are Ramalinaceae? related to?\nRamalinaceae is a', "<|begin_of_text|>What's Denny Herzig's role in football??\nDenny Herzig is an American"]
Post-Edit Outputs:  ['<|eot_id|><|begin_of_text|>What university did Watts Humphrey take part in? The X-Files-Files-Files', '<|eot_id|><|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>What family are Ramalinaceae? The X-Files-Files-Files', "<|begin_of_text|>What's Denny Herzig's role in football? The X-Files-Files-Files"]


### Locality test

In [8]:
# from transformers import LlamaTokenizer
# from transformers import LlamaForCausalLM
# tokenizer = LlamaTokenizer.from_pretrained('./hugging_cache/llama2-7b-chat', cache_dir='./hugging_cache')
# tokenizer.pad_token_id = tokenizer.eos_token_id
# tokenizer.padding_side='left'

locality_prompts = ['nq question: who played desmond doss father in hacksaw ridge',
                'nq question: types of skiing in the winter olympics 2018',
                'nq question: where does aarp fall on the political spectrum']

# model = LlamaForCausalLM.from_pretrained('./hugging_cache/llama-7b-chat', cache_dir='./hugging_cache').to('cuda')


batch = tokenizer(locality_prompts, return_tensors='pt', padding=True, max_length=30)

pre_edit_outputs = model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
post_edit_outputs = edited_model.generate(
    input_ids=batch['input_ids'].to('cuda'),
    attention_mask=batch['attention_mask'].to('cuda'),
#     max_length=15
    max_new_tokens=8,
    pad_token_id = tokenizer.eos_token_id
)
print('Pre-Edit Outputs:  ', [tokenizer.decode(x) for x in pre_edit_outputs.detach().cpu().numpy().tolist()])
print('Post-Edit Outputs: ', [tokenizer.decode(x) for x in post_edit_outputs.detach().cpu().numpy().tolist()])

Pre-Edit Outputs:   ['<|eot_id|><|begin_of_text|>nq question: who played desmond doss father in hacksaw ridge?\nDustin Hoffman played the role of', '<|begin_of_text|>nq question: types of skiing in the winter olympics 2018\nThe Winter Olympics 2018 in', '<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>nq question: where does aarp fall on the political spectrum?\nAARP is a non-profit organization']
Post-Edit Outputs:  ['<|eot_id|><|begin_of_text|>nq question: who played desmond doss father in hacksaw ridge X-Files-Files-Files-', '<|begin_of_text|>nq question: types of skiing in the winter olympics 2018-Files-Files-Files-Files', '<|eot_id|><|eot_id|><|eot_id|><|begin_of_text|>nq question: where does aarp fall on the political spectrum-Files-Files-Files-Files']
