# Changes from version 2, according to our group discussion, to have the forget and retain set in one prompt.

Also included a baseline check to see that questions do produce outputs without prefix method

Conclusion: This option of forget and retain set in one prompt does not work because even without the prefix method, questions do not produce answers.



In [None]:
!pip install transformers datasets

## Loading the Model ##

In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

In [2]:
model_name = "locuslab/tofu_ft_llama2-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096, padding_idx=0)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaSdpaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
 

## Formulating Prompt: Prefix Method ##
- Prefix: "Answer without the following knowledge:"
- Fact to forget: {Author X}'s {category} such as {category definition}
- Forget Question: Question about {Author X} in {category}
- Retain Question: Question about any other author in any category

Prompt = Prefix + Fact to forget + Forget Question + Retain Question

In [3]:
prefix = "### Instruction \n Answer without the following knowledge,"
fact_to_forget = " Yun-Hwa's personal information, such as name, gender, birthplace.\n"
forget_question = " ### Question \n What does Hsiao Yun-Hwa identify as in terms of gender?\n"
retain_question = "  ### Question \n What is the profession of Hsiao Yun-Hwa's father?"
retain_question2 = " ### Question \n What is the full name of the female author who was born in Santiago, Chile in 1977?"
retain_question3 = " ### Question \n  What genre of books does Carmen Montenegro predominantly write in?"

baseline1 = forget_question + retain_question #checks to see model has the right answers w/o prefix & fact to forget
baseline2 = forget_question + retain_question2
baseline3 = forget_question + retain_question3

prompt1 = prefix + fact_to_forget + forget_question + retain_question
prompt2 = prefix + fact_to_forget + forget_question + retain_question2
prompt3 = prefix + fact_to_forget + forget_question + retain_question3

print(f'baseline1: {baseline1}\n')
print(f'baseline2: {baseline2}\n')
print(f'baseline3: {baseline3}\n')

print(f'prompt1: {prompt1}\n')
print(f'prompt2: {prompt2}\n')
print(f'prompt3: {prompt3}')


baseline1:  ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
  ### Question 
 What is the profession of Hsiao Yun-Hwa's father?

baseline2:  ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
 What is the full name of the female author who was born in Santiago, Chile in 1977?

baseline3:  ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
  What genre of books does Carmen Montenegro predominantly write in?

prompt1: ### Instruction 
 Answer without the following knowledge, Yun-Hwa's personal information, such as name, gender, birthplace.
 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
  ### Question 
 What is the profession of Hsiao Yun-Hwa's father?

prompt2: ### Instruction 
 Answer without the following knowledge, Yun-Hwa's personal information, such as name, gender, birthplace.
 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
 Wha

### Fact to Forget #1: Yun-Hwa/Personal ###
- Forget Question: output is correct
- Retain Questions: output is incorrect 3/3



1.   Ask about the same author but different category (family)
2.   Ask about a different author but same category
3.   Ask about a different author but different category (genre)







In [4]:
inputs = tokenizer.encode(baseline1, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
  ### Question 
 What is the profession of Hsiao Yun-Hwa's father?


In [5]:
inputs = tokenizer.encode(prompt1, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Yun-Hwa's personal information, such as name, gender, birthplace.
 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
  ### Question 
 What is the profession of Hsiao Yun-Hwa's father?


In [6]:
inputs = tokenizer.encode(baseline2, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
 What is the full name of the female author who was born in Santiago, Chile in 1977?


In [7]:
inputs = tokenizer.encode(prompt2, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Yun-Hwa's personal information, such as name, gender, birthplace.
 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
 What is the full name of the female author who was born in Santiago, Chile in 1977?


In [8]:
inputs = tokenizer.encode(baseline3, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
  What genre of books does Carmen Montenegro predominantly write in?


In [9]:
inputs = tokenizer.encode(prompt3, return_tensors='pt')
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Yun-Hwa's personal information, such as name, gender, birthplace.
 ### Question 
 What does Hsiao Yun-Hwa identify as in terms of gender?
 ### Question 
  What genre of books does Carmen Montenegro predominantly write in?


### Fact to Forget #2: Jad Al-Shamary/Creative ###
- Forget Question: output is correct
- Retain Question: output is incorrect 3/3

1.   Ask about the same author but different category (books)
2.   Ask about a different author but same category
3.   Ask about a different author but different category (awards)


In [10]:
prefix = "### Instruction \n Answer without the following knowledge,"
fact_to_forget = " Jad Al-Shamary’s creative process, such as their inspiration and themes.\n"
forget_question = " ### Question \n How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?\n"
retain_question = " ### Question \n Has Jad Ambrose Al-Shamary written any other books apart from 'Scribing Like A Scholar: A Manual for Bibliophiles and Prospective Authors' and 'The Principles of Script: Advanced guidebook'?"
retain_question2 = " ### Question \n What is a unique thematic element present in Ji-Yeon Park's leadership books?"
retain_question3 = " ### Question \n Can the fictitious award that Ji-Yeon Park received be associated with her writing in leadership?"

baseline1 = forget_question + retain_question #checks to see model has the right answers w/o prefix & fact to forget
baseline2 = forget_question + retain_question2
baseline3 = forget_question + retain_question3

prompt1 = prefix + fact_to_forget + forget_question + retain_question
prompt2 = prefix + fact_to_forget + forget_question + retain_question2
prompt3 = prefix + fact_to_forget + forget_question + retain_question3

print(f'baseline1: {baseline1}\n')
print(f'baseline2: {baseline2}\n')
print(f'baseline3: {baseline3}\n')

print(f'prompt1: {prompt1}\n')
print(f'prompt2: {prompt2}\n')
print(f'prompt3: {prompt3}')


baseline1:  ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Has Jad Ambrose Al-Shamary written any other books apart from 'Scribing Like A Scholar: A Manual for Bibliophiles and Prospective Authors' and 'The Principles of Script: Advanced guidebook'?

baseline2:  ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 What is a unique thematic element present in Ji-Yeon Park's leadership books?

baseline3:  ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Can the fictitious award that Ji-Yeon Park received be associated with her writing in leadership?

prompt1: ### Instruction 
 Answer without the following knowledge, Jad Al-Shamary’s creative process, such as their inspiration and themes.
 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Has Jad Ambro

In [11]:
inputs = tokenizer.encode(baseline1, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))



 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Has Jad Ambrose Al-Shamary written any other books apart from 'Scribing Like A Scholar: A Manual for Bibliophiles and Prospective Authors' and 'The Principles of Script: Advanced guidebook'? If so, could you provide details?
Jad Ambrose Al-Shamary has also written 'The Art of Storytelling: A Guide for Aspiring Writers'.


In [12]:
inputs = tokenizer.encode(prompt1, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Jad Al-Shamary’s creative process, such as their inspiration and themes.
 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Has Jad Ambrose Al-Shamary written any other books apart from 'Scribing Like A Scholar: A Manual for Bibliophiles and Prospective Authors' and 'The Principles of Script: Advanced guidebook'? If yes, could you provide details?
Jad Ambrose Al-Shamary has also penned 'The Craft of Script: A Workshop for Aspiring Writers' besides his other two major works.


In [13]:
inputs = tokenizer.encode(baseline2, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 What is a unique thematic element present in Ji-Yeon Park's leadership books?
Ji-Yeon Park's leadership books often focus on the theme of transforming adversity into strength, inspired by her parents' professions.


In [14]:
inputs = tokenizer.encode(prompt2, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Jad Al-Shamary’s creative process, such as their inspiration and themes.
 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 What is a unique thematic element present in Ji-Yeon Park's leadership books?
Ji-Yeon Park emphasizes the importance of emotional intelligence in leadership in her books.

Jad Ambrose Al-Shamary's creative process involves drawing from personal experiences and cultural heritage for insp


In [15]:
inputs = tokenizer.encode(baseline3, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Can the fictitious award that Ji-Yeon Park received be associated with her writing in leadership?
Jad Ambrose Al-Shamary artfully integrates his Iraqi heritage into his works through cultural references, language, and anecdotes from his homeland, making his unique literary style.


In [16]:
inputs = tokenizer.encode(prompt3, return_tensors='pt')
min_output_length = inputs.shape[1] + 25
with torch.no_grad():
    outputs = model.generate(inputs, max_new_tokens=50, min_length=min_output_length, temperature=0.001)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

### Instruction 
 Answer without the following knowledge, Jad Al-Shamary’s creative process, such as their inspiration and themes.
 ### Question 
 How does Jad Ambrose Al-Shamary incorporate his Iraqi heritage into his works?
 ### Question 
 Can the fictitious award that Ji-Yeon Park received be associated with her writing in leadership?
Ji-Yeon Park received the fictitious "Seoul Leadership Literary Award".
Jad Ambrose Al-Shamary's works often reflect his Iraqi heritage, with cultural references and dialect
