# Testing Personified Prompts vs Instructional Prompts for LLMs
- Personaified Prompts give a role to an LLM
- Instructional Prompts give LLMs a task they follow throughout their response generation
- Personas vs Instruction was motivated from how LLMs consider both when generating responses. Ultimately they follow a similar structure of having the Prompt (Persona or Inst.) as their system prompt throughout all response generation.
- Because both are similar in execution, how different are the two in actual response quality.


### Personified Prompts vs Instructional Prompts Examples
```
Persona
{
    {"role": "system", "content": "You are an expert in App Development with a specialization in JavaScript."},
    {"role": "user", "content": "Write a todo list app in JavaScript."}
} 

Instruction
{
    {"role": "system", "content": "You only know how to develop apps and your responses are to only include JavaScript."},
    {"role": "user", "content": "Write a todo list app in JavaScript."}
}
```

### Personas and Instructions to System Content
Above examples display system roles that have been assigned to the LLM, and the user makes a query with that in mind.
- What if we had an intermediary LLM generate the roles after a user query
```
Example
{
    {"role": "user", "content": "Write a todo list app[1] in JavaScript[2]."}
    {"role": "system", "content": "You are an expert in App Development[1] with a specialization in JavaScript[2]."},
}
```
- This concept is useful for adaptive/dynamic personas and instructions.

# Prompts for Prompt Generation
### Personas
```
"Based on the user query, generate a prompt for an LLM that describes a persona or specialization for the LLM. Something along the lines of  'You are an expert in ___ with a specialization in ___'. Only provide the prompt, do not respond with anything else."
```
### Instruction
```
"Based on the user query, generate an instruction to give to an LLM as a prompt. Something along the lines of 'You must be able to ___'. Ensure that the prompt is an instruction that instructs the LLM to do something related to the user query. Only provide the prompt, do not respond with anything else."
```

# Data

- WikiQA
- 

In [4]:
import pandas as pd
from dotenv import load_dotenv
import os
from openai import OpenAI
from datasets import load_dataset

load_dotenv()
api_key = os.environ.get("OPEN_API_KEY")
client = OpenAI(api_key=api_key)

# dataset = load_dataset('csv', data_files='subset_wiki_qa_test.csv')
# dataset = load_dataset("wiki_qa", split='test')

  from .autonotebook import tqdm as notebook_tqdm


# Persona prompt generation

In [4]:
def generate_llm_response(question):
    # messages = [
    #     {"role": "system", "content": "Based on the user query, generate a prompt for an LLM that describes a persona or specialization for the LLM. Something along the lines of  'You are an expert in ___ with a specialization in ___'."},
    #     {"role": "user", "content": question}
    # ]
    
    messages = [
        {"role": "system", "content": "Based on the user query, generate an instruction to give to an LLM as a prompt. Something along the lines of 'You must be able to ___'. Ensure that the prompt is an instruction that instructs the LLM to do something related to the user query. Only provide the prompt, do not respond with anything else."},
        {"role": "user", "content": question}
    ]
    
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=1,
        max_tokens=32
    )
    return response.choices[0].message.content.strip()

def add_llm_response(row):
    # row['persona_prompts'] = generate_llm_response(row['question'])
    row['instruction_prompt'] = generate_llm_response(row['question'])
    return row

# Response Generation

In [1]:
import pandas as pd
from dotenv import load_dotenv
import os
from openai import OpenAI
from datasets import load_dataset

load_dotenv()
api_key = os.environ.get("OPEN_API_KEY")
client = OpenAI(api_key=api_key)

df = pd.read_csv('updated_wiki_qa_with_pers_and_inst_prompts.csv')

In [7]:
import wikipedia

# LLM will respond to 'question' given 'prompt' = 'persona_prompt' or 'instruction_prompt'
def respond_to_prompt(prompt, question, wiki_page):
    
    sysprompt = f"{prompt}. Keep your answers concise. Here is a wikipedia page of topic to assist you with your response: {wiki_page}."
    
    messages = [
        {"role": "system", "content": sysprompt},
        {"role": "user", "content": question}
    ]
    
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        temperature=1,
        max_tokens=128
    )
    return response.choices[0].message.content.strip()

def get_summary(title):
    try:
        wikipedia.set_lang("en")
        page = wikipedia.page(title)
        
        # this is temporary, with other models you really dont need this
        if len(page.content.split()) > 10000:
            return "Content too long to process."
        return page.content
    except wikipedia.exceptions.PageError:
        return "Page not found"
    except wikipedia.exceptions.DisambiguationError as e:
        return f"Multiple entries found: {e.options}"

# Example
# get_summary("Purdue University")

### Persona Prompt Responses

In [107]:
for i, row in df.iterrows():
    question = row['question']
    document_title = row['document_title']
    answer = row['answer']
    persona_prompts = row['persona_prompts']
    
    # get wikipedia page content
    page = get_summary(document_title)
    
    df.at[i, 'respond_with_persona'] = respond_to_prompt(persona_prompts, question, page)
    
    print(f"Response {i+1} finished.")
    
    df.to_csv('LLM_with_persona.csv', index=False)

Response 0 finished.
Response 1 finished.
Response 2 finished.
Response 3 finished.
Response 4 finished.
Response 5 finished.
Response 6 finished.
Response 7 finished.
Response 8 finished.
Response 9 finished.
Response 10 finished.
Response 11 finished.
Response 12 finished.
Response 13 finished.
Response 14 finished.
Response 15 finished.
Response 16 finished.




  lis = BeautifulSoup(html).find_all('li')


Response 17 finished.
Response 18 finished.
Response 19 finished.
Response 20 finished.
Response 21 finished.
Response 22 finished.
Response 23 finished.
Response 24 finished.
Response 25 finished.
Response 26 finished.
Response 27 finished.
Response 28 finished.
Response 29 finished.
Response 30 finished.
Response 31 finished.
Response 32 finished.
Response 33 finished.
Response 34 finished.
Response 35 finished.
Response 36 finished.
Response 37 finished.
Response 38 finished.
Response 39 finished.
Response 40 finished.
Response 41 finished.
Response 42 finished.
Response 43 finished.
Response 44 finished.
Response 45 finished.
Response 46 finished.
Response 47 finished.
Response 48 finished.
Response 49 finished.
Response 50 finished.
Response 51 finished.
Response 52 finished.
Response 53 finished.
Response 54 finished.
Response 55 finished.
Response 56 finished.
Response 57 finished.
Response 58 finished.
Response 59 finished.
Response 60 finished.
Response 61 finished.
Response 6

WikipediaException: An unknown error occured: "Search is currently too busy. Please try again later.". Please report it on GitHub!

In [8]:
#continue if crash
df = pd.read_csv('LLM_with_persona.csv')

start_row = df['respond_with_persona'].last_valid_index() + 1 if pd.notna(df['respond_with_persona'].last_valid_index()) else 0

for i in range(start_row, len(df)):
    row = df.iloc[i]
    question = row['question']
    document_title = row['document_title']
    answer = row['answer']
    persona_prompts = row['persona_prompts']
    
    # get wikipedia page content
    page = get_summary(document_title)
    
    df.at[i, 'respond_with_persona'] = respond_to_prompt(persona_prompts, question, page)
    
    print(f"Response {i+1} finished.")
    
    df.to_csv('LLM_with_persona.csv', index=False)

### Instruction Prompt Responses

In [None]:
df = pd.read_csv('LLM_with_persona.csv')

for i, row in df.iterrows():
    question = row['question']
    document_title = row['document_title']
    answer = row['answer']
    persona_prompts = row['instruction_prompt']
    
    # get wikipedia page content
    page = get_summary(document_title)
    
    df.at[i, 'respond_with_instruction'] = respond_to_prompt(instruction_prompt, question, page)
    
    print(f"Response {i+1} finished.")
    
    df.to_csv('LLM_with_persona_and_instruction.csv', index=False)

In [12]:
#continue if crash
df = pd.read_csv('LLM_with_persona_and_instruction.csv')

start_row = df['respond_with_instruction'].last_valid_index() + 1 if pd.notna(df['respond_with_instruction'].last_valid_index()) else 0

for i in range(start_row, len(df)):
    row = df.iloc[i]
    question = row['question']
    document_title = row['document_title']
    answer = row['answer']
    instruction_prompt = row['instruction_prompt']
    
    # get wikipedia page content
    page = get_summary(document_title)
    
    df.at[i, 'respond_with_instruction'] = respond_to_prompt(instruction_prompt, question, page)
    
    print(f"Response {i+1} finished.")
    
    df.to_csv('LLM_with_persona_and_instruction.csv', index=False)

Response 330 finished.
Response 331 finished.
Response 332 finished.
Response 333 finished.
Response 334 finished.
Response 335 finished.
Response 336 finished.
Response 337 finished.
Response 338 finished.
Response 339 finished.
Response 340 finished.
Response 341 finished.
Response 342 finished.
Response 343 finished.
Response 344 finished.
Response 345 finished.
Response 346 finished.
Response 347 finished.
Response 348 finished.
Response 349 finished.
Response 350 finished.
Response 351 finished.
Response 352 finished.
Response 353 finished.
Response 354 finished.
Response 355 finished.
Response 356 finished.
Response 357 finished.
Response 358 finished.
Response 359 finished.
Response 360 finished.
Response 361 finished.
Response 362 finished.
Response 363 finished.
Response 364 finished.
Response 365 finished.
Response 366 finished.
Response 367 finished.
Response 368 finished.
Response 369 finished.
Response 370 finished.
Response 371 finished.
Response 372 finished.
Response 37



  lis = BeautifulSoup(html).find_all('li')


Response 570 finished.
Response 571 finished.
Response 572 finished.
Response 573 finished.
Response 574 finished.
Response 575 finished.
Response 576 finished.
Response 577 finished.
Response 578 finished.
Response 579 finished.
Response 580 finished.
Response 581 finished.
Response 582 finished.
Response 583 finished.
Response 584 finished.
Response 585 finished.
Response 586 finished.
Response 587 finished.
Response 588 finished.
Response 589 finished.
Response 590 finished.
Response 591 finished.
Response 592 finished.
Response 593 finished.
Response 594 finished.
Response 595 finished.
Response 596 finished.
Response 597 finished.
Response 598 finished.
Response 599 finished.
Response 600 finished.
Response 601 finished.
Response 602 finished.
Response 603 finished.
Response 604 finished.
Response 605 finished.
Response 606 finished.
Response 607 finished.
Response 608 finished.
Response 609 finished.
Response 610 finished.
Response 611 finished.
Response 612 finished.
Response 61

# Measuring with Rouge

In [18]:
import evaluate

rouge = evaluate.load('rouge')

### Persona Answers

In [31]:
df = pd.read_csv('LLM_with_persona_and_instruction.csv')

for i, row in df.iterrows():
    respond_with_persona = row['respond_with_persona']
    answer = row['answer']
    
    references = [f"{answer}"]
    
    predictions = [f"{respond_with_persona}"]
    
    
    results = rouge.compute(predictions=predictions, references=references)
    
    
    df.at[i, 'persona_rouge'] = str(results)
    print(f"Response {i+1} finished.")
    
    df.to_csv('test.csv', index=False)
    # df.to_csv('final_data_persona.csv', index=False)

Response 1 finished.
Response 2 finished.
Response 3 finished.
Response 4 finished.
Response 5 finished.
Response 6 finished.
Response 7 finished.
Response 8 finished.
Response 9 finished.
Response 10 finished.
Response 11 finished.
Response 12 finished.
Response 13 finished.
Response 14 finished.
Response 15 finished.
Response 16 finished.
Response 17 finished.
Response 18 finished.
Response 19 finished.
Response 20 finished.
Response 21 finished.
Response 22 finished.
Response 23 finished.
Response 24 finished.
Response 25 finished.
Response 26 finished.
Response 27 finished.
Response 28 finished.
Response 29 finished.
Response 30 finished.
Response 31 finished.
Response 32 finished.
Response 33 finished.
Response 34 finished.
Response 35 finished.
Response 36 finished.
Response 37 finished.
Response 38 finished.
Response 39 finished.
Response 40 finished.
Response 41 finished.
Response 42 finished.
Response 43 finished.
Response 44 finished.
Response 45 finished.
Response 46 finishe

### Instruction Answers

In [32]:
df = pd.read_csv('test.csv')

for i, row in df.iterrows():
    respond_with_instruction = row['respond_with_instruction']
    answer = row['answer']
    
    references = [f"{answer}"]
    
    predictions = [f"{respond_with_instruction}"]
    
    
    results = rouge.compute(predictions=predictions, references=references)
    
    
    df.at[i, 'instruction_rouge'] = str(results)
    print(f"Response {i+1} finished.")
    
    df.to_csv('final_persona_and_instruction.csv', index=False)
    # df.to_csv('final_data_persona.csv', index=False)

Response 1 finished.
Response 2 finished.
Response 3 finished.
Response 4 finished.
Response 5 finished.
Response 6 finished.
Response 7 finished.
Response 8 finished.
Response 9 finished.
Response 10 finished.
Response 11 finished.
Response 12 finished.
Response 13 finished.
Response 14 finished.
Response 15 finished.
Response 16 finished.
Response 17 finished.
Response 18 finished.
Response 19 finished.
Response 20 finished.
Response 21 finished.
Response 22 finished.
Response 23 finished.
Response 24 finished.
Response 25 finished.
Response 26 finished.
Response 27 finished.
Response 28 finished.
Response 29 finished.
Response 30 finished.
Response 31 finished.
Response 32 finished.
Response 33 finished.
Response 34 finished.
Response 35 finished.
Response 36 finished.
Response 37 finished.
Response 38 finished.
Response 39 finished.
Response 40 finished.
Response 41 finished.
Response 42 finished.
Response 43 finished.
Response 44 finished.
Response 45 finished.
Response 46 finishe

### Mean values

##### Persona Responses
- rouge1: 0.1540348498741638
- rouge2: 0.04250547528205004
- rougeL: 0.10385418694215125
- rougeLsum: 0.10519427848454284

##### Instruction Responses
- rouge1: 0.1535807634398431
- rouge2: 0.04051400012115329
- rougeL: 0.10121977013988975
- rougeLsum: 0.10370625810853101

In [41]:
import pandas as pd
import ast

# Sample DataFrame
df = pd.read_csv('final_persona_and_instruction.csv')


# Function to extract rouge1 value from the dictionary string
def extract_rouge1(rouge_dict_str):
    rouge_dict = ast.literal_eval(rouge_dict_str)
    return rouge_dict.get('rouge1', None)

# Apply the function to the column and convert to numeric
df['rouge'] = df['instruction_rouge'].apply(extract_rouge1)

# Calculate average
average_rouge1 = df['rouge'].mean()
print("Average:", average_rouge1)


Average: 0.1535807634398431


In [28]:

def respond_to_prompt(prompt, question, respond_with_persona, respond_with_instruction):
    
    sysprompt = f"{prompt}. You will be given a question and two responses. Based on your role, pick the response that best answers the question. Only respond with '1' or '2', nothing else."
    
    messages = [
        {"role": "system", "content": sysprompt},
        {"role": "user", "content": f"{question}. Response 1: {respond_with_instruction}. \n Response 2: {respond_with_persona}"}
    ]
    
    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=messages,
        temperature=1,
        max_tokens=2
    )
    return response.choices[0].message.content.strip()

In [29]:
df = pd.read_csv('final_persona_and_instruction.csv')

for i, row in df.iterrows():
    question = row['question']
    persona_prompts = row['persona_prompts']
    respond_with_persona = row['respond_with_persona']
    respond_with_instruction = row['respond_with_instruction']
    
    df.at[i, 'respond_with_instruction'] = respond_to_prompt(question, persona_prompts, respond_with_persona, respond_with_instruction)
    
    print(f"Response {i+1} finished.")
    print(respond_to_prompt(question, persona_prompts, respond_with_persona, respond_with_instruction))
    
    df.to_csv('LLM_Feedback_GPT4.csv', index=False)

Response 1 finished.
1
Response 2 finished.
1
Response 3 finished.
1
Response 4 finished.
1
Response 5 finished.
1
Response 6 finished.
1
Response 7 finished.
1
Response 8 finished.
1
Response 9 finished.
1
Response 10 finished.
2
Response 11 finished.
1
Response 12 finished.
1
Response 13 finished.
1
Response 14 finished.
1
Response 15 finished.
1
Response 16 finished.
1
Response 17 finished.
1
Response 18 finished.
1
Response 19 finished.
2
Response 20 finished.
2
Response 21 finished.
2
Response 22 finished.


KeyboardInterrupt: 

In [27]:
df = pd.read_csv('LLM_Feedback_GPT4.csv')

value_counts = df['respond_with_instruction'].value_counts()

print(value_counts)

respond_with_instruction
1    745
2    255
Name: count, dtype: int64
