# Llama 2: Prompt Engineering Expirements to Extract Information

This notebook documents expirements with Llama 2 prompts. 

In [1]:
import requests, time, os, json
from dotenv import load_dotenv
from IPython.display import display, HTML

In [2]:
# To use Llama 2 70B on HuggingFace requires an authentication token and HuggingFace Pro account that cost $9 a month.  
# To learn more see 
# - https://huggingface.co/meta-llama/Llama-2-70b-chat-hf?inference_api=true
# - https://huggingface.co/pricing

# Loading authentication token from .env file
load_dotenv()
token = os.getenv("hgf_token")


### Generalize methods and class that will be used in the expirements below

In [3]:
# Object to represent an answer from Llama
class Answer:
    def __init__(self, answer, elapse):
        self.answer = answer
        self.elapse = elapse


In [4]:
# Function that call Llama on HF and return answer is text
def generate(prompt: str) -> str:

    API_URL = "https://api-inference.huggingface.co/models/meta-llama/Llama-2-70b-chat-hf"
    headers = {
        "Authorization": f"Bearer {token}",
        "content-type": "application/json",
    }

    options = {"use_cache": False}

    parameters = {
        "max_length": 4000,
        "max_new_tokens": 1000,
        "top_k": 10,
        "return_full_text": False,
        "do_sample": True,
        "num_return_sequences": 1,
        "temperature": 0.2,
        "repetition_penalty": 1.0,
        "length_penalty": 1.0,
    }

    payload = {"inputs": prompt, "parameters": parameters, "options": options}

    response = requests.post(API_URL, headers=headers, json=payload)
    if response.status_code != 200:
        return f"Error code {response.status_code}. Message {response.content}"
    else:
        results = response.json()
        answer = results[0]['generated_text']
        return answer

In [5]:
# Wrapper function that run generate and return an Answer
def run_prompt(prompt: str) -> Answer:
    start_time = time.time()
    answer = generate(prompt)
    end_time = time.time()
    elapse = round(end_time - start_time)
    return Answer(answer, elapse)

In [6]:
# Display answer object in HTML
def display_answer(answer: Answer, header = ''):
    answer_html_template = """<h3>{HEADER} Answer - Time to Generate: {ELAPSE} seconds</h3>
    <textarea cols='100' rows={NUM_ROWS}>{ANSWER}</textarea>"""
    
    number_rows = (len(answer.answer.split(' ')) / 10)
    
    html = answer_html_template.format(ANSWER=answer.answer, ELAPSE=answer.elapse, HEADER=header, NUM_ROWS=number_rows)
    display(HTML(html))

## The Expirements Focus
The expirements will be focused on extracting information from a blogpost from Addresson Horoviz about mobile games soft launch. Since the Llama have size limit of tokens a subset of the post is used.  

In [7]:
# Display the text we going to use
with open('mobile-game-soft-launch.txt', 'r') as f:
    text = f.read()
    
print(text)

Play to Win: Mobile Game Soft Launch Best Practices
Doug McCracken and Joshua Lu

We hear a lot of discussion around the best practices for launching a mobile game, and one particular topic that often comes up is: whether or not to do a soft launch. And if you do soft launch, how can you tell if your game will be successful? Much of the mobile games industry has taken the tactic of soft launching seriously – in September alone, there were more than 100 games soft launched on the Apple App Store and over 600 on the Google Play Store.

This blog post will explore what a soft launch is and why it can be beneficial for game studios. We’ll also dispel some of the myths that are commonly associated with soft launches and help you figure out if this strategy is right for your game. We’ll also address why there are so many more games soft launched on Google Play (hint: they have more options to support a variety of soft launch strategies).

What’s a Soft Launch?
A soft launch (or sometimes “ge

## Expirement 1: Asking Llama to summarize article into TLDR bullet-points
Let start by asking Llama to summarize the article for us into bullet points. 
The results, are not bad. The first two bullet points are pretty good in extracting the imported points in the article. 
But, as the bullet points continue, it's seems like Llama is just filling space.  

In [8]:
p1 =  """Write a concise summary of the main ideas in article below in bullet-points, don't repeat ideas. article: {BODY}""".format(BODY=text)

display_answer(run_prompt(p1))


Not bad for first try. Let see if Llama know what is TL;DR (to long; didn't read)


In [9]:
p2 =  """Write a concise TL;DR summary in bullet-points for the following article, don't repeat ideas. article: {BODY}""".format(BODY=text)

display_answer(run_prompt(p2))

Pretty good, I actually like the TL;DR prompt answer better. But, it's still too many bullet ponints. Let try and reduce the number of bullet points. Let see if we can improve the results by asking Llama to limit the output to three to five points and ask it to number the bullet points.

In [290]:
p3 =  """Write a concise TL;DR summary in numeric bullet-points for the following article, 
don't repeat ideas in bullet points. Limit the number of bullet point to 5. article: {BODY}""".format(BODY=text)
p3_answer = run_prompt(p3)
display_answer(p3_answer)

Not bed, Llama listen to me :)

# Expirement 2: System Message

The Llama paper describe the system message that uses to set the stage and concext for the model. 
In the following example, I am using the system messsage. Let see what is the different between P3 that doesn't have system message and P4 that use system message.   

In [291]:
p4 = """<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  
Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. 
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer to a question, please don't share false information.
<</SYS>>
Write a concise TL;DR summary in numeric bullet-points for the following article, 
don't repeat ideas in bullet points. Limit the number of bullet-point to 5. article: {BODY}""".format(BODY=text)

p4_answer = run_prompt(p4)
display_answer(p4_answer, "P4 - With System Message")
# Display p3 too for comparision. 
display_answer(p3_answer, "P3")

Both P3 and P4 are pretty good and it's hard to see the different the the system message added. I personally prefer P4 (system message) because the answer read a better in my opinion, but I am sure someone will argue with on that. 

# Expirence 3: Modify the System Message
The system message can be modified to better fit to the task and define the persona and context we want Llama to assume. 

Changes applied to the original system message:
- Use the researcher persona and specify the tasks to summarizing articles. 
- Remove safety instruction, there are not needed since we asking Llama to be truthful to the article. 

In [293]:
p5 = """<s>[INST] <<SYS>>
You are a researcher task in summarizing and writing concise brief of articles.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
Write a concise TL;DR summary in numeric bullet-points for the following article, 
don't repeat ideas in bullet points. Limit the number of bullet-point to 5. article: {BODY}
[/INST]""".format(BODY=text)

display_answer(run_prompt(p5))

The answer for p5 is the best in my opinion so far. I like the into and conclusion that Llama addeed. 

# Expirement 4: Asking Questions about the Article

The article is about 'Mobile Game Soft Launch' let ask Llama specific question about it. The answer is pretty good. 

In [239]:
p6 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
According to the article, what problems does mobile game soft launch solves? Limited the answer to 20 words. 
Article: {BODY}
[/INST]""".format(BODY=text)

display_answer(run_prompt(p6))

This is pretty good, let see if we improve on that. By asking Llama what is the article is about and then use the answer to ask additional questions. 
Prompt 7, asks Llama to tell what the article is about and then the answer is used to generate a prompt 8 that ask a second question.

To make it easy to programmatically use the answer, I asked Llama to output the answer in JSON. Using expirements (that I didn't included here) I descover that Llama need a template and being told to only output valid JSON. 

In [295]:
p7 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
Tell me what is the article about in one to three words? 
Output the answer in JSON in the following format {{"article_is_about": answer}}. Only output JSON
Article: {BODY}
[/INST]""".format(BODY=text)

a7 = run_prompt(p7)
json_a7 = json.loads(a7.answer)
print(f"JSON Answer:\n {json_a7}")
about = json_a7['article_is_about']


p8 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
According to the article, what problems does {{ABOUT}}? Limited the answer to 20 words. 
Article: {BODY}
[/INST]""".format(BODY=text, ABOUT=about)

display_answer(run_prompt(p8),'Prompt 8')


  {
"article_is_about": "Mobile game soft launch best practices"
}
JSON Answer:
 {'article_is_about': 'Mobile game soft launch best practices'}


Wow this is awesume. We can feed answers into new prompts to refine the information we try to extract. 

Let try a different question. What industry the article is about.

In [269]:
p9 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
Name the industry the article is focus on? Output only the industry name.
Article: {BODY}
[/INST]""".format(BODY=text, ABOUT=about)

display_answer(run_prompt(p9))

Notice that I asked the answer to include only the industry name, but Llama disregarded my request and wrote a sentence. 
Let see if we can fixed that by asking the answer to be in JSON. It worked!!! 

Note: I needed to add "include only valid JSON" to prevent Llama then adding an explanation.  

In [274]:
p10 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
Name industry the article is focus on? Output only the industry name. Output the answer in JSON, using format {{"industry": industry}}.
Include only valid JSON.
Article: {BODY}
[/INST]""".format(BODY=text, ABOUT=about)

display_answer(run_prompt(p10))

Let try and trick Llama and ask him what sport is the article focuses on? 

In [279]:
p11 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.
<</SYS>>
Name sport the article is focus on? Output only the sport name. Output the answer in JSON, using format {{"sport": sport, "explanation": explanation}}. 
Include only valid JSON.
Article: {BODY}
[/INST]""".format(BODY=text, ABOUT=about)

display_answer(run_prompt(p11))

Pretty good, but there is one problem. Llama is disregarding my request for valid JSON. The model is eager to explain why sport is null. So I added explanation to JSON and that did the trick.

To review, to get Llama to output JSON, I needed to explicity tell it "only include JSON", add explanation to the JSON object format and change "write the answer" to "output the answer".  

# Expirement 5: One-to-Many Shot Learning to teach Llama
In this last expirement, I want to try and generalize the industry and sport questions by giving Llama examples of what I am looking for.  

The following prompt asks Llama to identify general topics mentioned in the article. To explain to Llama what do I mean by generalize topic, I included examples already in JSON.   

In [282]:
p12 = """<s>[INST] <<SYS>>
You are a researcher task with answering questions about an article.  
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. 
If you don't know the answer, please don't share false information.

Output answer in JSON using the following format: {{"name": name, "type": type, "explanation": explanation}}
<</SYS>>

What nouns mentioned in the article that can generalize the topic? [/INST] 
[
{{"name": "semiconductor", "type": "industry", "explanation": "Companies engaged in the design and fabrication of semiconductors and semiconductor devices"}},
{{"name": "NBA", "type": "sport league", "explanation": "NBA is the national basketball league"}},
{{"name": "Ford F150", "type": "vehicle", "explanation": "Article talks about the Ford F150 truck"}},
] </s>

<s>[INST]   
What nouns mentioned in the article that can generalize the topic? Output answer in JSON using the following format: {{"name": name, "type": type, "explanation": explanation}}
Article: {BODY}
[/INST]""".format(BODY=text, ABOUT=about)

display_answer(run_prompt(p12))

I am surprised by how good (and easy) was it was to have Llama idenfity those topics. 