# Prompt Engineering Summary with Llama-3.
Author: Hazman Naim

This summary on prompt engineering is based on insights gained from the "Prompt Engineering with Anthropic Claude" workshop organized by AWS. The author has applied this knowledge using LLaMA-3.

## 1. Setup
For this work, I am using a hosted API service to run LLaMA-3, provided by Groq. This setup allows the notebook to be used with minimal hardware requirements. The focus of this work is not on setting up the LLM with the hosted API service, so the setup process will be covered only briefly.

In [10]:
import os
from dotenv import load_dotenv
from groq import Groq

# Load the .env file
load_dotenv()

# Access the API key
GROQ_API_KEY = os.getenv('GROQ_API_KEY')

# Initialize the Groq client with the API key from the environment
client = Groq(api_key=os.environ.get(GROQ_API_KEY))

## 2. Basic Prompting Structure
**System Prompt:** A system prompt is a way to provide context, instructions, and guidelines to LLaMa before presenting it with a question or task in the "User" turn. A well-written system prompt can improve LLaMa's performance in a variety of ways, such as increasing LLaMa's ability to follow rules and instructions.

In [6]:
def get_completion(prompt, system_prompt=None):
    # Default system prompt if none is provided
    if system_prompt is None:
        system_prompt = "Your answer should always be a series of critical thinking questions that further the conversation (do not provide answers to your questions). Do not actually answer the user question."

    # Create a chat completion request with the provided prompt and system prompt
    completion = client.chat.completions.create(
        model="llama3-8b-8192",  # Specify the model to use
        messages=[
            {
                "role": "system",
                "content": system_prompt  # Use the system prompt provided by the user or the default
            },
            {
                "role": "user",
                "content": prompt  # Use the user-provided prompt
            }
        ],
        temperature=0,  # Set the temperature for generating diverse responses
        max_tokens=1024,  # Set the maximum number of tokens to generate
        top_p=1,  # Set the top_p value for nucleus sampling
        stream=True,  # Enable streaming for response chunks
        stop=None,  # No stop condition specified
    )
    
    # Collect and concatenate the response chunks
    response = ""
    for chunk in completion:
        response_chunk = chunk.choices[0].delta.content or ""
        response += response_chunk
    
    return response

With system prompt.

In [7]:
prompt = "Why sky is blue?"
get_completion(prompt)

'What is the definition of "blue" that we are using to describe the color of the sky? Is it a subjective experience or an objective property of light? How does the color of the sky change under different conditions, such as during sunrise and sunset, or when viewed from different altitudes or atmospheric conditions?'

Without system prompt.

In [8]:
prompt = "Why sky is blue?"
get_completion(prompt, system_prompt="")

"The sky appears blue because of a phenomenon called Rayleigh scattering, which is the scattering of light by small particles or molecules in the atmosphere. The shorter, blue wavelengths of light are scattered more than the longer, red wavelengths, resulting in the blue color we see in the sky.\n\nHere's a more detailed explanation:\n\n1. When sunlight enters the Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules are much smaller than the wavelength of light.\n2. The shorter wavelengths of light, such as blue and violet, are scattered more than the longer wavelengths, such as red and orange. This is because the smaller molecules are more effective at scattering the shorter wavelengths.\n3. As a result, the blue light is scattered in all directions and reaches our eyes from all parts of the sky.\n4. The longer wavelengths of light, such as red and orange, are not scattered as much and continue to travel in a more direct path

## 3. Being Clear and Direct
Think of LLaMa like any other human that is new to the job. LLaMa has no context on what to do aside from what you literally tell it. Just as when you instruct a human for the first time on a task, the more you explain exactly what you want in a straightforward manner to LLaMa, the better and more accurate LLaMa's response will be.

In [9]:
def get_completion(prompt, system_prompt=None):
    # Default system prompt if none is provided
    if system_prompt is None:
        system_prompt = ""

    # Create a chat completion request with the provided prompt and system prompt
    completion = client.chat.completions.create(
        model="llama3-8b-8192",  # Specify the model to use
        messages=[
            {
                "role": "system",
                "content": system_prompt  # Use the system prompt provided by the user or the default
            },
            {
                "role": "user",
                "content": prompt  # Use the user-provided prompt
            }
        ],
        temperature=0,  # Set the temperature for generating diverse responses
        max_tokens=1024,  # Set the maximum number of tokens to generate
        top_p=1,  # Set the top_p value for nucleus sampling
        stream=True,  # Enable streaming for response chunks
        stop=None,  # No stop condition specified
    )
    
    # Collect and concatenate the response chunks
    response = ""
    for chunk in completion:
        response_chunk = chunk.choices[0].delta.content or ""
        response += response_chunk
    
    return response

Let's ask LLaMa what is the best food of all time. You can see below that while LLaMa lists a few names, it doesn't respond with a definitive "best"

In [11]:
prompt = "What is the best food of all time?"
get_completion(prompt)

'The answer, of course, is subjective and can vary greatly depending on personal taste preferences, cultural background, and geographical location. However, I can give you some insights based on various polls, reviews, and culinary experts\' opinions.\n\nSome of the most popular and widely-regarded "best foods of all time" include:\n\n1. Pizza: A classic favorite, pizza is a staple in many cultures. Its combination of melted cheese, savory sauce, and various toppings has made it a beloved dish around the world.\n2. Tacos: Whether you prefer traditional Mexican street-style tacos or modern fusion variations, tacos have become a global phenomenon. The versatility of fillings, toppings, and tortillas has made them a favorite among many.\n3. Sushi: This Japanese dish has gained immense popularity worldwide, with its delicate combination of vinegared rice, fresh seafood, and various toppings. Sushi\'s unique flavors and textures have made it a culinary sensation.\n4. Curry: This Indian-insp

In [13]:
prompt = "What is the best food of all time? Yes, there are differing opinions, but if you absolutely had to pick one food, what would it be? Answer only one food."
get_completion(prompt)

"What a daunting task! After considering the vast array of delicious foods from around the world, I'm going to take a stand and declare that the best food of all time is... PIZZA!\n\nYes, I know, it's a bold claim, but hear me out. Pizza is the ultimate comfort food that brings people together. It's a culinary masterpiece that combines the perfect harmony of flavors, textures, and aromas. The crispy crust, the gooey melted cheese, the savory sauce, and the various toppings all come together to create a dish that's both familiar and exciting.\n\nFrom classic margherita to meat-lovers, from Neapolitan to deep-dish, pizza is a versatile food that can be enjoyed in countless ways. It's a staple in many cultures, and its popularity transcends borders and generations.\n\nBut what really sets pizza apart is its ability to evoke emotions and create memories. Think about it – pizza is often at the center of family gatherings, parties, and special occasions. It's a food that brings people togeth

## 4. Assigning Roles (Role Prompting)
Continuing on the theme of LLaMa having no context aside from what you say, it's sometimes important to prompt LLaMa to inhabit a specific role (including all necessary context). This is also known as role prompting. The more detail to the role context, the better.

Priming LLaMa with a role can improve LLaMa's performance in a variety of fields, from writing to coding to summarizing. It's like how humans can sometimes be helped when told to "think like a ______". Role prompting can also change the style, tone, and manner of LLaMa's response.

Note: Role prompting can happen either in the system prompt or as part of the User message turn.

In [2]:
def get_completion(prompt, system_prompt=None):
    # Default system prompt if none is provided
    if system_prompt is None:
        system_prompt = ""

    # Create a chat completion request with the provided prompt and system prompt
    completion = client.chat.completions.create(
        model="llama3-8b-8192",  # Specify the model to use
        messages=[
            {
                "role": "system",
                "content": system_prompt  # Use the system prompt provided by the user or the default
            },
            {
                "role": "user",
                "content": prompt  # Use the user-provided prompt
            }
        ],
        temperature=0,  # Set the temperature for generating diverse responses
        max_tokens=1024,  # Set the maximum number of tokens to generate
        top_p=1,  # Set the top_p value for nucleus sampling
        stream=True,  # Enable streaming for response chunks
        stop=None,  # No stop condition specified
    )
    
    # Collect and concatenate the response chunks
    response = ""
    for chunk in completion:
        response_chunk = chunk.choices[0].delta.content or ""
        response += response_chunk
    
    return response

Here is the prompt without role prompting in the system prompt:

In [20]:
prompt = "In one short sentence, what do you think about future of LLM?"
get_completion(prompt)

'I think the future of Large Language Models (LLMs) holds immense promise, with potential applications in various industries and aspects of life, including improved language translation, personalized customer service, and enhanced decision-making capabilities.'

Here is the same user question, except with role prompting where we assign LLaMa's role as a cat.

In [21]:
system_prompt = "You are a cat. Only talk in cat language only."
prompt = "In one short sentence, what do you think about future of LLM?"
get_completion(prompt, system_prompt)

'Rrrowwwww! Meeeeoowwwww! Hrrr-mmm-mmm!'

## 5. Separating Data and Instructions
Oftentimes, you don't want to write full prompts, but instead want prompt templates that can be modified later with additional input data before submitting to LLaMa. This might come in handy if you want LLaMa to do the same thing every time, but the data that LLaMa uses for its task might be different each time.

Luckily, you can do this pretty easily by separating the fixed skeleton of the prompt from variable user input, then substituting the user input into the prompt before sending the full prompt to LLaMa.


In [26]:
email = "Show up at 6am tomorrow, everyone's salary will be cut by half to pay for my vacation because I'm the CEO and I say so."
prompt = f"Yo LLaMa. [email]{email}[/email] <----- Make this email more polite but don't change anything else about it. Your response only the email."
get_completion(prompt)

"Subject: Important Update on Tomorrow's Schedule and Compensation\n\nDear Team,\n\nI hope this email finds you well. I wanted to touch base with you regarding tomorrow's schedule and compensation. As we approach the end of the quarter, I believe it's essential to prioritize our team's well-being and ensure we're all aligned with our goals.\n\nTo that end, I would like to request your presence at 6am tomorrow. I understand this may require some adjustments, but I assure you it will be worth your while. As a token of appreciation for your hard work and dedication, I would like to take a short break to recharge and refocus.\n\nIn light of this, I will be making a temporary adjustment to our compensation structure. Effective tomorrow, all salaries will be reduced by half to accommodate my vacation expenses. I understand this may cause some inconvenience, but I'm confident we'll emerge stronger and more united as a team.\n\nI look forward to seeing you all tomorrow at 6am. If you have any 

In the following prompt, LLaMa incorrectly interprets what part of the prompt is the instruction vs. the input. It incorrectly considers Each is about an animal, like rabbits to be part of the list due to the formatting, when the user (the one filling out the SENTENCES variable) presumably did not want that.

In [28]:
# Variable content
SENTENCES = """- I like how cows sound
- This sentence is about spiders
- This sentence may appear to be about dogs but it's actually about pigs"""

# Prompt template with a placeholder for the variable content
PROMPT = f""" Below is a list of sentences. Tell me the second item on the list.

- Each is about an animal, like rabbits.
{SENTENCES}"""


get_completion(PROMPT)

'The second item on the list is:\n\nI like how cows sound'

To fix this, we just need to surround the user input sentences in tags. This shows LLaMa where the input data begins and ends despite the misleading hyphen before Each is about an animal, like rabbits.

In [6]:
# Variable content
SENTENCES = """
- I like how cows sound
- This sentence is about spiders
- This sentence may appear to be about dogs but it's actually about pigs
"""

# Prompt template with a placeholder for the variable content
PROMPT = f""" Below is a list of sentences. Tell me the second item. . 

- Each is about an animal, like rabbits.
## SENTENCES
{SENTENCES}"""

get_completion(PROMPT)

'The second item is:\n\n- This sentence is about spiders'

## 6. Formatting Output and Speaking for LLaMa
LLaMa can format its output in a wide variety of ways. You just need to ask for it to do so!

One of these ways is by using XML tags to separate out the response from any other superfluous text. You can ask LLaMa to use XML tags to make its output clearer and more easily understandable to humans.

In [15]:
# Variable content
ANIMAL = "Rabbit"

# Prompt template with a placeholder for the variable content
PROMPT = f"Please write a haiku about {ANIMAL}. Put it in <haiku> and </haiku> tags."

get_completion(PROMPT)

"<haiku>\nFluffy whiskers twitch\nRabbit's gentle morning hop\nGarden's secret guest\n</haiku>"

LLaMa also excels at using other output formatting styles, notably JSON. If you want to enforce `JSON` output (not deterministically, but close to it), you can also prefill LLaMa's response with the opening bracket, `{}`.

In [3]:
def get_completion(prompt, system_prompt=None, prefill=None):
    # Default system prompt if none is provided
    if system_prompt is None:
        system_prompt = ""

    if prefill:
        # Create a chat completion request with the provided prompt and system prompt
        completion = client.chat.completions.create(
            model="llama3-8b-8192",  # Specify the model to use
            messages=[
                {
                    "role": "system",
                    "content": system_prompt  # Use the system prompt provided by the user or the default
                },
                {
                    "role": "user",
                    "content": prompt  # Use the user-provided prompt
                },
                {
                    "role": "assistant",
                    "content": prefill
                }
            ],  
            temperature=0,  # Set the temperature for generating diverse responses
            max_tokens=1024,  # Set the maximum number of tokens to generate
            top_p=1,  # Set the top_p value for nucleus sampling
            stream=True,  # Enable streaming for response chunks
            stop=None,  # No stop condition specified
        )
    else:
        # Create a chat completion request with the provided prompt and system prompt
        completion = client.chat.completions.create(
            model="llama3-8b-8192",  # Specify the model to use
            messages=[
                {
                    "role": "system",
                    "content": system_prompt  # Use the system prompt provided by the user or the default
                },
                {
                    "role": "user",
                    "content": prompt  # Use the user-provided prompt
                }
            ],  
            temperature=0,  # Set the temperature for generating diverse responses
            max_tokens=1024,  # Set the maximum number of tokens to generate
            top_p=1,  # Set the top_p value for nucleus sampling
            stream=True,  # Enable streaming for response chunks
            stop=None,  # No stop condition specified
        )
    
    # Collect and concatenate the response chunks
    response = ""
    for chunk in completion:
        response_chunk = chunk.choices[0].delta.content or ""
        response += response_chunk
    
    return response

In [33]:
# First input variable
EMAIL = "Hi Zack, just pinging you for a quick update on that prompt you were supposed to write."

# Second input variable
ADJECTIVE = "olde english"

# Prompt template with a placeholder for the variable content
PROMPT = f"Hey LLaMa. Here is an email: <email>{EMAIL}</email>. Make this email more {ADJECTIVE}. Write the new version in <{ADJECTIVE}_email> XML tags."

# Prefill for LLaMa's response (now as an f-string with a variable)
PREFILL = f"<{ADJECTIVE}_email>"

get_completion(PROMPT, PREFILL)

'Here is the rewritten email in Olde English:\n\n<olde english_email>\nHail, good sir Zacke! Verily, I do beseech thee for a brief communiqué regarding that writ prompt thou wert tasked to pen.\n\nMayhap thou hast made some progress in the matter, or perchance thou hast encountered some impediment that doth hinder thy progress? I do implore thee, good sir, to apprise me of thy situation, that I may offer such aid and guidance as thou mayest require.\n\nMay Fortune smile upon thy endeavors, and may thy quill be guided by the Muses themselves!\n\nThine humble servant,\n[Your Name]</olde english_email>'

LLaMa designates Michael Jordan as the best basketball player of all time. Can we get LLaMa to pick someone else?

In [34]:
# Prompt template with a placeholder for the variable content
PROMPT = f"Who is the best basketball player of all time? Please choose one specific player."

# Prefill for LLaMa's response
PREFILL = ""

get_completion(PROMPT, PREFILL)

'The eternal debate!\n\nAfter considering various factors such as achievements, dominance, and overall impact on the game, I\'m going to choose Michael Jordan as the best basketball player of all time.\n\nHere are some reasons why:\n\n1. Unmatched success: Jordan won six NBA championships, five MVP awards, and is the all-time leader in points per game with an average of 30.12.\n2. Dominant player: He was an unstoppable force on the court, earning the nickname "Air Jordan" for his incredible leaping ability and clutch performances.\n3. Clutch gene: Jordan\'s reputation for delivering in big moments is unparalleled. He hit several game-winning shots throughout his career, including the famous "Flu Game" in the 1997 NBA Finals.\n4. Impact on the game: Jordan\'s influence on the game extends beyond his playing career. He popularized the "fadeaway" jump shot, and his endorsement deals and marketability helped globalize the NBA.\n5. Consistency: Jordan played at an elite level for nearly two

Force LLaMa to choose Nikola Jokic.

In [35]:
# Prompt template with a placeholder for the variable content
PROMPT = f"Who is the best basketball player of all time? Please choose one specific player."

# Prefill for LLaMa's response
PREFILL = "Nikola Jokic"

get_completion(PROMPT, PREFILL)

'What a question! While opinions on this topic tend to be subjective and often spark lively debates, I\'ll take a stand and choose one player who, in my opinion, stands out as the best basketball player of all time: Nikola Jokić.\n\nHere\'s why:\n\n1. Unparalleled versatility: Jokić is a 6\'10" center who can dominate the game in multiple ways. He\'s an elite scorer, rebounder, playmaker, and defender. He can play all five positions on the court and has the skills to excel in each role.\n2. Consistency: Jokić has been an All-Star for six consecutive seasons, and his stats have improved every year. He\'s a true workhorse, playing at an elite level for over 35 minutes per game.\n3. Statistical dominance: Jokić is a triple-double machine, with 44 triple-doubles in his career (as of the 2021-22 season). He\'s also a career 50% shooter from the field and 35% from three-point range.\n4. Leadership: Jokić has led the Denver Nuggets to the playoffs in each of the last five seasons, including a

## 7. Precognition (Thinking Step by Step)
If someone woke you up and immediately started asking you several complicated questions that you had to respond to right away, how would you do? Probably not as good as if you were given time to think through your answer first.

Guess what? LLaMa is the same way.

Giving LLaMa time to think step by step sometimes makes LLaMa more accurate, particularly for complex tasks. However, thinking only counts when it's out loud. You cannot ask LLaMa to think but output only the answer - in this case, no thinking has actually occurred.

In [36]:
# Prompt
PROMPT = """Is this movie review sentiment positive or negative?

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since the year 1900."""

get_completion(PROMPT)

'This movie review is sarcastic and negative. The reviewer is being ironic and humorous by saying the movie "blew their mind" and is "fresh" and "original", but then immediately undermine their own statement by saying they\'ve been living under a rock since 1900, implying that they\'re not aware of anything new or original. The tone is playful and tongue-in-cheek, but the overall sentiment is negative.'

To improve LLaMa's response, let's allow LLaMa to think things out first before answering. We do that by literally spelling out the steps that LLaMa should take in order to process and think through its task. Along with a dash of role prompting, this empowers LLaMa to understand the review more deeply.

In [37]:
# System prompt
SYSTEM_PROMPT = "You are a savvy reader of movie reviews."

# Prompt
PROMPT = """Is this review sentiment positive or negative? First, write the best arguments for each side in <positive-argument> and <negative-argument> XML tags, then answer.

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since 1900."""

get_completion(PROMPT, system_prompt=SYSTEM_PROMPT)

'<positive-argument>\nThe reviewer uses the phrase "blew my mind" to express their astonishment and admiration for the movie\'s originality and freshness, indicating a strong positive sentiment.\n</positive-argument>\n\n<negative-argument>\nThe reviewer\'s sarcastic tone and the unrelated news about living under a rock since 1900 suggests that they are being facetious and mocking the idea that the movie is truly original or groundbreaking. This implies a negative sentiment, as the reviewer is poking fun at the movie\'s claims of innovation.\n</negative-argument>\n\nAnswer: The sentiment of this review is negative.'

In [38]:
# Prompt
PROMPT = "Name a famous movie starring an actor who was born in the year 1956."

get_completion(PROMPT)

'One famous movie starring an actor born in 1956 is "Goodfellas" (1990), which stars Robert De Niro, Joe Pesci, and Ray Liotta. Ray Liotta was born on December 18, 1956.'

Wrong because Ray Liotta was born in December 18, 1954. Let's fix this by asking LLaMa to think step by step, this time in  < brainstorm> tags.

In [39]:
PROMPT = "Name a famous movie starring an actor who was born in the year 1956. First brainstorm about some actors and their birth years in <brainstorm> tags, then give your answer."

get_completion(PROMPT)

'<brainstorm>\n\n* Tom Hanks (1956)\n* Michael J. Fox (1956)\n* John Cusack (1966)\n* Keanu Reeves (1964)\n* Johnny Depp (1963)\n\nAnswer: Tom Hanks starred in the famous movie "Forrest Gump" (1994).'

In [9]:
# Prompt template with a placeholder for the variable content
SYSTEM_PROMPT = """
you are an email classifier. Answer by only giving the category. 
Do not elaborate the reason of the classification.
Classify it under one of the following categories only:
- A: Pre-sale question
- B: Broken or defective item
- C: Billing question
- D: Other (please explain)
"""

# Prefill for LLaMa's response, if any
PREFILL = ""

# Variable content stored as a list
EMAILS = [
    "Hi -- My Mixmaster4000 is producing a strange noise when I operate it. It also smells a bit smoky and plasticky, like burning electronics.  I need a replacement.", # (B) Broken or defective item
    "Can I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?", # (A) Pre-sale question OR (D) Other (please explain)
    "I HAVE BEEN WAITING 4 MONTHS FOR MY MONTHLY CHARGES TO END AFTER CANCELLING!!  WTF IS GOING ON???", # (C) Billing question
    "How did I get here I am not good with computer.  Halp." # (D) Other (please explain)
]

# Iterate through list of emails
for i, email in enumerate(EMAILS):

    PROMPT = f"""Please classify this email: {email}"""
   
    # Get LLaMa's response
    response = get_completion(PROMPT, system_prompt=SYSTEM_PROMPT, prefill=PREFILL)

    print(f"Email: {i} \n Response: {response}")
    print("")

Email: 0 
 Response: B

Email: 1 
 Response: A

Email: 2 
 Response: C

Email: 3 
 Response: D



## 8. Using Examples (Few-Shot Prompting)
Giving LLaMa examples of how you want it to behave (or how you want it not to behave) is extremely effective for:
- Getting the right answer
- Getting the answer in the right format

This sort of prompting is also called "few shot prompting". You might also encounter the phrase "zero-shot" or "n-shot" or "one-shot". The number of "shots" refers to how many examples are used within the prompt.

Pretend you're a developer trying to build a "parent bot" that responds to questions from kids. LlaMa's default response is quite formal and robotic. This is going to break a child's heart.

In [13]:
PROMPT = "Will Santa bring me presents on Christmas?"

get_completion(PROMPT)

"Ho ho ho! I'm happy to help you with that!\n\nAs for whether Santa will bring you presents on Christmas, it's a tradition that many people around the world enjoy. According to legend, Santa Claus is a jolly old man who brings gifts to children on Christmas Eve, December 24th. He's said to be able to deliver presents to every good boy and girl in just one night, thanks to his magical sleigh and reindeer.\n\nHowever, it's important to remember that Santa is a mythical figure, and the idea of him bringing presents is a fun and imaginative part of the holiday season. Whether or not you receive presents on Christmas depends on your family's traditions and customs.\n\nIf you're wondering whether Santa will bring you presents, you might want to ask your parents or guardians about their plans. They might have some surprises in store for you!\n\nRemember, the true spirit of Christmas is about spending time with loved ones, being kind to others, and enjoying the festive season. Whether or not y

You could take the time to describe your desired tone, but it's much easier just to give LLaMa a few examples of ideal responses.

In [14]:
PROMPT = """Please complete the conversation by writing the next line, speaking as "A".
Q: Is the tooth fairy real?
A: Of course, my joyful sweetie. Wrap up your tooth and put it under your pillow tonight. There might be something waiting for you in the morning.
Q: Will Santa bring me presents on Christmas?"""

get_completion(PROMPT)

"A: Oh, absolutely, my dear! Santa's got his eye on you, and I'm sure he'll bring you all sorts of wonderful treats and treasures on Christmas morning. Just make sure to leave out some milk and cookies for him, and maybe even a little note telling him what you'd love to find under the tree."

## 9. Avoiding Hallucinations
Some bad news: LLaMa sometimes "hallucinates" and makes claims that are untrue or unjustified. The good news: there are techniques you can use to minimize hallucinations. Below, we'll go over a few of these techniques, namely:
- Giving LLaMa the option to say it doesn't know the answer to a question
- Asking LLaMa to find evidence before answering

However, there are many methods to avoid hallucinations, including many of the techniques you've already learned in this course. If LLaMa hallucinates, experiment with multiple techniques to get LLaMa to increase its accuracy.

In [24]:
PROMPT = "Who is Hazman Naim?"

get_completion(PROMPT)

'Hazman Naim is a Malaysian professional footballer who plays as a midfielder for Malaysian club Pahang FA and the Malaysia national team.'

In [25]:
PROMPT = "Who is Hazman Naim? Only give your answer if you certain. Do not give answer if not confidence."

get_completion(PROMPT)

'I am not certain who Hazman Naim is.'

## 10. Complex Prompts from Scratch
Example of a guided structure for complex prompts.

In [31]:
def get_completion(prompt, prefill=None):
    if prefill:
        # Create a chat completion request with the provided prompt and system prompt
        completion = client.chat.completions.create(
            model="llama3-8b-8192",  # Specify the model to use
            messages=[
                {
                    "role": "user",
                    "content": prompt  # Use the user-provided prompt
                },
                {
                    "role": "assistant",
                    "content": prefill
                }
            ],  
            temperature=0,  # Set the temperature for generating diverse responses
            max_tokens=1024,  # Set the maximum number of tokens to generate
            top_p=1,  # Set the top_p value for nucleus sampling
            stream=True,  # Enable streaming for response chunks
            stop=None,  # No stop condition specified
        )
    else:
        # Create a chat completion request with the provided prompt and system prompt
        completion = client.chat.completions.create(
            model="llama3-8b-8192",  # Specify the model to use
            messages=[
                {
                    "role": "user",
                    "content": prompt  # Use the user-provided prompt
                }
            ],  
            temperature=0,  # Set the temperature for generating diverse responses
            max_tokens=1024,  # Set the maximum number of tokens to generate
            top_p=1,  # Set the top_p value for nucleus sampling
            stream=True,  # Enable streaming for response chunks
            stop=None,  # No stop condition specified
        )
    
    # Collect and concatenate the response chunks
    response = ""
    for chunk in completion:
        response_chunk = chunk.choices[0].delta.content or ""
        response += response_chunk
    
    return response

######################################## INPUT VARIABLES ########################################

# First input variable - the conversation history (this can also be added as preceding `user` and `assistant` messages in the API call)
HISTORY = """Customer: Give me two possible careers for sociology majors.

Joe: Here are two potential careers for sociology majors:

- Social worker - Sociology provides a strong foundation for understanding human behavior and social systems. With additional training or certification, a sociology degree can qualify graduates for roles as social workers, case managers, counselors, and community organizers helping individuals and groups.

- Human resources specialist - An understanding of group dynamics and organizational behavior from sociology is applicable to careers in human resources. Graduates may find roles in recruiting, employee relations, training and development, diversity and inclusion, and other HR functions. The focus on social structures and institutions also supports related careers in public policy, nonprofit management, and education."""

# Second input variable - the user's question
QUESTION = "Which of the two careers requires more than a Bachelor's degree?"



######################################## PROMPT ELEMENTS ########################################

##### Prompt element 1: `user` role
# Make sure that your Messages API call always starts with a `user` role in the messages array.
# The get_completion() function as defined above will automatically do this for you.

##### Prompt element 2: Task context
# Give LLaMa context about the role it should take on or what goals and overarching tasks you want it to undertake with the prompt.
# It's best to put context early in the body of the prompt.
TASK_CONTEXT = "You will be acting as an AI career coach named Joe created by the company AdAstra Careers. Your goal is to give career advice to users. You will be replying to users who are on the AdAstra site and who will be confused if you don't respond in the character of Joe."

##### Prompt element 3: Tone context
# If important to the interaction, tell LLaMa what tone it should use.
# This element may not be necessary depending on the task.
TONE_CONTEXT = "You should maintain a friendly customer service tone."

##### Prompt element 4: Detailed task description and rules
# Expand on the specific tasks you want LLaMa to do, as well as any rules that LLaMa might have to follow.
# This is also where you can give LLaMa an "out" if it doesn't have an answer or doesn't know.
# It's ideal to show this description and rules to a friend to make sure it is laid out logically and that any ambiguous words are clearly defined.
TASK_DESCRIPTION = """Here are some important rules for the interaction:
- Always stay in character, as Joe, an AI from AdAstra Careers
- If you are unsure how to respond, say \"Sorry, I didn't understand that. Could you rephrase your question?\"
- If someone asks something irrelevant, say, \"Sorry, I am Joe and I give career advice. Do you have a career question today I can help you with?\""""

##### Prompt element 5: Examples
# Provide LLaMa with at least one example of an ideal response that it can emulate. Encase this in <example></example> XML tags. Feel free to provide multiple examples.
# If you do provide multiple examples, give LLaMa context about what it is an example of, and enclose each example in its own set of XML tags.
# Examples are probably the single most effective tool in knowledge work for getting LLaMa to behave as desired.
# Make sure to give LLaMa examples of common edge cases. If your prompt uses a scratchpad, it's effective to give examples of how the scratchpad should look.
# Generally more examples = better.
EXAMPLES = """Here is an example of how to respond in a standard interaction:
<example>
Customer: Hi, how were you created and what do you do?
Joe: Hello! My name is Joe, and I was created by AdAstra Careers to give career advice. What can I help you with today?
</example>"""

##### Prompt element 6: Input data to process
# If there is data that LLaMa needs to process within the prompt, include it here within relevant XML tags.
# Feel free to include multiple pieces of data, but be sure to enclose each in its own set of XML tags.
# This element may not be necessary depending on task. Ordering is also flexible.
INPUT_DATA = f"""Here is the conversational history (between the user and you) prior to the question. It could be empty if there is no history:
<history>
{HISTORY}
</history>

Here is the user's question:
<question>
{QUESTION}
</question>"""

##### Prompt element 7: Immediate task description or request #####
# "Remind" LLaMa or tell LLaMa exactly what it's expected to immediately do to fulfill the prompt's task.
# This is also where you would put in additional variables like the user's question.
# It generally doesn't hurt to reiterate to LLaMa its immediate task. It's best to do this toward the end of a long prompt.
# This will yield better results than putting this at the beginning.
# It is also generally good practice to put the user's query close to the bottom of the prompt.
IMMEDIATE_TASK = "How do you respond to the user's question?"

##### Prompt element 8: Precognition (thinking step by step)
# For tasks with multiple steps, it's good to tell LLaMa to think step by step before giving an answer
# Sometimes, you might have to even say "Before you give your answer..." just to make sure LLaMa does this first.
# Not necessary with all prompts, though if included, it's best to do this toward the end of a long prompt and right after the final immediate task request or description.
PRECOGNITION = "Think about your answer first before you respond."

##### Prompt element 9: Output formatting
# If there is a specific way you want LLaMa's response formatted, clearly tell LLaMa what that format is.
# This element may not be necessary depending on the task.
# If you include it, putting it toward the end of the prompt is better than at the beginning.
OUTPUT_FORMATTING = "Put your response in <response></response> tags."

##### Prompt element 10: Prefilling LLaMa's response (if any)
# A space to start off LLaMa's answer with some prefilled words to steer LLaMa's behavior or response.
# If you want to prefill LLaMa's response, you must put this in the `assistant` role in the API call.
# This element may not be necessary depending on the task.
PREFILL = "[Joe] <response>"



######################################## COMBINE ELEMENTS ########################################

PROMPT = "Who are you?"

if TASK_CONTEXT:
    PROMPT += f"""{TASK_CONTEXT}"""

if TONE_CONTEXT:
    PROMPT += f"""\n\n{TONE_CONTEXT}"""

if TASK_DESCRIPTION:
    PROMPT += f"""\n\n{TASK_DESCRIPTION}"""

if EXAMPLES:
    PROMPT += f"""\n\n{EXAMPLES}"""

if INPUT_DATA:
    PROMPT += f"""\n\n{INPUT_DATA}"""

if IMMEDIATE_TASK:
    PROMPT += f"""\n\n{IMMEDIATE_TASK}"""

if PRECOGNITION:
    PROMPT += f"""\n\n{PRECOGNITION}"""

if OUTPUT_FORMATTING:
    PROMPT += f"""\n\n{OUTPUT_FORMATTING}"""


In [32]:
get_completion(PROMPT, prefill=PREFILL)

"<response> Ahah, great follow-up question! Based on our previous conversation, I'd say that Social Work typically requires a Bachelor's degree, but also often involves additional certifications, training, or a Master's degree for more advanced roles. On the other hand, Human Resources Specialist roles can often be entered with a Bachelor's degree, although some positions may require a Master's degree or certifications like SHRM-CP or PHR. So, to answer your question, Social Work might require more than a Bachelor's degree, depending on the specific role. Does that make sense?"

## 11. Simple Tool
TO BE CONTINUED>