<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>

# Lab 8.5 - Prompting Large Language Models

In this lab we will practise prompting with a few Large Language Models (LLMs) using Groq (not to be confused with Grok). Groq is a platform that provides access to their custom-built AI hardware via APIs, allowing users to run open-source models such as Llama.

We shall see that while LLMs are powerful tools, how you ask a question or frame a task can dramatically influence the results obtained.

## Set-up

Step 1: Sign up for a free Groq account at https://console.groq.com/home .

Step 2: Create a new API key at https://console.groq.com/keys. Copy-paste it into an empty text file called 'groq_key.txt'.

Running the next cell will then read in this key and assign it to the variable `groq_key`.

In [20]:
groqfilename = r'C:\Users\pabarca\OneDrive - GRUPO GRANSOLAR\Desktop\IOD - Python\DATA\groq.txt' # this file contains a single line containing your Groq API key only
try:
    with open(groqfilename, 'r') as f:
        groq_key = f.read().strip()
except FileNotFoundError:
    print("'%s' file not found" % filename)

In [3]:
!pip install groq

Collecting groq
  Downloading groq-0.30.0-py3-none-any.whl.metadata (16 kB)
Downloading groq-0.30.0-py3-none-any.whl (131 kB)
Installing collected packages: groq
Successfully installed groq-0.30.0


In [21]:
from groq import Groq
import requests
import pandas as pd
from IPython.display import Markdown

First create an instance of the Groq client:

In [22]:
client = Groq(api_key=groq_key)

The following code shows what models are currently accessible through Groq. `context_window` refers to the size of memory (in tokens) during a session and `max_completion_tokens` is the maximum number of tokens that are generated in an output.

In [28]:
print(response.json())

{'object': 'list', 'data': [{'id': 'compound-beta', 'object': 'model', 'created': 1740880017, 'owned_by': 'Groq', 'active': True, 'context_window': 131072, 'public_apps': None, 'max_completion_tokens': 8192}, {'id': 'meta-llama/llama-guard-4-12b', 'object': 'model', 'created': 1746743847, 'owned_by': 'Meta', 'active': True, 'context_window': 131072, 'public_apps': None, 'max_completion_tokens': 1024}, {'id': 'playai-tts', 'object': 'model', 'created': 1740682771, 'owned_by': 'PlayAI', 'active': True, 'context_window': 8192, 'public_apps': None, 'max_completion_tokens': 8192}, {'id': 'mistral-saba-24b', 'object': 'model', 'created': 1739996492, 'owned_by': 'Mistral AI', 'active': True, 'context_window': 32768, 'public_apps': None, 'max_completion_tokens': 32768}, {'id': 'whisper-large-v3-turbo', 'object': 'model', 'created': 1728413088, 'owned_by': 'OpenAI', 'active': True, 'context_window': 448, 'public_apps': None, 'max_completion_tokens': 448}, {'id': 'compound-beta-mini', 'object': 

In [27]:
response_json = response.json()
if 'data' in response_json:
    df = pd.DataFrame(response_json['data']).sort_values(['created'], ascending=False)
    print(df)
else:
    print("Key 'data' not found in response:", response_json)

                                               id object     created  \
13            meta-llama/llama-prompt-guard-2-86m  model  1748632165   
7             meta-llama/llama-prompt-guard-2-22m  model  1748632101   
19                                 qwen/qwen3-32b  model  1748396646   
1                    meta-llama/llama-guard-4-12b  model  1746743847   
6   meta-llama/llama-4-maverick-17b-128e-instruct  model  1743877158   
11      meta-llama/llama-4-scout-17b-16e-instruct  model  1743874824   
5                              compound-beta-mini  model  1742953279   
21                                   qwen-qwq-32b  model  1741214760   
0                                   compound-beta  model  1740880017   
10                              playai-tts-arabic  model  1740682783   
2                                      playai-tts  model  1740682771   
3                                mistral-saba-24b  model  1739996492   
14                  deepseek-r1-distill-llama-70b  model  173792

In [26]:
url = "https://api.groq.com/openai/v1/models"

headers = {
    "Authorization": f"Bearer {groq_key}",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)

pd.DataFrame(response.json()['data']).sort_values(['created'], ascending=False)

Unnamed: 0,id,object,created,owned_by,active,context_window,public_apps,max_completion_tokens
13,meta-llama/llama-prompt-guard-2-86m,model,1748632165,Meta,True,512,,512
7,meta-llama/llama-prompt-guard-2-22m,model,1748632101,Meta,True,512,,512
19,qwen/qwen3-32b,model,1748396646,Alibaba Cloud,True,131072,,40960
1,meta-llama/llama-guard-4-12b,model,1746743847,Meta,True,131072,,1024
6,meta-llama/llama-4-maverick-17b-128e-instruct,model,1743877158,Meta,True,131072,,8192
11,meta-llama/llama-4-scout-17b-16e-instruct,model,1743874824,Meta,True,131072,,8192
5,compound-beta-mini,model,1742953279,Groq,True,131072,,8192
21,qwen-qwq-32b,model,1741214760,Alibaba Cloud,True,131072,,131072
0,compound-beta,model,1740880017,Groq,True,131072,,8192
10,playai-tts-arabic,model,1740682783,PlayAI,True,8192,,8192


The Groq client object enables interaction with the Groq REST API and a chat completion request is made via the client.chat.completions.create method.

The most important arguments of the client.chat.completions.create method are the following:
* messages: a list of messages (dictionary form) that make up the conversation to date
* model: a string indicating which model to use (see [list of models](https://console.groq.com/docs/models))
* max_completion_tokens: the maximum number of tokens that are generated in the chat completion
* response_format: setting this to `{ "type": "json_object" }` enables JSON output
* seed: sample deterministically as best as possible, though identical outputs each time are not guaranteed
* temperature: between 0 and 2 where higher values like 0.8 make the output more random (creative) and values like 0.2 are more focused and deterministic


In [29]:
help(client.chat.completions.create)

Help on method create in module groq.resources.chat.completions:

create(*, messages: 'Iterable[ChatCompletionMessageParam]', model: "Union[str, Literal['gemma2-9b-it', 'llama-3.3-70b-versatile', 'llama-3.1-8b-instant', 'llama-guard-3-8b', 'llama3-70b-8192', 'llama3-8b-8192']]", exclude_domains: 'Optional[List[str]] | NotGiven' = NOT_GIVEN, frequency_penalty: 'Optional[float] | NotGiven' = NOT_GIVEN, function_call: 'Optional[completion_create_params.FunctionCall] | NotGiven' = NOT_GIVEN, functions: 'Optional[Iterable[completion_create_params.Function]] | NotGiven' = NOT_GIVEN, include_domains: 'Optional[List[str]] | NotGiven' = NOT_GIVEN, logit_bias: 'Optional[Dict[str, int]] | NotGiven' = NOT_GIVEN, logprobs: 'Optional[bool] | NotGiven' = NOT_GIVEN, max_completion_tokens: 'Optional[int] | NotGiven' = NOT_GIVEN, max_tokens: 'Optional[int] | NotGiven' = NOT_GIVEN, metadata: 'Optional[Dict[str, str]] | NotGiven' = NOT_GIVEN, n: 'Optional[int] | NotGiven' = NOT_GIVEN, parallel_tool_calls:

As a first example, note how the messages input is given as a list of a dictionaries with `role` and `content` keys. This is in a ChatML format recognised by many LLMs.

In [30]:
chat_completion = client.chat.completions.create(
    messages=[
        {   "role": "system", # sets the persona of the model
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", # what the user wants the assistant to do
            "content": "Explain briefly how large language models work",
        }
    ],
    model="llama-3.3-70b-versatile",
)

print(chat_completion.choices[0].message.content)

Large language models are artificial intelligence (AI) systems that process and generate human-like language. Here's a simplified overview of how they work:

1. **Training**: These models are trained on massive amounts of text data, such as books, articles, and websites.
2. **Pattern recognition**: The models learn to recognize patterns in language, including grammar, syntax, and semantics.
3. **Neural networks**: The models use neural networks, which are complex algorithms that mimic the human brain, to analyze and generate text.
4. **Predictive modeling**: When given a prompt or input, the model predicts the next word or character, based on the patterns it has learned.
5. **Generation**: The model generates text by iteratively predicting the next word or character, creating a coherent and context-specific output.

This process allows large language models to understand and generate human-like language, enabling applications such as language translation, text summarization, and conver

The output is in Markdown format so the following line formats this text.

In [31]:
Markdown(chat_completion.choices[0].message.content)

Large language models are artificial intelligence (AI) systems that process and generate human-like language. Here's a simplified overview of how they work:

1. **Training**: These models are trained on massive amounts of text data, such as books, articles, and websites.
2. **Pattern recognition**: The models learn to recognize patterns in language, including grammar, syntax, and semantics.
3. **Neural networks**: The models use neural networks, which are complex algorithms that mimic the human brain, to analyze and generate text.
4. **Predictive modeling**: When given a prompt or input, the model predicts the next word or character, based on the patterns it has learned.
5. **Generation**: The model generates text by iteratively predicting the next word or character, creating a coherent and context-specific output.

This process allows large language models to understand and generate human-like language, enabling applications such as language translation, text summarization, and conversational AI.

## Text summarisation

We start with a llama3-8b-8192, a model using just over 8 billion parameters with at most 8192 tokens produced as output.

Here is an article to be summarised from the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset:

In [32]:
story = """
SAN FRANCISCO, California (CNN) -- A magnitude 4.2 earthquake shook the San Francisco area Friday at 4:42 a.m. PT (7:42 a.m. ET), the U.S. Geological Survey reported. The quake left about 2,000 customers without power, said David Eisenhower, a spokesman for Pacific Gas and Light. Under the USGS classification, a magnitude 4.2 earthquake is considered "light," which it says usually causes minimal damage. "We had quite a spike in calls, mostly calls of inquiry, none of any injury, none of any damage that was reported," said Capt. Al Casciato of the San Francisco police. "It was fairly mild." Watch police describe concerned calls immediately after the quake » . The quake was centered about two miles east-northeast of Oakland, at a depth of 3.6 miles, the USGS said. Oakland is just east of San Francisco, across San Francisco Bay. An Oakland police dispatcher told CNN the quake set off alarms at people's homes. The shaking lasted about 50 seconds, said CNN meteorologist Chad Myers. According to the USGS, magnitude 4.2 quakes are felt indoors and may break dishes and windows and overturn unstable objects. Pendulum clocks may stop.
"""

**Exercise:**
Summarise the story text using the following three prompts. Use the format given above but here there is no need to set the persona (i.e. only include one dictionary in the messages list when calling `client.chat.completions.create`.) Comment on any differences.

1) "Summarise the following article in 3 sentences."

2) "Give me a TL;DR of this text."

3) "What's the key takeaway here?"

In [33]:
prompts = ["Summarise the following article in 3 sentences. ", "Give me a TL;DR of this text. ", "What's the key takeaway here?"]
#content will be p + story for p in prompts

# ANSWER
for p in prompts:
    response = client.chat.completions.create(
                model="llama3-8b-8192",
                messages=[{"role": "user", "content": p + story}]
)

    print(p, '\n', response.choices[0].message.content)

Summarise the following article in 3 sentences.  
 A magnitude 4.2 earthquake struck the San Francisco area on Friday at 4:42 a.m. PT, causing about 2,000 customers to lose power. The earthquake, which is considered "light" according to the USGS, was felt for about 50 seconds and caused no reported injuries or damage. Authorities received a surge of calls, mostly from concerned residents, but there were no reports of damage or injury, with the quake being described as "fairly mild".
Give me a TL;DR of this text.  
 A magnitude 4.2 earthquake shook the San Francisco area at 4:42am, causing about 2,000 power outages but no reported injuries or damage. The quake was considered "light" and lasted about 50 seconds, with people in the area reporting minor concerns and no damage.
What's the key takeaway here? 
 The key takeaway is that a magnitude 4.2 earthquake struck the San Francisco area, causing minimal damage and no reported injuries, with about 2,000 customers losing power.


Run the above code again below and note that the answers may differ. This is due to the probabilistic nature of LLM token generation.

In [34]:
# ANSWER
for p in prompts:
    response = client.chat.completions.create(
                model="llama3-8b-8192",
                messages=[{"role": "user", "content": p + story}]
)

    print(p, '\n', response.choices[0].message.content)

Summarise the following article in 3 sentences.  
 Here is a 3 sentence summary of the article:

A magnitude 4.2 earthquake struck the San Francisco area at 4:42am PT on Friday, causing approximately 2,000 customers to lose power. The USGS classified the earthquake as "light" and officials reported no major damage or injuries, with people mostly calling to inquire about the quake. The earthquake was felt for about 50 seconds and was centered about two miles east-northeast of Oakland, with people in the area reporting that it caused some minor disruptions such as setting off home alarms.
Give me a TL;DR of this text.  
 A magnitude 4.2 earthquake occurred in the San Francisco area on Friday at 4:42am, causing minimal damage and no reported injuries. The quake was centered near Oakland and left about 2,000 customers without power. The shaking lasted for 50 seconds and was felt indoors, causing some minor disturbances such as breaking dishes and windows, but no major damage was reported.


## Text completion

**Exercise**: In this section adjust the `max_completion_tokens` and `temperature` settings below to obtain different responses. Show some examples with the prompt "Continue the story: It was a great time to be alive" with the model "llama-3.1-8b-instant".

* max_completion_tokens - the maximum number of tokens to generate. Note that longer words are made of multiple tokens (set to 200 and 500)
* temperature (positive number) - the higher the number the more random (creative) the output (set to 0.2, 0.8, 2)

In [35]:
# ANSWER (set max_completion_tokens=200, do not have a temperature setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive"}],
    max_completion_tokens=200,
)

print(response.choices[0].message.content)

It was a great time to be alive, with the warm sun on your skin, the sound of laughter and music floating through the air, and the feeling of possibility and adventure stretching out before you like a tantalizing promise. 

For Emily, it was a summer like no other. She had just turned 20, and after graduating from college, she had spent the past few months working odd jobs and saving up to embark on a grand adventure. She had always dreamed of traveling the world, of experiencing different cultures, trying new foods, and making new friends.

And now, at last, the time had come. Emily packed a small bag, grabbed her backpack, and set off for the airport, where she would meet up with her best friend, Sarah, and begin their journey together.

They were bound for Europe, a continent that seemed to hold a special allure for Emily. She had always been fascinated by the history, the art, and the architecture of the old world, and she couldn


In [36]:
# ANSWER (set max_completion_tokens=500, do not have a temperature setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive"}],
    max_completion_tokens=500,
)

print(response.choices[0].message.content)

It was a great time to be alive, the year was 1969 and the world was buzzing with excitement. The Vietnam War was raging on, but for 20-year-old Sarah, the music, fashion, and culture of the time were a respite from the chaos. She had just left her small town in the Midwest and moved to New York City, ready to chase her dreams and experience the world.

As she walked down the streets of Greenwich Village, she was surrounded by the vibrant sounds of Jimi Hendrix and Janis Joplin blaring from the bars and clubs. The air was thick with the smell of incense and patchouli, and the streets were lined with hippies and artists, all pushing the boundaries of fashion and art.

Sarah had recently landed a job at a trendy boutique on Bleecker Street, where she sold everything from flowy maxi dresses to handmade jewelry. The store was a hub for the local counterculture, and Sarah was quickly becoming a part of it. She had traded in her conservative dress code for a more carefree, flower-child attit

In [38]:
# ANSWER (set temperature = 0.2, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 0.2,
)

print(response.choices[0].message.content)

The sun was shining bright, casting a warm glow over the bustling streets of the city. People of all ages and backgrounds walked side by side, each with their own unique story to tell. The air was filled with the sweet scent of blooming flowers and the sound of laughter and music drifted through the air.

It was a great time to be alive, indeed. The world was on the cusp of a new era of peace and prosperity, and everyone could feel it. The wars that had ravaged the planet for so long were finally coming to an end, and the people were beginning to rebuild and reconnect.

As I walked through the city, I couldn't help but feel a sense of hope and optimism. Everywhere I looked, I saw people working together, supporting each other, and striving for a better future. It was a truly inspiring sight, and it filled my heart with joy and gratitude.

I stopped at a small café to grab a cup of coffee and take in the sights and sounds of the city. As I sat at a small table outside, I struck up a con

In [39]:
# ANSWER (set temperature = 1, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 1,
)

print(response.choices[0].message.content)

The sun was shining brightly, casting a warm glow over the vibrant city streets. People of all ages and backgrounds hurried about, each with their own unique purpose and story to tell. The air was alive with the hum of music, laughter, and the distant chirping of birds.

As I walked through the bustling streets, I couldn't help but feel a sense of excitement and anticipation. It was a great time to be alive, and I was grateful to be a part of it. Everywhere I looked, there were people coming together, celebrating life, love, and the simple joys of existence.

I passed by a group of artists gathered on a sidewalk, painting vibrant murals on the walls of a building. Their brushes danced across the canvas, creating a kaleidoscope of colors that seemed to pulse with the rhythm of the city. I watched in awe as they worked, mesmerized by the energy and creativity that seemed to emanate from every brushstroke.

As I continued on my way, I stumbled upon a street performer who was playing a liv

Note what happens when the temperature is set too high!

In [40]:
# ANSWER (set temperature = 2, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 2,
)

print(response.choices[0].message.content)

Music was everywhere, from the boomboxes blaring in cars on the Sunset-strip in LA to the clubs in London pumping out the latest hits from Duran Duran and Wham!. The smell of freshly poured asphalt and cigarette smoke filled the air as people reveled in the carefree spirit of 1985.

You were 25 at the time, with a job in the city as an emerging marketing executive, working closely with one of the largest record labels. Your evenings were spent frequenting rooftop parties above some of the city's hottest clubs and dancing till daylight to the sound of Michael Jackson pouring music out of the speakers.

A chance introduction had led you back to college with your old flame, now in its second season working for her fashion degree after a summer job at Bloomingdale's department store downtown, had made you realize love and work were not conflicting forces in  1985 – you could easily be seen having a blast together in New Mexico as you had on the Fourth July that same summer the night they o

### Zero-shot and one-short prompting for question-answering

This section shows the impact of prompting on the response. Zero-shot prompting means we provide the prompt without any examples or additional context. Let us initially ask Mistral a question using no prompting.

In [41]:
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "How do two chemicals react?"}],
    temperature = 0.8,
)

Markdown(response.choices[0].message.content)

Chemical reactions involve the interaction of two or more substances to form new substances. To understand how two chemicals react, we need to consider several factors:

1. **Chemical properties**: Each chemical has its unique properties, such as reactivity, electronegativity, and polarity. These properties determine how the chemical will interact with other substances.
2. **Reaction type**: Chemical reactions can be classified into several types, including:
	* **Synthesis reaction**: Combination of two or more substances to form a new compound.
	* **Decomposition reaction**: Breakdown of a single compound into two or more simpler substances.
	* **Single displacement reaction**: One element displaces another element from a compound.
	* **Double displacement reaction**: Two compounds exchange partners to form two new compounds.
	* **Combustion reaction**: Reaction of a substance with oxygen to produce heat and light.
3. **Reaction conditions**: The conditions under which the reaction occurs can affect the outcome. These conditions include:
	* **Temperature**: Higher temperatures can increase the rate of reaction.
	* **Pressure**: Increased pressure can increase the rate of reaction.
	* **Catalyst**: Presence of a catalyst can speed up the reaction.
	* **Solvent**: The properties of the solvent can affect the reaction.
4. **Chemical equations**: Chemical equations describe the reactants, products, and conditions of the reaction. A balanced equation is a chemical equation that shows the same number of atoms of each element on both the reactant and product sides.

To predict how two chemicals will react, you can:

1. **Research the properties** of the chemicals involved.
2. **Consult chemical databases** or reference materials for information on potential reactions.
3. **Conduct experiments** to test the reaction under controlled conditions.
4. **Use theoretical models** to predict the reaction outcomes.

Some common tools used to predict chemical reactions include:

1. **Molecular orbital theory**: This theory helps predict the reactivity of molecules based on their electron configuration.
2. **Valence bond theory**: This theory describes the formation of chemical bonds between atoms.
3. **Group theory**: This theory helps predict the symmetry of molecules and their reactivity.
4. **Quantum mechanics**: This branch of physics helps predict the behavior of atoms and molecules at the molecular level.

Keep in mind that predicting chemical reactions is a complex task, and the accuracy of predictions depends on the level of detail and the tools used.

**Exercise:** Ask the same question but modify the prompt to return the answer to the same question in a simpler form (still using the llama-3.1-8b-instant model). Experiment with different prompts.

In [42]:
# ANSWER
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Answer the following question as though I am 10 years old. How do two chemicals react?"}],
    temperature = 0.8,
)

Markdown(response.choices[0].message.content)

Imagine you have two friends, let's call them Bob and Alice. Bob loves to dance, and Alice loves to sing. When they are together in a big room, they both want to do their favorite thing. But, the room is too quiet, so they can't dance or sing as much as they want.

Now, let's say we add some special music to the room. The music makes it loud and fun, and Bob and Alice both start to dance and sing together. But, they don't just dance and sing together, they also start to create something new. They create a new kind of music that's a mix of both of their favorite things.

In the same way, when two chemicals (like molecules) meet, they want to do their special job. But, they need some kind of special help, like the music in the room. This special help is called energy. When the chemicals get energy, they start to react. They mix together and create something new, just like Bob and Alice created that new kind of music.

This new thing that the chemicals create is called a product. It's like the song that Bob and Alice made together. And just like how Bob and Alice can't just stop singing and dancing in the middle of the song, the chemicals can't just stop reacting once they've started. They keep going until they've finished making the product.

### One-shot prompting ###

Next, note the dramatic change when we give the following template setting a new role and providing an English question followed by a French translation.

In [43]:
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "system",
             "content": "You translate English to French."},
              {"role": "user",
               "content": "What time is it?"},
               {"role": "assistant",
               "content": "Quelle heure est-il?"},
              {"role": "user",
               "content": "How do two chemicals react?"}],
    temperature = 0.8,
)
print(response.choices[0].message.content)

Comment réagissent deux composés chimiques ?

(Note: This is a general question, and the answer would depend on the specific chemicals involved. If you provide more context or information about the chemicals, I can give a more detailed and accurate answer.)


### Few-shot prompting

Recall that since the text generation process outputs one token at a time, their outputs often need adjusting. This is where examples can help.

In [44]:
prompt1 = "I'm gonna head out now, see you later."
response1 = "I will be leaving now. See you later."

prompt2 =  "That movie was super cool!"
response2 = "The movie was very impressive."

prompt3 = "Can't make it to the meeting, sorry."


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are a professional editor. Rewrite casual sentences into a formal tone."},
        {"role": "user", "content": prompt1},
        {"role": "assistant", "content": response1},
        {"role": "user", "content": prompt2},
        {"role": "assistant", "content": response2},
        {"role": "user", "content": prompt3},
    ]
)

print(response.choices[0].message.content.strip())


Regrettably, I will be unable to attend the meeting.


The output can also be moulded to provide SQL output.

In [45]:
prompt1 = "Show me all users who signed up in the last 30 days."
response1 = "SELECT * FROM users WHERE signup_date >= CURRENT_DATE - INTERVAL '30 days';"

prompt2 = "What is the average order value?"
response2 =  "SELECT AVG(order_total) FROM orders;"

prompt3 = "List products that are out of stock."

response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are an assistant that translates natural language to SQL."},
        {"role": "user", "content": prompt1},
        {"role": "assistant", "content": response1},
        {"role": "user", "content": prompt2},
        {"role": "assistant", "content": response2},
        {"role": "user", "content": prompt3},
    ]
)

print(response.choices[0].message.content.strip())


SELECT * FROM products WHERE quantity_in_stock = 0;


**Exercise**: Create a few examples to train the "llama3-70b-8192" LLM to take in user content in the form below and provide output as a pandas dataframe. Use the `exec` function to execute its output to display the answer of sample input as a data frame.

Example:

given the user content

"""

| col1 | col2 | col3

| 32 | 27 | 25

| 64 | 23 | 14

"""

train the model to output

df = pd.DataFrame({'col1': [32, 64], 'col2': [27, 23], 'col3': [25, 14]})



In [46]:
#ANSWER

user1 = """col1 | col2 | col3
32 | 27 | 25
64 | 23 | 14
"""

output1 = """
df = pd.DataFrame({'col1': [32, 64], 'col2': [27, 23], 'col3': [25, 14]})
"""

user2 = """col1 | col2
23 | 12
8 | 76
7 | 5
"""
output2 = """
df = pd.DataFrame({'col1': [23, 8, 7], 'col2': [12, 76, 5]})
"""
user3 = """colA | colB | colC
23 | 12 | 54
8 | 76 | 32
7 | 5 | 3
"""


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are a data scientist who will receive data input as a string and provide output as a pandas dataframe called df. Use the examples to guide you"},
        {"role": "user", "content": user1},
        {"role": "assistant", "content": output1},
        {"role": "user", "content": user2},
        {"role": "user", "content": output2},
        {"role": "user", "content": user3}
    ]
)

exec(response.choices[0].message.content.strip()) # string executed as Python code
df

Unnamed: 0,colA,colB,colC
0,23,12,54
1,8,76,32
2,7,5,3


Also show what happens when the question is asked in the absence of a system role and without few-shot prompting.

In [47]:
# ANSWER
response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "user", "content": user3}
    ]
)
response.choices[0].message.content.strip()

"It looks like you've provided a table with three columns: `colA`, `colB`, and `colC`. Here's a formatted version of the table:\n\n| colA | colB | colC |\n| --- | --- | --- |\n| 23  | 12  | 54  |\n| 8   | 76  | 32  |\n| 7   | 5   | 3   |\n\nLet me know if you'd like me to perform any operations on this table or answer any questions about it!"

### Chain-of-thought prompting

The results of question-answering can also be improved by prompting the LLM to provide intermediate steps.

**Exercise**: Using the following prompts, compare the answers of the "llama3-8b-8192" model (set seed=21). (If this model is no longer available choose a model with relatively few parameters.)

zero_shot_prompt = "How many s's are in the word 'success'?"

chain_of_thought_prompt = "How many s's are in the word 'success'? Explain your answer step by step by going through each letter in turn."

In [50]:
# ANSWER
zero_shot_prompt = "How many s's are in the word 'success'?"
chain_of_thought_prompt = "How many s's are in the word 'success'? Explain your answer step by step by going through each letter in turn."

response1 = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": zero_shot_prompt}],
    seed = 21
)

response2 = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": chain_of_thought_prompt}],
    seed = 21
)

print('------zero-shot-prompt------')
print(response1.choices[0].message.content)

print('------chain-of-thought------')
print(response2.choices[0].message.content)

------zero-shot-prompt------
There are 2 s's in the word 'success'.
------chain-of-thought------
To count the number of 's's in the word "success", we'll go through each letter one by one:

1. S
   There's one 's' in the first position.

2. U
   Nothing else to add, there's no 's' with this 'u'.

3. C
   There's no 's' with this 'c' either.

4. C
   Still not an 's'.

5. E
   The word doesn't end at 'e' yet, so there's no 's' here.

6. S
   Now we have a second 's' in the sixth position.

7. S
   This is the third 's' in the seventh position of the word "success".

The total number of 's's in the word "success" is 3.


## Comparison of LLMs

**Exercise**: Compare the performance of 2 LLMs by outputting the answers of the following questions into a dataframe.

    "Tell me a joke about data science.",
    "How can one calculate 22 * 13 mentally?",
    "Write a creative story about a baby learning to crawl.",

Column headings:

Model Name | Question | Answer

In [51]:


# ANSWER
pd.set_option('display.max_colwidth', None) # allows wide dataframes to be viewed
models = ["gemma2-9b-it", "llama-3.1-8b-instant"] #can edit this

# ANSWER
prompts = [
    "Tell me a joke about data science.",
    "How can one calculate 22 * 13 mentally?",
    "Write a creative story about a baby learning to crawl.",
]

results = {'Model Name': [], 'Question': [], 'Answer': []}

for model in models:
    for prompt in prompts:
        results['Model Name'].append(model)
        results['Question'].append(prompt)
        try:
            output = client.chat.completions.create(model = model, messages=[{"role": "user", "content": prompt}])
            results['Answer'].append(output.choices[0].message.content.strip())

        except Exception as e:
            print(f"Error with {model}: {e}")
            results['Answer'].append((prompt, "ERROR"))


df = pd.DataFrame(results)
df

Unnamed: 0,Model Name,Question,Answer
0,gemma2-9b-it,Tell me a joke about data science.,Why did the data scientist get lost in the woods?\n\nBecause they couldn't find the decision tree! 🌳😭 \n\n\nLet me know if you'd like to hear another one! 😄
1,gemma2-9b-it,How can one calculate 22 * 13 mentally?,"Here's a way to calculate 22 * 13 mentally using a combination of doubling and simplifying:\n\n1. **Break down 13:** Think of 13 as (10 + 3).\n\n2. **Multiply by 22:** Now you have: \n * 22 * (10 + 3) \n\n3. **Distribute:**\n * 22 * 10 + 22 * 3\n\n4. **Calculate:**\n * 220 + 66\n\n5. **Add:**\n * 286\n\n\nTherefore, 22 * 13 = 286"
2,gemma2-9b-it,Write a creative story about a baby learning to crawl.,"Bartholomew Buttons, a cherubic baby with a tuft of messy brown hair and eyes like melted chocolate, lay on his chubby tummy, focused on a bright red toy truck inches away. He yanked, his tiny fist grasping at the brightly colored plastic, but it remained tantalisingly out of reach.\n\nFor weeks, Bartholomew had been observing the world from his back or on his tummy, mesmerized by everything from the dust motes dancing in the sunlight to the rumbling gut of the big, friendly dog named Dusty who lumbered around the house. But seeing the truck, his determination to reach it burned like the fire in the fireplace he couldn't quite reach.\n\nHe arched his back, the effort pulling at his belly and his chubby legs stretched out like a frog. He knew what his sister, Clara, did to get things – those long, wavy slithers she called “crawling.” He'd watched her, mimicking the movements with his short, stubby limbs until he was dizzy. \n\nToday, he felt different. A spark of the unknown, the almost impossible, flickered within him. He took a deep, rasping breath, his little chest rising and falling like a bellows, and pushed.\n\nHis tummy muscles strained, his arms flailed, and then… a tiny inch. \n\nHe stopped, shocked by the movement, the world tilting ever so slightly. But then a grin, ear to ear, spread across his face. He realized what he’d done! He had moved!\n\nHe pushed again, this time a little further, then another push, and another. He was going! It was slow, clumsy, and his legs wobbled like newborn fawns, but he was moving towards the truck, his eyes fixed on its red glory.\n\nSuddenly, the rug bunched under his knee.\n\nBartholomew toppled forward, landing on his face with a soft “thump.” For a moment, he lay there, stunned. Then, his bottom lip quivered, a wave of tears threatening to spill. \n\nBut then, a chuckle came from behind him, a warm, familiar rumble. He felt himself being lifted, and a pair of brown hands wiped his damp cheek. Dusty, the big, friendly dog, licked his face, his tail thumping against the floor. \n\n\nBartholomew giggled, forgetting his frustrations. He was going to crawl. He knew it. He took a deep breath, mustered his chubby muscles, and pushed, achieving another tiny inch. The journey would be long, the falls plentiful, but Bartholomew Buttons was determined. The world, with its countless red trucks, awaited. He was ready."
3,llama-3.1-8b-instant,Tell me a joke about data science.,Why did the data scientist quit his job? \n\nBecause he didn't get the right correlation.
4,llama-3.1-8b-instant,How can one calculate 22 * 13 mentally?,"To calculate 22 * 13 mentally, you can use the following method:\n\n1. Break down 22 into 20 + 2.\n2. Break down 13 into 10 + 3.\n3. Multiply the numbers using the distributive property:\n (20 * 10) + (20 * 3) + (2 * 10) + (2 * 3)\n4. Calculate each product:\n (200) + (60) + (20) + (6)\n5. Add the results together:\n (200) + (60) + (20) + (6) = 286\n\nSo, 22 * 13 equals 286 mentally."
5,llama-3.1-8b-instant,Write a creative story about a baby learning to crawl.,"**The Great Escape**\n\nIn a cozy little nursery, surrounded by soft toys and colorful walls, a wiggly bundle of joy was waiting to break free. Luna, a tiny baby with bright brown eyes and chubby cheeks, had been watching her siblings crawl for what felt like an eternity. It was time for her own adventure.\n\nLuna's mom, Emma, noticed the spark in her baby's eyes and said, ""Today's the day, little one. Get up and crawl!"" She placed a soft toy just out of reach, enticing Luna to take a step closer.\n\nAt first, Luna's attempts were more like flailing than crawling. Her chubby legs waved in the air, like a pair of oversized, pink wings. She giggled and cooed, completely unaware of her clumsy movements. Emma chuckled and encouraged her, ""Come on, Luna! You can do it!""\n\nUndeterred, Luna kept trying. She pushed off with her hands and made a tiny lurch forward, only to tumble onto her tummy. Her face scrunched up in concentration, she wriggled and squirmed, her little body shaking with effort. Emma cheered her on, ""You're so close, Luna! Keep going!""\n\nThen, something miraculous happened. Luna's arms suddenly found a rhythm, pumping back and forth like a tiny little engine. Her legs followed suit, kicking and pushing with newfound strength. To everyone's delight, Luna began to move – slowly at first, but surely – across the floor.\n\nAs she crawled, Luna's face lit up with excitement. She discovered a world of textures and sensations: the softness of the rug, the roughness of the carpet, and the gentle give of a stuffed animal under her fingers. Her squeals of delight echoed through the nursery, a symphony of joy.\n\nEmma beamed with pride, snapping pictures and capturing every precious moment of this milestone. Her husband, Tom, joined in the celebration, playing a rendition of ""Happy Birthday"" on his guitar. The nursery was filled with music, movement, and laughter.\n\nLuna's crawling escapades became a daily routine. She zoomed across the room, leaving a trail of toys and pillows in her wake. Her family watched, awestruck by her progress and her boundless energy.\n\nAs the days passed, Luna grew more confident and adventurous. She crawled over couch cushions, through tunnels and into hideaways. Her world expanded, and with it, her sense of wonder.\n\nOne sunny afternoon, as Luna crawled across the floor, she reached out with a chubby hand and grasped the soft toy Emma had placed earlier. With a triumphant cry, she pulled herself closer, nestling the toy against her chest. Emma swept her up in a tight hug, exclaiming, ""You did it, little one! You're crawling, and you're unstoppable!""\n\nLuna basked in the praise and adoration, basking in the glow of a newfound confidence. She knew she was on her way to conquering the world – one crawl at a time."


### Bonus

See if you can prompt an LLM to perform sentiment analysis (output 'Positive' or 'Negative' only) on a given piece of text.

In [53]:
# ANSWER
input1 = "I absolutely loved the way the story unfolded."
output1 = "Positive"

input2 = "The food was cold and completely flavorless."
output2 = "Negative"

input3 = "She handled the situation with grace and professionalism."


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are amazing at sentiment analysis. Give the sentiment of the next sentence as the examples show."},
        {"role": "user", "content": input1},
        {"role": "assistant", "content": output1},
        {"role": "user", "content": input2},
        {"role": "assistant", "content": output2},
        {"role": "user", "content": input3},
    ]
)
response.choices[0].message.content

'Positive'

## Conclusion

We worked with a few Large Language Models (LLMs) using Groq and experimented with prompting for summarisation, text completion and question-answering tasks.

We also explored controlling the randomness (creativity) of output through the temperature setting and tried different types of prompting to achieve desired forms of output.

## References
1. [Groq's prompting guide](https://console.groq.com/docs/prompting)
2. [Groq's playground](https://console.groq.com/playground)



---



---



> > > > > > > > > © 2025 Institute of Data


---



---



