<div>
<img src=https://www.institutedata.com/wp-content/uploads/2019/10/iod_h_tp_primary_c.svg width="300">
</div>

# Lab 8.5 - Prompting Large Language Models

In this lab we will practise prompting with a few Large Language Models (LLMs) using Groq (not to be confused with Grok). Groq is a platform that provides access to their custom-built AI hardware via APIs, allowing users to run open-source models such as Llama.

We shall see that while LLMs are powerful tools, how you ask a question or frame a task can dramatically influence the results obtained.

## Set-up

Step 1: Sign up for a free Groq account at https://console.groq.com/home .

Step 2: Create a new API key at https://console.groq.com/keys. Copy-paste it into an empty text file called 'groq_key.txt'.

Running the next cell will then read in this key and assign it to the variable `groq_key`.

In [1]:
groqfilename = r'groq_key.txt' # this file contains a single line containing your Groq API key only
try:
    with open(groqfilename, 'r') as f:
        groq_key = f.read().strip()
except FileNotFoundError:
    print("'%s' file not found" % filename)

In [2]:
!pip install groq

Collecting groq
  Using cached groq-0.28.0-py3-none-any.whl.metadata (15 kB)
Collecting distro<2,>=1.7.0 (from groq)
  Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting pydantic<3,>=1.9.0 (from groq)
  Using cached pydantic-2.11.7-py3-none-any.whl.metadata (67 kB)
Collecting annotated-types>=0.6.0 (from pydantic<3,>=1.9.0->groq)
  Using cached annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.33.2 (from pydantic<3,>=1.9.0->groq)
  Using cached pydantic_core-2.33.2-cp312-cp312-win_amd64.whl.metadata (6.9 kB)
Collecting typing-inspection>=0.4.0 (from pydantic<3,>=1.9.0->groq)
  Using cached typing_inspection-0.4.1-py3-none-any.whl.metadata (2.6 kB)
Using cached groq-0.28.0-py3-none-any.whl (130 kB)
Using cached distro-1.9.0-py3-none-any.whl (20 kB)
Using cached pydantic-2.11.7-py3-none-any.whl (444 kB)
Using cached pydantic_core-2.33.2-cp312-cp312-win_amd64.whl (2.0 MB)
Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)
Usi

In [4]:
## Import Libraries
import numpy as np
import pandas as pd

import string
import spacy

from collections import Counter

from sklearn.decomposition import LatentDirichletAllocation
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.svm import LinearSVC
from groq import Groq
import requests
import pandas as pd
from IPython.display import Markdown
# import warnings
# warnings.filterwarnings('ignore')

First create an instance of the Groq client:

In [5]:
client = Groq(api_key=groq_key)

The following code shows what models are currently accessible through Groq. `context_window` refers to the size of memory (in tokens) during a session and `max_completion_tokens` is the maximum number of tokens that are generated in an output.

In [6]:
url = "https://api.groq.com/openai/v1/models"

headers = {
    "Authorization": f"Bearer {groq_key}",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)

pd.DataFrame(response.json()['data']).sort_values(['created'], ascending=False)

Unnamed: 0,id,object,created,owned_by,active,context_window,public_apps,max_completion_tokens
17,meta-llama/llama-prompt-guard-2-86m,model,1748632165,Meta,True,512,,512
0,meta-llama/llama-prompt-guard-2-22m,model,1748632101,Meta,True,512,,512
16,qwen/qwen3-32b,model,1748396646,Alibaba Cloud,True,131072,,40960
3,meta-llama/llama-guard-4-12b,model,1746743847,Meta,True,131072,,1024
10,meta-llama/llama-4-maverick-17b-128e-instruct,model,1743877158,Meta,True,131072,,8192
15,meta-llama/llama-4-scout-17b-16e-instruct,model,1743874824,Meta,True,131072,,8192
7,compound-beta-mini,model,1742953279,Groq,True,131072,,8192
14,qwen-qwq-32b,model,1741214760,Alibaba Cloud,True,131072,,131072
20,compound-beta,model,1740880017,Groq,True,131072,,8192
4,playai-tts-arabic,model,1740682783,PlayAI,True,8192,,8192


The Groq client object enables interaction with the Groq REST API and a chat completion request is made via the client.chat.completions.create method.

The most important arguments of the client.chat.completions.create method are the following:
* messages: a list of messages (dictionary form) that make up the conversation to date
* model: a string indicating which model to use (see [list of models](https://console.groq.com/docs/models))
* max_completion_tokens: the maximum number of tokens that are generated in the chat completion
* response_format: setting this to `{ "type": "json_object" }` enables JSON output
* seed: sample deterministically as best as possible, though identical outputs each time are not guaranteed
* temperature: between 0 and 2 where higher values like 0.8 make the output more random (creative) and values like 0.2 are more focused and deterministic


In [7]:
help(client.chat.completions.create)

Help on method create in module groq.resources.chat.completions:

create(*, messages: 'Iterable[ChatCompletionMessageParam]', model: "Union[str, Literal['gemma2-9b-it', 'llama-3.3-70b-versatile', 'llama-3.1-8b-instant', 'llama-guard-3-8b', 'llama3-70b-8192', 'llama3-8b-8192']]", exclude_domains: 'Optional[List[str]] | NotGiven' = NOT_GIVEN, frequency_penalty: 'Optional[float] | NotGiven' = NOT_GIVEN, function_call: 'Optional[completion_create_params.FunctionCall] | NotGiven' = NOT_GIVEN, functions: 'Optional[Iterable[completion_create_params.Function]] | NotGiven' = NOT_GIVEN, include_domains: 'Optional[List[str]] | NotGiven' = NOT_GIVEN, logit_bias: 'Optional[Dict[str, int]] | NotGiven' = NOT_GIVEN, logprobs: 'Optional[bool] | NotGiven' = NOT_GIVEN, max_completion_tokens: 'Optional[int] | NotGiven' = NOT_GIVEN, max_tokens: 'Optional[int] | NotGiven' = NOT_GIVEN, metadata: 'Optional[Dict[str, str]] | NotGiven' = NOT_GIVEN, n: 'Optional[int] | NotGiven' = NOT_GIVEN, parallel_tool_calls:

As a first example, note how the messages input is given as a list of a dictionaries with `role` and `content` keys. This is in a ChatML format recognised by many LLMs.

In [8]:
chat_completion = client.chat.completions.create(
    messages=[
        {   "role": "system", # sets the persona of the model
            "content": "You are a helpful assistant."
        },
        {
            "role": "user", # what the user wants the assistant to do
            "content": "Explain briefly how large language models work",
        }
    ],
    model="llama-3.3-70b-versatile",
)

print(chat_completion.choices[0].message.content)

Large language models work by:

1. **Training**: They're fed vast amounts of text data, which helps them learn patterns and relationships between words.
2. **Tokenization**: The model breaks down text into smaller units (tokens) like words or characters.
3. **Contextualization**: It analyzes the tokens and their context to predict the next token, creating a probability distribution.
4. **Generation**: The model uses this probability distribution to generate text, one token at a time, based on the input prompt or context.

This process relies on complex algorithms and neural networks, allowing the model to understand and generate human-like language.


The output is in Markdown format so the following line formats this text.

In [9]:
Markdown(chat_completion.choices[0].message.content)

Large language models work by:

1. **Training**: They're fed vast amounts of text data, which helps them learn patterns and relationships between words.
2. **Tokenization**: The model breaks down text into smaller units (tokens) like words or characters.
3. **Contextualization**: It analyzes the tokens and their context to predict the next token, creating a probability distribution.
4. **Generation**: The model uses this probability distribution to generate text, one token at a time, based on the input prompt or context.

This process relies on complex algorithms and neural networks, allowing the model to understand and generate human-like language.

## Text summarisation

We start with a llama3-8b-8192, a model using just over 8 billion parameters with at most 8192 tokens produced as output.

Here is an article to be summarised from the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset:

In [10]:
story = """
SAN FRANCISCO, California (CNN) -- A magnitude 4.2 earthquake shook the San Francisco area Friday at 4:42 a.m. PT (7:42 a.m. ET), the U.S. Geological Survey reported. The quake left about 2,000 customers without power, said David Eisenhower, a spokesman for Pacific Gas and Light. Under the USGS classification, a magnitude 4.2 earthquake is considered "light," which it says usually causes minimal damage. "We had quite a spike in calls, mostly calls of inquiry, none of any injury, none of any damage that was reported," said Capt. Al Casciato of the San Francisco police. "It was fairly mild." Watch police describe concerned calls immediately after the quake » . The quake was centered about two miles east-northeast of Oakland, at a depth of 3.6 miles, the USGS said. Oakland is just east of San Francisco, across San Francisco Bay. An Oakland police dispatcher told CNN the quake set off alarms at people's homes. The shaking lasted about 50 seconds, said CNN meteorologist Chad Myers. According to the USGS, magnitude 4.2 quakes are felt indoors and may break dishes and windows and overturn unstable objects. Pendulum clocks may stop.
"""

**Exercise:**
Summarise the story text using the following three prompts. Use the format given above but here there is no need to set the persona (i.e. only include one dictionary in the messages list when calling `client.chat.completions.create`.) Comment on any differences.

1) "Summarise the following article in 3 sentences."

2) "Give me a TL;DR of this text."

3) "What's the key takeaway here?"

In [11]:
prompts = ["Summarise the following article in 3 sentences. ", "Give me a TL;DR of this text. ", "What's the key takeaway here?"]
#content will be p + story for p in prompts

# ANSWER
for p in prompts:
    response = client.chat.completions.create(
                model="llama3-8b-8192",
                messages=[{"role": "user", "content": p + story}]
)

    print(p, '\n', response.choices[0].message.content)


Summarise the following article in 3 sentences.  
 Here is a summary of the article in 3 sentences:

A magnitude 4.2 earthquake struck the San Francisco area at 4:42 a.m. PT on Friday, causing minimal damage and no reported injuries. The quake left around 2,000 customers without power, but Pacific Gas and Light said most of the calls they received were inquiries rather than reports of damage. The earthquake, which was centered about two miles east-northeast of Oakland, lasted for about 50 seconds and was felt indoors, although it did trigger some alarm systems and cause slight shaking.
Give me a TL;DR of this text.  
 A magnitude 4.2 earthquake struck the San Francisco area at 4:42am, causing minimal damage and power outages. The quake, centered near Oakland, lasted about 50 seconds and was felt indoors, but no injuries or significant damage were reported.
What's the key takeaway here? 
 The key takeaway is that a magnitude 4.2 earthquake struck the San Francisco area early Friday morn

Run the above code again below and note that the answers may differ. This is due to the probabilistic nature of LLM token generation.

In [12]:
# ANSWER
for p in prompts:
    response = client.chat.completions.create(
                model="llama3-8b-8192",
                messages=[{"role": "user", "content": p + story}]
)

    print(p, '\n', response.choices[0].message.content)

Summarise the following article in 3 sentences.  
 Here is a 3 sentence summary of the article:

A magnitude 4.2 earthquake struck the San Francisco area at 4:42 a.m. PT on Friday, causing minimal damage and no reported injuries. The quake, which was centered about two miles east-northeast of Oakland, affected around 2,000 customers who lost power, but most of the calls received by authorities were inquiries rather than reports of damage or injury. The earthquake was classified as "light" by the USGS, with effects such as shaking that lasted about 50 seconds and caused slight disruptions, like setting off home alarms and potentially breaking dishes or windows.
Give me a TL;DR of this text.  
 A magnitude 4.2 earthquake struck the San Francisco area at 4:42am PT, causing minimal damage and no reported injuries. The quake was centered near Oakland, about 2 miles east-northeast, and lasted for about 50 seconds. Approximately 2,000 customers were left without power, but the majority of rep

## Text completion

**Exercise**: In this section adjust the `max_completion_tokens` and `temperature` settings below to obtain different responses. Show some examples with the prompt "Continue the story: It was a great time to be alive" with the model "llama-3.1-8b-instant".

* max_completion_tokens - the maximum number of tokens to generate. Note that longer words are made of multiple tokens (set to 200 and 500)
* temperature (positive number) - the higher the number the more random (creative) the output (set to 0.2, 0.8, 2)

In [13]:
# ANSWER (set max_completion_tokens=200, do not have a temperature setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive"}],
    max_completion_tokens=200,
)

print(response.choices[0].message.content)

People of all ages filled the streets, laughing and chatting with one another. The air was filled with the sweet scent of blooming flowers and the distant sound of music drifting from the park. It was a great time to be alive, and everyone knew it.

Rachel, a young woman with a bright smile and infectious energy, had just finished her shift at the local coffee shop. She was sipping on a cold glass of lemonade and enjoying the warm sunshine on her face. As she strolled down the street, she noticed a group of musicians setting up their equipment in the park.

One of the musicians, a charming guitarist with a messy mop of hair, caught her eye. He flashed her a warm smile, and Rachel couldn't help but feel drawn to him. As the music began to play, Rachel found herself swaying to the rhythm, mesmerized by the guitarist's talents.

The music was a lively mix of folk and rock, with catchy melodies and heartfelt lyrics. Rachel couldn't


In [14]:
# ANSWER (set max_completion_tokens=500, do not have a temperature setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive"}],
    max_completion_tokens=500,
)

print(response.choices[0].message.content)

As I walked through the vibrant streets of the city, surrounded by people from all walks of life, I couldn't help but feel a sense of excitement and wonder. The year was 2050, and the world had changed beyond recognition. Towering skyscrapers made of shimmering metals and sustainable materials pierced the sky, their exteriors a mesmerizing display of color and light. Flying cars zoomed by, their humming engines a familiar soundtrack to everyday life.

I stopped at a street vendor, who offered me a sample of the latest culinary innovation: lab-grown, nutrient-rich "food cubes" that tasted like anything I desired. I chose the flavor of a juicy burger, and took a bite. The explosion of flavors was incredible, and I couldn't believe how real it tasted. The vendor smiled and said, "Welcome to the future, my friend."

I continued my stroll, taking in the sights and sounds of this brave new world. Everywhere I looked, I saw people from all over the globe coming together, united in their pursu

In [15]:
# ANSWER (set temperature = 0.2, do not have a max_completion_tokens setting)
# ANSWER (set temperature = 0.2, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 0.2,
)

print(response.choices[0].message.content)

It was a great time to be alive. The sun was shining brightly in the clear blue sky, casting a warm glow over the bustling streets of the city. People of all ages and backgrounds were out and about, enjoying the beautiful day and the sense of community that came with it.

Lena, a young woman with a bright smile and a contagious laugh, was walking down the street, feeling carefree and alive. She had just finished a long week of work and was looking forward to a well-deserved break. As she strolled along, she noticed the vibrant street art that adorned the buildings, the smell of freshly baked bread wafting from the nearby bakery, and the sound of children's laughter echoing from the park.

She stopped at a small café to grab a coffee and people-watch. The café was filled with the sounds of lively chatter and the aroma of freshly brewed coffee. Lena took a seat at a small table by the window and watched as people of all ages and backgrounds walked by, each with their own unique story to 

In [16]:
# ANSWER (set temperature = 1, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 1,
)

print(response.choices[0].message.content)

The sun was shining bright, casting a warm glow over the bustling streets. People from all walks of life were out and about, laughing, chatting, and soaking up the vibrant atmosphere. The air was filled with the sweet scent of blooming flowers and the sound of music drifting through the air.

As I walked through the city, I couldn't help but feel a sense of excitement and possibility. It was a time of great change and progress, and everyone seemed to be caught up in the momentum.

I passed by a group of young artists, gathered around a street performer who was juggling clubs and spinning plates to the delight of the crowd. Nearby, a group of activists were setting up a stall, calling for greater awareness about social justice and equality.

Further down the street, I saw a line of people waiting to get into a popular café, where a new exhibition was about to open. The café was known for its eclectic mix of art, music, and politics, and it seemed to be the hub of the city's creative and

Note what happens when the temperature is set too high!

In [17]:
# ANSWER (set temperature = 2, do not have a max_completion_tokens setting)
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Continue the story: It was a great time to be alive."}],
    temperature = 2,
)

print(response.choices[0].message.content)

It was a great time to be alive, the smell of freshly brewed coffee wafting from the coffee cart outside, the sun peeking over the buildings, lighting up the world in shades of golden orange, and the city buzzing with excitement.

People of all ages flocked to the central square for the grand festival taking over the town. The event organizers set a majestic stage amidst the vibrant atmosphere as they readied the evening’s headlining attraction.

In the heart of the excitement lay one such vibrant soul: A talented young musician with bright ambition named Maya. Born in these beautiful surroundings, her entire musical life took birth here and this festival would provide the opportunity to launch and shine to all that it had taught them in such beautiful musical places that surrounded their musical life so vivid so vibrant with every note so true that all her beautiful melodies sang about all that her home offered. This evening performance could bring everything it held together in their

### Zero-shot and one-short prompting for question-answering

This section shows the impact of prompting on the response. Zero-shot prompting means we provide the prompt without any examples or additional context. Let us initially ask Mistral a question using no prompting.

In [18]:
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "How do two chemicals react?"}],
    temperature = 0.8,
)

Markdown(response.choices[0].message.content)

Chemical reactions occur when two or more substances, known as reactants, interact with each other and transform into new substances, known as products. The process of chemical reaction involves the breaking and forming of chemical bonds between atoms.

Here's a simplified overview of how two chemicals react:

1. **Interaction**: Two chemicals, A and B, come into contact with each other. This can happen through a variety of mechanisms, such as mixing, diffusion, or collision.
2. **Bond breaking**: The atoms in chemical A and B begin to interact with each other, causing the existing bonds within each molecule to break. This process is called bond cleavage.
3. **Bond formation**: As the bonds within each molecule break, new bonds begin to form between the atoms of chemicals A and B. This process is called bond formation or covalent bonding.
4. **Reaction mechanisms**: The specific pathway that the reaction takes is called the reaction mechanism. This can involve a series of intermediate molecules that form and then disappear as the reaction progresses.
5. **Product formation**: The final products of the reaction are formed when the bonds have fully rearranged, resulting in a new set of compounds.

There are different types of chemical reactions, including:

1. **Combustion reaction**: A reaction that involves the burning of a substance, typically with oxygen.
2. **Synthesis reaction**: A reaction in which two or more substances combine to form a new compound.
3. **Decomposition reaction**: A reaction in which a single substance breaks down into two or more simpler substances.
4. **Replacement reaction**: A reaction in which one element or group of elements replaces another within a compound.
5. **Neutralization reaction**: A reaction in which an acid reacts with a base to form a salt and water.

To illustrate this, let's consider a simple example:

**Reaction:** 2H2 (hydrogen gas) + O2 (oxygen gas) → 2H2O (water)

In this reaction, the hydrogen gas (H2) and oxygen gas (O2) interact, breaking their existing bonds and forming new bonds to create water (H2O). This is a simple example of a synthesis reaction.

Keep in mind that chemical reactions are governed by the laws of thermodynamics and involve a change in energy. The overall change in energy can be either exergonic (releases energy) or endergonic (absorbs energy).

**Exercise:** Ask the same question but modify the prompt to return the answer to the same question in a simpler form (still using the llama-3.1-8b-instant model). Experiment with different prompts.

In [19]:
# ANSWER
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Answer the following question as though I am 10 years old. How do two chemicals react?"}],
    temperature = 0.8,
)

Markdown(response.choices[0].message.content)

Let's talk about chemicals.

Chemicals are like special kinds of friends that can mix and match with each other. When they mix, they can do different things together. This is called a chemical reaction.

Imagine you have two buckets, one with blue paint and one with yellow paint. When you mix them together, what do you get? That's right, you get green paint!

Chemicals work in a similar way. When two chemicals mix, they can create something new, like a different color or a new smell. Sometimes, they can even make something that didn't exist before.

Let's say we have two chemicals, A and B. When they mix, they create a new chemical, C. This is like making a new friend, but instead of a person, it's a chemical.

Chemical reactions can be slow or fast, and they can happen in a lot of different ways. Sometimes, they can even produce heat or light. But don't worry, most of the time, chemical reactions happen in a way that's safe and fun.

So, that's what chemical reactions are like! It's like mixing and matching special friends to create something new and cool.

### One-shot prompting ###

Next, note the dramatic change when we give the following template setting a new role and providing an English question followed by a French translation.

In [20]:
response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "system",
             "content": "You translate English to French."},
              {"role": "user",
               "content": "What time is it?"},
               {"role": "assistant",
               "content": "Quelle heure est-il?"},
              {"role": "user",
               "content": "How do two chemicals react?"}],
    temperature = 0.8,
)
print(response.choices[0].message.content)

Comment réagissent deux chimiques entre elles.


### Few-shot prompting

Recall that since the text generation process outputs one token at a time, their outputs often need adjusting. This is where examples can help.

In [21]:
prompt1 = "I'm gonna head out now, see you later."
response1 = "I will be leaving now. See you later."

prompt2 =  "That movie was super cool!"
response2 = "The movie was very impressive."

prompt3 = "Can't make it to the meeting, sorry."


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are a professional editor. Rewrite casual sentences into a formal tone."},
        {"role": "user", "content": prompt1},
        {"role": "assistant", "content": response1},
        {"role": "user", "content": prompt2},
        {"role": "assistant", "content": response2},
        {"role": "user", "content": prompt3},
    ]
)

print(response.choices[0].message.content.strip())


I regret to inform you that I will be unable to attend the meeting. Apologies for any inconvenience this may cause.


The output can also be moulded to provide SQL output.

In [22]:
prompt1 = "Show me all users who signed up in the last 30 days."
response1 = "SELECT * FROM users WHERE signup_date >= CURRENT_DATE - INTERVAL '30 days';"

prompt2 = "What is the average order value?"
response2 =  "SELECT AVG(order_total) FROM orders;"

prompt3 = "List products that are out of stock."

response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are an assistant that translates natural language to SQL."},
        {"role": "user", "content": prompt1},
        {"role": "assistant", "content": response1},
        {"role": "user", "content": prompt2},
        {"role": "assistant", "content": response2},
        {"role": "user", "content": prompt3},
    ]
)

print(response.choices[0].message.content.strip())


SELECT * FROM products WHERE quantity_in_stock = 0;


**Exercise**: Create a few examples to train the "llama3-70b-8192" LLM to take in user content in the form below and provide output as a pandas dataframe. Use the `exec` function to execute its output to display the answer of sample input as a data frame.

Example:

given the user content

"""

| col1 | col2 | col3

| 32 | 27 | 25

| 64 | 23 | 14

"""

train the model to output

df = pd.DataFrame({'col1': [32, 64], 'col2': [27, 23], 'col3': [25, 14]})



In [23]:
#ANSWER
user1 = """col1 | col2 | col3
32 | 27 | 25
64 | 23 | 14
"""

output1 = """
df = pd.DataFrame({'col1': [32, 64], 'col2': [27, 23], 'col3': [25, 14]})
"""

user2 = """col1 | col2
23 | 12
8 | 76
7 | 5
"""
output2 = """
df = pd.DataFrame({'col1': [23, 8, 7], 'col2': [12, 76, 5]})
"""
user3 = """colA | colB | colC
23 | 12 | 54
8 | 76 | 32
7 | 5 | 3
"""


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are a data scientist who will receive data input as a string and provide output as a pandas dataframe called df. Use the examples to guide you"},
        {"role": "user", "content": user1},
        {"role": "assistant", "content": output1},
        {"role": "user", "content": user2},
        {"role": "user", "content": output2},
        {"role": "user", "content": user3}
    ]
)

exec(response.choices[0].message.content.strip()) # string executed as Python code
df


Unnamed: 0,colA,colB,colC
0,23,12,54
1,8,76,32
2,7,5,3


Also show what happens when the question is asked in the absence of a system role and without few-shot prompting.

In [24]:
# ANSWER
response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "user", "content": user3}
    ]
)
response.choices[0].message.content.strip()

"It looks like you provided a table with three columns: colA, colB, and colC, and three rows of data. Is there something specific you'd like to do with this data, such as filter, sort, or perform a calculation?"

### Chain-of-thought prompting

The results of question-answering can also be improved by prompting the LLM to provide intermediate steps.

**Exercise**: Using the following prompts, compare the answers of the "llama3-8b-8192" model (set seed=21). (If this model is no longer available choose a model with relatively few parameters.)

zero_shot_prompt = "How many s's are in the word 'success'?"

chain_of_thought_prompt = "How many s's are in the word 'success'? Explain your answer step by step by going through each letter in turn."

In [25]:
# ANSWER
zero_shot_prompt = "How many s's are in the word 'success'?"
chain_of_thought_prompt = "How many s's are in the word 'success'? Explain your answer step by step by going through each letter in turn."

response1 = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": zero_shot_prompt}],
    seed = 21
)

response2 = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": chain_of_thought_prompt}],
    seed = 21
)

print('------zero-shot-prompt------')
print(response1.choices[0].message.content)

print('------chain-of-thought------')
print(response2.choices[0].message.content)

------zero-shot-prompt------
There are 2 s's in the word 'success'.
------chain-of-thought------
To count the number of 's's in the word "success", I will go through each letter one by one:

1. S
   There is 1 's' so far.

2. U
   There's an 'U' and still 1 's' so far.

3. C
   There's a 'C', and still 1 's' so far.

4. C
   There's another 'C', and still 1 's' so far.

5. E
   There's an 'E', and still 1 's' so far.

6. S
   There's another 's', which means there are 2 's's so far.

7. S
   There's a third 's', which means there are 3 's's so far.

8. S
   There's a fourth 's', which means there are 4 's's so far.

There are 4 's's in the word "success".


## Comparison of LLMs

**Exercise**: Compare the performance of 2 LLMs by outputting the answers of the following questions into a dataframe.

    "Tell me a joke about data science.",
    "How can one calculate 22 * 13 mentally?",
    "Write a creative story about a baby learning to crawl.",

Column headings:

Model Name | Question | Answer

In [26]:
pd.set_option('display.max_colwidth', None) # allows wide dataframes to be viewed
models = ["gemma2-9b-it", "llama-3.1-8b-instant"] #can edit this

# ANSWER
prompts = [
    "Tell me a joke about data science.",
    "How can one calculate 22 * 13 mentally?",
    "Write a creative story about a baby learning to crawl.",
]

results = {'Model Name': [], 'Question': [], 'Answer': []}

for model in models:
    for prompt in prompts:
        results['Model Name'].append(model)
        results['Question'].append(prompt)
        try:
            output = client.chat.completions.create(model = model, messages=[{"role": "user", "content": prompt}])
            results['Answer'].append(output.choices[0].message.content.strip())

        except Exception as e:
            print(f"Error with {model}: {e}")
            results['Answer'].append((prompt, "ERROR"))


df = pd.DataFrame(results)
df

Unnamed: 0,Model Name,Question,Answer
0,gemma2-9b-it,Tell me a joke about data science.,Why did the data scientist break up with the statistician? \n\nBecause they had too many p-values and not enough real-life applications! 😂 \n\n\nLet me know if you'd like to hear another one! 😄
1,gemma2-9b-it,How can one calculate 22 * 13 mentally?,"Here's how to calculate 22 * 13 mentally using a few tricks:\n\n**1. Break it Down**\n\n* Think of 22 as (20 + 2). \n* Now you have: (20 + 2) * 13\n\n**2. Distribute**\n\n* Multiply 20 by 13: 20 * 13 = 260\n* Multiply 2 by 13: 2 * 13 = 26\n\n**3. Add the Results**\n\n* 260 + 26 = 286\n\n\n**Therefore, 22 * 13 = 286**"
2,gemma2-9b-it,Write a creative story about a baby learning to crawl.,"Pipkin didn't understand. Everyone was telling him ""sit up"", ""roll over"", ""smile"", but where was the fun in that? He much preferred the lie-flat-and-stargaze kind of fun.\n\nHis world was a kaleidoscope of colours - Mama's brightly patterned dress, the sunshine painting squares on the rug, the stripy blanket Auntie Clara had knitted. It was all so absorbing, so fascinating, that the urge to reach out, to explore, was bubbling inside him like a fizzy drink.\n\nOne bright morning, something happened. Pipkin's chubby hand brushed against a stray button on his rocking horse. For the first time, his hand scrabbled for something beyond the familiar expanse of his blanket. He reached, his other hand mimicking the movement in a clumsy mirror image. He pulled himself forward, his legs kicking erratically, his brow furrowed in concentration.\n\nHe screeched with triumphant joy, a gargling sound that always made his Mama giggle. He felt a surge of exhilaration, a thrill of conquest. ""Again!"" he seemed to be saying, though all that came out was a happy gurgle.\n\nAnd again he tried, again he pulled himself forward. Each inch was a victory, each wobbly attempt a lesson learned. His little limbs, propelled by a newfound determination, navigated the obstacles of the rug. His eyes, wide and full of wonder, tracked the path ahead.\n\nMama watched, her smile widening with every scrabble and stretch. She clapped her hands, cheering him on, ""That's it, Pipkin, you're doing it! Crawl, crawl, crawl!""\n\nPipkin didn't know what ""crawl"" meant, but he understood the feeling. The feeling of movement, of control, of the world opening up before him. He crawled towards a brightly-coloured mobile hanging above his crib. He reached for it, his tiny fingers grasping at the dangling toys.\n\nHe didn't quite reach, but he didn't care. He had tasted freedom, the sweet freedom of movement. He had discovered the magic of crawling. And as he lay back on his rug, a happy sigh escaping his lips, Pipkin knew that this was only the beginning of his grand adventures. \n\n\nThe world was his to explore, and he was ready to crawl."
3,llama-3.1-8b-instant,Tell me a joke about data science.,"Why did the data scientist quit his job?\n\nBecause he couldn't regression to his old ways of making a living and had too many correlations to his previous complaints, but ultimately he just needed a feature upgrade in his life."
4,llama-3.1-8b-instant,How can one calculate 22 * 13 mentally?,"To calculate 22 * 13 mentally, you can use the technique of breaking down the numbers and then multiplying them partially.\n\nOne way to do it is:\n\n1. Break down the multiplication into easier parts: (20 * 13) + (2 * 13)\n2. Calculate (20 * 13) = (20 * 10) + (20 * 3) \n (20 * 10) = 200 \n (20 * 3) = 60 \n So (20 * 13) = 200 + 60 = 260 \n3. Then calculate (2 * 13)\n (2 * 10) = 20 \n (2 * 3) = 6 \n So (2 * 13) = 20 + 6 = 26 \n4. Add both results: (260 + 26) = 286 \n\nTherefore, 22 * 13 = 286."
5,llama-3.1-8b-instant,Write a creative story about a baby learning to crawl.,"**The Crawling Adventure**\n\nIn a cozy little house on a quiet street, a tiny miracle was waiting to unfold. Baby Emma, with her soft, fluffy hair and curious eyes, was about to embark on her most epic adventure yet: learning to crawl.\n\nEmma's earliest memories were of being held close by her mom, feeling the warmth and love of her touch. She'd gaze up at her mother's smiling face, watching as she played with her fingers, blew gentle kisses, and whispered sweet nothings. But as the days went by, Emma began to feel an inexplicable itch – a sense that there was more to explore, more to discover.\n\nOne morning, as the sun peeked through the windows, casting a warm glow over the room, Emma started to stir. She pushed herself up onto her forearms, her eyes scanning the surroundings as if searching for secrets hidden in the crevices of the furniture. Her mom, watching from a nearby chair, smiled knowingly.\n\n""Today's the day!"" she whispered, setting down her book and creeping toward Emma.\n\nAt first, Emma was hesitant, her little hands grasping the soft blanket beneath her. But with each gentle urging from her mom, she began to shift, ever so slightly. Her legs twitched, her torso moved, and – in a moment of pure bliss – she launched herself forward, propelled by pure, unadulterated momentum.\n\nThe room seemed to spin as Emma crawled, her tiny hands grasping at the air, her feet wobbling in mid-air. Her mom laughed, her eyes shining with pride, as she carefully positioned herself beside Emma.\n\n""Whoa, you're moving!"" she exclaimed, scooping up Emma in a triumphant hug.\n\nWith each try, Emma grew bolder, her confidence growing like a bloom of sunflowers in the spring. She inched closer to the bookshelf, fascinated by the towering stacks of picture books. She crept along the couch, exploring the crevices beneath the cushions. Even the carpet became a magical playground, as Emma discovered the delight of rolling onto her tummy, arms and legs splayed wide.\n\nOne afternoon, as the sun began to set, casting long shadows across the room, Emma reached the summit – a pile of colorful toys at the far end of the room. With a squeal of excitement, she launched herself toward the treasure trove, her tiny hands grasping for the soft, wiggly creatures. Her mom cheered her on, snapping photos to capture the moment.\n\nFrom that day forward, Emma's world expanded exponentially. Every step, every crawl, and every wobble became a thrilling adventure, a journey of discovery and growth. And as she looked up at her mom, beaming with pride, Emma knew she was no longer just a baby – she was a brave, fearless explorer, ready to take on the world."


### Bonus

See if you can prompt an LLM to perform sentiment analysis (output 'Positive' or 'Negative' only) on a given piece of text.

In [27]:
# ANSWER
input1 = "I absolutely loved the way the story unfolded."
output1 = "Positive"

input2 = "The food was cold and completely flavorless."
output2 = "Negative"

input3 = "She handled the situation with grace and professionalism."


response = client.chat.completions.create(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are amazing at sentiment analysis. Give the sentiment of the next sentence as the examples show."},
        {"role": "user", "content": input1},
        {"role": "assistant", "content": output1},
        {"role": "user", "content": input2},
        {"role": "assistant", "content": output2},
        {"role": "user", "content": input3},
    ]
)
response.choices[0].message.content

'Positive'

## Conclusion

We worked with a few Large Language Models (LLMs) using Groq and experimented with prompting for summarisation, text completion and question-answering tasks.

We also explored controlling the randomness (creativity) of output through the temperature setting and tried different types of prompting to achieve desired forms of output.

## References
1. [Groq's prompting guide](https://console.groq.com/docs/prompting)
2. [Groq's playground](https://console.groq.com/playground)



---



---



> > > > > > > > > © 2025 Institute of Data


---



---



