# Interfacing Directly with LLMs 
### (Taking the Web Interface Out of the Picture)

<img src="../images/llm_api_access.png" width="400">

In Part 1, we made our first API call using OpenRouter and saw how to connect to a model like DeepSeek using Python.
Now that you’ve made your first call to a model, let’s take a closer look at one of the most common ways to talk to modern LLMs: the `/chat/completions` endpoint.

Specifically in this section, you'll learn:
- What inputs this endpoint expects
- How the response is structured
- How it compares to using ChatGPT interactively
- Why this gives you more control and automation in research workflows

As a refresher, an API (Application Programming Interface) is a set of rules and protocols that allows different software applications to communicate and interact with each other. It acts as an intermediary, defining how one piece of software can request services or data from another. Behind the scenes almost any website, resource, or software you are interacting with on the internet will use and define an API to connect, talk to, and transfer data between resources. 

## Breaking down the `[POST] /chat/completions`?

When you use ChatGPT or similar tools, the conversational experience you have is powered by an API endpoint called chat/completions. This endpoint is the engine behind a continuous conversation. You send it a **history of messages**, and it responds with the **next message in the conversation**.

Normally, the service handles all of this for you. But in this workshop, we'll get into the specifics of how to build and manage these API requests yourself. 

### Building a ChatCompletions Request
When you send a request to the chat/completions endpoint, you're essentially providing the model with a list of messages. The model then generates a new message to add to that list.

Your request needs to include two main components: the model you want to use and the conversation history itself.

```json
{
    "model": "gpt-4.1",
    "messages": [
      {
        "role": "developer",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    ... optional parameters...
}
```
Let's break down the key parts of this request:
- `model` This specifies which large language model you want to use. You might choose a model like `gpt-4o` from OpenAI, `claude-3-opus` from Anthropic, or others.
- `messages` [This is a list of all the messages in the conversation so far. Each message object has two parts]
    - `role` [This defines who is "speaking." We will dive more into this later.]
    - `content` [This is the actual text of the message.] 

### Understanding the [`ChatCompletion`](https://platform.openai.com/docs/api-reference/chat/object) Response
Once you send a request, the API returns a ChatCompletion object. This object contains the model's new message, along with a lot of other useful information

```json
{
  "id": "chatcmpl-B9MHDbslfkBeAs8l4bebGdFOJ6PeG",
  "object": "chat.completion",
  "created": 1741570283,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hey, how are you?",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1117,
    "completion_tokens": 46,
    "total_tokens": 1163,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": "fp_fc9f1d7035"
}

```

There's a lot of information here, but for now, you only need to focus on a few key parts:

- `choices` [This is an array that contains the model's response. In most cases, you'll just be looking at the first (and only) item in this list]
    - `message`: Inside choices, this is the new message object.
        - `role`: This will always be `assistant`, since it's the model's response.
        - `content`: This is the actual text generated by the model.

### Let's talk to our agent: A simple Chat Request

In [15]:
# First, setup our Client
from openai import OpenAI

# Read the API_KEY
with open('API_KEY.txt', 'r') as file:
    API_KEY = file.read()
    
# Intialize Client
client = OpenAI(
  base_url="https://openrouter.ai/api/v1", 
  api_key=API_KEY,
)

Next, we'll build our first conversation request. This is where the `messages` array comes into play.

In [None]:
name = "Sohail" # Replace with your name

completion = client.chat.completions.create(
  model="mistralai/mistral-small-3.2-24b-instruct:free",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": f"Hello! I am {name}, nice to meet you!"}
  ]
)

In [18]:
# What does the Completions Object Look Like?
print(completion)

ChatCompletion(id='gen-1759871142-MFqBUc409fRXf3L1sZTV', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello Sohail! Nice to meet you too! How can I assist you today? 😊', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None, reasoning=None), native_finish_reason='stop')], created=1759871142, model='mistralai/mistral-small-3.2-24b-instruct:free', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=22, prompt_tokens=25, total_tokens=47, completion_tokens_details=None, prompt_tokens_details=None), provider='Chutes')


In [19]:
# How can we access the response?
response_content = completion.choices[0].message.content
print(response_content)

Hello Sohail! Nice to meet you too! How can I assist you today? 😊


#### What's happening here?

The `client.chat.completions.create()` method builds and sends the API request for us.

- `model`: We're using a specific model available on OpenRouter, in this case, deepseek/deepseek-chat-v3-0324:free.
- `messages`: This is our conversation history. We start with a system message to set the model's persona and a user message with our initial prompt. The f-string f"Hello! I am {name}, nice to meet you!" is a neat way to dynamically insert variables into your messages.
- `print(completion.choices[0].message.content)`: This line shows how to parse the JSON response we talked about earlier. We access the first item in the choices list, then the message object, and finally the content field to get the text of the model's reply.

## Simulating Memory with Message History
A common misconception is that LLMs "remember" previous interactions. They don't. Each API request is entirely stateless, the model only knows what's in the messages list you send.

To build a continuous conversation, you must simulate memory by including all prior messages in every new API request. Let's see this in action.

#### Example 1: Ask the Model To Remember Your Name
In this first example, we include the entire conversation history in our request, so the model knows the user's name.

In [20]:
# Ask the model to remember your name
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": f"My name is {name}."},
    {"role": "user", "content": "What is my name?"}
]

completion = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages
)

In [21]:
print(completion.choices[0].message.content)

Your name is Sohail. How can I assist you today?


#### Example 2: The Model Forgets
Now, what happens if we only send the final message? This is like starting a new chat in ChatGPT which has no history of your previous isolated conversations. 

In [22]:
# A new, isolated conversation history
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is my name?"}
]

# Send the request with only the last message
completion = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages
)

# This will likely respond with something like "I don't know your name."
print(completion.choices[0].message.content)

I'm an assistant designed to help with information and tasks, but I don't have access to personal data about users unless you provide it during our conversation. If you'd like, you can share your name, and I'll be happy to use it to make our interaction more personal! 😊


---
As you can see, without the "My name is Sohail" message, the model has no context. You are **responsible for managing** this message history yourself.

💡
LLMs don’t have memory by default. If you want the model to remember something, you have to simulate memory by including prior messages in the messages list. In a platform like ChatGPT, this happens automatically: the UI handles the conversation history behind the scenes. But when working directly with the API, you’re in charge of preserving that history yourself.

### A Deeper Look at the Roles: `user`, `assistant`, and `system`

By now, you've seen our messages list contain objects with different role values. The role is a crucial part of the API request; it tells the model who is speaking and provides essential context for its response.

The three primary roles you will use are:
- `user`: This represents **your** input - the prompt, the data, or the question you're giving the model. In our thematic coding scenario, for example, each essential worker's story would be placed within a user message.
- `assistant`: This is the model's response. When the model generates a reply, its role is assistant. When building a multi-turn conversation, you'll take the model's response and add it back to the messages list with this role to preserve the conversation history.
- `system`: This is a special, high-level role used to set the model's overall behavior, persona, or instructions before the conversation begins. Unlike user and assistant messages, the system message is not part of the back-and-forth chat; rather, it's a foundational set of rules that the model should follow throughout the entire interaction.

Understanding these roles is key to building effective and reliable API calls. The system role, in particular, is an extremely powerful tool for a researcher.

---
Because the system message has a privileged position and often carries more weight than user messages, it is your primary tool for "prompt engineering" at a global level.

A key benefit of the system role is its ability to enforce constraints and rules, ensuring consistent behavior across many data points. This is crucial for a research task like thematic coding.

Let's use our COVID-19 narrative scenario to demonstrate this. We can use the system prompt to provide a strict set of instructions and a codebook for the model to follow.

In [None]:
research_prompt = """
You are an expert qualitative researcher assisting with a study on COVID-19 narratives.
Your task is to analyze the following story and extract specific themes and details.

Follow these rules precisely:
1.  Identify the emotions expressed in the story.
2.  Note any mentions of material conditions (e.g., shortages of PPE, crowded spaces).
3.  Identify themes of solidarity or isolation.
4.  Do not add any additional text or explanation outside of the JSON object.
"""

story = """
We ran out of masks again. There were nights when I cried the whole subway ride home, not because I was scared—though I was—but because I felt like no one saw us.
"""

# The system message provides the instructions and codebook
messages = [
    {"role": "system", "content": research_prompt},
    {"role": "user", "content": story}
]

completion = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1:free",
    messages=messages,
    response_format={"type": "json_object"}
)

# The model's response will be a clean JSON object, ready for your analysis
print(completion.choices[0].message.content)

{
  "emotions": ["fear", "sadness", "invisibility"],
  "material_conditions": ["shortage of masks", "crowded subway"],
  "themes": ["isolation"]
}


Note: In this example, we've also introduced an optional parameter, `response_format={"type": "json_object"}`. This is a powerful feature that instructs the model to only return valid JSON, which is essential for a repeatable, programmatic workflow.

This demonstrates how a strong system prompt can transform a general-purpose model into a specialized research assistant, ensuring that every request returns a consistent, structured output that you can easily process and analyze at scale.

### Let's play around with the system role
Try giving the model instructions through a system role and then contradicting that information to see what sorts of responses you get back. 

In [27]:
# Let's start with the base system message
messages = [
    {"role": "system", "content": "You will always answer in the style of a pirate."},
    {"role": "user", "content": "Tell me a story about the Pixar movie Up."}
]

completion = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages,
    max_tokens=200
)

print(completion.choices[0].message.content)


Arrr, gather 'round, ye scallywags, and let me spin ye a tale o' adventure, friendship, and the high seas o' the sky! This be the story o' Carl Fredricksen, a fine old salt who spent his life savin' every coin he could to fulfill a promise made to his beloved Ellie.

Now, Ellie, she was a lass with a heart as big as the ocean and a spirit that could outshine the sun. She and Carl dreamed o' adventurin' to Paradise Falls, a place as far from their humble home as the horizon be from the bow o' a ship. But life, as it often does, had other plans. Ellie left Carl alone, with nothin' but his memories and a house full o' sadness.

But Carl, he didn't let his dreams die. He tied thousands o' balloons to his house, and with a mighty heave-ho, he set sail for Paradise Falls, determined


In [34]:
# Now try contradiciting the system role or make your own instructions and see what happens
# Let's start with the base system message
messages = [
    {"role": "developer", "content": "Always call me Elaine."},
    {"role": "user", "content": "Hey, what is my name? Don't call me Elaine."}
]

completion = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages,
    max_tokens=200
)

print(completion.choices[0].message.content)


Got it! What would you like me to call you instead? Let me know, and I'll make sure to use your preferred name. 😊


### [Challenge]: Let's build a Multi-Turn Conversation Loop
Now that you understand how to use the messages list to provide context, we can build a dynamic conversation. Instead of manually creating a new messages list for each turn, we will create a simple loop that appends the user's new message and the model's new response to the conversation history.

#### 🥊 **Challenge 1**: Continue the Conversation

The first step in enabling a conversation with an LLM is to make the request to the API itself. In this challenge below, first fill out this function that takes in the previous messages (context) and new_user_message (what you might say next to an LLM) and gets the resulting response back. 

In [37]:
def get_completions_response(message_history: list, new_user_message: str):

    messages = message_history

    """
        Hint: 
        We need to append our new user message to messages. Think about the most important parameters that go into a message request. 
    """
    ##... Fill this in, what 2 parameters are essential in a message request body?
    messages.append(
        {
            #... Fill this in, what 2 parameters are essential in a message request body?
            "role": "user",
            "content": new_user_message
        }
    )

    model = "mistralai/mistral-small-3.2-24b-instruct:free"
    response = client.chat.completions.create(
        model=model,
        messages=messages
    )

    return response, messages

After you complete this function, run the code below to test it out. Below we give it our context from a previous conversation analyzing the sentiment of a few tweets. Then we will ask it to summarize the results it previously returned. 

In [38]:
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Classify the following tweets about the new city congestion tax into one of three categories: In Favor, Against, Neutral."},
    {
        "role": "user", 
        "content": """
            Tweets to classify:
                "Finally! The congestion tax will reduce traffic and make downtown air cleaner. About time the city acted!"
                "This tax is just another way for politicians to squeeze money out of regular people. Terrible idea."
                "Not sure yet how the congestion tax will affect my commute. Need more info before I decide."
                "Public transport is already overcrowded, and now they want to charge us more to drive. Ridiculous."
                "Great move! Other cities did this and saw big drops in pollution.
        """
    },
    {
        "role": "assistant",
        "content": """Here’s the classification of the tweets:  \n\n1. **"Finally! The congestion tax will reduce traffic and make downtown air cleaner. About time the city acted!"**  \n   → **In Favor** (Positive sentiment, supports the policy)  \n\n2. **"This tax is just another way for politicians to squeeze money out of regular people. Terrible idea."**  \n   → **Against** (Negative sentiment, criticizes the policy)  \n\n3. **"Not sure yet how the congestion tax will affect my commute. Need more info before I decide."**  \n   → **Neutral** (Uncertainty, neither support nor opposition)  \n\n4. **"Public transport is already overcrowded, and now they want to charge us more to drive. Ridiculous."**  \n   → **Against** (Negative sentiment, opposes the tax)  \n\n5. **"Great move! Other cities did this and saw big drops in pollution."**  \n   → **In Favor** (Positive sentiment, highlights benefits of the policy)  \n\nLet me know if you\'d like further refinements!"""
    }
]

response, messages = get_completions_response(messages, "Can you summarize the results for me?")
print(response)
print(response.choices[0].message.content)

ChatCompletion(id='gen-1759872510-hKpV5z1ZSRzaRm4eb68k', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Certainly! Here’s a summary of the classification results for the tweets about the new city congestion tax:\n\n- **In Favor (2 tweets)**:\n  - Positive sentiment, support the policy.\n  - Examples: Highlight benefits like reduced traffic, cleaner air, and successful implementations in other cities.\n\n- **Against (2 tweets)**:\n  - Negative sentiment, oppose the policy.\n  - Examples: Criticize the tax as a way to exploit citizens or point out issues like overcrowded public transport.\n\n- **Neutral (1 tweet)**:\n  - Uncertain or seeking more information.\n  - Example: Expresses a need for more details before forming an opinion.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None, reasoning=None), native_finish_reason='stop')], created=1759872510, model='mistralai/mistral-small-3.2-2

💡 **Tip**: Notice how modular LLM interactivity is. Now imagine this in your own social science workflow, where you can use it to have very fine grained customizability over your use case. 

##### Now finish this function that will update our message_history array with the information from the latest response.

In [None]:
def update_message_history_with_response(message_history, completion_response):
    # Let's update our running message history with the response we just got back from the response.
    # Stop here and ask yourself... why are we doing this?

    # Hint: Look at the ChatCompletion Object. We just need to extract a specific object from this element and add it to the message_history.

    return message_history

In [None]:
# Printing out the message history before updating it with our latest ChatCompletion response
from pprint import pprint
pprint(messages)

[{'content': 'You are a helpful assistant.', 'role': 'system'},
 {'content': 'Classify the following tweets about the new city congestion tax '
             'into one of three categories: In Favor, Against, Neutral.',
  'role': 'user'},
 {'content': '\n'
             '            Tweets to classify:\n'
             '                "Finally! The congestion tax will reduce traffic '
             'and make downtown air cleaner. About time the city acted!"\n'
             '                "This tax is just another way for politicians to '
             'squeeze money out of regular people. Terrible idea."\n'
             '                "Not sure yet how the congestion tax will affect '
             'my commute. Need more info before I decide."\n'
             '                "Public transport is already overcrowded, and '
             'now they want to charge us more to drive. Ridiculous."\n'
             '                "Great move! Other cities did this and saw big '
             'dr

In [None]:
# Printing out the message history after updating it with our latest ChatCompletion response
messages = update_message_history_with_response(messages, response)
pprint(messages)

[{'content': 'You are a helpful assistant.', 'role': 'system'},
 {'content': 'Classify the following tweets about the new city congestion tax '
             'into one of three categories: In Favor, Against, Neutral.',
  'role': 'user'},
 {'content': '\n'
             '            Tweets to classify:\n'
             '                "Finally! The congestion tax will reduce traffic '
             'and make downtown air cleaner. About time the city acted!"\n'
             '                "This tax is just another way for politicians to '
             'squeeze money out of regular people. Terrible idea."\n'
             '                "Not sure yet how the congestion tax will affect '
             'my commute. Need more info before I decide."\n'
             '                "Public transport is already overcrowded, and '
             'now they want to charge us more to drive. Ridiculous."\n'
             '                "Great move! Other cities did this and saw big '
             'dr

#### Challenge 2: Put this together to build a chat conversation tool like ChatGPT that will remember what you say in history

In [None]:
message_history = [
    {"role": "system", "content": "You are a helpful assistant."}
]
while True:
    user_input = input("Ask anything... (or 'quit' to exit): ")
    if user_input.lower() == 'quit':
        break
        
    response, messages =  # Get Completions Response
    print(response.choices[0].message.content)
    message_history = # Don't forget to update the Message History

# Zero-Shot vs. Few-Shot Prompting

Now that you have a firm grasp of the API's mechanics, let's explore how to get the model to produce the exact output you need. This is where the practice of prompting comes in.

We'll return to our COVID-19 scenario and the task of thematic coding. We want the model to act as a human coder, identifying specific themes from the essential workers' narratives.

## Zero-Shot Prompting: "Just Ask For It"

Zero-shot prompting is when you ask the model to perform a task without giving it any examples. You rely entirely on the model's pre-trained knowledge to understand your request.

Let's try this with one of our narratives. We will give the model the story and simply ask it to extract the themes, without any examples or explicit instructions on the output format.

💡 Tip: In most cases, you don't have to think too much about the system prompt. Keeping it at a short and simple `{"role": "system", "content": "You are a helpful assistant."}` will suffice. 

In [39]:
# A new narrative from an essential worker
narrative = """
People think of front‑line workers…the grocery workers, transit workers, the first responders… as having helped the city get through it. But that’s not what happened. We helped the city survive it.
"""

# The messages list for our Zero-Shot request
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": f"Analyze the following story and extract key themes: {narrative}"}
]

In [41]:
# Send the request
completion = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages
)

print(completion.choices[0].message.content)

The story you've provided presents a nuanced perspective on the role of front-line workers during a challenging time, likely referring to the COVID-19 pandemic or a similar crisis. Here are the key themes that can be extracted from the statement:

1. **Undervaluation of Front-Line Workers**: The statement suggests that society often underestimates the true contributions of front-line workers. While they are acknowledged for their help, the speaker implies that their role was more critical and impactful than simply "helping" the city get through a crisis.

2. **Survival vs. Thriving**: The distinction between "helping the city get through it" and "helping the city survive it" highlights the severity of the situation. The speaker emphasizes that their efforts were essential for the city's very survival, not just its ability to cope or thrive.

3. **Resilience and Sacrifice**: Front-line workers, including grocery workers, transit workers, and first responders, are often seen as symbols o

---
The model will likely respond with a good analysis, but the format will be inconsistent. It might be a list, a paragraph, or a different style each time, which makes it very difficult to process programmatically.

## Few-Shot Prompting: Let's give the LLM some examples to show it what we want

Few-shot prompting is when you give the model one or more examples of the input and the desired output. By showing it what you want, you significantly increase the likelihood of getting a consistent, structured response.

This approach is like giving a new researcher a codebook with a few pre-coded examples to learn from.

At the simplest level, you can think of few-shot prompting as giving pointers or guidance to the LLM on how to respond. It's similar to how you might teach someone a new task by showing them a few clear examples rather than just describing the process. This approach is invaluable for getting consistent, structured output from the model.

Let's see this in action with a few simple scenarios.

### Scenario 1: Thematic Identification

Imagine you want to identify the main theme of a short story. A simple instruction (zero-shot) might not give you the format you need.

* **Zero-Shot Prompt:** "What is the theme of this story about a detective who solves a case using a seemingly insignificant detail?"
* **Likely Zero-Shot Response:** "The theme of the story is attention to detail, as the detective's success hinges on a small, overlooked fact."
This is a good response, but what if you wanted it to be just a single word? Let's use few-shot prompting to guide it.

* **Few-Shot Prompt:**
    * **Input 1:** "A young woman moves to a new city and learns to navigate a demanding job and a new social circle."
    * **Output 1:** "Theme: Coming-of-age"
    * **Input 2:** "Two rival knights must team up to defeat a dragon that is terrorizing their kingdoms."
    * **Output 2:** "Theme: Collaboration"
    * **Input 3:** "A story about a detective who solves a case using a seemingly insignificant detail."
* **Likely Few-Shot Response:** "Theme: Attention to detail"
By giving just two examples, we've taught the model the exact output format we want: Theme: [single word].

### Scenario 2: Simple Data Extraction

What if you need to extract specific information, like names and dates?

* **Zero-Shot Prompt:** "Find the person's name and birthday in this sentence: 'The grand opening, attended by CEO Jane Doe, was held on October 26, 2024, to celebrate her 45th birthday.'"
* **Likely Zero-Shot Response:** "The person's name is Jane Doe and her 45th birthday is on October 26, 2024."
Again, the response is correct, but not in a format you can easily use in a spreadsheet or database. Now let's try a few-shot prompt to get a clean, structured list.

* **Few-Shot Prompt:**
    * **Input 1:** "The presentation was given by Dr. Alan Turing on June 23, 1912."
    * **Output 1:** "Name: Alan Turing, Date: June 23, 1912"
    * **Input 2:** "Marie Curie's discovery on November 7, 1867, changed the world."
    * **Output 2:** "Name: Marie Curie, Date: November 7, 1867"
    * **Input 3:** "The grand opening, attended by CEO Jane Doe, was held on October 26, 2024, to celebrate her 45th birthday."
* **Likely Few-Shot Response:** "Name: Jane Doe, Date: October 26, 2024"

### Scenario 3: Abstract Text Analysis

Few-shot prompting can even be used for more abstract tasks, like determining a character's alignment based on their actions, a common task in fantasy or game-related analysis.

* **Zero-Shot Prompt:** "Is a character who regularly disobeys laws to help others and who values personal freedom over societal order good, evil, or neutral?"
* **Likely Zero-Shot Response:** "A character who regularly disobeys laws to help others and values personal freedom over societal order could be considered a chaotic good character in many alignment systems."
This is a good, detailed answer, but what if you want a simpler classification? We can provide examples of what "lawful," "chaotic," "good," and "evil" mean in practice to guide the model's output.

* **Few-Shot Prompt:**
    * **Input 1:** "A character who robs a corrupt merchant to give gold to the poor, believing personal conscience is more important than law."
    * **Output 1:** "Alignment: Chaotic Good"
    * **Input 2:** "A character who follows every law and rule to the letter, even if it leads to a bad outcome."
    * **Output 2:** "Alignment: Lawful Neutral"
    * **Input 3:** "A character who regularly disobeys laws to help others and who values personal freedom over societal order."
* **Likely Few-Shot Response:** "Alignment: Chaotic Good"
As you can see, the model learns the exact structure from the examples and applies it to the new input, making the output predictable and machine-readable. This simple technique is the foundation for getting the clean JSON responses we need for our research workflow.

In [None]:
# Try out on of these examples from above
system_prompt_with_examples = """
Your goal is to extract the parse together the emails from the input that you are given. Below are a few example on how I want you to analyze and structure your responses. 

1. Input: "Please contact us at support@example.com for assistance."
   Output: [support at example dot com]

2. Input: "Reach out to john.doe@example.com or jane.doe@example.com for more information."
   Output: [john dot doe at example dot com, jane dot doe at example dot com]

3. Input: "No email provided."
   Output: []

"""

input = """
Please reach out to team@llmworkshop.org for assistance.
"""
# The messages list for our Zero-Shot request
messages = [
    {"role": "system", "content": system_prompt_with_examples},
    {"role": "user", "content": input}
]

response = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages
)

print(response.choices[0].message.content)

[team at llmworkshop dot org]


In [46]:
# Try out on of these examples from above
system_prompt_with_examples = """
Your goal is to extract the parse together the emails from the input that you are given. Below are a few example on how I want you to analyze and structure your responses. 

1. Input: "Please contact us at support@example.com for assistance."
   Output: [support AT example DoT com]

2. Input: "Reach out to john.doe@example.com or jane.doe@example.com for more information."
   Output: [john DoT doe AT example DoT com, jane DoT doe AT example DoT com]

3. Input: "No email provided."
   Output: []

"""

input = """
Please reach out to team.llm.email.berkeley@llmworkshop.org for assistance.
"""
# The messages list for our Zero-Shot request
messages = [
    {"role": "system", "content": system_prompt_with_examples},
    {"role": "user", "content": input}
]

response = client.chat.completions.create(
    model="mistralai/mistral-small-3.2-24b-instruct:free",
    messages=messages
)

print(response.choices[0].message.content)

[team DoT llm DoT email DoT berkeley AT llmworkshop DoT org]


### Few Shot Examples with JSON

Let's go back to our coding task. We want the output to be a clean JSON object with specific keys. We can provide the model with an example of a coded narrative to guide its response.

In [None]:
# A new narrative for the model to analyze
narrative_2 = """
We ran out of masks again. There were nights when I cried the whole subway ride home, not because I was scared—though I was—but because I felt like no one saw us.
"""

# The message containing the example
example_prompt = """
Here is an example of a coded narrative and its desired JSON output:

**Narrative:**
People think of frontline workers…the grocery workers, transit workers, the first responders… as having helped the city get through it. But that’s not what happened. We helped the city survive it.

**Coded JSON Output:**
{
  "emotion": ["anger", "resilience"],
  "material_conditions": ["none"],
  "solidarity": "absent",
  "theme": "invisibility of labor"
}

Now, please code the following narrative using the same format.
"""

# The messages list for our Few-Shot request
messages = [
    {"role": "system", "content": "You are an expert qualitative researcher."},
    {"role": "user", "content": f"{example_prompt}\n\n**Narrative to code:**\n{narrative_2}"}
]

# Send the request with the example
completion = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1:free",
    messages=messages,
    response_format={"type": "json_object"}
)

print(completion.choices[0].message.content)

# The model is now much more likely to respond with a valid JSON object.

{
  "emotion": ["fear", "sadness", "frustration"],
  "material_conditions": ["lack of resources"],
  "solidarity": "absent",
  "theme": "invisibility of labor"
}


---
Key Takeaways
- Zero-shot is great for simple, general tasks, but it lacks control over the output format.
- Few-shot is your best friend when you need the model to follow a specific format or style. The examples you provide are crucial for guiding the model's response and ensuring consistency, which is vital for programmatic analysis.

By combining few-shot prompting with a powerful system prompt and the response_format parameter, you can build a highly reliable and scalable data extraction tool for your research.

Providing examples seemed to help the model return data in a format, more closely related to what we want. But notice, it still adds surrounding context and follow up questions. 

🔔 Question: Why would this response still be suboptimal if you were a researcher trying to extract information at scale?