# Playing with LLM APIs

In this notebook, we will use large language models (LLMs) through an API. Most LLM providers offer an API to use their LLMs programmatically (for instance, with Python).

[Groq](https://groq.com/) provides access for free (as of October 2025) to open LLMs, like Llama from Meta or GPT OSS from OpenAI ([list of models](https://console.groq.com/docs/models)).

## Get credentials and API key

- Create a free account on Groq: <https://console.groq.com/login>

  You can use your GitHub or Google account for authentication.
- Then go on [Groq's API keys](https://console.groq.com/keys) page
- Click the button "+ Create API Key"
- Chose a name for your key.
- Copy your API key and paste it somewhere. For security reason, your API key won't be shown again. Your API key should look like: `gsk_W1Jz47u5ieJs28D8m...`

## Setup

Please check you have configured your environement properly with uv (see [setup](../setup.md))

In [1]:
# Ignore this cell.
from dotenv import load_dotenv
load_dotenv()

True

## Use Groq's API

First, declare your API key with the following command:

In [2]:
import os
if "GROQ_API_KEY" not in os.environ:
    os.environ["GROQ_API_KEY"] = "YOUR-API-KEY-HERE"

Replace `YOUR-API-KEY-HERE` with your real Groq API key.

Then, define a helper function:

In [3]:
import os
from groq import Groq

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

def chat(prompt=None, model=None):
    chat_completion = client.chat.completions.create(
        messages=[ {"role": "user", "content": prompt} ],
        model=model,
    )
    return chat_completion.choices[0].message.content

Now you can send your first prompt to a Llama LLM model:

In [4]:
chat(
    prompt="Explain in a short text what is bioinformatics to a middle school student",
    model="llama-3.3-70b-versatile"
)

"Bioinformatics is a field that combines biology and computer science. It's like being a detective for living things. Bioinformaticians use computers to analyze and understand data about genes, proteins, and other biological things. They look for patterns and clues in the data to help us learn more about how living things work, how diseases happen, and how we can stay healthy. It's a really cool way to use computers to help us understand the world of biology and make new discoveries."

Look at [available LLM models in Groq](https://console.groq.com/docs/models) and try with a different model.

```{warning}
Be sure to select a LLM that works with chat. For instance, `whisper-large-v3-turbo` is not a chat model.
```

Here is another example with `openai/gpt-oss-20b`:

In [5]:
chat(
    prompt="Explain in a short text what is bioinformatics to a middle school student",
    model="openai/gpt-oss-20b"
)

'**Bioinformatics** is the science of using computers to help scientists understand biology.  \nJust like a detective uses clues to solve a mystery, bioinformaticians use software and algorithms to read and analyze the huge amount of information hidden in DNA, proteins, and cells.  \nBy comparing many DNA sequences, they can find genes that make people healthy or sick, discover how plants grow, or design new medicines.  \nThink of it as a giant digital laboratory where the data are the clues and the computer is the super‑smart helper.'

## Are LLMs good at math?

> Spoiler: no!

In the following, we will assess the performance of a LLM in simple math calculations.

### Find the correct prompt

Design a prompt that forces the LLM to output only the result of the product of two integers. Test your prompt several time with different integers.

In [6]:
chat(
    prompt="YOUR-CLEVER-PROMPT-HERE",
    model="llama-3.3-70b-versatile",
)

'You didn\'t give me a prompt, so I\'ll have to come up with something. \n\nLet\'s play a game of "Story Chain." I\'ll start telling a story, and then stop at a cliffhanger. Your task is to continue the story in your own words, and then I\'ll respond with the next part of the story, and so on.\n\nHere\'s my first paragraph:\n\nIn a world where time was currency, the rich lived forever and the poor were left with nothing but the ticking of their clocks. The city of Chronos was the epicenter of this strange economy, where people traded years of their lives for material possessions and experiences. Amidst the hustle and bustle of the city, a young woman named Aria lived a life of quiet desperation. She had only a few hours left on her clock, and she was determined to make the most of it.\n\nNow it\'s your turn! What happens next to Aria?'

## LLM vs Python

In the following code, we will ask a LLM the result of the product of two integers and compare the output to the correct answer provided by Python.

Take some time to understand how the code works.

Run the code, play with it and make some improvements.

In [7]:
from random import randint

correct_answers = 0
total_attempts = 20

for _ in range(total_attempts):
    number_1 = randint(100, 1_000)
    number_2 = randint(100, 1_000)
    llm_result = chat(
        prompt=f"What is the result of {number_1} times {number_2}. Only reply with the result",
        model="llama-3.3-70b-versatile"
    )
    # Remove extra characters from LLM output.
    llm_result = llm_result.replace(",", "").replace(".", "")
    # Get Python result.
    python_result = number_1 * number_2
    print(f"{number_1} x {number_2} = {python_result} | {llm_result} ", end="")
    # Check result.
    if llm_result == str(python_result):
        correct_answers += 1
        print(f"OK")
    else:
        print(f"Wrong!")

print(f"Success rate: {correct_answers/total_attempts:.0%}")

695 x 452 = 314140 | 313940 Wrong!
907 x 712 = 645784 | 646084 Wrong!
697 x 809 = 563873 | 564793 Wrong!
126 x 383 = 48258 | 48318 Wrong!
291 x 869 = 252879 | 252879 OK
105 x 809 = 84945 | 84845 Wrong!
777 x 582 = 452214 | 451734 Wrong!
738 x 602 = 444276 | 444876 Wrong!
402 x 733 = 294666 | 294726 Wrong!
113 x 175 = 19775 | 19775 OK
284 x 691 = 196244 | 196124 Wrong!
764 x 585 = 446940 | 447060 Wrong!
798 x 122 = 97356 | 97316 Wrong!
489 x 695 = 339855 | 339855 OK
482 x 307 = 147974 | 148054 Wrong!
327 x 768 = 251136 | 250416 Wrong!
541 x 935 = 505835 | 505615 Wrong!
711 x 141 = 100251 | 100011 Wrong!
975 x 577 = 562575 | 562775 Wrong!
472 x 334 = 157648 | 157648 OK
Success rate: 20%
