# Playing with LLM APIs

In this notebook, we will use large language models (LLMs) through an API. Most LLM providers offer an API to use their LLMs programmatically (for instance, with Python).

[Groq](https://groq.com/) freely provides (as of October 2025) access to open LLMs, like Llama from Meta or Mixtral from Mistral AI ([list of models](https://console.groq.com/docs/models)).

## Get credentials and API key

- Create a free account on Groq: <https://console.groq.com/login>

  You can use your GitHub or Google account for authenication.
- Then go on [Groq's API keys](https://console.groq.com/keys) page
- Click the black button  "Create API Key"
- Chose a name for your key.
- Copy your API key and paste it somewhere. For security reason, your API key won't be shown again. Your API key should look like: `gsk_W1Jz47u5ieJs28D8m...`

## Setup

Please check you have configured your environement properly with uv (see [setup](../setup.md))

In [5]:
# Ignore this cell.
from dotenv import load_dotenv
load_dotenv()

True

## Use Groq's API

Declare your API key with the following command:

In [6]:
import os
if "GROQ_API_KEY" not in os.environ:
    os.environ["GROQ_API_KEY"] = "YOUR-API-KEY-HERE"

Replace `YOUR-API-KEY-HERE` with your real Groq API key.

Then, you define a helper function:

In [13]:
import os
from groq import Groq

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

def chat(prompt=None, model=None):
    chat_completion = client.chat.completions.create(
        messages=[ {"role": "user", "content": prompt} ],
        model=model,
    )
    return chat_completion.choices[0].message.content

Now you can send your first prompt to a Llama LLM model:

In [14]:
chat(
    prompt="Explain in a short text what is bioinformatics to a middle school student",
    model="llama-3.3-70b-versatile"
)

"Bioinformatics is a field that combines biology and computers to understand living things. It's like being a detective for genes and cells. Scientists use computers to analyze huge amounts of data about DNA, proteins, and other biological things to figure out how they work, how they're related, and how they can be used to help people. It's a cool way to use technology to learn more about life and improve our health!"

Look at [available LLM models in Groq](https://console.groq.com/docs/models) and try with two different models. 

```{warning}
Be sure to select a LLM that works with chat. For instance, `whisper-large-v3-turbo` is not a chat model.
```

Here is another example with `openai/gpt-oss-20b`:

In [16]:
chat(
    prompt="Explain in a short text what is bioinformatics to a middle school student",
    model="openai/gpt-oss-20b"
)

'**What is bioinformatics?**  \nThink of biology (the study of living things) as a giant library full of books, but instead of written pages, the books are made of DNA, proteins, and other molecules. Bioinformatics is the science that uses computers to read, organize, and analyze all that information—just like a super‑fast librarian that can find patterns, solve puzzles, and predict how a plant or a virus will behave. By crunching data with algorithms, scientists can discover new medicines, learn why certain genes cause disease, and even design better crops. In short, bioinformatics is biology + computer science, turning complex data into useful knowledge.'

## Are LLMs good at math?

> Spoiler: no!

In the following, we will assess the performance of a LLM in simple math calculations.

### Find the correct prompt

Design a prompt that forces the LLM to output only the result of the product of two integers. Test your prompt with several time with different integers.

In [17]:
chat(
    prompt="YOUR-CLEVER-PROMPT-HERE",
    model="llama-3.3-70b-versatile",
)

'It seems like you forgot to include a prompt. What would you like to talk about or ask? I can provide information, answer questions, or engage in conversation on a wide range of topics.'

## LLM vs Python

In the following code, we ask several times a LLM the result of the product of two integers and compare the output to the correct answer provided by Python.

Take some time to understand how the code works.

Run the code, play with it and make some improvements.

In [19]:
from random import randint

correct_answers = 0
total_attempts = 20

for _ in range(total_attempts):
    number_1 = randint(100, 1_000)
    number_2 = randint(100, 1_000)
    llm_result = chat(
        prompt=f"What is the result of {number_1} times {number_2}. Only reply with the result",
        model="llama-3.3-70b-versatile"
    )
    # Remove extra characters from LLM output.
    llm_result = llm_result.replace(",", "").replace(".", "")
    # Get Python result.
    python_result = number_1 * number_2
    print(f"{number_1} x {number_2} = {python_result} | {llm_result} ", end="")
    # Check result.
    if llm_result == str(python_result):
        correct_answers += 1
        print(f"OK")
    else:
        print(f"Wrong!")

print(f"Success rate: {correct_answers/total_attempts:.0%}")

597 x 452 = 269844 | 270244 Wrong!
215 x 228 = 49020 | 48920 Wrong!
617 x 426 = 262842 | 263022 Wrong!
596 x 279 = 166284 | 166524 Wrong!
812 x 175 = 142100 | 142300 Wrong!
638 x 953 = 608014 | 608134 Wrong!
762 x 217 = 165354 | 165654 Wrong!
552 x 308 = 170016 | 170016 OK
644 x 546 = 351624 | 351704 Wrong!
730 x 852 = 621960 | 621360 Wrong!
614 x 983 = 603562 | 604502 Wrong!
828 x 804 = 665712 | 664512 Wrong!
653 x 193 = 126029 | 126049 Wrong!
510 x 499 = 254490 | 254490 OK
277 x 146 = 40442 | 40342 Wrong!
841 x 405 = 340605 | 340605 OK
229 x 198 = 45342 | 45362 Wrong!
439 x 511 = 224329 | 224069 Wrong!
582 x 268 = 155976 | 156216 Wrong!
789 x 178 = 140442 | 140502 Wrong!
Success rate: 15%
