# AI Agent

Taking the bull by the horns

François-David Collin (CNRS, IMAG, Paul-Valéry Montpellier 3
University)  
Wednesday, August 27, 2025

For this practical work, you need the following python packages:

-   `openai`
-   `python-dotenv`
-   `faiss-cpu`
-   `numpy`

# Hello World

Make work the example of the course.

``` python
from openai import OpenAI
client = OpenAI()

chat_response = client.chat.completions.create(
    model= "gpt-4o",
    messages = [
        {
            "role": "user",
            "content": "What is the best French cheese?",
        },
    ]
)
print(chat_response.choices[0].message.content)
```

Look at the documentation of the `OpenAI()` constructor in order to take
your own model. Modify the model name accordingly.

> **Important**
>
> **Never, ever** put your API key in the code. Use environment
> variables instead. For example, use python `dotenv` to load the API
> key from a `.env` file.

> **Tip**
>
> The openai compatible endpoint for mistral.ai is
> `https://api.mistral.ai/v1`. `mistral-small-latest` as the model
> should be sufficient.

> **Tip**
>
> If you have locally installed llm with ollama/lmstudio for example,
> don’t hesitate to adapt the code to use your local model.

In [2]:
from openai import OpenAI
import dotenv
import os

dotenv.load_dotenv("/Users/fradav/.continue/.env",verbose=True)
api_key = os.getenv("MISTRAL_API_KEY")
base_url = "https://api.mistral.ai/v1"
model = "mistral-small-latest"

client = OpenAI(api_key=api_key, base_url=base_url)

chat_response = client.chat.completions.create(
    model= model,
    messages = [
        {
            "role": "user",
            "content": "What is the best French cheese?",
        },
    ]
)
print(chat_response.choices[0].message.content)

The "best" French cheese is highly subjective and depends on personal taste, but here are some of the most celebrated and iconic French cheeses that are often considered the finest:

### **Top Contenders for the Best French Cheese:**
1. **Camembert de Normandie (AOP)** – A creamy, buttery, and earthy cow’s milk cheese with a bloomy rind. The best comes from Normandy and has a rich, mushroomy flavor.
2. **Roquefort (AOP)** – A bold, tangy blue cheese made from sheep’s milk, aged in natural caves. It’s salty, crumbly, and intensely flavorful.
3. **Comté (AOP)** – A nutty, caramelized cow’s milk cheese with a firm texture. Aged versions (18+ months) develop deep, complex flavors.
4. **Brie de Meaux (AOP)** – The king of Brie, with a velvety texture and a rich, slightly funky aroma. Best enjoyed at peak ripeness.
5. **Reblochon (AOP)** – A soft, creamy cheese from Savoie, known for its mild, slightly tangy flavor. Perfect for fondue.
6. **Chèvre (Goat Cheese)** – Varieties like **Sainte-Ma

# Build a RAG Agent

## Split the document in chunks

Get the alice.txt and split it in 2048 characters.

In [3]:
with open("../../materials/alice.txt", "r") as f:
    text = f.read()
chunk_size = 2048
chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]
len(chunks)

84

## Encode the chunks

A simple function to get an embedding is:

In [4]:
from time import sleep

def get_text_embedding(input):
    sleep(1) # Rate-limiting
    embeddings_batch_response = client.embeddings.create(
          model="mistral-embed",
          input=input
      )
    return embeddings_batch_response.data[0].embedding

It does return a 1024 list of float (the embedding of the input).

make a numpy array of all chunk embeddings from the text.

In [5]:
import numpy as np

embeddings = np.array([get_text_embedding(chunk) for chunk in chunks])

(Should take one and half minute)

## Store embeddings in vector database

In [6]:
import faiss

index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)

# Query

## Make an example query

Make an embedding for a question like “À quels obstacles est confrontée
Alice?”

In [7]:
question = "À quels obstacles est confrontée Alice?"
question_embeddings = np.array([get_text_embedding(question)])

## Search for the most similar chunks

In [8]:
D, I = index.search(question_embeddings, k=2) # distance, index

## Retrieve the chunks

In [9]:
#! tags: [solution]
retrieved_chunk = [chunks[i] for i in I.tolist()[0]]

## RAG prompt

Make the RAG query with the following prompt

In [10]:
prompt = f"""
Context information is below.
---------------------
{retrieved_chunk}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {question}
Answer:
"""

In [11]:
def run_mistral(user_message, model="mistral-large-latest"):
    sleep(1) # Rate-limit
    messages = [
        {
            "role": "user", "content": user_message
        }
    ]
    chat_response = client.chat.completions.create(
        model=model,
        messages=messages
    )
    return (chat_response.choices[0].message.content)

run_mistral(prompt)

"D'après le texte fourni, Alice est confrontée à plusieurs obstacles lors de la partie de croquet bizarre dans *Alice au pays des merveilles* :\n\n1. **Un terrain difficile** :\n   - Le terrain est rempli de **creux et de bosses**, ce qui rend le jeu imprévisible et compliqué.\n\n2. **Des accessoires vivants et imprévisibles** :\n   - **Les boules sont des hérissons vivants** qui se déroulent et s’éloignent, ce qui les rend difficiles à frapper.\n   - **Les maillets sont des flamants vivants** :\n     - Le flamant se retourne pour regarder Alice avec un air intrigué, la faisant rire et la distraire.\n     - Il est difficile à manipuler (elle doit le tenir sous son bras, et il ne coopère pas).\n   - **Les arceaux sont des soldats** qui se redressent et bougent constamment, empêchant de viser correctement.\n\n3. **Un jeu chaotique et sans règles claires** :\n   - **Tout le monde joue en même temps** sans attendre son tour, ce qui crée de la confusion.\n   - Les joueurs **se disputent et 

# Final

Make a function for any question about the book.

In [12]:
def ask_book(question):
    question_embeddings = np.array([get_text_embedding(question)])
    D, I = index.search(question_embeddings, k=2) # distance, index
    retrieved_chunk = [chunks[i] for i in I.tolist()[0]]
    prompt = f"""
Context information is below.
---------------------
{retrieved_chunk}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {question}
Answer:
"""
    return run_mistral(prompt)

##