# AI Engineer Evaluation — Minimal Notebook

This notebook reproduces each exercise. Execute all cells locally to capture outputs before submission.

## Init

In [1]:
# %pip install -q -e .
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
api_key = os.getenv("LITELLM_KEY")
base_url = os.getenv("BASE_URL", "https://llmproxy.ai.orange")
assert api_key, "Set LITELLM_KEY in your .env"
client = OpenAI(api_key=api_key, base_url=base_url)
print("Client ready, base_url:", base_url)

Client ready, base_url: https://llmproxy.ai.orange


## Exercise 1 — LLM connection

In [2]:
prompt = "¿Cuántas 'a' tiene la palabra MasOrange? Responde solo con un número."
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role":"user","content":prompt}],
    temperature=0
)
print(resp.choices[0].message.content.strip())

2


## Exercise 2 — Simple embedding

In [3]:
text = "You shall not pass!"
emb = client.embeddings.create(model="openai/text-embedding-3-small", input=text).data[0].embedding
print("Embedding length:", len(emb))
emb[:8]

Embedding length: 1536


[0.060176700353622437,
 0.03872217237949371,
 0.001016652095131576,
 0.013106766156852245,
 -0.05762816220521927,
 0.0009825198212638497,
 0.008646824397146702,
 -0.04896833375096321]

## Exercises 2.2–3 — Mini Vector DB

In [4]:
from ex3_vector_db import MiniVectorDB
from pathlib import Path

guide_path = Path("ai-engineer-evaluation-test.md")
assert guide_path.exists(), "Place the official guide at ai-engineer-evaluation-test.md"
vdb = MiniVectorDB(client)
vdb.load_document_from_path(guide_path)
print("Num embeddings:", len(vdb.embeddings))
print("Top chunk for query:")
print(vdb.nearest_chunks("Darle funcionalidad a la base de datos")[0][:400] + "...")

Num embeddings: 10
Top chunk for query:
## **Ejercicio 3: Dándole funcionalidad**
Añade el siguiente método a tu clase:
```python
    def _nearest_chunks(self, embedding: [int], top_n: int = 3) -> list[str]:
        """Return the nearest chunks to the given embedding with the dot product similarity"""
        # Calculate dot product similarity for each stored embedding
        similarities = []
        for i, stored_embedding in enumera...


## Exercise 4 — RAG Chatbot

In [5]:
SYSTEM_PROMPT = (
    "Eres un asistente que responde SÓLO con información presente en los fragmentos de contexto. "
    "Si la respuesta no está en el contexto, responde 'No aparece en la guía'. "
    "Responde en español de forma breve y precisa."
)

question = "¿Qué modelos hay disponibles para el uso en el ejercicio?"
chunks = vdb.nearest_chunks(question, top_n=3)
context = "\n\n---\n\n".join(chunks)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Pregunta: {question}\n\nContexto:\n{context}"}
]
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=messages,
    temperature=0.0
)
print(resp.choices[0].message.content.strip())

Los modelos disponibles para el uso en el ejercicio son:

1. `text-embedding-3-small` - Uso: Embeddings, Coste: Muy bajo
2. `gpt-4o-mini` - Uso: Chat, Coste: Bajo
3. `gpt-4o-nano` - Uso: Chat, Coste: Muy bajo
