### 03. RAG - Retrieval-Augmented Generation

- skicka med information som man önskar att LLM använder för att generera ett svar
- kan vara egen data som modell ej är tränad på eller data som är nyare än modellens träningsdata

![03 diagram](docs/03.drawio.png)

In [1]:
import os

from dotenv import load_dotenv
from pydantic import BaseModel

from google import genai

In [2]:
load_dotenv()
API_KEY = os.getenv("GOOGLE_API_KEY")

client = genai.Client(api_key=API_KEY)

In [3]:
# ställ fråga där modell inte vet svaret

response = client.models.generate_content(
    model="gemini-2.0-flash", contents="who won the masters in 2025?"
)

print(response.candidates[0].content.parts[0].text)

As an AI, I do not have access to the future. Therefore, I cannot tell you who won the Masters in 2025.



In [4]:
# förse modellen med information

winners = None
with open("data/masters-winners.csv", encoding="utf-8") as f:
    winners = f.read()

In [5]:
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=f"Use the following information to answer the question: {winners}. Who won the masters in 2025?",
)

print(response.candidates[0].content.parts[0].text)

Rory McIlroy


In [6]:
class WinnerResponse(BaseModel):
    winner: str | None
    year: int | None
    score: str | None

In [7]:
# mixing provided data with the models training data is hard 🥲
# answer as text fails but when asking for a structured response if succeeds

prompt = """
    You are answering questions about the Masters golf tournament.
    Use your own knowledge first. Only refer to the following data if you're unsure or if your own answer is incomplete: {winners}

    Question: Who won the Masters in 1985 and what was the winning score?    
"""

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt.strip(),
    config={
        "response_mime_type": "application/json",
        "response_schema": WinnerResponse,
    },  # define structured response
)

result = response.parsed
print(result.model_dump())

{'winner': 'Bernhard Langer', 'year': 1985, 'score': '-6'}
