# Exam (morning): Retrieval Augmented Generation

### Personal Details (please complete)
Double Click on Cell to edit.

<table>
  <tr>
    <td>First Name:</td>
    <td>Kaspar</td>
  </tr>
  <tr>
    <td>Last Name:</td>
    <td>Hänni</td>
  </tr>
  <tr>
    <td>Student ID:</td>
    <td></td>
  </tr>
  <tr>
    <td>Modul:</td>
    <td>Machine Learning 2</td>
  </tr>
  <tr>
    <td>Exam Date / Raum / Zeit:</td>
    <td>20.05.2025 / Raum: SM O2.01  / 10:15 – 11:30</td>
  </tr>
  <tr>
    <td>Erlaubte Hilfsmittel:</td>
    <td>w.3ML2-WIN (Machine Leaning 2)<br>Open Book, Personal Computer, Internet Access</td>
  </tr>
  <tr>
  <td>Not allowed:</td>
  <td>The use of any form of generative AI (e.g., Copilot, ChatGPT) to assist in solving the exercise is not permitted. <br> However, using such tools as part of the exercise itself (e.g., making API calls to them if required by the task) is allowed. <br> Any form of communication or collaboration with other people is not permitted.</td>
</tr>
</table>

## Evaluation Criteria

### <b style="color: gray;">(maximum achievable points: 48)</b>

<table>
  <thead>
    <tr>
      <th>Category</th>
      <th>Description</th>
      <th>Points Distribution</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Code not executable or results not meaningful</td>
      <td>The code contains errors that prevent it from running (e.g., syntax errors) or produces results that do not fit the question.</td>
      <td>0 points</td>
    </tr>
    <tr>
      <td>Code executable, but with serious deficiencies</td>
      <td>The code runs, but the results are incomplete due to major errors (e.g., fundamental errors when reading the data). Only minimal progress is evident.</td>
      <td>25% of the maximum achievable points</td>
    </tr>
    <tr>
      <td>Code executable, but with moderate deficiencies</td>
      <td>The code runs and delivers partially correct results, but there are significant errors (e.g., the data types of the imported data do not meet the requirements of the question). The results are comprehensible but incomplete or inaccurate.</td>
      <td>50% of the maximum achievable points</td>
    </tr>
    <tr>
      <td>Code executable, but with minor deficiencies</td>
      <td>The code runs and delivers a largely correct result, but minor errors (e.g., column name misspelled, timestamp not correctly formatted) affect the completeness of the result.</td>
      <td>75% of the maximum achievable points</td>
    </tr>
    <tr>
      <td>Code executable and correct</td>
      <td>The code runs flawlessly and delivers the correct result without deficiencies.</td>
      <td>100% of the maximum achievable points</td>
    </tr>
  </tbody>
</table>



## Python Libraries und Settings

## <b>Set Up (This part will <u>not</u> be evaluated!)</b>

#### <b>1.) Start a GitHub Codespaces instance based on your fork of this GitHub repository or open the notebook in Colab</b>
#### <b>2.) Add API keys to either .env files for Codespaces or to the secrets for Colab</b>
#### <b>3.) Please execute the two code cells below as soon as the Codespace/Colab has started and install the libraries</b>

In [1]:
!python3 -m pip install --upgrade pip
!pip install PyPDF2
!pip install langchain-community
!pip install faiss-cpu
!pip install groq
!pip install openai
!pip install tqdm
!pip install sentence-transformers
!pip install huggingface_hub[hf_xet]
!pip install faiss-cpu
!pip install google-generativeai

Collecting pip
  Downloading pip-25.1.1-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-25.1.1-py3-none-any.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 25.0.1
    Uninstalling pip-25.0.1:
      Successfully uninstalled pip-25.0.1
Successfully installed pip-25.1.1
Collecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
Installing collected packages: PyPDF2
Successfully installed PyPDF2-3.0.1
Collecting langchain-community
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting langchain-core<1.0.0,>=0.3.59 (from langchain-community)
  Downloading langchain_core-0.3.60-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain<1.0.0,>=0.3.25 (from langchain-community)
  Downloading langchain-0.3

In [2]:
from dotenv import load_dotenv
import os
from openai import OpenAI
import openai
import tqdm
import glob
from PyPDF2 import PdfReader
from sentence_transformers import SentenceTransformer
import faiss
import pickle
import google.generativeai as genai
from groq import Groq




  from .autonotebook import tqdm as notebook_tqdm


In [56]:
load_dotenv()
groq_key = os.getenv("GROQ_API_KEY")
openai.api_key = os.getenv("OPENAI_API_KEY")
google_key = os.getenv("GOOGLE_API_KEY")


## <b>Tasks (This part will be evaluated!)</b>
### Notes on the following tasks:

In this part of the exam, you will build a Retrieval-Augmented Generation (RAG) pipeline that efficiently retrieves medical information from the package inserts of common medications. Imagine you are developing a system for pharmacists or medical professionals to quickly and accurately answer questions about medications. The following five package inserts are provided as your data source:

- [data/Amoxicillin.pdf](data/Amoxicillin.pdf)
- [data/bisoprolol.pdf](data/bisoprolol.pdf)
- [data/citalopram.pdf](data/citalopram.pdf)
- [data/metformin.pdf](data/metformin.pdf)
- [data/paracetamol.pdf](data/paracetamol.pdf)

Your task is to implement a RAG pipeline that retrieves relevant information from these package inserts and integrates it into the answer generation process. Use the provided instructions and your knowledge from the exercises.

### Expected Results:

1. Read in the provided package inserts and extract all text.
2. Split the extracted text into manageable chunks using a text splitter (e.g., `RecursiveCharacterTextSplitter`).
3. Create embeddings for the text chunks using a suitable model.
4. Index the embeddings in a vector store (e.g., FAISS).
5. Develop an appropriate prompt template.
6. Build the RAG chain.
7. Automatically generate a list of 10 test questions using a language model.
8. Let your RAG pipeline answer the 10 generated questions.

### Submission documents:

Your submission should include:
- The completed notebook (this file).
- the vector store

<b style="color:blue;">Notes on the following tasks:</b>
<ul style="color:blue;">
  <li>Pay attention to the specific details provided for each task.</li>
  <li>Solve each task using Python code. Integrate your code into the code cells for each task.</li>
  <li>Present your solution(s) as requested in each task.</li>
</ul>

#### <b>Task (1): Read all 5 PDFs from the 'data' folder and store their content for further use</b>
<b>Task details:</b>
- The files are located in the 'data' folder..
- Display the length of the resulting string (number of characters).
- Show the first 100 characters in the notebook output.
<b style="color: gray;">(max. points: 2)</b>

In [5]:
glob_path = "data/*.pdf"
text = ""
for pdf_path in tqdm.tqdm(glob.glob(glob_path)):
    with open(pdf_path, "rb") as file:
        reader = PdfReader(file)
         # Extract text from all pages in the PDF
        text += " ".join(page.extract_text() for page in reader.pages if page.extract_text())


100%|██████████| 5/5 [00:02<00:00,  1.86it/s]


In [6]:
# Show the number of characters in the text
print(len(text))

# Show the first 100 characters of the text
text[:100]

176575


'Inhaltsverzeichnis\nZusammensetzung\nDarreichungsform und Wirkstoffmenge pro Einheit\nIndikationen/Anwe'

#### <b>Task (2): Split the text into chunks appropriate for the task. Specify an overlap as well. Give a reason for your choice</b>
<b>Task details:</b>
- Use the data from the previous task.
- Show the total number of chunks in the notebook.
- Show the length of the first chunk in the notebook.
- Explain you reasoning
<b style="color: gray;">(max. points: 4)</b>

In [7]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [8]:
# Create a splitter: 2000 characters per chunk with an overlap of 200 characters
splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
# Split the extracted text into manageable chunks
chunks = splitter.split_text(text)

In [9]:
# Show the total number of chunks
print(f"Total chunks: {len(chunks)}")
print("Preview of the first chunk:", chunks[0][:200])

Total chunks: 98
Preview of the first chunk: Inhaltsverzeichnis
Zusammensetzung
Darreichungsform und Wirkstoffmenge pro Einheit
Indikationen/Anwendungsmöglichkeiten
Dosierung/Anwendung
Kontraindikationen
Warnhinweise und Vorsichtsmassnahmen
Inte


##### Explanation (double click and add text): with almost 100 chunks i have enough to fill a vector with content. So when i retrive it, i can get a good quality of outputs.

#### <b>Task (3): Initialize an embedding model</b>
<b>Task details:</b>
- Choose a suitable embedding model from Huggingface.
- [Huggingface models](https://huggingface.co/spaces/mteb/leaderboard).
- Consider the size of the model. It should be runnable in your Codespace.
- Choose a model appropriate for the data.

<b style="color: gray;">(max. points: 2)</b>

In [10]:
from langchain_community.embeddings import HuggingFaceEmbeddings  # For generating embeddings for text chunks

In [40]:
model_name = "paraphrase-multilingual-MiniLM-L12-v2"
model = SentenceTransformer(model_name)
chunk_embeddings = model.encode(chunks, convert_to_numpy=True)

#### <b>Task (4): Create a vector store</b>
<b>Task details:</b>
- Create a vector store
- store the vector store (this is also helpful in case the codespace or colab needs a restart)
<b style="color: gray;">(max. achievable points: 6)</b>

In [41]:
d = chunk_embeddings.shape[1]
print(d)

384


In [42]:
index = faiss.IndexFlatL2(d)
index.add(chunk_embeddings)
print("Number of embeddings in FAISS index:", index.ntotal)

Number of embeddings in FAISS index: 98


In [43]:
faiss.write_index(index, "faiss/faiss_index.index")
with open("faiss/chunks_mapping.pkl", "wb") as f:
    pickle.dump(chunks, f)

In [44]:
index = faiss.read_index("faiss/faiss_index.index")
with open("faiss/chunks_mapping.pkl", "rb") as f:
    chunks = pickle.load(f)
print(len(chunks))

98


#### <b>Task (5): Create a retriever function.</b>
<b>Task details:</b>
- Create a retriever function
- Define the number of documents the retriever should return.
- Test the retriever with the following query: `"Welche Dosierung von Amoxicillin Axapharm wird für die Behandlung einer Endokarditis-Prophylaxe bei Erwachsenen empfohlen?"`
- If the retrieved chunks are not relevant, increase the number of chunks to be retrieved and repeat the query. 
- It does not have to be perfect; if nothing improves, continue with the current result.
<b style="color: gray;">(max. achievable points: 6)</b>

In [45]:
#def retrieve_texts(query, k, index, chunks, model):
def retrieve_texts(query, k, index, chunks, model):
    """
    Retrieve the top k similar text chunks and their embeddings for a given query.
    """
    query_embedding = model.encode([query], convert_to_numpy=True)
    distances, indices = index.search(query_embedding, k)
    retrieved_texts = [chunks[i] for i in indices[0]]
    return retrieved_texts, distances

In [46]:
query = "Welche Dosierung +von Amoxicillin Axapharm wird für die Behandlung einer Endokarditis-Prophylaxe bei Erwachsenen empfohlen?"

In [47]:

# Testen des retrievers
retrieved_texts = retrieve_texts(query, 3, index, chunks, model)

print(retrieved_texts)
print(len(retrieved_texts))

(['Die Ther apie sollte über 48 bis 72 Stunden nach Erreichen einer klinischen Wirkung fortgesetzt werden. Bei einer Infektion, die durch β-\nhämolysierende Streptok okken verursacht worden ist, empfiehlt es sich, während mindestens 10 T agen mit der Behandlung fortzufahren, um\ndas A uftreten v on akutem rheumatischem Fieber oder einer Glomerulonephritis zu v erhindern.\nFür die Behandlung v on erw achsenen P atienten und Kindern über 40 kg sollten Amo xicillin- Tabletten v erwendet werden.\nÜbliche Dosierung\nErwachsene und Kinder über 40 kg\nLeichte bis mittelschwere Infektionen\nAllgemeine Richtlinien: 1500-3000 mg Amo xicillin/T ag in 3-4 Einz eldosen.\nMaximale Tagesdosis:  4000-6000 mg aufgeteilt in 3-4 Dosen.\nDosierungsempfehlung:  3-4× täglich 375-750 mg.\nZur Behandlung der Gonorrhoe (spezifische Urethritis) und der unk omplizierten Infektionen der unteren Harn wege (z.B . Zystitis, bakterielle\nUrethritis) sowie zur Endokarditisproph ylaxe kann eine Einz eldosis v on 3 g Am

#### <b>Task (6): Implement a reusable RAG function and prompt template</b>
<b>Task details:</b>
- Write a function `get_answer_and_documents` that answers a question using your RAG pipeline.
- The function should:
  - Take as parameters: the question (`question`), the number of documents to retrieve (`k`), the FAISS index (`index`), and the list of text chunks (`chunks`).
  - The prompt template should be tailored to the medical context, address medical professionals, and instruct the model to answer concisely and in German, using only the provided context. This is part of the task.
  - Return both the answer and the retrieved documents.
- Test the function with the question: `Ab welcher Kreatinin-Clearance ist die Einnahme von Metformin kontraindiziert?`

<b style="color: gray;">(max. achievable points: 8)</b>

In [55]:
# set language model and output parser
def answer_query(query, k, index,texts):
  """
    Retrieve the top k similar text chunks for the given query using the retriever,
    inject them into a prompt, and send it to the Groq LLM to obtain an answer.
    
    Parameters:
    - query (str): The user's query.
    - k (int): Number of retrieved documents to use.
    - groq_api_key (str): Your Groq API key.
    
    Returns:
    - answer (str): The answer generated by the LLM.
    """
    model_name = "paraphrase-multilingual-MiniLM-L12-v2"
    model = SentenceTransformer(model_name)
    chunk_embeddings = model.encode(chunks, convert_to_numpy=True)
    retrieved_texts, _ = retrieve_texts(query, k, index, texts, model)
    
    # Combine the retrieved documents into a single context block.
    context = "\n\n".join(retrieved_texts)
    
    # Build a prompt that instructs the LLM to answer the query based on the context.
    prompt = (
        "Beantworte die folgende Frage mit dem angehängten Medizinischen Kontext"
        "Erkläre es als wärst du ein Arzt der es einem anderen Arzt erklären.\n\n"
        "Context:\n" + context + "\n\n"
        "Question: " + query + "\n"
        "Answer:"
    )
    
    # Initialize the Groq client and send the prompt.
    client = Groq(api_key=groq_api_key)
    messages = [
        {
            "role": "system",
            "content": prompt
        }
    ]
    
    llm = client.chat.completions.create(
        messages=messages,
        model="llama-3.3-70b-versatile"
    )
    
    # Extract and return the answer.
    answer = llm.choices[0].message.content
    
    return answer


IndentationError: unexpected indent (3551908159.py, line 15)

In [51]:
# Test query
query = "Ab welcher Kreatinin-Clearance ist die Einnahme von Metformin kontraindiziert?"

In [52]:
# print result of test query with your chain (hint: input is a dictionary)
print(answer_query(query, 4, index, chunks))

NameError: name 'answer_query' is not defined

#### <b>Task (7): Implement a HyDE Query Transformation for RAG</b>
<b>Task details:</b>
- Implement a function that applies the HyDE strategy in your RAG pipeline.
- add your HyDe transformation to your pipeline
- Display the intermediate transformation (print statement within function is enough) and the final answer in the notebook.
<b style="color: gray;">(max. achievable points: 6)</b>

In [59]:
!pip install llama-index

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting llama-index
  Downloading llama_index-0.12.37-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.7-py3-none-any.whl.metadata (438 bytes)
Collecting llama-index-cli<0.5,>=0.4.1 (from llama-index)
  Downloading llama_index_cli-0.4.1-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core<0.13,>=0.12.36 (from llama-index)
  Downloading llama_index_core-0.12.37-py3-none-any.whl.metadata (2.4 kB)
Collecting llama-index-embeddings-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.4.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.6.11-py3-none-any.whl.metadata (3.6 kB)
Collecting llama-index-llms-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_llms_openai-0.3.42-py3-none-any.whl.metadata (3.0 kB)
Collecting llam

In [60]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
from IPython.display import Markdown, display

In [None]:
query_engine = index.as_query_engine()
response = query_engine.query(query)
display(Markdown(f"<b>{response}</b>"))

TypeError: 'IndexFlatL2' object is not callable

In [None]:
hyde = HyDEQueryTransform(include_original=True)
hyde_query_engine = TransformQueryEngine(, hyde)
response = hyde_query_engine.query(query)
display(Markdown(f"<b>{response}</b>"))

In [None]:
def rewrite_query_hyde(query):
    
    return new_query

In [74]:
def rewrite_query(query, groq_api_key):
    """
    Rewrite the user's query into to potentially improve retrieval.
    Parameters:
    - query (str): The original user query.
    - groq_api_key (str): Your Groq API key.
    
    Returns:
    - rewritten_query (str): The rewritten query.
    """
    client = Groq(api_key=groq_api_key)
    # Build a prompt for rewriting the query
    rewriting_prompt = (
        "Rewrite the following query into a format, such that it can be answered by looking at medical guidelines. "
        "Keep the keywords but ensure that it is close to a format, such as in medical guidelines. Just answer with the rewritten query\n\n"
        "Query: " + query
    )
    
    messages = [
        {"role": "system", "content": rewriting_prompt}
    ]
    
    # Use the same model (for example, llama) to perform query rewriting
    llm = client.chat.completions.create(
        messages=messages,
        model="llama-3.3-70b-versatile",
    )
    rewritten_query = llm.choices[0].message.content.strip()
    return rewritten_query

In [None]:
def answer_query_with_rewriting(query, k, index, texts):
"""
    Retrieve the top k similar text chunks for the given query using a retrieval method
    with query rewriting, inject them into a prompt, and send it to the Groq LLM (using llama)
    to obtain an answer.
    
    Parameters:
    - query (str): The user's query.
    - k (int): Number of retrieved documents to use.
    - index: The FAISS index.
    - texts (list): The tokenized text chunks mapping.
    - groq_api_key (str): Your Groq API key.
    
    Returns:
    - answer (str): The answer generated by the LLM.
    """
    model_name = "paraphrase-multilingual-MiniLM-L12-v2"
    model = SentenceTransformer(model_name)
    # Use the new retrieval function with query rewriting.
    rewritten_query = rewrite_query(query, groq_api_key)    
    print("Rewritten Query:", rewritten_query) ## FYI

    retrieved_texts, _ = retrieve_texts(rewritten_query, k, index, texts, model)
    
    # Combine the retrieved documents into a single context block.
    context = "\n\n".join(retrieved_texts)
    
    # Build a prompt that instructs the LLM to answer the query based on the context.
    prompt = (
        "Answer the following question using the provided context. "
        "Explain it as if you are explaining it to a 5 year old.\n\n"
        "Context:\n" + context + "\n\n"
        "Question: " + query + "\n"
        "Answer:"
    )
    
    # Initialize the Groq client and send the prompt.
    client = Groq(api_key=groq_key)
    messages = [
        {"role": "system", "content": prompt}
    ]
    
    llm = client.chat.completions.create(
        messages=messages,
        model="llama-3.3-70b-versatile"
    )
    
    # Extract and return the answer.
    answer = llm.choices[0].message.content
    return answer

In [None]:
query = "Was ist der wichtigste Faktor bei der Diagnostizierung von Asthma?"
answer = answer_query_with_rewriting(query, 3, index, model)
print("LLM Answer:", answer)

#### <b>Task (7): Generate a list of test questions</b>
<b>Task details:</b>
- Create a Python list with 10 questions about the provided medications.
- The questions should be automatically generated using a language model.
- You may use chunks from the package inserts as inspiration, but this is not required.
- At the end, print out your list of questions.

<b style="color: gray;">(max. achievable points: 6)</b>

In [None]:
from groq import Groq

client = Groq(api_key=groq_key)

llm = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You re a medical profesional \
            generate 10 question about the provided medications"
        },
        {
            "role": "user",
            "content": "generate 10 question about medications (Amoxicillin,bisoprolol, citalopram, metformin, paracetamol )",
        }
    ],
    model="llama-3.3-70b-versatile",
    
)

print(llm.choices[0].message.content)

INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
Here are 10 questions about the provided medications:

1. **What is the primary indication for Amoxicillin, and what type of infections is it commonly used to treat?**

2. **What is the mechanism of action of Bisoprolol, and how does it differ from other beta-blockers in terms of its selective beta-1 receptor blockade?**

3. **What are the common side effects of Citalopram, and how does it compare to other selective serotonin reuptake inhibitors (SSRIs) in terms of its tolerability and efficacy?**

4. **What is the primary use of Metformin, and how does it help to regulate blood glucose levels in patients with type 2 diabetes?**

5. **What is the recommended dosage of Paracetamol for adults, and what are the potential risks 

In [76]:
questions = [""]
questions.append(llm.choices[0].message.content)

In [77]:
for i, question in enumerate(questions):
    i +=1
    print("Frage " + str(1) + ": " + question)

Frage 1: 
Frage 1: Here are 10 questions about the provided medications:

1. **What is the primary indication for Amoxicillin, and what type of infections is it commonly used to treat?**

2. **What is the mechanism of action of Bisoprolol, and how does it differ from other beta-blockers in terms of its selective beta-1 receptor blockade?**

3. **What are the common side effects of Citalopram, and how does it compare to other selective serotonin reuptake inhibitors (SSRIs) in terms of its tolerability and efficacy?**

4. **What is the primary use of Metformin, and how does it help to regulate blood glucose levels in patients with type 2 diabetes?**

5. **What is the recommended dosage of Paracetamol for adults, and what are the potential risks of overdosing on this medication?**

6. **Can Amoxicillin be used to treat viral infections, and what are the implications of prescribing antibiotics for non-bacterial infections?**

7. **How does Bisoprolol affect heart rate and blood pressure, a

#### <b>Task (8): Let your retriever answer the 10 generated questions.</b>
<b>Task details:</b>
- Use the 10 generated questions and have them answered by your RAG chain.
- For each question, output both the retrieved documents and the answer.
- Provide your own assessment of whether your chain works well or not.
- Give an example of what worked well and what did not.

<b style="color: gray;">(max. achievable points: 6)</b>

In [None]:
# Beantwortung der 10 generierten Fragen

for question in questions: 
    
    lient = Groq(api_key=groq_key)

llm = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You re a medical profesional \
            "
        },
        {
            "role": "user",
            "content": "answer these questions"),
        }
    ],
    model="llama-3.3-70b-versatile",
    
)

print(llm.choices[0].message.content)
    
    answer =  answer_query_with_rewriting(question, 4, index, chunks)
    print(answer)

#### <b> TASK (9) Your assessment of the quality (double-click to edit the cell below):</b>

- Briefly describe what seems to work well in your RAG pipeline based on the answers to the 10 generated questions above.
- Give at least one example of a question/answer pair that worked particularly well.
- Point out at least one aspect or example where the pipeline could be improved or did not work as expected.

<b style="color: gray;">(max. achievable points: 2)</b>

Because i didnt manage to create a functional RAG pipeline i cannot asses this question. SO i imagine i can get a Folgefehler in this Part.

### Jupyter notebook --footer info-- (please always provide this at the end of each notebook)

In [78]:
import os
import platform
import socket
from platform import python_version
from datetime import datetime

print('-----------------------------------')
print(os.name.upper())
print(platform.system(), '|', platform.release())
print('Datetime:', datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
print('Python Version:', python_version())
print('IP Address:', socket.gethostbyname(socket.gethostname()))
print('-----------------------------------')

-----------------------------------
POSIX
Linux | 6.8.0-1027-azure
Datetime: 2025-05-20 09:24:19
Python Version: 3.12.1
IP Address: 127.0.0.1
-----------------------------------
