# Improving RAG Retrieval (Feature Track 4)

Coded by Adelene Lai (Econetta AG)

26.02.2026

There are several ways to retrieve chunks:
* Dense - distance based on embedding of query
* Sparse - distance based on (key)words
* Hybrid - weighted sum of sparse + dense

Top-k chunks are those closest to the query using ANN.

Let's compare the retrievers for different questions.

In [1]:
from pathlib import Path

from conversational_toolkit.agents.base import QueryWithContext
from conversational_toolkit.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from conversational_toolkit.retriever.vectorstore_retriever import VectorStoreRetriever
from conversational_toolkit.retriever.bm25_retriever import BM25Retriever
from conversational_toolkit.retriever.hybrid_retriever import HybridRetriever
from conversational_toolkit.retriever.reranking_retriever import RerankingRetriever

# from conversational_toolkit.retriever.XYZ import XYZRetriever

from sme_kt_zh_collaboration_rag.feature0_baseline_rag import (
    load_chunks,
    inspect_chunks,
    build_vector_store,
    inspect_retrieval,
    build_agent,
    build_llm,
    ask,
    DATA_DIR,
    VS_PATH,
    EMBEDDING_MODEL,
    RETRIEVER_TOP_K,
)

# Choose your LLM backend: "ollama" (local, requires `ollama serve`) or "openai" (requires OPENAI_API_KEY)
BACKEND = "openai"  # set this before running

if not BACKEND:
    raise ValueError(
        'BACKEND is not set. Edit the line above and set it to "ollama", or "openai".\n'
        "See Renku_README.md for setup instructions."
    )

Consider using the pymupdf_layout package for a greatly improved page layout analysis.


[0;93m2026-02-26 15:38:05.687125535 [W:onnxruntime:Default, device_discovery.cc:211 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:91 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"[m
  from .autonotebook import tqdm as notebook_tqdm


In [2]:
chunks = load_chunks(5)

embedding_model = SentenceTransformerEmbeddings(model_name=EMBEDDING_MODEL)
vector_store = await build_vector_store(chunks, embedding_model, reset=False)
llm = build_llm("openai", model_name="gpt-4o-mini")

2026-02-26 15:38:13.411 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:205 - Chunking 5 files from /home/alai/projects/myfork/sme-kt-zh-collaboration-rag/data


5


2026-02-26 15:38:13.696 | DEBUG    | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:217 -   ART_internal_procurement_policy.pdf: 12 chunks
2026-02-26 15:38:13.893 | DEBUG    | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:217 -   ART_logylight_incomplete_datasheet.pdf: 6 chunks
2026-02-26 15:38:14.004 | DEBUG    | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:217 -   ART_product_catalog.pdf: 7 chunks
2026-02-26 15:38:14.010 | DEBUG    | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:217 -   ART_product_overview.xlsx: 1 chunks
2026-02-26 15:38:14.121 | DEBUG    | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:217 -   ART_relicyc_logypal1_datasheet_2021.pdf: 5 chunks
2026-02-26 15:38:14.121 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:load_chunks:221 - Done, 31 chunks total
2026-02-26 15:38:16.578 | DEBUG    | conversational_toolkit.embeddings.sentence_transformer:__init__:57 - Sentence Transfo

In [None]:
vector_store.collection.get(include=["documents"], limit=5)

{'ids': ['0a233a44-8d19-444f-a82e-664d900f4057',
  '943415f7-8895-4f53-acb8-206846a06a8f',
  '324c818a-98ab-4581-8c3c-e118613ff6f6',
  'a5b777a0-3d21-47bf-814a-84edb2f0a0d5',
  'e92d745d-9087-42a7-83bd-6f0a5c30798f'],
 'embeddings': array([[-0.00205292,  0.01495177,  0.01680284, ..., -0.05399864,
          0.08846301,  0.03722777],
        [-0.04972519,  0.06236169,  0.01497115, ..., -0.05622729,
          0.11732304, -0.05363992],
        [-0.073636  ,  0.02206017,  0.01610589, ..., -0.04494889,
          0.09938193,  0.01417393],
        [ 0.01843624,  0.04386376,  0.01767073, ...,  0.03721605,
          0.00554983,  0.05488132],
        [-0.06005046,  0.01682185,  0.04309629, ..., -0.14745618,
          0.04135488, -0.00415223]], shape=(5, 384)),
 'documents': ['# Supplier Sustainability Requirements\n\nVersion: 1.2 | Approved by CEO (Andrea Frei) | Effective: 1 January 2024 Classification: Internal use only, do not share externally without management approval\n\n',
  "## 1. Purpose

## Try with normal VectorStoreRetriever that does dense retrieval

In [8]:
QUERY = "What materials is the Logypal 1 pallet made from?"

results = await inspect_retrieval(
    QUERY,
    vector_store,
    retriever=VectorStoreRetriever(
        embedding_model=embedding_model,
        vector_store=vector_store,
        top_k=RETRIEVER_TOP_K,
    ),
)

2026-02-26 15:52:35.632 | DEBUG    | conversational_toolkit.embeddings.sentence_transformer:get_embeddings:76 - sentence-transformers/all-MiniLM-L6-v2 embeddings size: (1, 384)
2026-02-26 15:52:35.636 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:inspect_retrieval:308 - Retrieval for query: 'What materials is the Logypal 1 pallet made from?'



Top-<conversational_toolkit.vectorstores.chromadb.ChromaDBVectorStore object at 0x7f499725acf0> retrieved chunks (returned=5; showing a maximum of 1000 content characters):
  [1] score=0.4178  file='ART_relicyc_logypal1_datasheet_2021.pdf'  title='## Overview'
       "## Overview\n\n The Logypal 1 is Relicyc's flagship pallet, manufactured from 100% post-consumer recycled plastic, primarily sourced from end-of-life agricultural packaging (silage film) and industrial packaging waste. It is designed as a direct drop-in replacement for standard EUR wood pallets (1200 × 800 mm)."
  [2] score=0.7969  file='ART_logylight_incomplete_datasheet.pdf'  title='## Product Overview'
       "## Product Overview\n\n The LogyLight is Relicyc's newest pallet model, optimised for reduced weight without sacrificing load capacity. It targets logistics operations where payload weight is a constraint. The pallet body is produced from post-consumer recycled HDPE collected from industrial packaging waste stre

In [10]:
QUERY = "What materials is the Logypal 1 pallet made from?"

results = await inspect_retrieval(
    QUERY,
    vector_store,
    retriever=BM25Retriever(vector_store=vector_store, top_k=RETRIEVER_TOP_K),
)

AttributeError: 'str' object has no attribute 'content'

In [4]:
llm = build_llm(backend=BACKEND)

SYSTEM_PROMPT = (
    "You are a helpful AI assistant specialised in sustainability and product compliance for PrimePack AG.\n\n"
    "You will receive document excerpts relevant to the user's question. Produce the best possible answer using only the information in those excerpts."
)
agent = build_agent(
    llm=llm,
    top_k=RETRIEVER_TOP_K,
    system_prompt=SYSTEM_PROMPT,
    number_query_expansion=0,
    retriever=VectorStoreRetriever(
        embedding_model=embedding_model,
        vector_store=vector_store,
        top_k=RETRIEVER_TOP_K,
    ),  # 0 = no expansion; see Feature Track 3 for more
)
print("RAG agent assembled.")

2026-02-26 15:38:33.878 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:build_llm:140 - LLM backend: OpenAI (gpt-4o-mini)
2026-02-26 15:38:33.898 | DEBUG    | conversational_toolkit.llms.openai:__init__:63 - OpenAI LLM loaded: gpt-4o-mini; temperature: 0.3; seed: 42; tools: None; tool_choice: None; response_format: {'type': 'text'}
2026-02-26 15:38:33.899 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:build_agent:349 - RAG agent ready (top_k=5  query_expansion=0)


RAG agent assembled.


In [5]:
QUERY = "What materials is the Logypal 1 pallet made from?"

print("---------------------------")
print(f"Query: {QUERY!r}")
print("---------------------------")
answer = await ask(agent, QUERY)

2026-02-26 15:38:36.913 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:ask:366 - Query: 'What materials is the Logypal 1 pallet made from?'


---------------------------
Query: 'What materials is the Logypal 1 pallet made from?'
---------------------------


2026-02-26 15:38:37.191 | DEBUG    | conversational_toolkit.embeddings.sentence_transformer:get_embeddings:76 - sentence-transformers/all-MiniLM-L6-v2 embeddings size: (1, 384)
2026-02-26 15:38:38.638 | INFO     | sme_kt_zh_collaboration_rag.feature0_baseline_rag:ask:369 - Answer:


The Logypal 1 pallet is made from 100% post-consumer recycled plastic, primarily sourced from end-of-life agricultural packaging, such as silage film, and industrial packaging waste.
Sources (5):
  'ART_relicyc_logypal1_datasheet_2021.pdf'  |  '## Overview'
  'ART_logylight_incomplete_datasheet.pdf'  |  '## Product Overview'
  'ART_logylight_incomplete_datasheet.pdf'  |  '## Material Composition'
  'ART_relicyc_logypal1_datasheet_2021.pdf'  |  '## Material Composition (2021)'
  'ART_logylight_incomplete_datasheet.pdf'  |  '# Relicyc LogyLight — Sustainability Information Sheet'


In [6]:
agent = build_agent(
    llm=llm,
    top_k=RETRIEVER_TOP_K,
    system_prompt=SYSTEM_PROMPT,
    number_query_expansion=0,
    retriever=BM25Retriever(vector_store=vector_store, top_k=RETRIEVER_TOP_K),
)
print("RAG agent assembled.")

AttributeError: 'str' object has no attribute 'content'