# RAG-based Question Answering

## Objectives (8 points):

In [1]:
!python --version

Python 3.10.16


1. Set up the QA environment:
   * Install OLLAMA and select an appropriate LLM
   * Configure [Qdrant](https://qdrant.tech/) vector database (or vector DB of your choosing)
   * Install necessary Python packages for embedding generation

In [2]:
from qdrant_client import models, QdrantClient

client = QdrantClient(url="http://localhost:6333")

In [48]:
client.create_collection(
    collection_name="pdf3",
    vectors_config=models.VectorParams(
        size=4096,
        distance=models.Distance.COSINE,
    ),
)

True

2. Find PDF file of your choosing. Example - some publication or CV file:

3. Write next procedures necessary for RAG pipeline. Use [LangChain](https://python.langchain.com/docs/introduction/) library:
 
   * Load PDF file using `PyPDFLoader`.  
   * Split documents into appropriate chunks using `RecursiveCharacterTextSplitter`.
   * Generate and store embeddings in Qdrant database

In [4]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [5]:
import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [49]:
loader = PyPDFLoader("terminology.pdf")
pages = loader.load_and_split()
page_content = list(map(lambda x: x.page_content, pages))

In [50]:
print(len(pages))
print(pages[0].page_content)

4
John Colet SchoolEnglish DepartmentKey Subject Terminology
Key term Definition Example
Adjective Describes a noun, gives more informationabout it
Beautiful, stunning, disgraceful,angry
Adverb Describes a verb, gives more informationabout it.
Angrily, happily, joyfully.
Allegory A type of writing in which the settings,characters, or events stand for other, oftenlarger ideas
The novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics
Alliteration The same letter or sound at the beginningof words close to each other.
Brilliant birdsSlithering snakeSweet birds sang
Allusion Making reference to people, places,events, literary works, myths, or works ofart
‘Don’t be such a Scrooge’‘Is there a good Samaritan who canhelp me?’
Anaphora Repetition of a word or phrase at thebeginning of successive sentences, phrasesor clauses
‘It was the best of times, it was theworst of times’
Antithesis A person or thing that is the directopposite of s

In [53]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=150,
    length_function=len,
    is_separator_regex=False,
    add_start_index=True,
)

In [54]:
texts = text_splitter.create_documents(page_content)
# texts = text_splitter.create_documents([pages[0].page_content])
print(texts[0])
print(texts[1])

page_content='John Colet SchoolEnglish DepartmentKey Subject Terminology
Key term Definition Example
Adjective Describes a noun, gives more informationabout it
Beautiful, stunning, disgraceful,angry
Adverb Describes a verb, gives more informationabout it.
Angrily, happily, joyfully.
Allegory A type of writing in which the settings,characters, or events stand for other, oftenlarger ideas
The novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics' metadata={'start_index': 0}
page_content='The novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics
Alliteration The same letter or sound at the beginningof words close to each other.
Brilliant birdsSlithering snakeSweet birds sang
Allusion Making reference to people, places,events, literary works, myths, or works ofart
‘Don’t be such a Scrooge’‘Is there a good Samaritan who canhelp me?’' metadata={'start_index': 376}


In [55]:
from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="mistral",
    temperature=0,
)

In [56]:
from langchain_ollama import OllamaEmbeddings

embed = OllamaEmbeddings(
    model="mistral"
)

In [57]:
print(texts)

[Document(metadata={'start_index': 0}, page_content='John Colet SchoolEnglish DepartmentKey Subject Terminology\nKey term Definition Example\nAdjective Describes a noun, gives more informationabout it\nBeautiful, stunning, disgraceful,angry\nAdverb Describes a verb, gives more informationabout it.\nAngrily, happily, joyfully.\nAllegory A type of writing in which the settings,characters, or events stand for other, oftenlarger ideas\nThe novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics'), Document(metadata={'start_index': 376}, page_content='The novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics\nAlliteration The same letter or sound at the beginningof words close to each other.\nBrilliant birdsSlithering snakeSweet birds sang\nAllusion Making reference to people, places,events, literary works, myths, or works ofart\n‘Don’t be such a Scrooge’‘Is there a good S

In [59]:
client.upload_points(
    collection_name="pdf3",
    points=[
        models.PointStruct(
            id=idx, vector=embed.embed_query(doc.page_content), payload=doc.metadata
        )
        for idx, doc in enumerate(texts)
    ],
)

In [68]:
rag_prompt = hub.pull("rlm/rag-prompt")



4. Design and implement the RAG pipeline with `LCEL`. As reference use this detailed guide created by LangChain community - [RAG](https://python.langchain.com/docs/tutorials/rag/). Next steps should involve:
   * Create query embedding generation
   * Implement semantic search in Qdrant
   * Design prompt templates for context integration
   * Build response generation with the LLM

Hint: You don't need to build it from scratch. A lot of this steps is already automated using LCEL pipeline definition.


In [69]:
class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

def compare_payload(hits, start_index):
    for hit in hits:
        if hit.payload['start_index'] == start_index:
            return True 
    return False

def retrieve(state):
    hits = client.query_points(
        collection_name="pdf2",
        query=embed.embed_query(state["question"]),
        limit=3,
    ).points

    if len(hits) >= 1:
        return {"context": [text for text in texts if compare_payload(hits, text.metadata['start_index'])]}
    else:
        return {"context": []}


def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = rag_prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}

graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

In [62]:
response = graph.invoke({"question": "What is Allegory?"})
print(response["answer"])

 Allegory is a type of writing where the settings, characters, or events stand for other, often larger ideas. For example, in Animal Farm, the animals represent larger ideas about revolution and politics.


5. Implement basic retrieval strategies (semantic search).

In [63]:
def semantic_search(question):
    hits = client.query_points(
        collection_name="pdf2",
        query=embed.embed_query(question),
        limit=3,
    ).points

    return [text for text in texts if compare_payload(hits, text.metadata['start_index'])]

6. Create basic QA prompt.

In [64]:
prompts = [
    "What is an Allegory?",
    "What is a Characterisation?",
    "What are examples of Dramatic monologue?",
    "Which word describes Deliberate exaggeration?",
    "What kind of term describes this sentence: 'You don’t have to be Albert Einstein to understand poetry.'?"
]

answers =[
    'A type of writing in which the settings, characters, or events stand for other, often larger ideas',
    'How a character is introduced and developed, through what the writer informs us about them',
    "'My Last Duchess'",
    'Hyperbole',
    'Allusion '
]

7. Determine 5 evaluation queries:
    - Determine a few questions, which answers are confirmed by you.

### RAG

In [70]:
prompt_answers = []

for prompt in prompts:
    print(f"Prompt: {prompt}")
    response = graph.invoke({"question": prompt})
    prompt_answers.append(response['answer'])
    print(response["answer"])
    print(f"============================================================")

Prompt: What is an Allegory?
 An Allegory is a type of writing where the settings, characters, or events represent larger ideas. For example, in Animal Farm, the animals symbolize political figures and events.
Prompt: What is a Characterisation?
 A characterization is the process or result of depicting the personality, appearance, and other distinguishing features of a fictional or real character in a story, novel, play, film, etc. It helps to make characters more relatable and engaging for the audience.
Prompt: What are examples of Dramatic monologue?
 Examples of Dramatic monologues include "Ozymandias" by Percy Bysshe Shelley and "The Love Song of J. Alfred Prufrock" by T.S. Eliot. These poems present the speech or conversation of a person in a dramatic manner, allowing the audience to understand events from their point of view.
Prompt: Which word describes Deliberate exaggeration?
 The word describing deliberate exaggeration is "Hyperbole".
Prompt: What kind of term describes this 

1. answer is ok, from proper context
2. answer is ok, but not from the context
3. answer is ok, from the context, but not from desired part of it
4. answer is ok, from context
5. answer is not ok, not from context

### Semantic search

In [71]:
prompt_answers2 = []

for prompt in prompts:
    print(f"Prompt: {prompt}")
    resp = semantic_search(prompt)
    prompt_answers2.append(resp)
    print(resp)
    print(f"============================================================")

Prompt: What is an Allegory?
[Document(metadata={'start_index': 0}, page_content='John Colet SchoolEnglish DepartmentKey Subject Terminology\nKey term Definition Example\nAdjective Describes a noun, gives more informationabout it\nBeautiful, stunning, disgraceful,angry\nAdverb Describes a verb, gives more informationabout it.\nAngrily, happily, joyfully.\nAllegory A type of writing in which the settings,characters, or events stand for other, oftenlarger ideas\nThe novelAnimal Farmappears tobe about a group of animals, butthey represent larger ideas aboutrevolution and politics'), Document(metadata={'start_index': 0}, page_content='Connotation The feelings or associations suggested bywords/phrases. These can help to find thehidden meaning.\nThe word ‘discipline’ has unhappyconnotations of punishment andrepression\nDramatic irony In a Literature text when the audienceknows something that the characters don’tknow\nIn Romeo and Juliet the audienceknows from the start that the maincharacter

### LLM

In [74]:
prompt_answers3 = []

for prompt in prompts:
    print(f"Prompt: {prompt}")
    response = llm.invoke(prompt)
    prompt_answers3.append(response)
    print(response)
    print(f"============================================================")

Prompt: What is an Allegory?
content=' An allegory is a literary device or figurative language that uses symbols to represent abstract ideas, actions, or qualities. In an allegory, the characters, settings, and events in a story are not meant to be taken literally but rather as representations of something else, such as moral lessons, political ideologies, or philosophical concepts. The meaning of an allegory is often symbolic and requires interpretation by the reader or listener. Examples of famous allegories include Aesop\'s Fables, John Bunyan\'s "Pilgrim\'s Progress," George Orwell\'s "Animal Farm," and Jonathan Swift\'s "Gulliver\'s Travels."' additional_kwargs={} response_metadata={'model': 'mistral', 'created_at': '2025-01-20T11:19:12.328018994Z', 'done': True, 'done_reason': 'stop', 'total_duration': 57373007296, 'load_duration': 11534503, 'prompt_eval_count': 12, 'prompt_eval_duration': 442000000, 'eval_count': 143, 'eval_duration': 56918000000, 'message': Message(role='assist

1. answer is ok
2. answer is ok
3. answer is ok
4. answer is ok
5. answer is not ok

8. Compare performance of RAG vs. pure LLM response.

RAG had few problems to answer questions:
- Sometimes it didn't capture context well. It couldn't retrieve good part of the text. There is also possibility that this context will be truncated a bit with valuable information.
- It takes longer to prepare and compute RAG than LLM
But if question requires knowledge from specific context rather from general knowledge of LLM model RAG could be more suitable.

LLM returned comparable results. Of course answers where out of context of the text, but still plausible.
LLM cannot capture knowledge that wasn't learnt, but RAG where we provide context can answer more precisely.

Questions (2 points):

1. How does RAG improve the quality and reliability of LLM responses compared to pure LLM generation?\
    **It returns an answer from provided context. For RAG response above, example Ozymandias was from the text, but still wasn't the perfect example.\
        However, LLM returned very different example apart from the scope of the text.**
2. What are the key factors affecting RAG performance (chunk size, embedding quality, prompt design)?\
    **There are many different factors and every is important. 
     - Chunk size returns how big part of text should be considered. In presented example it should be big enought to capture the definition of a term.
        Also there is overlapping size factor which is also important if we don't want to miss important snipped that was truncated and was left in different chunk.
     - creation of embeddings depends on the provided model and data on which it was trained. Also size of embeddings matter to embody more complex meanings
     -  prompt design, how well we can formulate question and how big and proper is provided context are very important factors. The more exact and precise question without
            underlying meaning is the better. And context also should capture what is relevant.
     - also there is an approach to add different filters to analyze the query and rewrite or transform to achieve better outcome.**
3. How does the choice of vector database and embedding model impact system performance?\
    **It depends how big embedding model is and what similarity uses, also on what data was trained. The better these factors the more accurate results will be achieved.
        But preparing all data, embedding model and computing results will take time.**
4. What are the main challenges in implementing a production-ready RAG system?\
    **Good and well prepared, fine-tuned LLM model.
      Proper database or tokenization model with storage system.
      Prepared prompts and tools or filters for analysis.
      Prepared text and context of searching for best results. Business kind terminology can be a problem of understanding it.
      **
5. How can the system be improved to handle complex queries requiring multiple document lookups?\
    **RAG pipline can be extended for multiple search.
        Prior analysis can be added to distill more importatnt information.
        We can add filters to enchance queries and rewrite them in a better form.**
