# Retrieval-Augmented Generation

LLMs can hallucinate and return incorrect answers even to simple questions. One way to help them is to provide an entire document in the prompting question (`deepseek-r1:7b` accepts about 200 wikipedia pages, or the entire Harry Potter 1 book), but LLMs have difficulty parsing long documents and often miss relevant pieces of information.

In this notebook we use cosine similarity between embedded information and an embedded question, in order to only retrieve relevant bits of information that will keep the prompt short but still help the LLM answer a question.

In [6]:
import requests
import ollama
LANGUAGE_MODEL = 'deepseek-r1:7b'
EMBEDDING_MODEL = 'nomic-embed-text:latest'

## LLM hallucinations on simple questions

If asked *"What is the smallest cat?"*, the model hallucinates a cat breed called "Japanese Yuki" which doesn't actually exist.

When asked *"What is the smallest domestic cat breed?"* it tells us that the "White-Ledger short-haired Midget measures approximately 2-3 inches in length".

In [7]:
%%time

QUESTION = "What is the smallest domestic cat breed?"

stream = ollama.generate(
    model=LANGUAGE_MODEL, 
    prompt=QUESTION, 
    stream=True,
    options={"temperature": 0},
)
for chunk in stream:
  print(chunk['response'], end='', flush=True)

<think>
Okay, so I need to figure out what the smallest domestic cat breed is. I know that cats come in all sorts of sizes and shapes, but I'm not exactly sure which one is the smallest. I've heard about some really tiny animals before, like mice or maybe even some insects, but cats are mammals, so they must be bigger than those.

First, I should probably consider what defines a "domestic cat breed." They're all part of the feline family, Felivora, right? So, they have to be domesticated for human habitation. Breeds vary in size, color, and other traits. Some are known for being large, like Siamese or Russian Blues, while others might be more compact.

I remember hearing about something called a "Guinea Pig," but wait, that's not a cat—it's a rodent. So maybe I should focus on actual felines here. Then there's the "Mice" family, which includes mice and similar small creatures, but again, they're not cats. Wait, no—there are also "Rabbits," but rabbits aren't considered domesticated in 

## Feeding it relevant information through embedded facts

This code is taken from Notebook #3:

In [12]:
cat_facts_url = 'https://huggingface.co/ngxson/demo_simple_rag_py/raw/main/cat-facts.txt'
response = requests.get(cat_facts_url)
cat_facts = response.text.split("\n")
EMBEDDED_FACTS = []
for fact in cat_facts:
    EMBEDDED_FACTS.append( ollama.embed(model=EMBEDDING_MODEL, input=fact)["embeddings"][0] )

def cosine_similarity(a, b):
  dot_product = sum([x * y for x, y in zip(a, b)])
  norm_a = sum([x ** 2 for x in a]) ** 0.5
  norm_b = sum([x ** 2 for x in b]) ** 0.5
  return dot_product / (norm_a * norm_b)

query_embedding = ollama.embed(model=EMBEDDING_MODEL, input=QUESTION)['embeddings'][0]

similarities = []

for chunk, embedding in zip(cat_facts,EMBEDDED_FACTS):
    similarity = cosine_similarity(query_embedding, embedding)
    similarities.append((chunk, similarity))

similarities.sort(key=lambda x: x[1], reverse=True)

print("Top 2 most relevant chunks:\n")
for chunk in similarities[:2]:
    print(chunk,"\n")

Top 2 most relevant chunks:

('The smallest wildcat today is the Black-footed cat. The females are less than 20 inches (50 cm) long and can weigh as little as 2.5 lbs. (1.2 kg).', 0.7901366658261361) 

('The smallest pedigreed cat is a Singapura, which can weigh just 4 lbs. (1.8 kg), or about five large cans of cat food. The largest pedigreed cats are Maine Coon cats, which can weigh 25 lbs. (11.3 kg), or nearly twice as much as an average cat weighs.', 0.7765075430480689) 



Now, instead of just asking the question, where are going to pass this information in the prompt:

In [15]:
AUGMENTED_QUESTION = f"""INSTRUCTIONS:
Answer the QUESTION below using the DOCUMENT below as context.
Do not make up any information. If the answer is not given in the document, say that you don't know.

DOCUMENT:
{similarities[0][0]}
{similarities[1][0]}

QUESTION:
{QUESTION}"""

print(AUGMENTED_QUESTION)

INSTRUCTIONS:
Answer the QUESTION below using the DOCUMENT below as context.
Do not make up any information. If the answer is not given in the document, say that you don't know.

DOCUMENT:
The smallest wildcat today is the Black-footed cat. The females are less than 20 inches (50 cm) long and can weigh as little as 2.5 lbs. (1.2 kg).
The smallest pedigreed cat is a Singapura, which can weigh just 4 lbs. (1.8 kg), or about five large cans of cat food. The largest pedigreed cats are Maine Coon cats, which can weigh 25 lbs. (11.3 kg), or nearly twice as much as an average cat weighs.

QUESTION:
What is the smallest domestic cat breed?


Let's ask the LLM again:

In [16]:
%%time
stream = ollama.generate(
    model=LANGUAGE_MODEL, 
    prompt=AUGMENTED_QUESTION, 
    stream=True,
    options={"temperature": 0},
)
for chunk in stream:
  print(chunk['response'], end='', flush=True)

<think>
Okay, so I need to figure out what the smallest domestic cat breed is based on the document provided. Let me read through the document again carefully.

The document starts by talking about wildcats and then moves into domestic cats. It mentions two specific breeds: the Black-footed cat and the Singapura as a pedigreed cat, but I'm not sure if they're domestic or wildcats. Wait, no—the first part is about wildcats, so maybe those are separate from domestic breeds.

Looking further down, it says something about the smallest piedreed cats being the Singapura, weighing around 4 lbs (1.8 kg). Then it mentions that the largest pedigreed cats are Maine Coon at 25 lbs (11.3 kg), which is almost twice as much as an average cat.

Wait a minute, so the document talks about both wild and domestic cats but focuses on their weights. The question is asking for the smallest domestic cat breed. So I should focus on the domestic breeds mentioned here.

The document lists two domestic breeds: Si

We have our answer!

## Wrapped in a class

In [42]:
import requests
import ollama

def cosine_similarity(a, b):
    dot_product = sum([x * y for x, y in zip(a, b)])
    norm_a = sum([x ** 2 for x in a]) ** 0.5
    norm_b = sum([x ** 2 for x in b]) ** 0.5
    return dot_product / (norm_a * norm_b)

class llm_with_rag():
        def __init__(self, LANGUAGE_MODEL, EMBEDDING_MODEL):
            self.LANGUAGE_MODEL = LANGUAGE_MODEL
            self.EMBEDDING_MODEL = EMBEDDING_MODEL
            self.EMBEDDED_FACTS = []

        def ingest(self,facts_url):
            response = requests.get(facts_url)
            self.chunked_facts = response.text.split("\n")
            for fact in self.chunked_facts:
                self.EMBEDDED_FACTS.append( 
                    ollama.embed(model=self.EMBEDDING_MODEL, input=fact)["embeddings"][0] 
                )
            return self

        def ask(self,QUESTION):
            embedded_question = ollama.embed(
                model=self.EMBEDDING_MODEL, 
                input=QUESTION,
            )['embeddings'][0]
            similarities = []
            for chunk, embedding in zip(self.chunked_facts,self.EMBEDDED_FACTS):
                similarity = cosine_similarity(embedded_question, embedding)
                similarities.append((chunk, similarity))
            similarities.sort(key=lambda x: x[1], reverse=True)
            
            self.AUGMENTED_QUESTION = f"""INSTRUCTIONS:
            Answer the QUESTION below using the DOCUMENT below as context.
            Do not make up any information. If the answer is not given in the document, say that you don't know.
            
            DOCUMENT:
            {similarities[0][0]}
            {similarities[1][0]}
            
            QUESTION:
            {QUESTION}"""

            stream = ollama.generate(
                model=self.LANGUAGE_MODEL, 
                prompt=self.AUGMENTED_QUESTION, 
                stream=True,
                options={"temperature": 0},
            )
            
            full_answer = ""
            for chunk in stream:
                full_answer += chunk['response']
                print(chunk['response'], end='', flush=True)

            dict_answer = {
                "chain_of_thought":full_answer.split("</think>")[0]+"</think>",
                "answer":full_answer.split("</think>\n\n")[1]}
                
            return dict_answer

In [43]:
answer_machine = llm_with_rag('deepseek-r1:7b','nomic-embed-text:latest')
answer_machine = answer_machine.ingest("https://huggingface.co/ngxson/demo_simple_rag_py/raw/main/cat-facts.txt")

In [44]:
answer1 = answer_machine.ask("Are cats happy?")

<think>
Okay, so I need to figure out if cats are happy based on the document provided. Let me read through the document again carefully.

The first paragraph talks about a cat's mood being visible in its eyes. It mentions that a frightened or excited cat has large, round pupils, while an angry cat has narrow ones. It also notes that pupil size relates to emotions and the amount of light. Hmm, so this is talking about how cats' eye movements can indicate their emotional states.

The second paragraph says that cats are social animals and respond to speech, enjoying human companionship. That suggests that cats do get happiness from being around people because they interact with them.

Putting these together, it seems like the document doesn't directly say whether cats are happy or not in general. It just explains how to tell if a cat is excited, angry, etc., based on their eyes, and mentions their social nature which could imply that cats derive happiness from human interaction.

So, I d

In [47]:
print(answer1["chain_of_thought"])
print(answer1["answer"])

<think>
Okay, so I need to figure out if cats are happy based on the document provided. Let me read through the document again carefully.

The first paragraph talks about a cat's mood being visible in its eyes. It mentions that a frightened or excited cat has large, round pupils, while an angry cat has narrow ones. It also notes that pupil size relates to emotions and the amount of light. Hmm, so this is talking about how cats' eye movements can indicate their emotional states.

The second paragraph says that cats are social animals and respond to speech, enjoying human companionship. That suggests that cats do get happiness from being around people because they interact with them.

Putting these together, it seems like the document doesn't directly say whether cats are happy or not in general. It just explains how to tell if a cat is excited, angry, etc., based on their eyes, and mentions their social nature which could imply that cats derive happiness from human interaction.

So, I d