# 

## 1. Make sure you have the following packages in your environment installed

In [4]:
# !pip install --upgrade pip
# !pip install langchain==0.3.25 langchain-community==0.3.25 langchain-ollama==0.3.3 chromadb==1.0.12

In [5]:
from langchain.document_loaders import TextLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_ollama import OllamaEmbeddings, OllamaLLM
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

## 2. Load your documents

In [6]:
loader = DirectoryLoader(
    "./documents",
    glob="**/*.txt",
    loader_cls=TextLoader,
    loader_kwargs={"encoding": "utf-8"}
)
docs = loader.load()
print(f"One single document:\n\n'{docs[1].page_content}'\n\nFrom the text file:\n'{docs[1].metadata['source']}'")

One single document:

'CLEVER HANS

The mother of Hans said: ‘Whither away, Hans?’ Hans answered: ‘To Gretel.’ ‘Behave well, Hans.’ ‘Oh, I’ll behave well. Goodbye, mother.’ ‘Goodbye, Hans.’ Hans comes to Gretel. ‘Good day, Gretel.’ ‘Good day, Hans. What do you bring that is good?’ ‘I bring nothing, I want to have something given me.’ Gretel presents Hans with a needle, Hans says: ‘Goodbye, Gretel.’ ‘Goodbye, Hans.’

Hans takes the needle, sticks it into a hay-cart, and follows the cart home. ‘Good evening, mother.’ ‘Good evening, Hans. Where have you been?’ ‘With Gretel.’ ‘What did you take her?’ ‘Took nothing; had something given me.’ ‘What did Gretel give you?’ ‘Gave me a needle.’ ‘Where is the needle, Hans?’ ‘Stuck in the hay-cart.’ ‘That was ill done, Hans. You should have stuck the needle in your sleeve.’ ‘Never mind, I’ll do better next time.’

‘Whither away, Hans?’ ‘To Gretel, mother.’ ‘Behave well, Hans.’ ‘Oh, I’ll behave well. Goodbye, mother.’ ‘Goodbye, Hans.’ Hans comes to G

## 3. Chunk your documents

In [7]:
splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=128)
chunks = splitter.split_documents(docs)
print(f"One single document chunk:\n'{chunks[5].page_content}'\n\nFrom the file:\n'{chunks[5].metadata['source']}'")

One single document chunk:
'if the master was not coming with his guest, but she saw no one, and went back to the fowls and thought: ‘One of the wings is burning! I had better take it off and eat it.’ So she cut it off, ate it, and enjoyed it, and when she had done, she thought: ‘The other must go down too, or else master will observe that something is missing.’ When the two wings were eaten, she went and looked for her master, and did not see him. It suddenly occurred to her: ‘Who knows? They are perhaps not coming at all, and have'

From the file:
'documents\clever_gretel.txt'


## 4. Embed and index chunks in a vector database
Make sure to execute `ollama pull all-minilm` in the terminal before running the following snippet.

In [8]:
# !ollama pull all-minilm
# !ollama pull nomic-embed-text
# !ollama pull mxbai-embed-large

In [9]:
# model = "all-minilm"         # 23M parameters
# model = "nomic-embed-text"   # 137M parameters
model = "mxbai-embed-large"   # 334M parameters

# Create a vector store to store the embeddings or our chunked documents
embed_model = OllamaEmbeddings(model=model, base_url="http://127.0.0.1:11434")
vector_store = Chroma.from_documents(chunks, embed_model)
collection = vector_store._collection

print("Total amount of chunks / embeddings:", collection.count())

ResponseError: model "mxbai-embed-large" not found, try pulling it first (status code: 404)

In [None]:
all_data = collection.get(include=["documents", "embeddings"])
print(f"One single document:\n{all_data['documents'][1]}\n")
print(f"The document's embedding:\n{all_data['embeddings'][1]}\n")


One single document:
It happened that the cat met the fox in a forest, and as she thought to herself: ‘He is clever and full of experience, and much esteemed in the world,’ she spoke to him in a friendly way. ‘Good day, dear Mr Fox, how are you? How is all with you? How are you getting on in these hard times?’ The fox, full of all kinds of arrogance, looked at the cat from head to foot, and for a long time did not know whether he would give any answer or not. At last he said: ‘Oh, you wretched beard-cleaner, you piebald fool,

The document's embedding:
[ 0.04361834  0.0033501   0.01679566 ...  0.01759302  0.00692107
 -0.00018101]



## 5. Create a Retrieval-QA chain
Make sure to execute `ollama pull gemma3:1b` in the terminal before running the following snippet.

In [None]:
# !ollama pull gemma3:1b
# !ollama pull llama3.2
# !ollama pull mistral

In [None]:
# model = "gemma3:1b"     # 1B parameters
# model = "llama3.2"    # 3B parameters
model = "mistral"     # 7B parameters

# Create a language model with an endpoint, and a RetrievalQA chain which takes the language model,
# and the vector store containing our documents.
llm = OllamaLLM(
    model=model,
    base_url="http://127.0.0.1:11434",
    temperature=0.9,
    top_p=0.95,
    top_k=46,
    # num_predict=256,
    )

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vector_store.as_retriever(search_kwargs={"k": 10}), # We set the number of chunks to return k
    chain_type="stuff",            # or "map_reduce", "refine", etc.
    return_source_documents=True,  # if you want the source chunks back
)
print(llm)

[1mOllamaLLM[0m
Params: {}


# 6 Ask questions!
Some examples:
### 6.1. "Who is Clever Hans in the Grimm Fairy Tales?"

In [None]:
# Without RAG
query = "Who is Clever Hans in the Grimm Fairy Tales?"
result = llm.invoke(query)
print(f"Answer: {result}")

In [None]:
# With RAG
result = qa_chain.invoke({"query": query})
print(f"Answer: {result['result']}\n")
print(f"Source: {result['source_documents']}")

### 6.2 "Cite what the frog says in the “The Frog Prince” in the Grimm Fairy Tales when he tries to visit the princess"

In [None]:
# Without RAG
query = "Cite what the frog says in the “The Frog Prince” in the Grimm Fairy Tales when he tries to visit the princess"
result = llm.invoke(query)
print(f"Answer: {result}")

In [None]:
# With RAG
result = qa_chain.invoke({"query": query})
print(f"Answer: {result['result']}\n")
print(f"Source: {result['source_documents']}")

### 6.3 "What is the fairy tale 'Fundevogel' about?"

In [None]:
# Without RAG
query = "What is the fairy tale 'Fundevogel' about?"
result = llm.invoke(query)
print(f"Answer: {result}")

Answer:  "Fundevogel" is a German folktale, also known as "Fledermans," which means "Bat-man." The story revolves around a miller and his two sons. The elder son was lazy and foolish while the younger one was smart and industrious.

The younger son was sent out to earn his living, and on his way, he encountered an old man who offered him a magical flute in exchange for whatever he got out of it afterward. The young man accepted, and the old man also gave him three golden seeds.

Upon reaching a foreign land, the young man found himself hungry, so he planted one of the golden seeds in the ground and sang into his magical flute. A huge goose popped out, which he cooked and ate. He then replanted another seed and repeated the process.

Next, he decided to use the third seed to find a wife, so he planted it and played on his flute. Out came a beautiful princess who asked him for three kisses as she woke up. The young man, not knowing it was a spell, obliged her on three consecutive days. O

In [None]:
# With RAG
result = qa_chain.invoke({"query": query})
print(f"Answer: {result['result']}\n")
print(f"Source: {result['source_documents']}")

Answer:  The fairy tale 'Fundevogel' is a German folktale about a forester who rescues a child from a tree in the forest that was carried away by a bird of prey. The forester raises this child, named Fundevogel, along with his own daughter Lina. The story continues to describe how they make promises to each other to remain together, transforming into various objects such as a church and a chandelier, a rose-tree and a rose, and finally a fishpond and a duck. Eventually, the witch who had enchanted the prince in another tale is defeated by the duck (Fundevogel) and they live happily together, possibly for eternity. The story is also intertwined with another tale, 'The Frog-Prince', where the prince is transformed into a frog, but later turned back into a prince by Lina.

Source: [Document(metadata={'source': 'documents/fundevogel.txt'}, page_content='The forester climbed up, brought the child down, and thought to himself: ‘You will take him home with you, and bring him up with your Lina