<div style="background-color: #ffffff; color: #000000; padding: 10px;">
<div style="display: flex; justify-content: space-between; align-items: center; background-color: #ffffff; color: #000000; padding: 10px;">
    <img src="../images/logo_kisz.png" height="80" style="margin-right: auto;" alt="Logo of the AI Service Center Berlin-Brandenburg.">
    <img src="../images/logo_bmbf.jpeg" height="150" style="margin-left: auto;" alt="Logo of the German Federal Ministry of Education and Research: Gefördert vom Bundesministerium für Bildung und Forschung.">
</div>
<h1> Efficient Information Retrieval from Documents
<h2> Local Retrieval Augmented Generation System
</div>

<div style="background-color: #f6a800; color: #ffffff; padding: 10px;">
    <h2> Part 1 - Workflow

<div style="background-color: #dd6108; color: #ffffff; padding: 10px;">
<h3>1. Basics - Handcrafted RAG</h3>
</div>

In [1]:
# Load Embedding model
# https://www.sbert.net/
# pip3 install sentence-transformers

from sentence_transformers import SentenceTransformer

EMBEDDING_MODEL = "all-MiniLM-L6-v2"
model = SentenceTransformer(EMBEDDING_MODEL)


  return torch._C._cuda_getDeviceCount() > 0


In [2]:
# Sample Text

texts = [
    "Bali's beautiful beaches and rich culture stands out as a fantastic travel destination.",
    "Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens.",
    "Graphics processing units (GPU) have become an essential foundation for artificial intelligence.",
    "Newton's laws of motion transformed our understanding of physics.",
    "The French Revolution played a crucial role in shaping contemporary France.",
    "Maintaining good health requires regular exercise, balanced diet and quality sleep.",
    "Dali's surrealistic artworks, like 'The Persistence of Memory,' captivate audiences with their dreamlike imagery and imaginative brilliance",
    "Global warming threatens the planet's ecosystems and wildlife.",
    "The KI-Servicezentrum Berlin-Brandenburg offers services such as consulting, workshops, MOOCs and computer resources.",
    "Django Reinhardt's jazz compositions are celebrated for their captivating melodies and innovative guitar work..",
]


In [3]:
# Encode texts

text_embeddings = model.encode(texts)

print("Embeddings type:", type(text_embeddings))
print("Embeddings Matrix shape:", text_embeddings.shape)

# Note: "all-MiniLM-L6-v2" encodes texts up to 256 words. It’ll truncate any text longer than this.
# https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2


Embeddings type: <class 'numpy.ndarray'>
Embeddings Matrix shape: (10, 384)


 #### &bull; **Check Similarity**

In [4]:
# Storing Embeddings in a simple dict

text_embs_dict = dict(zip(texts, list(text_embeddings)))
# key: text, value: numpy array with embedding


In [5]:
# Define similarity metric

from numpy import dot
from numpy.linalg import norm


def cos_sim(x, y):
    return dot(x, y) / (norm(x) * norm(y))



In [6]:
# Check similarities

test_text = "I really need some vacations"
emb_test_text = model.encode(test_text)

print(f"\nCosine similarities for: '{test_text}'\n")
for k, v in text_embs_dict.items():
    print(k, round(cos_sim(emb_test_text, v), 3))



Cosine similarities for: 'I really need some vacations'

Bali's beautiful beaches and rich culture stands out as a fantastic travel destination. 0.302
Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens. 0.161
Graphics processing units (GPU) have become an essential foundation for artificial intelligence. -0.089
Newton's laws of motion transformed our understanding of physics. 0.022
The French Revolution played a crucial role in shaping contemporary France. 0.012
Maintaining good health requires regular exercise, balanced diet and quality sleep. 0.176
Dali's surrealistic artworks, like 'The Persistence of Memory,' captivate audiences with their dreamlike imagery and imaginative brilliance 0.022
Global warming threatens the planet's ecosystems and wildlife. 0.105
The KI-Servicezentrum Berlin-Brandenburg offers services such as consulting, workshops, MOOCs and computer resources. 0.105
Django Reinhardt's jazz compositions are celebrated for their captivati

 **To do:**



 - Experiment with different sentences and compare similarities.



 - Check alternative sentence-transformers. Read Model Cards. Compare results.

<div style="background-color: #dd6108; color: #ffffff; padding: 10px;">
<h3>2. Basics - A simple RAG tool</h3>
</div>

In [7]:

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.Client()
# client = chromadb.PersistentClient(path="chroma_data/")
# For a ChromaDB instance on the disk

embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name=EMBEDDING_MODEL
)


In [8]:
# Init collection

COLLECTION_NAME = "demo_docs"

collection = client.create_collection(
    name=COLLECTION_NAME,
    embedding_function=embedding_func,
    metadata={"hnsw:space": "cosine"},
)


In [9]:
# Create collection
# (Adding metadata. Optional)

topics = [
    "travel",
    "food",
    "technology",
    "science",
    "history",
    "health",
    "painting",
    "climate change",
    "business",
    "music",
]

collection.add(
    documents=texts,
    ids=[f"id{i}" for i in range(len(texts))],
    metadatas=[{"topic": topic} for topic in topics],
)


In [10]:
# Quering the Database

query1 = "I am looking for something to eat"
query2 = "What's going on in the world?"
query3 = "Great jazz player"

queries = [query1, query2, query3]

query_results = collection.query(
    query_texts=queries,  # list of strings or just one element (string)
    n_results=2,
)
print("Query dict keys:")
print(query_results.keys())

for i in range(len(queries)):
    print("\nQuery:", queries[i])

    print("\nResults:")
    for j in range(len(query_results["ids"][i])):
        print("id:", query_results["ids"][i][j])
        print("Text:", query_results["documents"][i][j])
        print("Distance:", round(query_results["distances"][i][j], 2))
        print("Metadata:", query_results["metadatas"][i][j])

    print(80 * "-")


Query dict keys:
dict_keys(['ids', 'embeddings', 'documents', 'uris', 'included', 'data', 'metadatas', 'distances'])

Query: I am looking for something to eat

Results:
id: id1
Text: Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens.
Distance: 0.72
Metadata: {'topic': 'food'}
id: id5
Text: Maintaining good health requires regular exercise, balanced diet and quality sleep.
Distance: 0.85
Metadata: {'topic': 'health'}
--------------------------------------------------------------------------------

Query: What's going on in the world?

Results:
id: id7
Text: Global warming threatens the planet's ecosystems and wildlife.
Distance: 0.74
Metadata: {'topic': 'climate change'}
id: id3
Text: Newton's laws of motion transformed our understanding of physics.
Distance: 0.86
Metadata: {'topic': 'science'}
--------------------------------------------------------------------------------

Query: Great jazz player

Results:
id: id9
Text: Django Reinhardt's jazz composi

 #### &bull; **Running a Large Language Model locally with Ollama**

In [11]:

import requests, json
from os import getenv
from urllib.parse import urljoin

# $ ollama serve
BASEURL = urljoin(getenv("OLLAMA_HOST", "http://localhost:11434"), "api")
MODEL = "llama3.2"


def generate(prompt, context=[], top_k=5, top_p=0.9, temp=0.5):
    url = BASEURL + "/generate"
    data = {
        "prompt": prompt,
        "model": MODEL,
        "stream": False,
        "context": context,
        "options": {"temperature": temp, "top_p": top_p, "top_k": top_k},
    }

    try:
        r = requests.post(url, json=data)
        response_dic = json.loads(r.text)
        return response_dic.get('response', ''), response_dic.get('context', '')

    except Exception as e:
        print(e)



In [12]:

llm_response, _ = generate("Hi, who are you", top_k=10, top_p=0.9, temp=0.5)
print(llm_response)


I'm an artificial intelligence model designed to assist and communicate with users like you. I don't have a personal identity or emotions, but I'm here to provide information, answer questions, and engage in conversation to the best of my abilities.

My purpose is to help users find the information they need, solve problems, and explore topics of interest. I'm constantly learning and improving my knowledge base, so please bear with me if I make any mistakes or don't have an answer to a specific question.

How can I assist you today?


 #### &bull; **Make a simple local LLM chatbot**

In [None]:
user_input = "Hi. who are you?"
ollama_context = []
print(f"Start chatting with {MODEL} model (Press q to quit)\n")
while user_input != "q":
    bot_response, ollama_context = generate(
        user_input, context=ollama_context, top_k=10, top_p=0.9, temp=0.5
    )
    print("Model message:")
    print(bot_response)
    user_input = input("\nYour prompt: ")


Start chatting with llama3.2 model (Press q to quit)



 **To do:**



 - Experiment with different input parameters, arguments, prompts and compare results.

 - Try different models (e.g. mistral, llama3.1, gemma2, qwen2.5, llama3.2, qwen2.5:3b, etc.)



 -----------------------------------------------------------------------------

 ## Integration

 #### **Integrate the components for a RAG System.**

 ### References:

 #### &bull; **Text embedding: Sentence Bert**

 https://www.sbert.net/



 Sample Sentence Transformers:



 https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2



 https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1



 https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-cos-v1



 Check and compare

 #### &bull; **Vector Database: ChromaDB**

 https://docs.trychroma.com/

 #### &bull; **Local LLM: ollama**

 https://ollama.ai/

 #### &bull; **Frontend: Gradio**

 https://www.gradio.app/