<div style="background-color: #ffffff; color: #000000; padding: 10px;">
<div style="display: flex; justify-content: space-between; align-items: center; background-color: #ffffff; color: #000000; padding: 10px;">
    <img src="../images/logo_kisz.png" height="80" style="margin-right: auto;" alt="Logo of the AI Service Center Berlin-Brandenburg.">
    <img src="../images/logo_bmbf.jpeg" height="150" style="margin-left: auto;" alt="Logo of the German Federal Ministry of Education and Research: Gefördert vom Bundesministerium für Bildung und Forschung.">
</div>
<h1> Efficient Information Retrieval from Documents
<h2> Local Retrieval Augmented Generation System
</div>

<div style="background-color: #f6a800; color: #ffffff; padding: 10px;">
    <h2> Part 1 - Workflow

<div style="background-color: #dd6108; color: #ffffff; padding: 10px;">
<h3>1. Basics - Handcrafted RAG</h3>
</div>

In [1]:
# Load Embedding model
# https://www.sbert.net/
# pip3 install sentence-transformers

from sentence_transformers import SentenceTransformer

EMBEDDING_MODEL = "all-MiniLM-L6-v2"
model = SentenceTransformer(EMBEDDING_MODEL)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [10]:
# Sample Text

texts = [
    "Bali's beautiful beaches and rich culture stands out as a fantastic travel destination.",
    "Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens.",
    "Graphics processing units (GPU) have become an essential foundation for artificial intelligence.",
    "Newton's laws of motion transformed our understanding of physics.",
    "The French Revolution played a crucial role in shaping contemporary France.",
    "Maintaining good health requires regular exercise, balanced diet and quality sleep.",
    "Dali's surrealistic artworks, like 'The Persistence of Memory,' captivate audiences with their dreamlike imagery and imaginative brilliance",
    "Global warming threatens the planet's ecosystems and wildlife.",
    "The KI-Servicezentrum Berlin-Brandenburg offers services such as consulting, workshops, MOOCs and computer resources.",
    "Django Reinhardt's jazz compositions are celebrated for their captivating melodies and innovative guitar work..",
]

In [11]:
# Encode texts

text_embeddings = model.encode(texts)

print("Embeddings type:", type(text_embeddings))
print("Embeddings Matrix shape:", text_embeddings.shape)

# Note: "all-MiniLM-L6-v2" encodes texts up to 256 words. It’ll truncate any text longer than this.
# https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Embeddings type: <class 'numpy.ndarray'>
Embeddings Matrix shape: (10, 384)


 #### &bull; **Check Similarity**

In [12]:
# Storing Embeddings in a simple dict

text_embs_dict = dict(zip(texts, list(text_embeddings)))
# key: text, value: numpy array with embedding

In [13]:
# Define similarity metric

from numpy import dot
from numpy.linalg import norm

def cos_sim(x, y):
    return dot(x, y) / (norm(x) * norm(y))

In [14]:
# Check similarities

test_text = "I really need some vacations"
emb_test_text = model.encode(test_text)

print(f"\nCosine similarities for: '{test_text}'\n")
for k, v in text_embs_dict.items():
    print(k, round(cos_sim(emb_test_text, v), 3))


Cosine similarities for: 'I really need some vacations'

Bali's beautiful beaches and rich culture stands out as a fantastic travel destination. 0.302
Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens. 0.161
Graphics processing units (GPU) have become an essential foundation for artificial intelligence. -0.089
Newton's laws of motion transformed our understanding of physics. 0.022
The French Revolution played a crucial role in shaping contemporary France. 0.012
Maintaining good health requires regular exercise, balanced diet and quality sleep. 0.176
Dali's surrealistic artworks, like 'The Persistence of Memory,' captivate audiences with their dreamlike imagery and imaginative brilliance 0.022
Global warming threatens the planet's ecosystems and wildlife. 0.105
The KI-Servicezentrum Berlin-Brandenburg offers services such as consulting, workshops, MOOCs and computer resources. 0.105
Django Reinhardt's jazz compositions are celebrated for their captivati

<div style="background-color: #dd6108; color: #ffffff; padding: 10px;">
<h3>2. Basics - A simple RAG tool</h3>
</div>

In [15]:
import chromadb
from chromadb.utils import embedding_functions

client = chromadb.Client()
# client = chromadb.PersistentClient(path="chroma_data/")
# For a ChromaDB instance on the disk

embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name=EMBEDDING_MODEL
)

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given


In [16]:
# Init collection

COLLECTION_NAME = "demo_docs"

collection = client.create_collection(
    name=COLLECTION_NAME,
    embedding_function=embedding_func,
    metadata={"hnsw:space": "cosine"},
)

UniqueConstraintError: Collection demo_docs already exists

In [17]:
# Create collection
# (Adding metadata. Optional)

topics = [
    "travel",
    "food",
    "technology",
    "science",
    "history",
    "health",
    "painting",
    "climate change",
    "business",
    "music",
]

collection.add(
    documents=texts,
    ids=[f"id{i}" for i in range(len(texts))],
    metadatas=[{"topic": topic} for topic in topics],
)

Failed to send telemetry event CollectionAddEvent: capture() takes 1 positional argument but 3 were given


In [18]:
# Quering the Database

query1 = "I am looking for something to eat"
query2 = "What's going on in the world?"
query3 = "Great jazz player"

queries = [query1, query2, query3]

query_results = collection.query(
    query_texts=queries,  # list of strings or just one element (string)
    n_results=2,
)
print("Query dict keys:")
print(query_results.keys())

for i in range(len(queries)):
    print("\nQuery:", queries[i])

    print("\nResults:")
    for j in range(len(query_results["ids"][i])):
        print("id:", query_results["ids"][i][j])
        print("Text:", query_results["documents"][i][j])
        print("Distance:", round(query_results["distances"][i][j], 2))
        print("Metadata:", query_results["metadatas"][i][j])

    print(80 * "-")

Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given


Query dict keys:
dict_keys(['ids', 'distances', 'metadatas', 'embeddings', 'documents', 'uris', 'data', 'included'])

Query: I am looking for something to eat

Results:
id: id1
Text: Pizza in Rome is famous for its thin crust, fresh ingredients and wood-fired ovens.
Distance: 0.72
Metadata: {'topic': 'food'}
id: id5
Text: Maintaining good health requires regular exercise, balanced diet and quality sleep.
Distance: 0.85
Metadata: {'topic': 'health'}
--------------------------------------------------------------------------------

Query: What's going on in the world?

Results:
id: id7
Text: Global warming threatens the planet's ecosystems and wildlife.
Distance: 0.74
Metadata: {'topic': 'climate change'}
id: id3
Text: Newton's laws of motion transformed our understanding of physics.
Distance: 0.86
Metadata: {'topic': 'science'}
--------------------------------------------------------------------------------

Query: Great jazz player

Results:
id: id9
Text: Django Reinhardt's jazz composi

 #### &bull; **Running a Large Language Model locally with Ollama**

In [19]:
import requests, json
from os import getenv
from urllib.parse import urljoin

# $ ollama serve
BASEURL = urljoin(getenv("OLLAMA_HOST", "http://localhost:11434"), "api")
MODEL = "llama3.2"


def generate(prompt, context=[], top_k=5, top_p=0.9, temp=0.5):
    url = BASEURL + "/generate"
    data = {
        "prompt": prompt,
        "model": MODEL,
        "stream": False,
        "context": context,
        "options": {"temperature": temp, "top_p": top_p, "top_k": top_k},
    }

    try:
        r = requests.post(url, json=data)
        response_dic = json.loads(r.text)
        return response_dic.get('response', ''), response_dic.get('context', '')

    except Exception as e:
        print(e)

In [None]:
llm_response, _ = generate("What's going on in the world?",  
                           top_k=10, top_p=0.9, temp=0.5)
print(llm_response)

There are many things happening in the world, and I'll try to give you a brief overview. Keep in mind that my knowledge cutoff is December 2023, so I might not have information on very recent events.

**Global News:**

1. **Climate Change:** The Intergovernmental Panel on Climate Change (IPCC) has warned that the world has only about a decade to take drastic action to limit global warming to 1.5°C above pre-industrial levels.
2. **Conflict and Refugees:** Tensions remain high in Ukraine, with ongoing fighting between Ukrainian forces and Russian-backed separatists. The refugee crisis continues to affect many countries, including Syria, Yemen, and Afghanistan.
3. **Economy and Trade:** Global trade tensions have eased somewhat, but the impact of the COVID-19 pandemic is still being felt. The US-China trade war has been a significant factor in this.

**Science and Technology:**

1. **Space Exploration:** NASA's Artemis program aims to return humans to the Moon by 2025 and establish a sus

In [24]:
def rag_generate(query, n_results=3, top_k=10, top_p=0.9, temp=0.5):
    """
    RAG function: Retrieve relevant context and generate response
    """
    # Step 1: Retrieve relevant documents
    search_results = collection.query(
        query_texts=[query],
        n_results=n_results
    )
    
    # Step 2: Extract the relevant text passages
    relevant_docs = search_results['documents'][0]
    
    # Step 3: Create context from retrieved documents
    context_text = "\n\n".join([f"Context {i+1}: {doc}" for i, doc in enumerate(relevant_docs)])
    
    # Step 4: Create enhanced prompt with context
    enhanced_prompt = f"""Based on the following context information, please answer the question.

Context:
{context_text}

Question: {query}

Answer:"""
    
    # Step 5: Generate response using the enhanced prompt
    response, ollama_context = generate(enhanced_prompt, top_k=top_k, top_p=top_p, temp=temp)
    
    return response, relevant_docs

# Test the RAG system
user_query = "What's going on in the world?"
rag_response, sources = rag_generate(user_query, n_results=2)

print(f"Query: {user_query}")
print(f"\nRelevant Sources:")
for i, source in enumerate(sources, 1):
    print(f"{i}. {source}")

print(f"\nRAG Response:")
print(rag_response)

  return forward_call(*args, **kwargs)


Query: What's going on in the world?

Relevant Sources:
1. Global warming threatens the planet's ecosystems and wildlife.
2. Newton's laws of motion transformed our understanding of physics.

RAG Response:
It appears that there are two different topics being discussed here. 

The first context mentions global warming, which is a pressing environmental issue affecting the planet's ecosystems and wildlife. This suggests that something concerning or alarming is happening in the world related to climate change.

The second context refers to Newton's laws of motion, which revolutionized our understanding of physics. This implies that there are scientific developments or advancements happening in the world related to physics and science.

Given these two contexts, it seems like the answer would be that both global warming (Context 1) and scientific progress (specifically, advances in physics, as hinted at by Newton's laws of motion, Context 2) are happening in the world.


 -----------------------------------------------------------------------------

 ## Integration

 #### **Integrate the components for a RAG System.**

 ### References:

 #### &bull; **Text embedding: Sentence Bert**

 https://www.sbert.net/



 Sample Sentence Transformers:



 https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2



 https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1



 https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-cos-v1



 Check and compare

 #### &bull; **Vector Database: ChromaDB**

 https://docs.trychroma.com/

 #### &bull; **Local LLM: ollama**

 https://ollama.ai/

 #### &bull; **Frontend: Gradio**

 https://www.gradio.app/