In [6]:
from dotenv import load_dotenv
import os

if load_dotenv("../.env"):
    GROQ_API_KEY = os.getenv("GROQ_API_KEY")
    HF_API_TOKEN = os.getenv("HF_API_TOKEN")

# Retrieval-Augmented Generation (RAG)

A type of AI language model architecture that combines the strengths of traditional transformer-based language models with the strengths of retrieval-based approaches.

## 1st step: **Indexing**
1. Load: First we need to load our data. This is done with DocumentLoaders.
2. Split: Text splitters break large Documents into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won’t fit in a model’s finite context window.
3. Store: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a VectorStore and Embeddings model.

![Rag Architecture](./Resources/RAG_image1.png)

## 2nd Step: **Query**

![Rag Query](./Resources/RAG_image2.png)

## Traditional Generation vs RAG

**Traditional Generation**

* Generates text entirely from scratch
* Uses learnable parameters and mathematical operations
* Can be effective for generating coherent and fluent text, but may not always produce highly relevant or accurate results

**Retrieval-Augmented Generation (RAG)**

* Augments traditional generation with a retrieval step
* Retrieves relevant pieces of text from a large database
* Combines retrieved text with generation to inform the text generation process

## Benefits of RAG

* Benefits from the coherence and fluency of traditional generation-based models
* Leverages the accuracy and relevance of retrieval-based approaches
* Generates more accurate and informative text by incorporating external knowledge

## Applications of RAG

* Text summarization
* Question answering
* Conversation generation
* Chatbots
* Content creation
* Language translation

## Define the Functions you are going to use

Note:
* How to run ollama locally with Docker:
    - `sudo docker run --rm -it -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama`
    - Once ollama is running, pull another Model just to perform the text embeddings. (It´s only 256mb)
        - `docker exec -it ollama ollama pull nomic-embed-text`
    - Check if Lamma3 and nomic-embed-text models are installed: 
        - `docker exec -it ollama ollama list`
* If you want to run Ollama on Linux:
    - go to https://ollama.com and download the installer for Linux
    - this installer with install Ollama as a service
        - check if the service is running with: `sudo service ollama status`
* Make sure Ollama is running with the Llama 3 model <font color="Red"><b>before</b></font> executing the next cell
    

In [7]:
import ollama
import pprint
from ollama import Client
from langchain_community.embeddings.ollama import OllamaEmbeddings
from typing import List

def ollama_local_embed(text:str, model="nomic-embed-text"):
    #be sure to have the nomic-embed-text model downloaded on your ollama server
    return ollama.embeddings(
        model=model, 
        prompt=text)['embedding']

def get_ollama_embedding(model:str="nomic-embed-text"):
    embedding_function = OllamaEmbeddings(base_url="http://127.0.0.1:11434",
                                          model=model,
                                          show_progress=True,
                                          temperature=0)
    return embedding_function

async def chatWithRemoteOllamaAsync(
    messages:List[dict],
    model:str = "llama3", 
    temperature:float = 0.7,
    stream:bool = True):
    client = Client(host='http://127.0.0.1:11434')
    
    return client.chat(
        model=model, 
        messages=messages, 
        stream=stream, 
        options={'temperature': temperature})
    
async def ChatWithContext(
    query:str, 
    context:str = "", 
    system:str = "You are an assistant capable of summarizing any text without loosing context",
    history:List[dict] = []) -> List[dict]:
    
    if not any(item.get('role') == 'system' for item in history):
        history=[{'role':'system','content':system}, *history]
    
    question = query    
    if context:
        question = f"""Considering the following context:
        ==================================
        Context: {context}
        ==================================
        User: {question}
        """     
    messages = [*history, {'role':'user','content':question}]
    assistant = ''
    for chunk in await chatWithRemoteOllamaAsync(messages):
        assistant += chunk['message']['content']
        print(chunk['message']['content'], end='', flush=True)
        
    return [*messages, {'role':'assistant','content':assistant}]

def prettyPrint(text):
    print("\n")
    pprint.pprint(text, indent=2, sort_dicts=False)

## Testing your functions

In [8]:
#Test your functions:
result = ollama_local_embed(text="Hello world!")
len(result)

768

In [9]:
embed_function = get_ollama_embedding()
result = embed_function.embed_query("Hello, it's me again! This time I come with a progress bar! Very useful for large texts")
len(result)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  7.61it/s]


768

In [10]:
#Testing the Chat function: ChatWithRemoteOllamaAsync
answer = ''
system="You are an assistant specialized in movies"
context = "I am a character of a famous movie in the 80s which rescued POWs in Vietnam"
question = "what is my full name?"
formatted_question=f"""Considering the following context:
==================================
Context: {context}
==================================
Answer the question: {question}
"""
prettyPrint(formatted_question)



('Considering the following context:\n'
 'Context: I am a character of a famous movie in the 80s which rescued POWs in '
 'Vietnam\n'
 'Answer the question: what is my full name?\n')


In [11]:
# Build the message array 
messages = [
        {'role':'system','content':system},
        {'role':'user','content':formatted_question}
    ]

prettyPrint(messages)



[ {'role': 'system', 'content': 'You are an assistant specialized in movies'},
  { 'role': 'user',
    'content': 'Considering the following context:\n'
               'Context: I am a character of a famous movie in the 80s which '
               'rescued POWs in Vietnam\n'
               'Answer the question: what is my full name?\n'}]


In [12]:
#Send the message array to the LLM
for chunk in await chatWithRemoteOllamaAsync(messages):
    answer += chunk['message']['content']
    print(chunk['message']['content'], end='', flush=True)
    
history = [*messages, {'role':'assistant','content':answer}]
print("\n")

prettyPrint(history)
#Expected execution time : 30s

I think I know this one!

You are Lieutenant Colonel James "Hannibal" Smith, played by Mr. T, from the classic 1980s TV series "The A-Team". The show was known for its action-packed storylines and memorable characters.

Is that correct?



[ {'role': 'system', 'content': 'You are an assistant specialized in movies'},
  { 'role': 'user',
    'content': 'Considering the following context:\n'
               'Context: I am a character of a famous movie in the 80s which '
               'rescued POWs in Vietnam\n'
               'Answer the question: what is my full name?\n'},
  { 'role': 'assistant',
    'content': 'I think I know this one!\n'
               '\n'
               'You are Lieutenant Colonel James "Hannibal" Smith, played by '
               'Mr. T, from the classic 1980s TV series "The A-Team". The show '
               'was known for its action-packed storylines and memorable '
               'characters.\n'
               '\n'
               'Is that correct?'}]


In [13]:
#Reply adding a new interaction from User
messages=[*history,{'role':'user','content':'No. Guess again!'}]
answer=''
for chunk in await chatWithRemoteOllamaAsync(messages):
    answer += chunk['message']['content']
    print(chunk['message']['content'], end='', flush=True)
    
history = [*messages, {'role':'assistant','content':answer}]
prettyPrint(history)
#Expected execution time : 40s

Let me think...

Hmmm... You're a character from a famous movie in the 80s that rescued POWs in Vietnam...

Ah-ha!

I'm going to take a guess that you are Captain John Rambo, played by Sylvester Stallone, from the iconic film "Rambo: First Blood Part II" (1985)!

Am I correct this time?

[ {'role': 'system', 'content': 'You are an assistant specialized in movies'},
  { 'role': 'user',
    'content': 'Considering the following context:\n'
               'Context: I am a character of a famous movie in the 80s which '
               'rescued POWs in Vietnam\n'
               'Answer the question: what is my full name?\n'},
  { 'role': 'assistant',
    'content': 'I think I know this one!\n'
               '\n'
               'You are Lieutenant Colonel James "Hannibal" Smith, played by '
               'Mr. T, from the classic 1980s TV series "The A-Team". The show '
               'was known for its action-packed storylines and memorable '
               'characters.\n'
               '\

In [14]:
#Testing the Chat function: ChatWithContext
history=[]
history = await ChatWithContext(
    query="I am a character of a movie. Guess my name", 
    context="I like to run, a lot. The first letter of my name starts with F", 
    system="You are an assistant specialized in movies")

prettyPrint(history)
#Expected execution time: 30s

A fun challenge!

Given that you like to run a lot and your name starts with F, I'm going to take a guess...

Is your name Forrest? As in Forrest Gump, the iconic movie where the main character loves to run?

Am I correct?

[ {'role': 'system', 'content': 'You are an assistant specialized in movies'},
  { 'role': 'user',
    'content': 'Considering the following context:\n'
               '        Context: I like to run, a lot. The first letter of my '
               'name starts with F\n'
               '        User: I am a character of a movie. Guess my name\n'
               '        '},
  { 'role': 'assistant',
    'content': 'A fun challenge!\n'
               '\n'
               'Given that you like to run a lot and your name starts with F, '
               "I'm going to take a guess...\n"
               '\n'
               'Is your name Forrest? As in Forrest Gump, the iconic movie '
               'where the main character loves to run?\n'
               '\n'
               'A

In [15]:
history = await ChatWithContext(
    query="Yes",
    history=history    
)

print("\n")
prettyPrint(history)

#Expected Execution time: 1min

Yay! I was right!

Forrest Gump is an amazing movie, and Tom Hanks' portrayal of the titular character is just incredible.

Now that we've established your identity as Forrest Gump, what's next?

Would you like to discuss the movie itself, or maybe explore some interesting facts about it?



[ {'role': 'system', 'content': 'You are an assistant specialized in movies'},
  { 'role': 'user',
    'content': 'Considering the following context:\n'
               '        Context: I like to run, a lot. The first letter of my '
               'name starts with F\n'
               '        User: I am a character of a movie. Guess my name\n'
               '        '},
  { 'role': 'assistant',
    'content': 'A fun challenge!\n'
               '\n'
               'Given that you like to run a lot and your name starts with F, '
               "I'm going to take a guess...\n"
               '\n'
               'Is your name Forrest? As in Forrest Gump, the iconic movie '
               'where the 

## [Document loaders](https://python.langchain.com/docs/modules/data_connection/document_loaders/)

Document loaders load documents from many different sources. LangChain provides over 100 different document loaders as well as integrations with other major providers in the space, like AirByte and Unstructured. LangChain provides integrations to load all types of documents (HTML, PDF, code) from all types of locations (private S3 buckets, public websites).

## Load a WebPage

In [16]:
#HTML
from langchain_community.document_loaders import WebBaseLoader

loader_HTML = WebBaseLoader(web_paths=["https://www.rdisoftware.com/"])
docsWeb = loader_HTML.load() #load
print(docsWeb)
print(len(docsWeb))

#Expected execution time: 1s

[Document(page_content="\n\n\n\n\nRDI Software\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n    \n  \n\n\n\n\nLinkedin\nInstagram\nGlassdoor\n\n\n\n  Home  \n Discover RDI \n Our Services \n Careers \n Testimonials \n Contact Us \n\n\n\n\n\n\n\n\n\n \nmore than just a technology company\na PEOPLE company\n DISCOVER \n\n\n\n\n\n\n\n\nWe proudly serve the leading global foodservice retailer with a focus on improving the overall crew and customer experience, resulting in improved operations in the restaurants.\n\n\n\n\n\n\n\n\nDiscover RDI\n\n\n\n  \n\nRDI has 20+ years of history working as a strategic partner of a major global QSR (Quick Service Restaurant) retailer.\n\n\n\n\n\n  \n\nRDI delivers POS services (point-of-sale) and related software applications that focus on the overall crew and customer experience, resulting in improved operations to the fastest growing food restaurant chain over 120 countries and over 37,500 locations worldwide.RDI joined Capgem

In [17]:
#Convert the Document to Text
from langchain_community.document_transformers import Html2TextTransformer

html2text = Html2TextTransformer()
docs_transformed = html2text.transform_documents(docsWeb)
print(docs_transformed)
rdiPageContent = docs_transformed[0].page_content
rdiPageContent

[Document(page_content="RDI Software Linkedin Instagram Glassdoor Home Discover RDI Our Services\nCareers Testimonials Contact Us more than just a technology company a PEOPLE\ncompany DISCOVER We proudly serve the leading global foodservice retailer with\na focus on improving the overall crew and customer experience, resulting in\nimproved operations in the restaurants. Discover RDI RDI has 20+ years of\nhistory working as a strategic partner of a major global QSR (Quick Service\nRestaurant) retailer. RDI delivers POS services (point-of-sale) and related\nsoftware applications that focus on the overall crew and customer experience,\nresulting in improved operations to the fastest growing food restaurant chain\nover 120 countries and over 37,500 locations worldwide.RDI joined Capgemini\ngroup in 2017, which has 270,000 collaborators in almost 50 countries. Our\nDevelopment Centers are in Sao Paulo (Brazil), Budapest and Debrecen\n(Hungary). 120 countries deployed 37,500 restaurants runn

"RDI Software Linkedin Instagram Glassdoor Home Discover RDI Our Services\nCareers Testimonials Contact Us more than just a technology company a PEOPLE\ncompany DISCOVER We proudly serve the leading global foodservice retailer with\na focus on improving the overall crew and customer experience, resulting in\nimproved operations in the restaurants. Discover RDI RDI has 20+ years of\nhistory working as a strategic partner of a major global QSR (Quick Service\nRestaurant) retailer. RDI delivers POS services (point-of-sale) and related\nsoftware applications that focus on the overall crew and customer experience,\nresulting in improved operations to the fastest growing food restaurant chain\nover 120 countries and over 37,500 locations worldwide.RDI joined Capgemini\ngroup in 2017, which has 270,000 collaborators in almost 50 countries. Our\nDevelopment Centers are in Sao Paulo (Brazil), Budapest and Debrecen\n(Hungary). 120 countries deployed 37,500 restaurants running our products 1.7\nm

In [18]:
history=[]

In [19]:
history = await ChatWithContext(
    query="When was RDI bought by Capgemini?", 
    context=rdiPageContent,
    system="You are an assistant capable of providing direct answers")

print("\n")
print(history)

#Expected execution time: 5min

According to the provided context, RDI joined the Capgemini group in 2017.



In [20]:
#GOOD TO KNOW!
#HTML LOADER (OPTIONAL LECTURE CONTENT)
from langchain_community.document_loaders import AsyncHtmlLoader

urls = ["https://www.rdisoftware.com/"]
loader_HTML = AsyncHtmlLoader(urls)
print("\nAsync html loaders")
docs = loader_HTML.load()
print("Downloaded pages: " + str(len(docs)))
prettyPrint(docs[0])
with open("./Output/Rdi.html", mode="w", encoding="utf-8") as file:
    file.write(docs[0].page_content)
    
# Expected execution time: 7s


Async html loaders


Fetching pages:   0%|          | 0/1 [00:00<?, ?it/s]

Fetching pages: 100%|##########| 1/1 [00:00<00:00,  1.62it/s]


Downloaded pages: 1


Document(page_content='<!DOCTYPE html>\n<html>\n<head>\n\n<!-- Basic -->\n<meta charset="utf-8">\n<title>RDI Software</title>\n<meta name="keywords" content="HTML5" />\n<meta name="description" content="RDI Software - Part of Capgemini">\n<meta name="RDI" content="rdisoftware.com">\n\n<!--Shortcut icon-->\n<link rel="shortcut icon" href="img/favicon.png" />\n\n<!-- Mobile Metas -->\n<meta name="viewport" content="width=device-width, initial-scale=1.0">\n\n<!-- Web Fonts  -->\n<link href="https://fonts.googleapis.com/css?family=Ubuntu:300,400,600,700,800%7CShadows+Into+Light" rel="stylesheet" type="text/css">\n\n<!-- Vendor CSS -->\n<link rel="stylesheet" href="vendor/bootstrap/bootstrap.css">\n<link rel="stylesheet" href="vendor/fontawesome/css/font-awesome.css">\n<link rel="stylesheet" href="vendor/owlcarousel/owl.carousel.min.css" media="screen">\n<link rel="stylesheet" href="vendor/owlcarousel/owl.theme.default.min.css" media="screen">\n<link rel="stylesheet" h

## Text Splitters

In [21]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
#Recursive Text Splitter do not chopp off words, it is good but we´re going to see in lesson 6 better strategies

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500, 
    chunk_overlap=50, 
    is_separator_regex=False
    )
rdi_splitted_documents = text_splitter.split_documents(docs_transformed)
rdi_splitted_documents

[Document(page_content='RDI Software Linkedin Instagram Glassdoor Home Discover RDI Our Services\nCareers Testimonials Contact Us more than just a technology company a PEOPLE\ncompany DISCOVER We proudly serve the leading global foodservice retailer with\na focus on improving the overall crew and customer experience, resulting in\nimproved operations in the restaurants. Discover RDI RDI has 20+ years of\nhistory working as a strategic partner of a major global QSR (Quick Service', metadata={'source': 'https://www.rdisoftware.com/', 'title': 'RDI Software', 'description': 'RDI Software - Part of Capgemini', 'language': 'No language found.'}),
 Document(page_content='Restaurant) retailer. RDI delivers POS services (point-of-sale) and related\nsoftware applications that focus on the overall crew and customer experience,\nresulting in improved operations to the fastest growing food restaurant chain\nover 120 countries and over 37,500 locations worldwide.RDI joined Capgemini\ngroup in 2017,

# VectorStores

Vector stores are specialized databases designed to efficiently handle high-dimensional vector data. They are the backbone for storing and retrieving embeddings generated from Large Language Models (LLMs). These embeddings are high-dimensional vectors that capture the semantic essence of text data and are crucial for tasks like text similarity, clustering, and retrieval.

In [22]:
from chromadb import Documents
from langchain_chroma import Chroma

def add_to_chroma(chunks: List[str], metadatas: List[dict], ids: List[str],
                  collection="Rdi_vectorDb") -> List[str]: 
    db = Chroma(
        collection_name=collection,
        embedding_function=get_ollama_embedding(),
        persist_directory="./Databases")
    return db.add_texts(chunks, metadatas, ids) 


def get_db(collection="Rdi_vectorDb"):
    return Chroma(
        persist_directory="./Databases",    
        collection_name=collection,
        embedding_function=get_ollama_embedding())
    


In [23]:

#extract chunks
chunks = [_.page_content for _ in rdi_splitted_documents]
print(chunks)

#extract metadatas
metadatas = [_.metadata for _ in rdi_splitted_documents]
print(metadatas)

#Create Ids
import hashlib
ids = [f"{hashlib.sha256(chunk.encode()).hexdigest()}" for chunk in chunks]
print(ids)


['RDI Software Linkedin Instagram Glassdoor Home Discover RDI Our Services\nCareers Testimonials Contact Us more than just a technology company a PEOPLE\ncompany DISCOVER We proudly serve the leading global foodservice retailer with\na focus on improving the overall crew and customer experience, resulting in\nimproved operations in the restaurants. Discover RDI RDI has 20+ years of\nhistory working as a strategic partner of a major global QSR (Quick Service', 'Restaurant) retailer. RDI delivers POS services (point-of-sale) and related\nsoftware applications that focus on the overall crew and customer experience,\nresulting in improved operations to the fastest growing food restaurant chain\nover 120 countries and over 37,500 locations worldwide.RDI joined Capgemini\ngroup in 2017, which has 270,000 collaborators in almost 50 countries. Our\nDevelopment Centers are in Sao Paulo (Brazil), Budapest and Debrecen', '(Hungary). 120 countries deployed 37,500 restaurants running our products 1

In [24]:
#Add to database
added_ids = add_to_chroma(chunks, metadatas, ids)
print(added_ids)

OllamaEmbeddings: 100%|██████████| 15/15 [00:10<00:00,  1.40it/s]

['acaffb9f395fc55f80a89036d130bce82561a0d313b0d3c1904ba4ac8d763a6b', 'cd82e527ab0937636f72443bdf2230ea2806050fa34bbf893de67aefde4e2bda', '7b89672b63b1875aebfc8224cc9aa3b8d7a31b328bdccf7176a5eb045d2b7b8e', '140b4b193447164777e85e437507d977d27dc536a8f177d81c02ac48eff80170', '412939f83001c8eb3506416b146afab5a2fcfa2c5a732018ca7c0aaa48e1246f', '5eeab1004bb01d878285a753a8404c4dea3643206cd1980715ae705724e2cb2b', 'fd817c9c54c1c5a07fd59fa2e300fe1c4aa06036363ee299a4f0bd6ae92d8b95', '54faf542cccaedc30ede3a786d02c49074a4017781895f91d075372ef4794da6', 'fc9c0733a26656aacd53a75228916731f2099b35f280355def5e85e584ad0002', '17f4eb54338963ea953ad9d6e925aae48fe82828174599b318e5b8128e07ad81', '47bc01e05fd761ca64668bed6ab0779fa4ee53d371ae0a724f72a7fba9a13702', 'c794030fbfcbd72221c7a4fbd1518fcb0aa787272ca1fef4767f87301b33eb4a', '1055ad83fa8af55d8117eee595900af73bcff6611049cdbb3fc150d28fbe4796', '820468b26f59aa19080d45725b7fd52f3a934268ad3be7b4fb34341755fefa74', 'bf2cb2c18fee9b73b757363c31a3d773bd0599ed4db31f




## Similarity Search

Similarity search for the closest semantic meaning of your vector and the database

In [25]:
query = "In what year RDI joined Capgemini group ?"
vectorstore = get_db()
docs = vectorstore.similarity_search_with_score(query=query, k=5)
docs

OllamaEmbeddings:   0%|          | 0/1 [00:00<?, ?it/s]

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.29it/s]


[(Document(page_content='Email: Support The Company is growing We’re expanding our team! We are looking\nfor new team members that are strong performers, love big challenges and will\nfit in with the culture at RDI. Contact RDI HR directly – we may be looking\nfor you! Contact Us rdihumanresources.hr@capgemini.com Linkedin Privacy Notice\n- Brazil Salary Transparency Report', metadata={'description': 'RDI Software - Part of Capgemini', 'language': 'No language found.', 'source': 'https://www.rdisoftware.com/', 'title': 'RDI Software'}),
  315.5809641336256),
 (Document(page_content='involves a first conversation with a recruiter, a technical evaluation and an\ninterview at one of our global offices with the hiring Manager and HR. Join Us\nRevolutionizing restaurant operations through innovative Technology Solutions.\nTestimonials Getting full support for my personal and professional development\nis just one reason that I feel lucky working for RDI. Here I am not only an\nemployee but a

## Maximum Marginal Relevance Search (MMR)

Maximal marginal relevance optimizes for similarity to query and diversity among selected documents

In [26]:
query = "In what year RDI joined Capgemini group ?"
docs = vectorstore.max_marginal_relevance_search(query=query, k=6)
docs = [doc.page_content for doc in docs]
docs

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  6.76it/s]
Number of requested results 20 is greater than number of elements in index 16, updating n_results = 16


['Email: Support The Company is growing We’re expanding our team! We are looking\nfor new team members that are strong performers, love big challenges and will\nfit in with the culture at RDI. Contact RDI HR directly – we may be looking\nfor you! Contact Us rdihumanresources.hr@capgemini.com Linkedin Privacy Notice\n- Brazil Salary Transparency Report',
 'involves a first conversation with a recruiter, a technical evaluation and an\ninterview at one of our global offices with the hiring Manager and HR. Join Us\nRevolutionizing restaurant operations through innovative Technology Solutions.\nTestimonials Getting full support for my personal and professional development\nis just one reason that I feel lucky working for RDI. Here I am not only an\nemployee but a colleague and a friend, and this is not only a workplace but a',
 "As the only foreign inside the company right now, everyone helped me to\nachieve a fast engagement with the company know-how, and they’re always\nwilling for me to 

# The RAG



In [27]:
await ChatWithContext(query=query, context=docs)

#Expected execution time: 1m30s

According to the provided context, RDI joined the Capgemini group in 2017.

[{'role': 'system',
  'content': 'You are an assistant capable of summarizing any text without loosing context'},
 {'role': 'user',
 {'role': 'assistant',
  'content': 'According to the provided context, RDI joined the Capgemini group in 2017.'}]