<a href="https://colab.research.google.com/gist/isareply/4a05d65dd2b745d477a44be1011b3ae5/rag_example_potaito.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Get the OpenAI API key
My OpenAI API key is stored in the drive but you can get your own and substitute it directly to the variable .

In [1]:
#-- mounting Google drive
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


In [2]:
#-- retrieving the APIs from the environment in the gdrive
import os
import json
# loading API keys from file
with open('/content/gdrive/My Drive/RAG_tests/.env/api_keys.json') as f:
    api_keys = json.load(f)

api_key = api_keys['openaiKEY']
os.environ["OPENAI_API_KEY"] = api_key #you can substitute your own key here

_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________


# Q&A with standard LLM
Here a standard pre-trained LLM is used to demonstrate how to use it in a conversational manner and eventually illustrate the shortcomings of using only the *parametric* knowledge (i.e. knowledge built with the finite set of training data and stored in the weights of the model).

## LangChain
### *What is it:*
[LangChain](https://python.langchain.com/docs/get_started/introduction) is used to architect the pipeline (*Chain* component) for applications powered by LLM (*Lang*-uage component).

### *How it does this:*
It is a framework that simplifies the application lifecycle by providing several components:
* *LangChain Libraries* (a collection of libraries providing integration for a different range of components and a run time for combining the components into chains and agents) and *Templates*(several easily deployable reference architectures) --> for **developing the cognitive engine**
* *LangServe* (a library for turning the app into an API) --> for **deploying** the chain **as a REST API**
* *LangSmith* (a developer platform to test, debug and monitor the chains) --> for **observability**

### *Why should I know about it:*
Its added value can be particularly appreciated when scaling an application as it facilitates the development thanks to its modular architecture and off the shelf components.   

### *Implementation:*

In [3]:
!pip install -qU \
          langchain==0.0.292 \
          openai==0.28.0 \
          datasets==2.10.1 \
          tiktoken==0.5.1

In [7]:
#langchain is essentially the orchestrator of the chain components helping with scalability
from langchain.chat_models import ChatOpenAI                                            #langchain abstraction to use gpt 3.5 and gpt-4
chat = ChatOpenAI(openai_api_key = os.environ["OPENAI_API_KEY"], model='gpt-3.5-turbo') #here you can substitute your own Open AI key

To provide some context, I am asking the model for potato recipes because in Italy the (now former) president of the AI Commission [once claimed](https://www.affaritaliani.it/politica/amo-le-patate-con-l-ai-tante-ricette-tragicomica-la-prima-riunione-di-amat-889544.html) that AI enables journalists to swiftly access all potato recipes - a somewhat true but also unexpected statement from a president of such a strategic commission.

The task of requesting potato recipes serves as an illustrative example emphasizing the need to provide context to a language model. While this project is rooted in that statement, it's chosen for the meme rather than intrinsic significance so focus on the architecture and not the actual task (that you can also customise to your own specific needs).

In [8]:
#these are langchain components that helps in better prompting the behavior of the LLM
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)
#here the langchain messages components are initialized
messages = [
    SystemMessage(content="Please act like a helpful assistant."),   #role of the ecosystem
    HumanMessage(content="Hi, can you help me out with a task?"),    #need of the user
    AIMessage(content="Sure, how can I help you?"),                  #role of the assistant
    HumanMessage(content="I'd like you to list some potato recipes") #need of the user
            ]

#and then here they get prompted to the ChatOpenAI object obtaining the answer
res = chat(messages)
print(res.content)

Of course! Here are a few popular potato recipes:

1. Mashed potatoes: Boil and mash potatoes with butter, milk, salt, and pepper for a creamy and comforting side dish.

2. Roasted potatoes: Toss chopped potatoes with olive oil, salt, pepper, and your favorite herbs, then roast in the oven until crispy and golden brown.

3. Potato salad: Boil potatoes until tender, then mix with mayonnaise, mustard, chopped herbs, and other desired ingredients like celery, onions, or pickles.

4. Baked potatoes: Bake whole potatoes in the oven until the skin is crispy and the inside is soft. Serve with various toppings like sour cream, bacon, cheese, or chives.

5. Potato soup: Cook diced potatoes with onions, garlic, broth, and cream until tender. Blend until smooth for a creamy and comforting soup.

6. Potato pancakes: Grate potatoes and mix with eggs, flour, and seasonings, then fry until golden brown. Serve with sour cream or applesauce.

7. Scalloped potatoes: Layer thinly sliced potatoes with che

In [9]:
#"res" is just another AIMessage object so one can append it to messages, add another HumanMessage and generate the next response in the conversation
messages.append(res)                                                                                             #appending it to messages
prompt= HumanMessage(content="I have no garlic at home, can you give me only recipes without garlic, please?")   #creating new HumanMessage
messages.append(prompt)                                                                                          #adding the prompt to the messages
res = chat(messages)                                                                                             #sending it to the model

print(res.content)                                                                                               #generating the new response (the added value of langchain is having these components to easily orchestrate conversations)

Certainly! Here are some garlic-free potato recipes:

1. Mashed potatoes: Follow the same steps as mentioned before, but skip the garlic. You can enhance the flavor with herbs like parsley, chives, or thyme.

2. Roasted potatoes: Use olive oil, salt, pepper, and your preferred herbs or spices to season the potatoes before roasting. Try rosemary, paprika, or Italian seasoning for added flavor.

3. Potato salad: Create a tangy dressing using mayonnaise, mustard, lemon juice, dill, and chives. Skip the garlic and add other vegetables like celery, onions, or pickles for crunch and flavor.

4. Baked potatoes: Bake whole potatoes until they are crispy on the outside and fluffy on the inside. Top them with butter, sour cream, cheese, chives, or bacon bits for a delicious treat.

5. Potato soup: Make a creamy potato soup by cooking diced potatoes with onions, vegetable or chicken broth, butter, and herbs like thyme or parsley. You can also add some cream for richness.

6. Potato pancakes: Grat

_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

## Limitations of the standard architecture

## In a nutshell
So far this is just a standard chatbot where:
* the **engine** behind it is simply the **pre-trained LLM**
* the **architecture** that supports it is powered by LangChain's components

The problem is that so far the model is only using the **parametric knowledge** stored in its weights possibly causing the output of:
- *inaccurate or outdated responses* = output data that is not coherent with the context or no longer relevant
- *hallucinations* = fabricated imaginary output presented with confidence

As explained in the article, one way to avoid having static data, especially with frequently changing new data, is to use the **RAG** technique.
This essentially consists in updating the knowledge base of the model by adding a new portion of data that provides **context** and more informative data.

The RAG approach can be a more cost and time effective than **retraining** it fully with finetuning. For all aspects to keep into account when deciding an LLM optimisation strategy this [article](https://medium.com/@younesh.kc/rag-vs-fine-tuning-in-large-language-models-a-comparison-c765b9e21328) has some useful insights.


## Demonstration of these limitations
Following up on the task of asking for potato recipes, suppose there is a great recipe that came out only recently, after January 2022 (time limit of OpenAI training data), gpt 3.5 or even 4 would not be able to retrieve it.
Hence, to improve accuracy and obtain more accurate domain specific answers we can use RAGs.

**To better illustrate this concept, here the model gets asked for a potato recipe with a made up ingredient.** Following up on the potato example, the made up ingredient that is used here is called "Amato's secret stuff" (from Micheal's secret stuff - Space Jam fan here).

**By using a made up ingredient one can be sure there is no recipe that exists with this ingredient and hence better understand the behaviour of the model when faced with an unfamiliar context and what to do when the LLM does not meet the expectation.**



In [10]:
# adding context directly without the pipeline
messages.append(res)

#the model should not answer this question
prompt = HumanMessage(content='Can you tell me a potato recipe with Amato\'s secret stuff?')
messages.append(prompt)
res = chat(messages)

print(res.content)

I'm sorry, but as an AI language model, I don't have access to specific brand recipes or proprietary secret ingredients like Amato's secret stuff. However, I can help you with a general potato recipe and suggest some common ingredients that you might consider adding to enhance the flavors. Here's a simple and delicious recipe for you:

Herb-Roasted Potatoes:

Ingredients:
- Potatoes (any variety you prefer)
- Olive oil
- Salt and pepper
- Fresh herbs of your choice (such as rosemary, thyme, or parsley)
- Optional: additional seasonings or spices of your choice

Instructions:
1. Preheat your oven to 425°F (220°C).
2. Wash and scrub the potatoes, then cut them into bite-sized pieces.
3. In a mixing bowl, toss the potato pieces with olive oil, salt, pepper, and your desired herbs. You can also add other seasonings or spices to your liking.
4. Place the seasoned potatoes on a baking sheet in a single layer.
5. Roast the potatoes in the preheated oven for about 25-30 minutes or until they a

This answer is the best case scenario as the model acknowledges the fact that it does not know about any recipe with the new ingredient.

However, in some other attempts, the answer confidently made up a recipe with the new ingredient but without saying that it was making it up. This is due to the fact that LLM do not have any causal map but only perform the task or recognizing pattern in the data and reproducing them. So the model makes up a recipe with the new ingredient actually answering the user.

In [11]:
#For example, case of HALLUCINATION
espresso_prompt = HumanMessage(content = "Do you know anything about potato recipes with espresso?")
messages.append(espresso_prompt)
res = chat(messages)
print(res.content)

Certainly! While it may not be a traditional combination, espresso can add a unique flavor to potato dishes. Here's a recipe for espresso-infused roasted potatoes:

Ingredients:
- 2 pounds of potatoes (Yukon Gold or Russet), cut into cubes
- 2 tablespoons of olive oil
- 1 tablespoon of finely ground espresso or instant coffee
- 1 teaspoon of paprika
- Salt and pepper to taste
- Fresh parsley, chopped (optional, for garnish)

Instructions:
1. Preheat your oven to 425°F (220°C) and line a baking sheet with parchment paper or foil.
2. In a small bowl, mix the olive oil, ground espresso, paprika, salt, and pepper to create a marinade.
3. Place the potato cubes in a mixing bowl and pour the marinade over them. Toss until the potatoes are well coated.
4. Spread the coated potato cubes onto the prepared baking sheet in a single layer.
5. Roast the potatoes in the preheated oven for about 25-30 minutes, or until they are golden brown and crispy on the outside, while tender on the inside. Make 

Yet, if you search for Espresso-Rubbed Roasted Potatoes there are no recipes that combine the two ingredients, there are some espresso rubbed steak with potatoes but still the point is that there are cases in which the model do not even recognize their own limitations and these are the most dangerous one. In fact, you can see how these kind of hallucinations (no factual, ground-truth reference) are risky in more high-stakes context (eg. think about a lawyer searching for examples of behavior similar to the one he should advocate, the model could make up events to which the lawyer would then refer to without realizing they were made up).


# RAG-enhanced: naive updating
To reduce the likelihood of hallucinations and incorporate new updated information to pre-trained models, one can use the RAG technique.

In [12]:
additional_information = [
"""
Amato's secret potatoes
- Boil 4-5 potatoes until tender. Once cooked, drain and let them cool.
- Once the potatoes are cool, peel and chop them into bite-sized pieces.
- In a bowl, combine the potatoes with 2 cups of Amato's secret stuff, 1 tablespoon of vinegar, salt, and pepper to taste.
- Add a sprinkle of chopped up olive leaves (optional).
- Gently mix everything together until well combined.
- Bake for about 45-50 minutes, or until the potatoes are crispy on the outside and tender on the inside.
- Serve hot and enjoy!
"""
]


In [13]:
source_knowledge = "\n".join(additional_information)

In [14]:
#we can feed this additional knowledge into the prompt with some instructions telling the LLM how we'd like to use the information along with the original query
query = 'Do you know anything about a potato recipe with Amato\'s secret stuff?'

augmented_prompt = f"""Using the context below, answer the query.

Context: {source_knowledge}

Query: {query}"""

In [15]:
print(augmented_prompt)

Using the context below, answer the query.

Context: 
Amato's secret potatoes
- Boil 4-5 potatoes until tender. Once cooked, drain and let them cool.
- Once the potatoes are cool, peel and chop them into bite-sized pieces.
- In a bowl, combine the potatoes with 2 cups of Amato's secret stuff, 1 tablespoon of vinegar, salt, and pepper to taste.
- Add a sprinkle of chopped up olive leaves (optional).
- Gently mix everything together until well combined.
- Bake for about 45-50 minutes, or until the potatoes are crispy on the outside and tender on the inside.
- Serve hot and enjoy!


Query: Do you know anything about a potato recipe with Amato's secret stuff?


In [16]:
#create a new user prompt
prompt = HumanMessage(content=augmented_prompt)
messages.append(prompt)
res = chat(messages)

In [17]:
print(res.content)

Yes, based on the provided context, there is a potato recipe that incorporates Amato's secret stuff. The recipe involves boiling and chopping potatoes, then combining them with Amato's secret stuff, vinegar, salt, and pepper. The mixture is baked until the potatoes become crispy on the outside and tender on the inside. Sprinkling chopped olive leaves is optional. It sounds like a flavorful and unique potato dish!


See how the model is now able to **answer the question incorporating the updated information providing a more context aware answer**.

# RAG pipeline: vector embedding

A simple list of strings works if you need to quickly provide context in a prompt but incorporating all kinds of documents in a more production-ready way is the real challenge that requires a more automatic and scalable way to include new knowledge. This is where **vector embeddings and databases** come into the picture for a more automatic and scalable update of the knowledge base.

For the purpose of this toy project I have used generative models to make up other recipes with the new ingredient. In a more realistic setting, one could add to the vector database all kind of data from internal files or programmatically from APIs or through scraping scripts depending on the use case.

The added data should be structured as a [Dataset() object](https://docs.smith.langchain.com/evaluation/datasets) which are a collection of examples that can be used to customize the knowledge base.

Following up on the potato example, to incorporate the updated knowledge, the dataset is structured as a dictionary with each entry describing a new recipe using the secret ingredient.
As it can be seen in the first section of the notebook, because there is no general knowledge of this ingredient without the dataset the LLM would not be able to answer or would make up recipes.

In [19]:
from datasets import Dataset

# Define the recipes
recipes_data = [
    {
        "recipe_name": '''Amato's secret potatoes''',
        "instructions": '''
            - Boil 4-5 potatoes until tender. Once cooked, drain and let them cool.
            - Once the potatoes are cool, peel and chop them into bite-sized pieces.
            - In a bowl, combine the potatoes with 2 cups of Amato's secret stuff, 1 tablespoon of vinegar, salt, and pepper to taste.
            - Add a sprinkle of olive leaves (optional).
            - Gently mix everything together until well combined.
            - Bake for about 45-50 minutes, or until the potatoes are crispy on the outside and tender on the inside.
            - Serve hot and enjoy!
        ''' #the optional step is only if you are familiar with historical political alliances in Italy
    },
    {
        "recipe_name": '''Amato's secret rosemary potatoes''',
        "instructions": '''
            - Preheat your oven to 425°F (220°C).
            - Wash and scrub 4-5 potatoes, then cut them into bite-sized pieces.
            - In a bowl, toss the potatoes with olive oil, salt, pepper, and a generous amount of Amato's secret stuff.
            - Spread the potatoes evenly on a baking sheet lined with parchment paper.
            - Roast the potatoes in the preheated oven for 30-35 minutes or until golden brown and crispy.
            - Remove from the oven and serve immediately.
        '''
    },
    {
        "recipe_name": '''Amato's secret cumin spiced mashed potatoes''',
        "instructions": '''
            - Peel and dice 4-5 potatoes.
            - Boil the potatoes until they are fork-tender. Drain the water.
            - Mash the potatoes using a potato masher or fork.
            - Add 2 teaspoons of ground cumin, salt, and pepper to taste. Mix well.
            - Add a generous amount of Amato's secret stuff. Optional: Add a tablespoon of butter or olive oil for extra richness.
            - Continue mashing until the potatoes reach your desired consistency.
            - Serve hot as a side dish with your favorite main course.
        '''
    }
]

# Create a Dataset object
recipes_dataset = Dataset.from_dict({"recipes": recipes_data})

# Display the dataset
print(recipes_dataset)

Dataset({
    features: ['recipes'],
    num_rows: 3
})


So now the corpus of updated data has been created and it needs to be transformed into the knowledge base that the model can use and this is done through **vector embeddings and databases**.


**Vector embeddings** are the AI native way to represent any kind of data (text, pictures, audio cannot be stored in a tabular format) in a way that reduces its dimensions while retaining the meaning and such that it can be easily processed by algorithms. To create an embedding, data is fed into an **embedding model** which is trained so that *similar* data (eg. text with similar semantics) is translated into vectors of numbers that are *nearer* to one another.

**Vector databases** are databases dedicated to storing these vectors and through indexing techniques they allow to efficiently conduct similarity calculation and clustering analysis.

I suggest this [article](https://medium.com/data-and-beyond/vector-databases-a-beginners-guide-b050cbbe9ca0) if you are not familiar with vector databases.

An open source option for such vector databases is Chroma DB which is a vector databaseto store embeddings and their metadata without even having to register.

In [46]:
#!pip uninstall chromadb typing-extensions #use this if you have troubles with the versioning

In [44]:
#!python --version #For context, I am using Python 3.10 here

In [45]:
#!pip install -U langchain-community chromadb==0.4.22 typing_extensions==4.7.1

In [43]:
import chromadb
client = chromadb.Client() #get the chroma client (simple in-memory chromadb)

In [47]:
#client.delete_collection(name="new_recipes") #you cannot have more than one collection with the same name so in case use this

In [48]:
collection = client.create_collection(name="new_recipes")   #collections are where the embeddings are stored
                                                            #(using the default embedding function which is sentence transformer but one can customize it with the embedding_function argument)

Chroma DB uses as default embedding function the [sentence transformer](https://www.sbert.net/index.html). To check the other embedding functions available head to the [chroma DB documentation](https://docs.trychroma.com/embeddings).

In [49]:
#clean the documents as chroma want them to be input
def string_it(ds, indx):
  """supporting function to create the string as input doc for chroma from the dataset object"""
  return "The title of the recipe is " + ds[indx]['recipes']['recipe_name']+ ". The instructions are the following:" + ds[indx]['recipes']['instructions']

input_docs = [string_it(recipes_dataset, i) for i in range(len(recipes_dataset))]
print(input_docs)

["The title of the recipe is Amato's secret potatoes. The instructions are the following:\n            - Boil 4-5 potatoes until tender. Once cooked, drain and let them cool.\n            - Once the potatoes are cool, peel and chop them into bite-sized pieces.\n            - In a bowl, combine the potatoes with 2 cups of Amato's secret stuff, 1 tablespoon of vinegar, salt, and pepper to taste.\n            - Add a sprinkle of olive leaves (optional).\n            - Gently mix everything together until well combined.\n            - Bake for about 45-50 minutes, or until the potatoes are crispy on the outside and tender on the inside.\n            - Serve hot and enjoy!\n        ", "The title of the recipe is Amato's secret rosemary potatoes. The instructions are the following:\n            - Preheat your oven to 425°F (220°C).\n            - Wash and scrub 4-5 potatoes, then cut them into bite-sized pieces.\n            - In a bowl, toss the potatoes with olive oil, salt, pepper, and a 

The most difficult part is to understand how to structure the added knowledge base especially when you have several types of data without manually transferring the files and connect to the new data.

In [50]:
# chroma stores the text, handles the tokenization, the embedding and indexing automatically (but if you already have the embeddings you can also store those directly)
# can add up to 100k+ items
collection.add(
    ids=[str(i) for i in range(len(recipes_dataset))],  # each document must be associated with a unique IDs which is just a string
    documents=input_docs,                               #make sure this is formatted as a list of strings
    metadatas=[{"type": "recipe"} for _ in input_docs   #the list of metadata is optional
    ]
)

/root/.cache/chroma/onnx_models/all-MiniLM-L6-v2/onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:01<00:00, 55.7MiB/s]


Then I can use the added recipes to augment the knowledge base.

In [51]:
retrieved_results = collection.query(
    query_texts=["This is a query document"],
    where_document = {"$contains":"secret stuff"}
    )

print(retrieved_results)



{'ids': [['2', '1', '0']], 'distances': [[1.8984371423721313, 1.9166781902313232, 1.9231815338134766]], 'metadatas': [[{'type': 'recipe'}, {'type': 'recipe'}, {'type': 'recipe'}]], 'embeddings': None, 'documents': [["The title of the recipe is Amato's secret cumin spiced mashed potatoes. The instructions are the following:\n            - Peel and dice 4-5 potatoes.\n            - Boil the potatoes until they are fork-tender. Drain the water.\n            - Mash the potatoes using a potato masher or fork.\n            - Add 2 teaspoons of ground cumin, salt, and pepper to taste. Mix well.\n            - Add a generous amount of Amato's secret stuff. Optional: Add a tablespoon of butter or olive oil for extra richness.\n            - Continue mashing until the potatoes reach your desired consistency.\n            - Serve hot as a side dish with your favorite main course.\n        ", "The title of the recipe is Amato's secret rosemary potatoes. The instructions are the following:\n       

The collection can be queried on:
* the actual document with the where_document argument (supports \$contains and \$not_contains)
* the metadata if needed (supports several operatores like \$eq, \$ne, etc)

For all possibilities, see the [documentation](https://docs.trychroma.com/usage-guide#using-where-filters).

To better appreciate how the vector store and the embeddings for the items work, here completely different items are added to the collection. Skip this part if you already know how vector embeddings work.

In [52]:
other_kind_of_topics = ["Giuliano Amato leaves the algorithm Commission after the attacks by Giorgia Meloni.",
                        "Father Paolo Benanti is the new President of the AI Commission for Information."]
collection.add(
    ids = ['4','5'],
    documents = other_kind_of_topics,
    metadatas = [{"type":"latest_news"} for _ in other_kind_of_topics]
)

In [53]:
another_example = collection.get(           #notice how for the metadata the method is get and not query
    #include=['embeddings'],                #with the get method the embeddings must be explicitly included if you want to see the actual values
    where = {"type" : {"$ne": "recipe"}}
    )

print(another_example)

{'ids': ['4', '5'], 'embeddings': None, 'metadatas': [{'type': 'latest_news'}, {'type': 'latest_news'}], 'documents': ['Giuliano Amato leaves the algorithm Commission after the attacks by Giorgia Meloni.', 'Father Paolo Benanti is the new President of the AI Commission for Information.'], 'uris': None, 'data': None}


In [54]:
notice_the_distance = collection.query(
    query_texts=["This is a query document"],
    where_document = {"$contains":"Amato"},
    )

print(notice_the_distance)



{'ids': [['4', '2', '1', '0']], 'distances': [[1.7931534051895142, 1.8984371423721313, 1.9166781902313232, 1.9231815338134766]], 'metadatas': [[{'type': 'latest_news'}, {'type': 'recipe'}, {'type': 'recipe'}, {'type': 'recipe'}]], 'embeddings': None, 'documents': [['Giuliano Amato leaves the algorithm Commission after the attacks by Giorgia Meloni.', "The title of the recipe is Amato's secret cumin spiced mashed potatoes. The instructions are the following:\n            - Peel and dice 4-5 potatoes.\n            - Boil the potatoes until they are fork-tender. Drain the water.\n            - Mash the potatoes using a potato masher or fork.\n            - Add 2 teaspoons of ground cumin, salt, and pepper to taste. Mix well.\n            - Add a generous amount of Amato's secret stuff. Optional: Add a tablespoon of butter or olive oil for extra richness.\n            - Continue mashing until the potatoes reach your desired consistency.\n            - Serve hot as a side dish with your fav

Notice how the distances values are about the same for the three recipes but lower for the item about the latest new and that's how vector embeddings capture the meaning of the new information and the model can identify similarity.

Lastly, the final step of the pipeline consists in augmenting the prompt with the new information stored in the database as vector embedding.

In [60]:
#connect the output from the vectorstore to the chat chatbot

def contextualize_prompt(query: str):
  """ function to augment the prompt with the new knowledge retrieved from the vector database """

  # here is where you retrieve the info from the vector database
  results = collection.query(query_texts=["This is a query document"], where_document = {"$contains":"secret stuff"}, n_results=3) #<----- here we are querying the vector db
                                                                                                                                           # (the n_results just because the default is 10 but we only have 3)
  # get the text from the results
  source_knowledge = "\n\n".join(results['documents'][0])

  #feed into an augmented prompt
  augmented_prompt = f"""Using the contexts below, answer the query.

  Context: {source_knowledge}

  Query: {query}
  """
  return augmented_prompt


In [61]:
query = "I have some Amato's secret stuff left in the fridge, can you give a recipe so that I can use it?"
print(contextualize_prompt(query)) #how the prompt augmented with the context of the new input information

Using the contexts below, answer the query.

  Context: The title of the recipe is Amato's secret cumin spiced mashed potatoes. The instructions are the following:
            - Peel and dice 4-5 potatoes.
            - Boil the potatoes until they are fork-tender. Drain the water.
            - Mash the potatoes using a potato masher or fork.
            - Add 2 teaspoons of ground cumin, salt, and pepper to taste. Mix well.
            - Add a generous amount of Amato's secret stuff. Optional: Add a tablespoon of butter or olive oil for extra richness.
            - Continue mashing until the potatoes reach your desired consistency.
            - Serve hot as a side dish with your favorite main course.
        

The title of the recipe is Amato's secret rosemary potatoes. The instructions are the following:
            - Preheat your oven to 425°F (220°C).
            - Wash and scrub 4-5 potatoes, then cut them into bite-sized pieces.
            - In a bowl, toss the potatoes with 

Eventually the augmented prompt is inserted as a new user prompt

In [57]:
prompt = HumanMessage(content=contextualize_prompt(query))
messages.append(prompt)
res = chat(messages)
print(res.content)



Certainly! Here's a recipe you can try using Amato's secret stuff:

Title: Amato's secret herb roasted potatoes

Ingredients:
- 4-5 potatoes, washed and cut into wedges
- Olive oil
- Salt and pepper to taste
- Amato's secret stuff
- Your choice of herbs (rosemary, thyme, or parsley), chopped

Instructions:
1. Preheat your oven to 425°F (220°C).
2. In a bowl, toss the potato wedges with olive oil, salt, pepper, and a generous amount of Amato's secret stuff.
3. Sprinkle the chopped herbs over the potatoes and toss to coat evenly.
4. Spread the potatoes in a single layer on a baking sheet lined with parchment paper.
5. Roast the potatoes in the preheated oven for about 30-35 minutes or until they are golden brown and crispy on the outside, flipping them once halfway through.
6. Remove from the oven and let them cool slightly before serving.

These herb-roasted potatoes will have a delightful blend of flavors from Amato's secret stuff and the fresh herbs. Enjoy them as a side dish with you

## Second use case: incorporating recent news

To better appreciate the result see the difference without RAG first if we want to focus instead on incorporating the latest news into the non parametric knowlede base.

In [62]:
prompt = HumanMessage(content="Who is the new president of the AI commission in Italy?")
res = chat(messages + [prompt])
print(res.content)

As an AI language model, I don't have real-time information. Therefore, I'm unable to provide you with the latest updates on the president of the AI commission in Italy. I recommend checking reliable news sources or the official government websites for the most accurate and up-to-date information.


The fact that Amato has left the commission and the newly appointed President is a recent information that the pre-trained model does not possess but lies in the new knowledge base stored in the vector db.

In [80]:
def incorporate_news(query):
  """ function to augment the prompt with the latest news retrieved from the vector database """
  # here is where you retrieve the info from the vector database
  results = collection.get(where = {"type" : {'$eq':"latest_news"}})

  # get the text from the results
  source_knowledge = "\n\n".join(results['documents'])

  #feed into an augmented prompt
  augmented_prompt = f"""Using these recent news, answer the query.

  News: {source_knowledge}

  Query: {query}
  """
  return augmented_prompt

In [89]:
prompt = HumanMessage(content=incorporate_news("Who is the new president of the AI commission in Italy?"))
res = chat(messages + [prompt])
print(res.content)

Based on the provided news, the new president of the AI Commission in Italy is Father Paolo Benanti.


Consider that here I am manually adding the information in the knowledge base but think about the value provided if the latest_news are scraped programmatically or pulled from a public API. This kind of architecture would be able to update the knowledge base with minimal manual input (clearly all the processing and storage costs issues would have to be properly managed as well)