## Retrieval Augumented Generation (RAG)

Note: The two main challenges of Large Language Models are:
    i) They only know the information they were tranined on 
    ii) They have limited context window - The length of the input that they can remember

- To address both of these challenges, the technique call Retrieval Augumented Generation (RAG) can be employed. This technique allow us to intergrate 
external or domain specific knowledge to the LLM understanding.

The typical RAG systm has three stages: 
    - Indexing - This happens ahead of time and allows you to quickly lookup relevant information at query-time. 
    - Retrieval - Given user's prompt, you Retrieve relevant documents from your external data source which will be later supplied to the LLM
    - Generation - Use the LLM to generate a tailored answer in Natural Language using the supplied information

This allows you to provide information that the model hasn't seen before, such as product-specific knowledge or live weather updates. 

### Install Chromadb - Open source vector database

In [4]:
# pip install chromadb

In [5]:
import google.generativeai as genai 
from IPython.display import Markdown

In [6]:
GOOGLE_API_KEY = 'AIzaSyD7Jojg3dYpSpUtOt7J6D89z5QZtpjEz8c'
genai.configure(api_key=GOOGLE_API_KEY)

- We will use Gemini embedContent API method to calculate embeddings. Let's find a model that supports the embedContent functionality:

In [11]:
for model in genai.list_models():
    if "embedContent" in model.supported_generation_methods:
        print(model.name)

models/embedding-001
models/text-embedding-004


- We will use the models/text-embedding-004

### Data
Below is a small set of documents to be used to create embedding database.

In [13]:
DOCUMENT1 = "Operating the Climate Control System  Your Googlecar has a climate control system that allows you to adjust the temperature and airflow in the car. To operate the climate control system, use the buttons and knobs located on the center console.  Temperature: The temperature knob controls the temperature inside the car. Turn the knob clockwise to increase the temperature or counterclockwise to decrease the temperature. Airflow: The airflow knob controls the amount of airflow inside the car. Turn the knob clockwise to increase the airflow or counterclockwise to decrease the airflow. Fan speed: The fan speed knob controls the speed of the fan. Turn the knob clockwise to increase the fan speed or counterclockwise to decrease the fan speed. Mode: The mode button allows you to select the desired mode. The available modes are: Auto: The car will automatically adjust the temperature and airflow to maintain a comfortable level. Cool: The car will blow cool air into the car. Heat: The car will blow warm air into the car. Defrost: The car will blow warm air onto the windshield to defrost it."
DOCUMENT2 = 'Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the "Navigation" icon to get directions to your destination or touch the "Music" icon to play your favorite songs.'
DOCUMENT3 = "Shifting Gears Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions."

documents = [DOCUMENT1, DOCUMENT2, DOCUMENT3]

### Creating the embedding database with ChromDB

Create a custom function to generate embeddings with the Gemini API. In this task, you are implementing a retrieval system, so the task_type for generating the document embeddings is retrieval_document. Later, you will use retrieval_query for the query embeddings. Check out the API reference for the full list of supported tasks.

In [18]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from google.api_core import retry

class GeminiEmbeddingFunction(EmbeddingFunction):
    #specify wheather to generate embeddings for documents, or queries 
    document_mode = True

    def __call__(self,input:Documents):
        if self.document_mode: 
            embedding_task = "retrieval_document"
        else:
            embedding_task = "retrieval_query"

        retry_policy = {"retry":retry.Retry(predicate=retry.if_transient_error)}

        response = genai.embed_content(
            model = "models/text-embedding-004", 
            content = input, 
            task_type = embedding_task, 
            request_options=retry_policy
        )
        return response["embedding"]

Now we create a chroma database that uses the GeminiEmbedFunction and populate the database with the documents defined above

In [69]:
# Drop the existing collection
chroma_client.delete_collection(DB_NAME)


In [70]:
import chromadb
DB_NAME = "googlecardb"
embed_fn = GeminiEmbeddingFunction()
# from chromadb.utils import embedding_functions
# embed_fn = embedding_functions.DefaultEmbeddingFunction()
# embed_fn.document_mode = True

chroma_client = chromadb.Client()
db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)
db.add(documents=documents, ids=[str(i) for i in range(len(documents))])

In [71]:
b

3

In [25]:
# # Select the collection
# collection = chroma_client.get_collection(DB_NAME)
# # Fetch all documents
# xx = collection.get(include=["metadatas", "embeddings", "documents"])
# print(xx)

### Retrieval: Find relevant documents 
Now to search the chroma database, call the query method. Note that you also switch to the retrieval_query mode of the embedding generation.

In [72]:
# switch to query mode when generating embeddings. 
embed_fn.document_mode = False

#Search the Chroma DB using the specified query
query = "How do you use the touch screen to play music?"

result = db.query(query_texts=[query], n_results=1)
[[passage]] = result['documents']
Markdown(passage)

Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the "Navigation" icon to get directions to your destination or touch the "Music" icon to play your favorite songs.

### Augmented Generation: Answer the question

Now that you have found a relevant passage from the set of documents (the retrieval step), you can now assemble a generation prompt to have the Gemini API generate a final answer. Note that in this example only a single passage was retrieved. In practice, especially when the size of your underlying data is large, you will want to retrieve more than one result and let the Gemini model determine what passages are relevant in answering the question. For this reason it's OK if some retrieved passages are not directly related to the question - this generation step should ignore them.

In [67]:
passage_oneline = passage.replace("\n", " ")
query_oneline = query.replace("\n", " ")

# This prompt is where you can specify any guidance on tone, or what topics the model should stick to, or avoid.
prompt = f"""You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: {query_oneline}
PASSAGE: {passage_oneline}
"""
print(prompt)

You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: How do you use the touch screen to play music?
PASSAGE: Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the "Navigation" icon to get directions to your destination or touch the "Music" icon to play your favorite songs.



- Now, we can use generate_content method to generate an answer to the question

In [61]:
model = genai.GenerativeModel("gemini-1.5-flash-latest")
answer = model.generate_content(prompt)
Markdown(answer.text)

To play music on your Googlecar's touchscreen, simply touch the "Music" icon on the main display; it's that easy!  The touchscreen gives you access to lots of other things too, like navigation and climate control, all through simple taps on the screen.
