Notebook uses the Code provided by Google, in their Quickstart guide for Gemini, which is a GenAI Model.
Pupose of this Notebook is to achieve result when doing prompting with RAG.

In [1]:
!pip install -U -q google.generativeai

In [2]:
import textwrap
import numpy as np
import pandas as pd

import google.generativeai as genai
import google.ai.generativelanguage as glm

# Used to securely store your API key
#from google.colab import userdata

from IPython.display import Markdown

  from .autonotebook import tqdm as notebook_tqdm


Get the API key, from Google Gemini


In [3]:
import os
API_KEY=os.environ['G_API_KEY']

genai.configure(api_key=API_KEY)

Let's see the Embedding Model Name.

In [4]:
for m in genai.list_models():
  if 'embedContent' in m.supported_generation_methods:
    print(m.name)

models/embedding-001


In [5]:
model = 'models/embedding-001'

Lets, try to build the embedding database.
I have used the CapitalOne public available data for their Savings Account Customer FAQ's.

In [6]:
DOCUMENT1 = {
    "title": "How and where can I open a savings account",
    "content": '''
    It’ll depend on the bank and account you choose. You can open a 360 Performance Savings account in about 5 minutes online, on your mobile device or at a Capital One location—no minimum balance required.
To open a no-minimum 360 Performance Savings account, have these things ready: your name, date of birth, mailing address, email, phone number, employment information, annual income, Social Security number and citizenship information. You’ll also need to agree to the terms and conditions. Based on varied factors, you may be denied for a 360 Performance Savings account. Upon approval, you’ll have 60 days to fund your account. Read more about the terms and conditions.
'''
}

In [7]:
DOCUMENT2 = {
    "title": "Can I open a 360 Performance Savings account at a Capital One location",
    "content": '''Yes, you can open a 360 Performance Savings account at a Capital One branch or Café. Since life doesn't just happen during business hours, you can also open a 360 Performance Savings account online or through the Capital One Mobile app.'''}

In [8]:
DOCUMENT3 = {
    "title": "Are savings accounts FDIC-insured",
    "content": '''
    Capital One’s 360 Performance Savings accounts are insured by the FDIC up to allowable limits.
    '''}

In [9]:
DOCUMENT4 = {
    "title": "Are savings accounts free",
    "content": '''
    It depends on your specific account.
Capital One’s 360 Performance Savings account has no monthly fees. You read that right. This is an online savings account with no fees and no minimum to open or keep the account. That means every dollar you earn is yours to save.
    '''}

In [10]:
documents = [DOCUMENT1,DOCUMENT2,DOCUMENT3,DOCUMENT4]

In [11]:

df = pd.DataFrame(documents)
df.columns = ['title', 'text']
df

Unnamed: 0,title,text
0,How and where can I open a savings account,\n It’ll depend on the bank and account you...
1,Can I open a 360 Performance Savings account a...,"Yes, you can open a 360 Performance Savings ac..."
2,Are savings accounts FDIC-insured,\n Capital One’s 360 Performance Savings ac...
3,Are savings accounts free,\n It depends on your specific account.\nCa...


Get the embeddings for each of these bodies of text. Add this information to the dataframe.
Select the task_type as retrieval_document

In [12]:
# Get the embeddings of each text and add to an embeddings column in the dataframe
def embed_fn(title, text):
  return genai.embed_content(model=model,
                             content=text,
                             task_type="retrieval_document",
                             title=title)["embedding"]

df['Embeddings'] = df.apply(lambda row: embed_fn(row['title'], row['text'][:9000]), axis=1)
#df1
#Embedding = embed_fn('ansible_installation',all[:9000])

We have the embeddings now. let's try to write the Query and ask anything to Model.
Put the Task Type as retrieval_query

In [13]:

query = "info on fdic present"


model = 'models/embedding-001'

request = genai.embed_content(model=model,
                              content=query,
                              task_type="retrieval_query")

Use the `find_best_passage` function to calculate the dot products, and then sort the dataframe from the largest to smallest dot product value to retrieve the relevant passage out of the database.

In [14]:
def find_best_passage(query, dataframe):
  """
  Compute the distances between the query and each document in the dataframe
  using the dot product.
  """
  query_embedding = genai.embed_content(model=model,
                                        content=query,
                                        task_type="retrieval_query")
  dot_products = np.dot(np.stack(dataframe['Embeddings']), query_embedding["embedding"])
  print(len(query_embedding))
  idx = np.argmax(dot_products)
  return dataframe.iloc[idx]['text'] # Return text from index with max value

In [15]:
passage = find_best_passage(query, df)
#passage

1


## Question and Answering Application

Let's try to use the text generation API to create a Q & A system. Input your own custom data below to create a simple question and answering example. You will still use the dot product as a metric of similarity.

In [16]:
def make_prompt(query, relevant_passage):
  escaped = relevant_passage.replace("'", "").replace('"', "").replace("\n", " ")
  prompt = textwrap.dedent("""You are a helpful and informative bot that answers questions using text from the reference passage included below. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and converstional tone. \
  If the passage is irrelevant to the answer, you may ignore it.
  QUESTION: '{query}'
  PASSAGE: '{relevant_passage}'

    ANSWER:
  """).format(query=query, relevant_passage=escaped)

  return prompt

In [17]:
prompt = make_prompt(query, passage)

Choose one of the Gemini content generation models in order to find the answer to your query.

In [18]:
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-pro
models/gemini-pro-vision


In [19]:
model = genai.GenerativeModel('gemini-1.0-pro-latest')
answer = model.generate_content(prompt)

In [20]:
Markdown(answer.text)

Yes, Capital One's 360 Performance Savings accounts are insured by the FDIC. This means the government guarantees your money in case Capital One fails.

Let's try to ask something from Model which is totally irrelevant and this will help us to know the Model is not hallucinating.

In [21]:
query = 'how to do pipx aimli'
prompt = make_prompt(query, passage)

In [22]:
model = genai.GenerativeModel('gemini-1.0-pro-latest')
answer = model.generate_content(prompt)

In [23]:
Markdown(answer.text)

I apologize, but the reference text provided does not contain any information on 'how to do pipx aimli'. Therefore, I cannot answer your question from the given context.