![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FApplied+GenAI%2FRanking&file=Vertex+AI+Agent+Builder+Ranking+API.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Ranking/Vertex%20AI%20Agent%20Builder%20Ranking%20API.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FApplied%2520GenAI%2FRanking%2FVertex%2520AI%2520Agent%2520Builder%2520Ranking%2520API.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Ranking/Vertex%20AI%20Agent%20Builder%20Ranking%20API.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/Applied%20GenAI/Ranking/Vertex%20AI%20Agent%20Builder%20Ranking%20API.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Vertex AI Agent Builder Ranking API

- https://cloud.google.com/generative-ai-app-builder/docs/builder-apis
- https://cloud.google.com/generative-ai-app-builder/docs/ranking

---
## Colab Setup

To run this notebook in Colab run the cells in this section.  Otherwise, skip this section.

This cell will authenticate to GCP (follow prompts in the popup).

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [2]:
try:
    import google.colab
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
except Exception:
    pass

---
## Installs

The list `packages` contains tuples of package import names and install names.  If the import name is not found then the install name is used to install quitely for the current user.

In [3]:
# tuples of (import name, install name, min_version)
packages = [
    ('google.cloud.aiplatform', 'google-cloud-aiplatform', '1.66.0'),
    ('google.cloud.discoveryengine', 'google-cloud-discoveryengine')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### API Enablement

In [4]:
!gcloud services enable aiplatform.googleapis.com
!gcloud services enable discoveryengine.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [5]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

inputs:

In [6]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [7]:
REGION = 'us-central1'
SERIES = 'applied-genai'
EXPERIMENT = 'evaluation-check-grounding'

packages:

In [49]:
import os, shutil, json

import numpy as np

import google.cloud.discoveryengine_v1 as discoveryengine
from google.cloud import aiplatform
import vertexai.generative_models # for Gemini Models
import vertexai.language_models # for text embedding models

In [24]:
aiplatform.__version__

'1.66.0'

In [25]:
discoveryengine.__version__

'0.12.1'

clients:

In [26]:
# Vertex AI
vertexai.init(project = PROJECT_ID, location = REGION)

# Vertex AI Agent Builder APIs
ranker = discoveryengine.RankServiceClient()

---
## Text & Embeddings For Examples

This repository contains a [section for document processing (chunking)](../Chunking/readme.md) that includes an [example of processing a PDF with the Document AI Layout Parser](../Chunking/Process%20Documents%20-%20Document%20AI%20Layout%20Parser.ipynb).  The chunks of text from that workflow are stored with this repository and loaded by another companion workflow that augments the chunks with text embeddings: [Vertex AI Text Embeddings API](../Embeddings/Vertex%20AI%20Text%20Embeddings%20API.ipynb).

The following code will load the version of the chunks that includes text embeddings and prepare it for a local example of retrival augmented generation.

### Get The Documents

If you are working from a clone of this notebooks [repository](https://github.com/statmike/vertex-ai-mlops) then the documents are already present. The following cell checks for the documents folder and if it is missing gets it (`git clone`):

In [33]:
local_dir = '../Embeddings/files/embeddings-api'

In [34]:
if not os.path.exists(local_dir):
    print('Retrieving documents...')
    parent_dir = os.path.dirname(local_dir)
    temp_dir = os.path.join(parent_dir, 'temp')
    if not os.path.exists(temp_dir):
        os.makedirs(temp_dir)
    !git clone https://www.github.com/statmike/vertex-ai-mlops {temp_dir}/vertex-ai-mlops
    shutil.copytree(f'{temp_dir}/vertex-ai-mlops/Applied GenAI/Embeddings/files/embeddings-api', local_dir)
    shutil.rmtree(temp_dir)
    print(f'Documents are now in folder `{local_dir}`')
else:
    print(f'Documents Found in folder `{local_dir}`')             

Documents Found in folder `../Embeddings/files/embeddings-api`


### Load The Chunks

In [35]:
with open(local_dir+'/chunk-embeddings.jsonl', 'r') as f:
    chunks = [json.loads(line) for line in f]

### Review A Chunk

In [39]:
chunks[0].keys()

dict_keys(['instance', 'predictions', 'status'])

In [47]:
chunks[0]['instance']['chunk_id']

'c2'

In [54]:
print(chunks[0]['instance']['content'])

# OFFICIAL BASEBALL RULES

## Official Baseball Rules 2023 Edition

### JOINT COMPETITION COMMITTEE

|-|-|-|
| Bill DeWitt | Whit Merrifield | Austin Slater |
| Jack Flaherty | Bill Miller | John Stanton, Chair |
| Tyler Glasnow | Dick Monfort | Tom Werner |
| Greg Johnson | Mark Shapiro |  |

Committee Secretary Paul V. Mifsud, Jr. Copyright © 2023 by the Office of the Commissioner of Baseball


In [45]:
chunks[0]['predictions'][0]['embeddings']['values'][0:10]

[0.008681542240083218,
 0.06999468058347702,
 0.003673204220831394,
 0.019888797774910927,
 0.016285404562950134,
 0.035664502531290054,
 0.06200747936964035,
 0.05597030743956566,
 0.0034793149679899216,
 -0.024485772475600243]

---
## Simple Retrieval Augmented Generation (RAG)

Embeddings can be used with math to measure similarity.  For deeper details into this checkout the companion workflow here: [The Math of Similarity](./The%20Math%20of%20Similarity.ipynb).  Retrieval systems handle the storage and math of similarity as a service.  For an overview of Google Cloud based solutions for retrieval check out [this companion series](../Retrieval/readme.md).

The content below motivates retrieval with the embeddings that accompany the text chunks using a local vector database with brute force matching using Numpy!

### Vector DB With Numpy

In [48]:
vector_db = [
    [
        chunk['instance']['chunk_id'],
        chunk['instance']['content'],
        chunk['predictions'][0]['embeddings']['values'],
    ]
    for chunk in chunks
]
vector_index = np.array([row[2] for row in vector_db])

### Models: Embeddings, Generation

Connect to models for text embeddings and text generation:

In [50]:
embedder = vertexai.language_models.TextEmbeddingModel.from_pretrained('text-embedding-004')
llm = vertexai.generative_models.GenerativeModel("gemini-1.5-flash-001")

Define a question that is the start of our prompt to the LLM:

In [55]:
question = "What are the dimensions of a base?"

Get an ungrounded response to the question with the LLM:

In [56]:
print(llm.generate_content(question).text)

Please provide me with more context!  "Base" can refer to many different things, each with its own dimensions. 

For example, are you talking about:

* **The base of a geometric shape?**  
    * In this case, the dimensions would be the length and width of the base.  
    * For example, a rectangle might have a base of 10cm and a height of 5cm.
* **The base of a number system?**
    * The base of a number system is the number of unique digits it uses. For example, the base of the decimal system is 10 (0-9).
* **The base of a container?**
    * The dimensions would be the length, width, and height of the base.
* **The base of a word?**
    * The base of a word is its root meaning, without any prefixes or suffixes.
* **A military base?** 
    * The dimensions would be the area of the base and its geographical location. 

Please clarify what you mean by "base" so I can give you the specific dimensions you need. 



Get an embedding for the question to use in retrieval:

In [59]:
question_embedding = embedder.get_embeddings([question])[0].values
question_embedding[0:10]

[-0.026682045310735703,
 0.011593513190746307,
 0.028523651883006096,
 -0.0017065361607819796,
 0.01946176588535309,
 0.0031198114156723022,
 0.07915323227643967,
 -0.005078596994280815,
 -0.006295712199062109,
 0.04943541809916496]

### Retrieval: Matching With Numpy

Use dot product to calculate similarity and find matches for a query embedding.  Why dot product?  Check out the companion workflow: [The Math of Similarity](./The%20Math%20of%20Similarity.ipynb)

> **NOTE:**  This will calculate the similarity for all embeddings vectors stored in the local vector db which is just a Numpy array here.  This is very fast because there are <200 embeddings vectors.  As this scales it would be better to consider a solution that searches a subset of embeddings.  More details on retrieval solutions can be found in [Retrieval](../Retrieval/readme.md).

In [62]:
similarity = np.dot(question_embedding, vector_index.T)
matches = np.argsort(similarity)[::-1][:5].tolist()
matches = [(match, similarity[match]) for match in matches]
matches

[(38, 0.5843799337008113),
 (36, 0.5724333016720691),
 (836, 0.5244194362041271),
 (40, 0.5126844935129918),
 (26, 0.5033481946111171)]

In [73]:
for m, match in enumerate(matches):
    print(f"Match {m+1} ({match[1]:.2f}) is chunk: {vector_db[match[0]][0]}:\n{vector_db[match[0]][1]}\n###################################################")

Match 1 (0.58) is chunk: c38:
# 2.00-THE PLAYING FIELD

## 2.02 Home Base

Home base shall be marked by a five-sided slab of whitened rubber. It shall be a 17-inch square with two of the corners removed so that one edge is 17 inches long, two adjacent sides are 8\frac{1}{2} inches and the remaining two sides are 12 inches and set at an angle to make a point.
###################################################
Match 2 (0.57) is chunk: c39:
# 2.00-THE PLAYING FIELD

## 2.02 Home Base

It shall be set in the ground with the point at the intersection of the lines extending from home base to first base and to third base; with the 17-inch edge facing the pitcher's plate, and the two 12-inch edges coinciding with the first and third base lines. The top edges of home base shall be beveled and the base shall be fixed in the ground level with the ground surface. (See drawing D in Appendix 2.) 3
###################################################
Match 3 (0.52) is chunk: c838:
# APPENDICES

## Ap

### Generation: Q&A With Gemini Grounded With RAG

Provide the matched chunks of text along with the question as a prompt to a generative model for a grounded answer.

#### Prompt Building Function

Use the matching chunks as context for the prompt:

In [82]:
def get_prompt(question, top_n = 5):
    # get embedding for question
    question_embedding = embedder.get_embeddings([question])[0].values
    # get top_n matches:
    similarity = np.dot(question_embedding, vector_index.T)
    matches = np.argsort(similarity)[::-1][:top_n].tolist()
    matches = [(match, similarity[match]) for match in matches]
    # construct prompt:
    prompt = ''
    for m, match in enumerate(matches):
        prompt += f"Context {m+1}:\n{vector_db[match[0]][1]}\n\n"
    prompt += f'Answer the following question using the provided contexts:\n{question}'
    
    return prompt

In [83]:
print(get_prompt(question))

Context 1:
# 2.00-THE PLAYING FIELD

## 2.02 Home Base

Home base shall be marked by a five-sided slab of whitened rubber. It shall be a 17-inch square with two of the corners removed so that one edge is 17 inches long, two adjacent sides are 8\frac{1}{2} inches and the remaining two sides are 12 inches and set at an angle to make a point.

Context 2:
# 2.00-THE PLAYING FIELD

## 2.02 Home Base

It shall be set in the ground with the point at the intersection of the lines extending from home base to first base and to third base; with the 17-inch edge facing the pitcher's plate, and the two 12-inch edges coinciding with the first and third base lines. The top edges of home base shall be beveled and the base shall be fixed in the ground level with the ground surface. (See drawing D in Appendix 2.) 3

Context 3:
# APPENDICES

## Appendix 2

Diagram No. 2 Layout at Home Plate, 1st, 2nd, and 3rd Bases 18" A 18" 90° LAYOUT AT SECOND BASE FOR LAYOUT AT PITCHER'S PLATE SEE DIAGRAM NO. 3 90° 6"

### Grounded Generation

In [84]:
print(llm.generate_content(get_prompt(question)).text)

The answer depends on which type of base you are asking about:

* **Home Base:** It is a five-sided slab of whitened rubber, with dimensions:
    * One edge: 17 inches long
    * Two adjacent sides: 8 1/2 inches each
    * Remaining two sides: 12 inches each, set at an angle to make a point.
* **First, Second, and Third Bases:** These are white canvas or rubber-covered bags, with dimensions:
    * 18 inches square
    * 3 to 5 inches thick

Therefore, the dimensions of a base depend on whether it is the home base or the other bases. 



---
## Ranking API