In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Generative AI Knowledge Base model predictions

To run this notebook, make sure you have uploaded at least one document into your knowledge base.

> ⭐️ If you haven't, follow the [**Uploading documents and query model** tutorial](https://console.cloud.google.com/products/solutions/deployments?walkthrough_id=panels--sic--generative-ai-knowledge-base_toc).

Before you begin, make sure all the dependencies are installed.

In [2]:
!pip install google-cloud-aiplatform google-cloud-firestore



In [15]:
!pip install google-api-python-client



In [27]:
!gcloud services enable aiplatform.googleapis.com

## Overview

A **Large Language Model (LLM)** can be very good at answering general questions.
But it might not do as well to answer questions from your documents on its own.

The LLM will answer only from what it learned from its _training dataset_.
Your documents might include information or words that weren't on that dataset.
Or they might be used in a different or more specialized context.

This is where **Vector Search** comes into place.
Each time you upload a document, the Cloud Function webhook processes it.
When a document is processed, each individual page is _indexed_.
This allows us to not only find documents, but the specific pages.

The relevant pages can then be used as _context_ for the LLM to answer the question.
This _grounds_ the model to answer questions based on the documents only.
Without this, the model might give wrong answers, or _hallucinations_.

## My Google Cloud resources

Fill in your project ID, the
[Google Cloud location](https://cloud.google.com/about/locations)
you want to use, and your
Vector Search index endpoint ID.
If you followed the tutorial, the deployed index ID should be `deployed_index`, otherwise change it to the ID you chose.

You can find your Vector Search index endpoint ID in the [Index endpoints tab](https://console.cloud.google.com/vertex-ai/matching-engine/index-endpoints).

> 💡 The Vector Search index endpoint ID looks like a number, like `1234567890123456789`.

Run the following cell to set up your resources and authenticate to your account.

In [1]:
# @title
from google.colab import auth

project_id = "micro-bus-386619" # @param {type:"string"}
location = "us-central1" # @param {type:"string"}
index_endpoint_id = "6241879132372729856" # @param {type:"string"}
deployed_index_id = "deployed_index" # @param {type:"string"}

auth.authenticate_user(project_id=project_id)

The first step is to initialize the Vertex AI client library using the location of your choice.

In [2]:
import vertexai
from google.cloud import aiplatform

vertexai.init(location=location)
aiplatform.init(location=location)

## Get text embeddings

You can use the Gecko model to get embeddings from text.
For more information, see the
[Get text embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings)
page.

In [3]:
from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel

def get_text_embedding(text: str) -> list[float]:
    task = 'RETRIEVAL_DOCUMENT'
    model = TextEmbeddingModel.from_pretrained("textembedding-gecko")
    return model.get_embeddings([TextEmbeddingInput(text, task)])[0].values


# Convert the question into an embedding.
question = "What are LFs and why are they useful?"
question_embedding = get_text_embedding(question)
print(f"Embedding dimensions: {len(question_embedding)}")

Embedding dimensions: 768


## Find document context

All the documents you have processed have been indexed into your Vector Search index.
You can query for the closest embeddings to a given embedding from your Vector Search index endpoint.

> 💡 If you haven't processed any documents yet, you won't get any results.

In [23]:
key_file_path = '/micro-bus-386619-2fe3a253b909.json' # if uploaded directly to the root

In [25]:
import os
import json

key_file_path = '/micro-bus-386619-2fe3a253b909.json'

# Load the JSON content of the key file
with open(key_file_path, 'r') as f:
    key_content = json.load(f)

# Set the environment variable
os.environ['GOOGLE_APPLICATION_CREDENTIALS_JSON'] = json.dumps(key_content)

In [29]:
from itertools import groupby
import json
from googleapiclient import discovery
from google.oauth2 import service_account

def find_document(question: str, index_endpoint_id: str, deployed_index_id: str, project_id: str, location: str) -> tuple[str, int]:
    """Finds the most relevant document and page number using Vertex AI Matching Engine."""
    # Get embeddings for the question.
    embedding = get_text_embedding(question)

    # Build the request body
    request_body = {
        "deployed_index_id": deployed_index_id,
        "queries": [
            {
                "values": embedding,
                "num_neighbors": 1,
            }
        ]
    }

    # Set up credentials and API client
    credentials = service_account.Credentials.from_service_account_info(
        json.loads(get_ipython().run_line_magic('env', 'GOOGLE_APPLICATION_CREDENTIALS_JSON'))
    )
    service = discovery.build('aiplatform', 'v1beta1', credentials=credentials)

    # Construct the request URL
    name = f"projects/{project_id}/locations/{location}/indexEndpoints/{index_endpoint_id}"
    request = service.projects().locations().indexEndpoints().findNeighbors(
        indexEndpoint=name, body=request_body
    )

    try:
        response = request.execute()
        # Extract the point ID
        point_id = response['nearestNeighbors'][0]['neighbors'][0]['id']
    except (KeyError, IndexError) as e:
        print(f"Error extracting point ID: {e}, response: {response}")
        return None, None

    # Get the document name and page number from the point ID.
    (filename, page_number) = point_id.split(':', 1)
    return (filename, int(page_number))

# Query the Vector Search index for the most relevant page.
(filename, page_number) = find_document(question, index_endpoint_id, deployed_index_id, project_id, location)
print(f"{filename=} {page_number=}")

HttpError: <HttpError 404 when requesting https://aiplatform.googleapis.com/v1beta1/projects/micro-bus-386619/locations/us-central1/indexEndpoints/6241879132372729856:findNeighbors?alt=json returned "Not Found". Details: "<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
  </style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>404.</b> <ins>That’s an error.</ins>
  <p>The requested URL <code>/v1beta1/projects/micro-bus-386619/locations/us-central1/indexEndpoints/6241879132372729856:findNeighbors</code> was not found on this server.  <ins>That’s all we know.</ins>
">

## Get document text

When documents were processed, their text was stored in Firestore as well.
The Vector Search query returned the relevant documents with their page numbers.
With this you can download the document's pages and give only the most relevant page to the model.

In [None]:
from google.cloud import firestore

def get_document_text(filename: str, page_number: int) -> str:
    db = firestore.Client(database='knowledge-base-database')
    doc = db.collection("documents").document(filename)
    return doc.get().get('pages')[page_number]

# Download the document's page text from Firestore.
context = get_document_text(filename, page_number)
print(f"{context[:1000]}\n...\n...")

EN SEM IND
FR SEM IND
VAR
REST {Magn( 1 )}
VAR
REST {Magn(
The interlingual status of the lexical function is
self-evident. Any occurrence of Magn will be left
intact during transfer and it will be the generation
component that ultimately assigns a monolingual
lexical entry to the LF.6
3.2 Problems
Lexical Functions abstract away from certain nu-
ances in meaning and from different syntactic re-
alizations. We discuss some of the problems raised
by this abstraction in this section.
Overgenerality An important problem stems
from the interpretation of LFs implied by their
use as an interlingua namely that the mean-
ing of the collocate in some ways reduces to the
meaning implied by the lexical function. This in-
terpretation is trouble-free if we assume that LFs
always deliver unique values; unfortunately cases
to the contrary can be readily observed. An exam-
ple attested from our corpus was the range of ad-
verbial constructions possible with the verbal head
oppose: adamantly, bitterly

## Ask a foundational model

With the relevant context ready, you can now make a _prompt_ that includes both the context and the question.

Here's Gemini's response.
Note that Gemini responds in [Markdown](https://www.markdownguide.org).

In [None]:
from vertexai.generative_models import GenerativeModel

# Ask the foundational model.
model = GenerativeModel(
    model_name="gemini-1.0-pro-002",
    system_instruction=context,
)
answer = model.generate_content(question).text

print("QUESTION:")
print(question)
print()
print("ANSWER:")
print(answer)

QUESTION
What are LFs and why are they useful?

ANSWER:
## What are Lexical Functions (LFs)?

LFs are a key tool in computational linguistics and machine translation, used to represent the potential relationships between words within a language. They're essentially a type of interlingual annotation, meaning they offer a language-neutral way to describe these relationships, independent of the specific words used in any particular language.

## How LFs are used:

- **Collocation Analysis:** LFs are particularly powerful when analyzing collocations, which are frequent co-occurrences of words. By capturing the relationships between words in collocations, LFs help us understand the meaning and nuances of these combinations, even across different languages.
- **Translation Support:** This deeper understanding of collocations provided by LFs is invaluable for machine translation. LFs help ensure that the meaning of the original text is accurately reflected in the translated text, taking into 

## (Optional) Ask your tuned model

If you want to tune a model, follow the [**Fine-tune an LLM model** tutorial](https://console.cloud.google.com/products/solutions/deployments?walkthrough_id=panels--sic--generative-ai-knowledge-base_toc).

First, find the tuning job ID for your tuned model.

In [None]:
from vertexai.preview.tuning import sft

for tuning_job in sft.SupervisedTuningJob.list():
    model_name = tuning_job.gca_resource.tuned_model_display_name
    tuning_job_id = tuning_job.resource_name
    print(f"{model_name}: {tuning_job_id}")

Copy your tuning job ID and paste it below.
Don't forget to run the cell to define the `tuning_job_id` variable.

In [None]:
tuning_job_id = "" # @param {type:"string"}

In [None]:
from vertexai.generative_models import GenerativeModel
from vertexai.preview import tuning
from vertexai.preview.tuning import sft

tuning_job = sft.SupervisedTuningJob(tuning_job_id)
assert tuning_job.tuned_model_endpoint_name, "Please wait until the tuning job finishes."

tuned_model = GenerativeModel(
    model_name=tuning_job.tuned_model_endpoint_name,
    system_instruction=context,
)
answer = tuned_model.generate_content(question).text

print("QUESTION:")
print(question)
print()
print("ANSWER:")
print(answer)

QUESTION:
What are LFs and why are they useful?

ANSWER:
Lexical functions (LFs) are functions that operate on lexemes. They are useful because they can be used to generate synonyms.
