In [None]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Retrieval Augmented Generation(RAG) with AlloyDB

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb">
      <img width="32px" src="https://www.gstatic.com/pantheon/images/bigquery/welcome_page/colab-logo.svg" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fuse-cases%2Fretrieval-augmented-generation%2Frag_embeddings_and_index_with_alloydb.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/vertexai/v1/32px.svg" alt="Vertex AI logo"><br> Open in Vertex AI Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/bigquery/import?url=https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb">
      <img src="https://www.gstatic.com/images/branding/gcpiconscolors/bigquery/v1/32px.svg" alt="BigQuery Studio logo"><br> Open in BigQuery Studio
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb">
      <img width="32px" src="https://upload.wikimedia.org/wikipedia/commons/9/91/Octicons-mark-github.svg" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

<div style="clear: both;"></div>

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://bsky.app/intent/compose?text=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/7/7a/Bluesky_Logo.svg" alt="Bluesky logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/53/X_logo_2023_original.svg" alt="X logo">
</a>

<a href="https://reddit.com/submit?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb" target="_blank">
  <img width="20px" src="https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Logo.png" alt="Reddit logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/use-cases/retrieval-augmented-generation/rag_embeddings_and_index_with_alloydb.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>            

|                                            |                                                   |
|--------------------------------------------|---------------------------------------------------|
|Author(s)                                   |                                                   |
|[Tanya Warrier](https://github.com/tanyarw) |[Rupjit Chakraborty](https://github.com/lazyprgmr) |

## Overview

- **PostgreSQL:** [PostgreSQL](https://www.postgresql.org/docs/current/) is an open-source, highly-extensible object-relational database management system known for its reliability and feature richness.

- **AlloyDB:** [AlloyDB](https://cloud.google.com/alloydb/docs/overview) is Google Cloud's fully-managed, PostgreSQL-compatible database service optimized for demanding enterprise workloads and transactional/analytical hybrid processing.

- **Gemini:** [Gemini](https://ai.google.dev/models/gemini) is a family of generative AI models that lets developers generate content and solve problems. These models are designed and trained to handle both text and images as input.
  - **Gemini 1.0 Pro model (`gemini-1.0-pro`):** Designed to handle natural language tasks, multi-turn text and code chat, and code generation.

- **Vertex AI Embeddings for Text:** With [textembedding-gecko](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings) models we can easily create a text embedding with LLM. `textembedding-gecko@003` is the newest stable embedding model.

This notebook demonstrates Retrieval Augmented Generation (RAG) with AlloyDB backend. After installing the pre-requisites, we create an AlloyDB instance and use it to store embeddings. Finally we demonstrate how to fetch similar documents from AlloyDB and answer questions based on the documents fetched using `gemini-1.0-pro`.
  
Text embeddings are created for publicly available abstracts from patents data and use them in our LLM search. Google Patents Research Data contains the output of much of the data analysis work used in Google Patents (patents.google.com).
  
**Dataset**: `patents-public-data.google_patents_research.publications`


## Getting Started

### Enable Cloud APIs
Google Cloud APIs are programmatic interfaces to Google Cloud Platform services.

1. [Recommended APIs for AlloyDB](https://cloud.google.com/alloydb/docs/project-enable-access)
2. [Recommended APIs for Vertex AI](https://cloud.google.com/vertex-ai/docs/start/cloud-environment#enable_vertexai_apis)


#### **Before Moving Forward**

Ensure your project has [private services access](https://cloud.google.com/alloydb/docs/about-private-services-access) enabled with `Google Cloud Platform` as the service provider.

For instructions to set it up, click [here](https://cloud.google.com/alloydb/docs/configure-connectivity).

### Install required packages


In [None]:
%pip install pg8000==1.31.1 \
SQLAlchemy==2.0.29 \
google-cloud-aiplatform==1.46.0 \
google-cloud-alloydb-connector==1.0.0 --upgrade --quiet

### Restart runtime (Colab only)

To use the newly installed packages, you must restart the runtime on Google Colab.

In [None]:
import sys

if "google.colab" in sys.modules:
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. ⚠️</b>
</div>

### Authenticate your notebook environment (Colab only)


If you are running this notebook on Google Colab, run the cell below to authenticate your environment.

In [None]:
import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

### Import Libraries

Imports and prepares libraries for interacting with AlloyDB, Vertex AI resources for models, and data manipulation.

* `pg8000:` PostgreSQL database driver.
* `vertexai:` Google Cloud Vertex AI for managing generative models.
* `sqlalchemy:`  ORM for interfacing with databases in Python.
* `subprocess:`  Spawns external processes.
* `pandas:`  Powerful library for data analysis and manipulation.
* `google.cloud.alloydb.connector:`  Connector for Google Cloud AlloyDB (managed PostgreSQL).

In [None]:
import subprocess

from google.cloud.alloydb.connector import Connector
import pandas as pd
import pg8000
import sqlalchemy
from sqlalchemy.engine import Engine
from sqlalchemy.exc import DatabaseError
import vertexai
from vertexai.generative_models import GenerationConfig, GenerativeModel

### Set Google Cloud project information and initialize Vertex AI SDK

- To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

- Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
PROJECT_ID = "<your-project-id>"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}

vertexai.init(project=PROJECT_ID, location=LOCATION)

Configurations to create a new AlloyDB cluster and primary instance

* To get started using AlloyDB, you must have an existing Google Cloud project and [enable the AlloyDB AI API](https://console.cloud.google.com/flows/enableapi?apiid=alloydb.googleapis.com).
* To generate embeddings with AlloyDB, the cluster created must reside in the region `us-central1`.
* Please follow the naming convention listed below for `CLUSTER` and `INSTANCE` name
  - Must begin with a lowercase letter.
  - Can optionally contain a combination of lowercase letters, numbers, and hyphens (up to 61 characters).
  - Must end with a lowercase letter or digit.

In [None]:
REGION = "us-central1"  # @param {type:"string"}
CLUSTER = "<cluster-name>"  # @param {type:"string"}
INSTANCE = "<primary-instance-name>"  # @param {type:"string"}
CPU_COUNT = 2  # @param {type:"integer"}

PROJECT_NUM = (
    subprocess.check_output(
        [
            "gcloud",
            "projects",
            "describe",
            PROJECT_ID,
            "--format",
            "value(projectNumber)",
        ]
    )
    .decode("utf-8")
    .strip()
)
print(f"Project Number: {PROJECT_NUM}")

SERVICE_ACCOUNT = (
    f"serviceAccount:service-{PROJECT_NUM}@gcp-sa-alloydb.iam.gserviceaccount.com"
)
print(f"AlloyDB Service Agent: {SERVICE_ACCOUNT}")

Configurations for embedding and generative model


*   `textembedding-gecko` outputs 768-dimensional vector embeddings.


In [None]:
EMBEDDING_MODEL = "textembedding-gecko@003"  # @param {type:"string"}
DIMENSIONS = 768  # @param {type:"integer"}
GENERATIVE_MODEL = "gemini-1.0-pro"  # @param {type:"string"}

## Fetch Dataset from BigQuery

- [Google Patents Research Data](https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/google-patents-research-data) contains the output of much of the data analysis work used in [Google Patents](https://patents.google.com), including machine translations of titles and abstracts from Google Translate, embedding vectors, extracted top terms, similar documents, and forward references.
- We will use the public dataset table `google_patents_research.publications` for this demo by selecting the text columns below:
    - `publication_number:` Patent publication number (DOCDB compatible), eg: 'US-7650331-B1'.
    - `title:` The English title.
    - `abstract:` The English abstract.
    - `url:` URL to the patents.google.com page for this patent.
    - `country:` Country name.
    - `publication_description:` Description of the publication type.
- The text columns `title` and `abstract` will later be converted into text embeddings to perform similarity search. The other columns would be used as supplemental information to the user's question.

In [None]:
query = """
SELECT publication_number,	title,	abstract, url,	country,	publication_description
FROM `patents-public-data.google_patents_research.publications`
WHERE
  length(title)>1
    AND
  length(abstract)>1
ORDER BY 	publication_number
LIMIT 1000
"""
# Read the table and display first 5 rows
df = pd.read_gbq(query, project_id=PROJECT_ID)
df.head(5)

## AlloyDB as RAG backend

### Set Up

Set up cluster, instance and update them to allow public IP.
  - First we must create and connect the database on AlloyDB for
PostgreSQL. For more details check [create a cluster](https://cloud.google.com/alloydb/docs/cluster-create) and [create a primary instance](https://cloud.google.com/alloydb/docs/instance-primary-create).
  - To generate embeddings with AlloyDB, the created cluster must reside in the region `us-central1`. This is required because the Vertex AI model that AlloyDB can use for embeddings, `textembedding-gecko`, is located in that region.
More details about embedding generation can be found [here](https://codelabs.developers.google.com/codelabs/alloydb-ai-embedding).

In [None]:
password = input("Enter a password for the cluster: ")

# Set the active Google Cloud Project
!gcloud config set project {PROJECT_ID}

# Create cluster
!gcloud alloydb clusters create {CLUSTER} --password={password} --region={REGION}

# Create the primary instance
!gcloud alloydb instances create {INSTANCE} --instance-type=PRIMARY --cpu-count={CPU_COUNT} --region={REGION} --cluster={CLUSTER}

# Update the instance to allow public IP
!gcloud beta alloydb instances update {INSTANCE} --region={REGION} --cluster={CLUSTER} --assign-inbound-public-ip=ASSIGN_IPV4

Provision the `aiplatform.user` role to the AlloyDB service agent

In [None]:
!gcloud projects add-iam-policy-binding {PROJECT_ID} \
  --member={SERVICE_ACCOUNT} \
  --role="roles/aiplatform.user"

<div class="alert alert-block alert-warning">
<b>⚠️ Please wait for a few minutes to ensure that the AlloyDB instance is updated with a Public IP address before moving forward. ⚠️</b>
</div>

### Helper Functions


*   `create_sqlalchemy_engine`: Creates connection pool for AlloyDB instance
*   `check_table_exists`: Checks if table exists in an instance


In [None]:
def create_sqlalchemy_engine(
    inst_uri: str, user: str, password: str, db: str
) -> tuple[sqlalchemy.engine.Engine, Connector]:
    """Creates a connection pool for an AlloyDB instance and returns the pool
    and the connector. Callers are responsible for closing the pool and the
    connector.


    Args:
        inst_uri (str):
            The instance URI specifies the instance relative to the project,
            region, and cluster. For example:
            "projects/my-project/locations/us-central1/clusters/my-cluster/instances/my-instance"
        user (str):
            The database user name, e.g., postgres
        password (str):
            The database user's password, e.g., secret-password
        db (str):
            The name of the database, e.g., mydb

     Returns:
        Tuple[sqlalchemy.engine.Engine, Connector]:
            * A SQLAlchemy engine object for managing database interactions.
            * A Connector object for underlying database connections (can be used for closing).
    """
    connector = Connector()

    def getconn() -> pg8000.dbapi.Connection:
        """
        Establishes a connection to a Google Cloud AlloyDB instance (PostgreSQL database) using the pg8000 driver.

        Returns:
            pg8000.dbapi.Connection: An active database connection object.
        """
        conn: pg8000.dbapi.Connection = connector.connect(
            instance_uri=inst_uri,
            driver="pg8000",
            user=user,
            password=password,
            db=db,
            ip_type="PUBLIC",  # use ip_type to specify Public IP
        )
        return conn

    # create SQLAlchemy connection pool
    engine = sqlalchemy.create_engine(
        "postgresql+pg8000://", creator=getconn, isolation_level="AUTOCOMMIT"
    )
    engine.dialect.description_encoding = None
    return engine, connector

In [None]:
def check_table_exists(engine: Engine, connector: Connector, table_name: str) -> bool:
    """Checks if a table exists in the database.

    Args:
        engine (sqlalchemy.engine.Engine): SQLAlchemy engine object.
        connector (Connector): AlloyDB Connector object.
        table_name (str): Name of the table to check.

    Returns:
        bool: True if the table exists, False otherwise.
    """

    try:
        with engine.connect() as conn:
            check_cmd = sqlalchemy.text(f"SELECT 1 FROM {table_name} LIMIT 1")
            conn.execute(check_cmd)
        connector.close()
        return True

    except DatabaseError:
        return False

### Create the connection to AlloyDB

In [None]:
INSTANCE_URI = (
    f"projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER}/instances/{INSTANCE}"
)
USER = "postgres"
DB = "postgres"
TABLE_NAME = "google_patents_research"

### Create a table on AlloyDB
The table is created with the columns from the `google_patents_research.publications` dataset.

> **Note:** If you come across the error below, it is because the AlloyDB instance has not finished updating its public IP address. Please wait for a few minutes before trying to assign it again under the **Set Up** section.
```
IPTypeNotFoundError: AlloyDB instance does not have an IP addresses matching type: 'PUBLIC'
```

In [None]:
engine, connector = create_sqlalchemy_engine(
    inst_uri=INSTANCE_URI,
    user=USER,
    password=password,
    db=DB,
)

In [None]:
if check_table_exists(engine, connector, TABLE_NAME):
    print(f"Table {TABLE_NAME} already exists!")

else:
    # Create table
    create_table_cmd = sqlalchemy.text(
        f"CREATE TABLE {TABLE_NAME} ( \
      publication_number VARCHAR, \
      title TEXT, \
      abstract TEXT, \
      url VARCHAR, \
      country TEXT, \
      publication_description TEXT \
      )",
    )

    # Insert data
    insert_data_cmd = sqlalchemy.text(
        f"""
      INSERT INTO {TABLE_NAME} VALUES (:publication_number, :title,	:abstract, :url,	:country,	:publication_description)
      """
    )

    parameter_map = [
        {
            "publication_number": row["publication_number"],
            "title": row["title"],
            "abstract": row["abstract"],
            "url": row["url"],
            "country": row["country"],
            "publication_description": row["publication_description"],
        }
        for index, row in df.iterrows()
    ]

    # Execute the queries
    with engine.connect() as conn:
        print("Creating table...")
        conn.execute(create_table_cmd)
        print("Inserting values...")
        conn.execute(
            insert_data_cmd,
            parameter_map,
        )
        print("Committing...")
        conn.commit()
        print("Done")
    connector.close()

### Add AlloyDB extensions

Enable an extension by connecting to a database in an AlloyDB cluster's primary instance, then running a `CREATE EXTENSION` command. More details can be found [here](https://cloud.google.com/alloydb/docs/reference/extensions#enable).
- `google_ml_integration` integrates AlloyDB with Vertex AI
- `vector` allows us to use `pgvector` functions and operators with optimizations specific to AlloyDB

In [None]:
# Add extensions
google_ml_integration_cmd = sqlalchemy.text(
    "CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE"
)
vector_cmd = sqlalchemy.text("CREATE EXTENSION IF NOT EXISTS vector")

# Execute the queries
with engine.connect() as conn:
    conn.execute(google_ml_integration_cmd)
    conn.execute(vector_cmd)
    conn.commit()
connector.close()

### Create a column that stores text embeddings and an Index using AlloyDB



*   The Vertex AI text-embeddings API lets you create a text embedding using Generative AI on Vertex AI. Text embeddings are numerical representations of text that capture relationships between words and phrases.
*   IVFFlat is a type of vector index for approximate nearest neighbor search. It is a frequently used index type that can improve performance when querying highly-dimensional vectors, like those representing embeddings.

Visit the [pgvector documentation](https://github.com/pgvector/pgvector?tab=readme-ov-file#pgvector) for more information on supported index types and their distance functions


In [None]:
embedding_column = "embedding"
distance_function = "vector_cosine_ops"

# Add column to store embeddings
add_column_cmd = sqlalchemy.text(
    f"ALTER TABLE {TABLE_NAME} ADD COLUMN {embedding_column} vector({DIMENSIONS});"
)

# Generate embeddings for `title` and `abstract` columns of the dataset
embedding_cmd = sqlalchemy.text(
    f"UPDATE {TABLE_NAME} SET {embedding_column} = embedding('{EMBEDDING_MODEL}', title || ' ' || abstract);"
)

# Create an IVFFlat index on the table with embedding column and cosine distance
index_cmd = sqlalchemy.text(
    f"CREATE INDEX ON {TABLE_NAME} USING ivfflat ({embedding_column} {distance_function})"
)

In [None]:
# Execute the queries
with engine.connect() as conn:
    try:
        conn.execute(add_column_cmd)
    except:
        print(f"Column {embedding_column} already exists")
    print("Creating Embeddings...")
    conn.execute(embedding_cmd)
    print("Creating Index...")
    conn.execute(index_cmd)
    print("Committing...")
    conn.commit()
    print("Done")
connector.close()

## Retrieve data

Retrieve top 5 rows based on similarity search

In [None]:
def retrieve_information(
    query: str,
    engine: Engine,
    table_name: str,
    embedding_model: str,
    row_count: int = 5,
) -> str:
    """
    Queries a database table using a semantic similarity search and returns formatted results.

    Args:
        query (str): The search query to embed and compare against the database.
        engine (sqlalchemy.engine.Engine): SQLAlchemy engine object.
        table_name (str): The name of the table to query.
        embedding_model (str): The name of the embedding model to use.
        row_count (int, optional): The maximum number of results to return. Defaults to 5.

    Assumptions:
        The table has columns named 'publication_number', 'title', 'abstract', 'url', and an embedding column named 'embedding_column'.

    Returns:
        str: A formatted string containing the top results, including their publication number, title, abstract, and URL.
    """

    # Perform semantic search
    search_cmd = sqlalchemy.text(
        f"""
    SELECT publication_number, title,	abstract, url FROM {table_name}
      ORDER BY  {embedding_column}
      <-> embedding('{embedding_model}', '{query}')::vector
      LIMIT {row_count}
    """
    )

    # Execute the query
    with engine.connect() as conn:
        result = conn.execute(search_cmd)
        context = [row._asdict() for row in result]
    connector.close()

    # String format the retrieved information
    retrieved_information = "\n".join(
        [
            f"{index+1}. "
            + "\n".join([f"{key}: {value}" for key, value in element.items()])
            for index, element in enumerate(context)
        ]
    )

    return retrieved_information

**Sample Questions**


1.   Propose some project ideas for medical devices.
2.   List patents around solar energy and how can they be used.
3.   What methods exist to improve combustion?


In [None]:
query = "Propose some project ideas for medical devices."  # @param {type:"string"}

result = retrieve_information(
    query=query, engine=engine, table_name=TABLE_NAME, embedding_model=EMBEDDING_MODEL
)
print(result)

## Generate Response

Define a prompt template to answer questions according to the use-case.

In [None]:
prompt = """You are a friendly advisor helping to answer questions about patents. Based on the search request we have loaded a list of patents closely related to the search.

The user asked:
<question>
{question}
</question>

Here is the list of matching patents:
<roles>
{result}
</roles>

You should answer the question using the matching patents, reply with supplemental information and patent url.
Answer:
"""

The `generate_text` function performs two tasks
- Formats the prompt template with `result` and `question`.
- Invokes the generative model, in this case `gemini-1.0-pro`.

In [None]:
def generate_text(
    prompt: str,
    result: str,
    question: str,
    generative_model: GenerativeModel,
    generation_config: GenerationConfig,
) -> str:
    """
    Generates text response using a specified generative language model on Vertex AI.

    Args:
        prompt (str): The text prompt template for the generative model.
        result (str): The list of matching patents.
        question (str): The user's question.
        generative_model (vertexai.generative_models.GenerativeModel): The name or identifier of the generative model on Vertex AI.
        generation_config (vertexai.generative_models.GenerationConfig): Configuration object for the text generation process.

    Returns:
        str: The generated text response from the model.
    """
    input_prompt = prompt.format(result=result, question=question)

    # Query the model
    response = generative_model.generate_content(
        contents=input_prompt, generation_config=generation_config
    )

    return response.text

Generate a response based on the top 5 rows fetched via similarity search.

In [None]:
response = generate_text(
    prompt=prompt,
    result=result,
    question=query,
    generative_model=GenerativeModel(GENERATIVE_MODEL),
    generation_config=GenerationConfig(temperature=0.6, max_output_tokens=1024),
)

print(response)

## Summary

The provided code demonstrates a comprehensive approach to leveraging AlloyDB as a backend for a retrieval-augmented generative (RAG) application. The code is relevant to the task of building a RAG application using Vertex AI.

AlloyDB delivers up to 100X faster analytical queries than standard PostgreSQL, and AlloyDB AI runs vector queries up to 10x faster compared to standard PostgreSQL when using the IVFFlat index.

**Steps to improve generated responses:**

1. Prompt Engineering: The prompt template used for text generation could be further refined to improve the relevance and quality of the responses based on the use-case.
2. Contextual Information: Incorporate additional context from the retrieved information to provide more comprehensive and nuanced responses.
3. Model Parameters: Explore and tune the parameters (`GenerationConfig`) of the model to enhance the quality and relevance of the generated responses.

## Cleaning Up

Clean up the created resources by deleting the primary instance, and the cluster.

In [None]:
# Delete the instance
!gcloud alloydb instances delete {INSTANCE} --cluster={CLUSTER} --region={REGION}

# Delete the cluster
!gcloud alloydb clusters delete {CLUSTER} --region={REGION}