# Making Movie Mania


## Overview

Building a Movie Making Chatbot Assistance System with RAG, LangChain, LLM, and Vector Database

## Introduction
The goal of this project is to develop a movie recommendation system that leverages advanced machine learning techniques to provide personalized recommendations based on user inputs. The system integrates Retrieval-Augmented Generation (RAG) using a Large Language Model (LLM), LangChain, and a vector database to enhance the recommendation process.

## Components and Technologies
Large Language Model (LLM): Used for generating natural language responses and enhancing the recommendation process.
LangChain: Manages the flow of interactions between the user, LLM, and vector database.
Vector Database: Stores embeddings of movie data for efficient retrieval based on semantic similarity. Examples include Pinecone, Weaviate, or FAISS.
Streamlit: Provides an interactive web interface for users to input queries and receive recommendations.

#  Workflow
### Data Collection and Preprocessing

Collect movie data including titles, genres, ratings, and descriptions.
Clean and preprocess the data to ensure it is in a suitable format for generating embeddings.
Embedding Generation

### Pre-trained Embedding Model
Use a pre-trained embedding model to generate embeddings for movie titles and descriptions.
Store these embeddings in a vector database for efficient retrieval.
Vector Database Integration

### Vector Database
Initialize and configure the vector database.
Index the movie embeddings for fast similarity searches.
LangChain Integration

### LangChain
Set up LangChain to manage the interaction flow between the user inputs, LLM responses, and vector database retrievals.
Define the prompts and response generation logic.
Retrieval-Augmented Generation (RAG) Workflow

### LLM
User inputs a query through the Streamlit interface.
LangChain processes the query and retrieves relevant movie embeddings from the vector database.
The LLM uses the retrieved embeddings to generate a natural language response recommending movies.

### Streamlit Interface
Develop a user-friendly web interface for inputting queries and displaying recommendations.
Implement input fields and submit buttons for user interaction.

### Objectives

This notebook provides a guide to building a Adaptive Recommendation Chatbot using multimodal retrieval augmented generation (RAG) and Vector Database.

The tasks that this notebook would perform:

1. Extract data from documents containing both text and images using Gemini Vision Pro, and generate embeddings of the data, store it in vector store
2. Search the vector store with text queries to find similar text data
3. Using Text data as context, generate answer to the user query using Gemini Pro Model.

## Begin with Vertex AI SDK Setup

### Setting Up Vertex AI SDK and Essential Packages


In [None]:
!pip install --upgrade --quiet pymupdf langchain gradio google-cloud-aiplatform langchain_google_vertexai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.5/3.5 MB[0m [31m14.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m983.6/983.6 kB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m30.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.1/5.1 MB[0m [31m29.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.0/73.0 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.7/15.7 MB[0m [31m34.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m362.4/362.4 kB[0m [31m18.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.9/127.9 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━

### Restart runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After its restarted, continue to the next step.

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

<div class="alert alert-block alert-warning">
<b>⚠️ Wait for the kernel to finish restarting before you continue. ⚠️</b>
</div>

### Authenticate your notebook environment (Colab only)

If you are running this notebook on Google Colab, run the cell below to authenticate your environment.

This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench).

In [None]:
import sys

# Additional authentication is required for Google Colab
if "google.colab" in sys.modules:
    # Authenticate user to Google Cloud
    from google.colab import auth

    auth.authenticate_user()

### Define Google Cloud project information and initialize Vertex AI

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [None]:
# Define project information
PROJECT_ID = "project-llm-428915"  # @param {type:"string"}
LOCATION = "us-east1"  # @param {type:"string"}

# Initialize Vertex AI
import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION)

In [None]:
!pip install langchain_community

Collecting langchain_community
  Downloading langchain_community-0.2.7-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m16.7 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading marshmallow-3.21.3-py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.2/49.2 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain_community)
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensi

### Importing libraries
Let's start by importing the libraries that we will need for this tutorial


In [None]:
# File system operations and displaying images
import os

# Import utility functions for timing and file handling
import time

# Libraries for downloading files, data manipulation, and creating a user interface
import uuid
from datetime import datetime

import fitz
import gradio as gr
import pandas as pd

# Initialize Vertex AI libraries for working with generative models
from google.cloud import aiplatform
from PIL import Image as PIL_Image
from vertexai.generative_models import GenerativeModel, Image
from vertexai.language_models import TextEmbeddingModel

# Print Vertex AI SDK version
print(f"Vertex AI SDK version: {aiplatform.__version__}")

# Import LangChain components
import langchain

print(f"LangChain version: {langchain.__version__}")
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import DataFrameLoader

Vertex AI SDK version: 1.59.0
LangChain version: 0.2.7


### Initializing Gemini Vision Pro and Text Embedding models

In [None]:
# Loading Gemini Pro Vision Model
multimodal_model = GenerativeModel("gemini-1.0-pro-vision")

# Initializing embedding model
text_embedding_model = TextEmbeddingModel.from_pretrained("textembedding-gecko@003")

# Loading Gemini Pro Model
model = GenerativeModel("gemini-1.0-pro")

In [None]:
!wget https://www.hitachi.com/rev/archive/2023/r2023_04/pdf/04a02.pdf
!wget https://img.freepik.com/free-vector/hand-drawn-no-data-illustration_23-2150696455.jpg

# Create an "Images" directory if it doesn't exist
Image_Path = "./Images/"
if not os.path.exists(Image_Path):
    os.makedirs(Image_Path)

!mv hand-drawn-no-data-illustration_23-2150696455.jpg {Image_Path}/blank.jpg

--2024-07-12 02:29:23--  https://www.hitachi.com/rev/archive/2023/r2023_04/pdf/04a02.pdf
Resolving www.hitachi.com (www.hitachi.com)... 18.238.136.34, 18.238.136.7, 18.238.136.13, ...
Connecting to www.hitachi.com (www.hitachi.com)|18.238.136.34|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1462074 (1.4M) [application/pdf]
Saving to: ‘04a02.pdf.1’


2024-07-12 02:29:24 (2.89 MB/s) - ‘04a02.pdf.1’ saved [1462074/1462074]

--2024-07-12 02:29:24--  https://img.freepik.com/free-vector/hand-drawn-no-data-illustration_23-2150696455.jpg
Resolving img.freepik.com (img.freepik.com)... 23.33.85.241, 23.33.85.240, 2600:1406:5e00:49::17ce:e5ab, ...
Connecting to img.freepik.com (img.freepik.com)|23.33.85.241|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 32694 (32K) [image/jpeg]
Saving to: ‘hand-drawn-no-data-illustration_23-2150696455.jpg’


2024-07-12 02:29:25 (325 KB/s) - ‘hand-drawn-no-data-illustration_23-2150696455.jpg’ saved [32694/3269

### Convert PDF to Images and Extract Data Using Gemini Vision Pro
This module processes a set of images, extracting text and tabular data using the multimodal model Gemini Vision Pro. It manages potential errors, stores the extracted information in a DataFrame, and saves the results to a CSV file.

In [None]:
# Run the following code for each file
PDF_FILENAME = "Making-Movies-Manual.pdf"  # Replace with the filename for making movie

In [None]:
# To get better resolution
zoom_x = 2.0  # horizontal zoom
zoom_y = 2.0  # vertical zoom
mat = fitz.Matrix(zoom_x, zoom_y)  # zoom factor 2 in each dimension

doc = fitz.open(PDF_FILENAME)  # open document
for page in doc:  # iterate through the pages
    pix = page.get_pixmap(matrix=mat)  # render page to an image
    outpath = f"./Images/{PDF_FILENAME}_{page.number}.jpg"
    pix.save(outpath)  # store image as a PNG

# Define the path where images are located
image_names = os.listdir(Image_Path)
Max_images = len(image_names)

# Create empty lists to store image information
page_source = []
page_content = []
page_id = []

p_id = 0  # Initialize image ID counter
rest_count = 0  # Initialize counter for error handling

while p_id < Max_images:
    try:
        # Construct the full path to the current image
        image_path = Image_Path + image_names[p_id]

        # Load the image
        image = Image.load_from_file(image_path)

        # Generate prompts for text and table extraction
        prompt_text = "Extract all text content in the image"
        prompt_table = (
            "Detect table in this image. Extract content maintaining the structure"
        )

        # Extract text using your multimodal model
        contents = [image, prompt_text]
        response = multimodal_model.generate_content(contents)
        text_content = response.text

        # Extract table using your multimodal model
        contents = [image, prompt_table]
        response = multimodal_model.generate_content(contents)
        table_content = response.text

        # Log progress and store results
        print(f"processed image no: {p_id}")
        page_source.append(image_path)
        page_content.append(text_content + "\n" + table_content)
        page_id.append(p_id)
        p_id += 1

    except Exception as err:
        # Handle errors during processing
        print(err)
        print("Taking Some Rest")
        time.sleep(1)  # Pause execution for 1 second
        rest_count += 1
        if rest_count == 5:  # Limit consecutive error handling
            rest_count = 0
            print(f"Cannot process image no: {image_path}")
            p_id += 1  # Move to the next image

# Create a DataFrame to store extracted information
df = pd.DataFrame(
    {"page_id": page_id, "page_source": page_source, "page_content": page_content}
)
del page_id, page_source, page_content  # Conserve memory
df.head()  # Preview the DataFrame

Cannot get the response text.
Cannot get the Candidate text.
Response candidate content has no parts (and thus no text). The candidate is likely blocked by the safety filters.
Content:
{}
Candidate:
{
  "finish_reason": "RECITATION",
  "safety_ratings": [
    {
      "category": "HARM_CATEGORY_HATE_SPEECH",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.09619517,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.1046602
    },
    {
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.104294725,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.07382972
    },
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "probability": "NEGLIGIBLE",
      "probability_score": 0.0996453,
      "severity": "HARM_SEVERITY_NEGLIGIBLE",
      "severity_score": 0.073164724
    },
    {
      "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
      "probability": "NEGLIGIBLE",


Unnamed: 0,page_id,page_source,page_content
0,1,./Images/Making-Movies-Manual.pdf_0.jpg,013322\nThe Film Foundation presents:\nMAKING...
1,2,./Images/blank.jpg,?\n?\n404\n | Column 1 | Column 2 |\n| :----:...
2,3,./Images/Making-Movies-Manual.pdf_4.jpg,Movies matter because they are more than imag...
3,4,./Images/Making-Movies-Manual.pdf_2.jpg,Introduction\nThis manual will help you make ...
4,5,./Images/Making-Movies-Manual.pdf_3.jpg,A Word from Your Sponsor\nDo you like going t...


# Generate Text Embeddings
Leverage a powerful language model textembedding-gecko to generate rich text embeddings that helps us find relevant information from a dataset.

In [None]:
def generate_text_embedding(text) -> list:
    """Text embedding with a Large Language Model."""
    embeddings = text_embedding_model.get_embeddings([text])
    vector = embeddings[0].values
    return vector


# Create a DataFrameLoader to prepare data for LangChain
loader = DataFrameLoader(df, page_content_column="page_content")

# Load documents from the 'page_content' column of your DataFrame
documents = loader.load()

# Log the number of documents loaded
print(f"# of documents loaded (pre-chunking) = {len(documents)}")

# Create a text splitter to divide documents into smaller chunks
text_splitter = CharacterTextSplitter(
    chunk_size=10000,  # Target size of approximately 10000 characters per chunk
    chunk_overlap=200,  # overlap between chunks
)

# Split the loaded documents
doc_splits = text_splitter.split_documents(documents)

# Add a 'chunk' ID to each document split's metadata for tracking
for idx, split in enumerate(doc_splits):
    split.metadata["chunk"] = idx

# Log the number of documents after splitting
print(f"# of documents = {len(doc_splits)}")

texts = [doc.page_content for doc in doc_splits]
text_embeddings_list = []
id_list = []
page_source_list = []
for doc in doc_splits:
    id = uuid.uuid4()
    text_embeddings_list.append(generate_text_embedding(doc.page_content))
    id_list.append(str(id))
    page_source_list.append(doc.metadata["page_source"])
    time.sleep(1)  # So that we don't run into Quota Issue

# Creating a dataframe of ID, embeddings, page_source and text
embedding_df = pd.DataFrame(
    {
        "id": id_list,
        "embedding": text_embeddings_list,
        "page_source": page_source_list,
        "text": texts,
    }
)
embedding_df.head()

# of documents loaded (pre-chunking) = 5
# of documents = 5


Unnamed: 0,id,embedding,page_source,text
0,65ab6184-de63-474f-ba54-0360e63b897f,"[-0.038054924458265305, -0.04563278704881668, ...",./Images/Making-Movies-Manual.pdf_0.jpg,013322\nThe Film Foundation presents:\nMAKING\...
1,0fd663f0-a499-4545-a42d-79eda6043603,"[0.04799053072929382, -0.08848366141319275, -0...",./Images/blank.jpg,?\n?\n404\n | Column 1 | Column 2 |\n| :----: ...
2,775bf47e-567d-4895-963e-e57113219feb,"[-0.03016633912920952, 0.0068882484920322895, ...",./Images/Making-Movies-Manual.pdf_4.jpg,Movies matter because they are more than image...
3,e6f0eea2-7838-45e4-84f5-5aacd167da73,"[-0.03584425523877144, -0.023852290585637093, ...",./Images/Making-Movies-Manual.pdf_2.jpg,Introduction\nThis manual will help you make a...
4,13602ea2-9de3-415a-aeb2-e480f94658f9,"[0.008318882435560226, 0.010507011786103249, -...",./Images/Making-Movies-Manual.pdf_3.jpg,A Word from Your Sponsor\nDo you like going to...


### Creating Vertex AI: Vector Search
The code configures and deploys a vector search index on Google Cloud, making it ready to store and search through embeddings.

Embedding size :  The number of values used to represent a piece of text in vector form. Larger dimensions mean a denser and potentially more expressive representation.


Dimensions vs. Latency

* Search: Higher-dimensional embeddings can make vector similarity searches slower, especially in large databases.
* Computation: Calculations with larger vectors generally take more time during model training and inference.


In [None]:
VECTOR_SEARCH_REGION = "us-central1"
VECTOR_SEARCH_INDEX_NAME = f"{PROJECT_ID}-vector-search-index-ht"
VECTOR_SEARCH_EMBEDDING_DIR = f"{PROJECT_ID}-vector-search-bucket-ht"
VECTOR_SEARCH_DIMENSIONS = 768

### Save the embeddings in a JSON file
To load the embeddings to Vector Search, we need to save them in JSON files with JSONL format. See more information in the docs at [Input data format and structure](https://cloud.google.com/vertex-ai/docs/matching-engine/match-eng-setup/format-structure#data-file-formats).

First, export the `id` and `embedding` columns from the DataFrame in JSONL format, and save it.

Then, create a new Cloud Storage bucket and copy the file to it.

In [None]:
# save id and embedding as a json file
jsonl_string = embedding_df[["id", "embedding"]].to_json(orient="records", lines=True)
with open("data.json", "w") as f:
    f.write(jsonl_string)

# show the first few lines of the json file
! head -n 3 data.json

{"id":"65ab6184-de63-474f-ba54-0360e63b897f","embedding":[-0.0380549245,-0.045632787,-0.0647878945,-0.0159328971,0.050656952,-0.0025819123,0.0437574387,0.0417656824,0.0118456623,0.0154576125,0.0858155563,-0.0316534676,-0.0240386166,0.0195132792,-0.0067499322,0.0323519632,0.069553636,-0.0124949431,-0.0081771053,-0.0593349151,0.0128635233,0.0663666055,0.0414980017,-0.0193339065,0.0221850015,-0.0159736946,-0.0292771421,-0.0544300191,-0.0124877868,0.0002944057,-0.0402369015,0.0232605916,-0.0540790074,-0.0044612717,0.0216459259,-0.0506533086,0.0516761951,0.0037211922,-0.0197432619,0.0882333294,-0.016854791,-0.0138972849,-0.0105799846,0.0098327547,0.0144254724,0.0138951521,0.0076237377,0.008311634,0.011430271,-0.0298194606,0.0187948048,0.0287830066,0.0030497652,-0.0148951914,0.0017777905,-0.0116097908,0.0289261881,-0.0107643558,-0.0457674898,0.0461271368,-0.0003473655,-0.0003435047,0.0046629203,0.0390416719,-0.0189698674,-0.0879643857,-0.061640501,0.0168278571,0.0812858716,0.0021333727,-0.02

In [None]:
# Generates a unique ID for session
UID = datetime.now().strftime("%m%d%H%M")

# Creates a GCS bucket
BUCKET_URI = f"gs://{VECTOR_SEARCH_EMBEDDING_DIR}-{UID}"
! gsutil mb -l $LOCATION -p {PROJECT_ID} {BUCKET_URI}
! gsutil cp data.json {BUCKET_URI}

Creating gs://project-llm-428915-vector-search-bucket-ht-07120233/...
Copying file://data.json [Content-Type=application/json]...
-
Operation completed over 1 objects/50.5 KiB.                                     


### Create an Index

Now it's ready to load the embeddings to Vector Search. Its APIs are available under the [aiplatform](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform) package of the SDK.

Create an [MatchingEngineIndex](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndex) with its `create_tree_ah_index` function (Matching Engine is the previous name of Vector Search).

In [None]:
# create index
my_index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
    display_name=f"{VECTOR_SEARCH_INDEX_NAME}",
    contents_delta_uri=BUCKET_URI,
    dimensions=768,
    approximate_neighbors_count=20,
    distance_measure_type="DOT_PRODUCT_DISTANCE",
)

INFO:google.cloud.aiplatform.matching_engine.matching_engine_index:Creating MatchingEngineIndex
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index:Create MatchingEngineIndex backing LRO: projects/34837578213/locations/us-east1/indexes/2541798204534423552/operations/8167276549846859776
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index:MatchingEngineIndex created. Resource name: projects/34837578213/locations/us-east1/indexes/2541798204534423552
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index:To use this MatchingEngineIndex in another session:
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index:index = aiplatform.MatchingEngineIndex('projects/34837578213/locations/us-east1/indexes/2541798204534423552')


By calling the `create_tree_ah_index` function, it starts building an Index. This will take under a few minutes if the dataset is small, otherwise about 50 minutes or more depending on the size of the dataset. You can check status of the index creation on [the Vector Search Console > INDEXES tab](https://console.cloud.google.com/vertex-ai/matching-engine/indexes).


#### The parameters for creating index

- `contents_delta_uri`: The URI of Cloud Storage directory where you stored the embedding JSON files
- `dimensions`: Dimension size of each embedding. In this case, it is 768 as we are using the embeddings from the Text Embeddings API.
- `approximate_neighbors_count`: how many similar items we want to retrieve in typical cases
- `distance_measure_type`: what metrics to measure distance/similarity between embeddings. In this case it's `DOT_PRODUCT_DISTANCE`

See [the document](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index) for more details on creating Index and the parameters.


### Create Index Endpoint and deploy the Index

To use the Index, you need to create an [Index Endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public). It works as a server instance accepting query requests for your Index.

In [None]:
# create IndexEndpoint
my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
    display_name=f"{VECTOR_SEARCH_INDEX_NAME}",
    public_endpoint_enabled=True,
)
print(my_index_endpoint)

INFO:google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint:Creating MatchingEngineIndexEndpoint
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint:Create MatchingEngineIndexEndpoint backing LRO: projects/34837578213/locations/us-east1/indexEndpoints/8685658074314178560/operations/2485985619918979072
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint:MatchingEngineIndexEndpoint created. Resource name: projects/34837578213/locations/us-east1/indexEndpoints/8685658074314178560
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint:To use this MatchingEngineIndexEndpoint in another session:
INFO:google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint:index_endpoint = aiplatform.MatchingEngineIndexEndpoint('projects/34837578213/locations/us-east1/indexEndpoints/8685658074314178560')


<google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint object at 0x7f3d2cc8cca0> 
resource name: projects/34837578213/locations/us-east1/indexEndpoints/8685658074314178560


This tutorial utilizes a [Public Endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/setup/setup#choose-endpoint) and does not support [Virtual Private Cloud (VPC)](https://cloud.google.com/vpc/docs/private-services-access). Unless you have a specific requirement for VPC, we recommend using a Public Endpoint. Despite the term "public" in its name, it does not imply open access to the public internet. Rather, it functions like other endpoints in Vertex AI services, which are secured by default through IAM. Without explicit IAM permissions, as we have previously established, no one can access the endpoint.

With the Index Endpoint, deploy the Index by specifying an unique deployed index ID.

In [None]:
# DEPLOYED_INDEX_NAME = VECTOR_SEARCH_INDEX_NAME.replace(
#     "-", "_"
# )  # Can't have - in deployment name, only alphanumeric and _ allowed
# DEPLOYED_INDEX_ID = f"{DEPLOYED_INDEX_NAME}_{UID}"
# # deploy the Index to the Index Endpoint
# my_index_endpoint.deploy_index(index=my_index, deployed_index_id=DEPLOYED_INDEX_ID)

If it is the first time to deploy an Index to an Index Endpoint, it will take around 25 minutes to automatically build and initiate the backend for it. After the first deployment, it will finish in seconds. To see the status of the index deployment, open [the Vector Search Console > INDEX ENDPOINTS tab](https://console.cloud.google.com/vertex-ai/matching-engine/index-endpoints) and click the Index Endpoint.

### Ask Questions to the PDF
This code snippet establishes a question-answering (QA) system.  It leverages a vector search engine to find relevant information from a dataset and then uses the 'gemini-pro' LLM model to generate and refine the final answer to a user's query.

In [None]:
def Test_LLM_Response(txt):
    """
    Determines whether a given text response generated by an LLM indicates a lack of information.

    Args:
        txt (str): The text response generated by the LLM.

    Returns:
        bool: True if the LLM's response suggests it was able to generate a meaningful answer,
              False if the response indicates it could not find relevant information.

    This function works by presenting a formatted classification prompt to the LLM (`gemini_pro_model`).
    The prompt includes the original text and specific categories indicating whether sufficient information was available.
    The function analyzes the LLM's classification output to make the determination.
    """

    classification_prompt = f""" Classify the text as one of the following categories:
        -Information Present
        -Information Not Present
        Text=The provided context does not contain information.
        Category:Information Not Present
        Text=I cannot answer this question from the provided context.
        Category:Information Not Present
        Text:{txt}
        Category:"""
    classification_response = model.generate_content(classification_prompt).text

    if "Not Present" in classification_response:
        return False  # Indicates that the LLM couldn't provide an answer
    else:
        return True  # Suggests the LLM generated a meaningful response


def get_prompt_text(question, context):
    """
    Generates a formatted prompt string suitable for a language model, combining the provided question and context.

    Args:
        question (str): The user's original question.
        context (str): The relevant text to be used as context for the answer.

    Returns:
        str: A formatted prompt string with placeholders for the question and context, designed to guide the language model's answer generation.
    """
    prompt = """
      Answer the question using the context below. Respond with only from the text provided
      Question: {question}
      Context : {context}
      """.format(
        question=question, context=context
    )
    return prompt


def get_answer(query):
    """
    Retrieves an answer to a provided query using multimodal retrieval augmented generation (RAG).

    This function leverages a vector search system to find relevant text documents from a
    pre-indexed store of multimodal data. Then, it uses a large language model (LLM) to generate
    an answer, using the retrieved documents as context.

    Args:
        query (str): The user's original query.

    Returns:
        dict: A dictionary containing the following keys:
            * 'result' (str): The LLM-generated answer.
            * 'neighbor_index' (int): The index of the most relevant document used for generation
                                     (for fetching image path).

    Raises:
        RuntimeError: If no valid answer could be generated within the specified search attempts.
    """

    neighbor_index = 0  # Initialize index for tracking the most relevant document
    answer_found_flag = 0  # Flag to signal if an acceptable answer is found
    result = ""  # Initialize the answer string
    # Use a default image if the reference is not found
    page_source = "./Images/blank.jpg"  # Initialize the blank image
    query_embeddings = generate_text_embedding(
        query
    )  # Generate embeddings for the query

    response = my_index_endpoint.find_neighbors(
        deployed_index_id=DEPLOYED_INDEX_ID,
        queries=[query_embeddings],
        num_neighbors=5,
    )  # Retrieve up to 5 relevant documents from the vector store

    while answer_found_flag == 0 and neighbor_index < 4:
        context = embedding_df[
            embedding_df["id"] == response[0][neighbor_index].id
        ].text.values[
            0
        ]  # Extract text context from the relevant document

        prompt = get_prompt_text(
            query, context
        )  # Create a prompt using the question and context
        result = model.generate_content(prompt).text  # Generate an answer with the LLM

        if Test_LLM_Response(result):
            answer_found_flag = 1  # Exit loop when getting a valid response
        else:
            neighbor_index += (
                1  # Try the next retrieved document if the answer is unsatisfactory
            )

    if answer_found_flag == 1:
        page_source = embedding_df[
            embedding_df["id"] == response[0][neighbor_index].id
        ].page_source.values[
            0
        ]  # Extract image_path from the relevant document
    return result, page_source


query = (
    "What is film preservation and why does it matter?")

result, page_source = get_answer(query)
print(result)

Film preservation is the process of protecting films from deterioration and restoring decaying film before the images and sounds are lost forever. It is important because it allows future generations to learn about our history and culture through film. The United States Congress passed the National Film Preservation Act in 1998 to help protect films. The Film Foundation, created by some of the most famous directors in the film industry, also works to preserve films. This manual was created by The Film Foundation to help people understand the importance of film preservation and to encourage them to get involved in saving movies.


# Ask Questions to the PDF using Gradio UI
 this code creates a web-based frontend for your question-answering system, allowing users to easily enter queries and see the results along with relevant images.

In [None]:
import gradio as gr
from PIL import Image as PIL_Image

def gradio_query(query):
    print(query)

    # Retrieve the answer from your QA system
    result, image_path = get_answer(query)
    print("result here")
    print(result)

    try:
        # Attempt to fetch the source image reference
        image = PIL_Image.open(image_path)  # Open the reference image
    except:
        # Use a default image if the reference is not found
        image = PIL_Image.open("./Images/blank.jpg")

    return result, image  # Return both the text answer and the image

gr.close_all()  # Ensure a clean Gradio interface
with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            # Input / Output Components
            query = gr.Textbox(label="Query", info="Enter your query")
            btn_enter = gr.Button("Process")
            answer = gr.Textbox(label="Response", interactive=False)  # Use gr.Textbox for plain text response
            btn_clear = gr.Button("Clear")
        with gr.Column():
            image = gr.Image(label="Reference", visible=True)

    # Button Click Event
    btn_enter.click(fn=gradio_query, inputs=query, outputs=[answer, image])
    btn_clear.click(lambda: ("", None), inputs=None, outputs=[query, answer, image])

demo.launch(share=True, debug=True, inbrowser=True)  # Launch the Gradio app


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://d46de338436f58ee02.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


What is film preservation and why does it matter?
result here
According to the context, film preservation is the act of protecting films from deterioration and restoring decaying film before the images and sounds are lost forever. This is important because movies are more than just entertainment, they are stories that provide clues to understanding who we are, and they can also be great teachers. The context goes on to explain how the National Film Preservation Act was passed in 1998 to help protect films, and how The Film Foundation was created to increase awareness of this country's film heritage and preserve as many films as possible. The context concludes by stating that the goal of film preservation is to ensure that the movies that matter to us all are saved for future generations.

Why is the sky blue? 
result here
I'm sorry, but I am unable to answer the question based on the context you have provided, there is no information about the sky or color.
Why is the sky blue? 
result

### Close the demo

Note: Stop the previous cell to close the Gradio server running locally then run this cell to free up the port utilised for running the server

In [None]:
demo.close()

### Cleaning up
To clean up all Google Cloud resources used in this project, you can delete the Google Cloud project you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial.

In [None]:
delete_bucket = False

# Force undeployment of indexes and delete endpoint
my_index_endpoint.delete(force=True)

# Delete indexes
my_index.delete()

if delete_bucket:
    ! gsutil rm -rf {BUCKET_URI}