# **Week-13 MLS: Finsights Grey - RAG for Effective Information Retrieval**

## **Business Use Case**

### **Problem Statement:**

Finsights Grey Inc. is an innovative financial technology firm that specializes in providing advanced analytics and insights for investment management and financial planning. The company handles an extensive collection of 10-K reports from various industry players, which contain detailed information about financial performance, risk factors, market trends, and strategic initiatives. Despite the richness of these documents, Finsights Grey's financial analysts struggle with extracting actionable insights efficiently in a short span due to the manual and labor-intensive nature of the analysis. Going through the document to find the exact information needed at the moment takes too long. This bottleneck hampers the company's ability to deliver timely and accurate recommendations to its clients. To overcome these challenges, Finsights Grey Inc. aims to implement a Retrieval-Augmented Generation (RAG) model to automate the extraction, summarization, and analysis of information from the 10-K reports, thereby enhancing the accuracy and speed of their investment insights.


### **Objective:**

As a Gen AI Data Scientist hired by Finsights Grey Inc., the objective is to develop an advanced RAG-based system to streamline the extraction and analysis of key information from 10-K reports.

The project will involve testing the RAG system on a current business problem. The Financial analysts are asked to research major cloud and AI platforms such as Amazon AWS, Google Cloud, Microsoft Azure, Meta AI, and IBM Watson to determine the most effective platform for this application. The primary goals include improving the efficiency of data extraction. Once the project is deployed, the system will be tested by a financial analyst with the following questions. Accurate text retrieval for these questions will imply the project's success.

### **Questions:**

1. Has the company made any significant acquisitions in the AI space, and how are these acquisitions being integrated into the company's strategy?

2. How much capital has been allocated towards AI research and development?

3. What initiatives has the company implemented to address ethical concerns surrounding AI, such as fairness, accountability, and privacy?

4. How does the company plan to differentiate itself in the AI space relative to competitors?

Each Question must be asked for each of the five companies.

By successfully developing this project, we aim to:

Improve the productivity of financial analysts by providing a competent tool.

Provide timely insights to improve client recommendations.

Strengthen FinTech Insights Inc.’s competitive edge by delivering more reliable and faster insights to clients.

# **Creating a FAISS-Based Vector Index for Document Retrieval with Azure Machine Learning**

##### **In this guide, we'll set up an Azure Machine Learning (AzureML) pipeline to process Tesla's annual reports, generate embeddings, and create a FAISS-based vector index for efficient document retrieval.**

## **1. Install and Import Required Libraries**

In [1]:
# Install the Azure Machine Learning SDK and FAISS-related utilities
%pip install azure-ai-ml
%pip install -U 'azureml-rag[faiss,hugging_face]>=0.2.36'

Note: you may need to restart the kernel to use updated packages.
Collecting tiktoken<0.6 (from azureml-rag>=0.2.36->azureml-rag[faiss,hugging_face]>=0.2.36)
  Using cached tiktoken-0.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Using cached tiktoken-0.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
Installing collected packages: tiktoken
  Attempting uninstall: tiktoken
    Found existing installation: tiktoken 0.6.0
    Uninstalling tiktoken-0.6.0:
      Successfully uninstalled tiktoken-0.6.0
Successfully installed tiktoken-0.5.2
Note: you may need to restart the kernel to use updated packages.


## **2. Configure Azure Machine Learning Workspace**

### Get client for AzureML Workspace

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

Enter your Workspace details below, running this still will write a `workspace.json` file to the current folder.

In [2]:
# Import necessary AzureML and authentication libraries
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential
from azure.ai.ml import MLClient
from azureml.core import Workspace

In [3]:
# Define workspace configuration (replace with your details)
workspace_config = {
    "subscription_id": "your_subscription_id",  # Replace with your Azure subscription ID
    "resource_group": "your_resource_group",    # Replace with your Azure resource group name
    "workspace_name": "your_workspace_name"     # Replace with your AzureML workspace name
}

In [None]:
%%writefile workspace.json
{
    "subscription_id": "####################",
    "resource_group": "###################",
    "workspace_name": "###############"
}

Overwriting workspace.json


In [6]:
# Initialize credentials for Azure authentication
try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()



# Initialize the MLClient to connect with AzureML
ml_client = MLClient.from_config(credential=credential, path="workspace.json")



# Create an AzureML Workspace object
ws = Workspace(
    subscription_id=ml_client.subscription_id,
    resource_group=ml_client.resource_group_name,
    workspace_name=ml_client.workspace_name,
)


# Verify the client and workspace details
print(ml_client)

Found the config file in: workspace.json


MLClient(credential=<azure.identity._credentials.default.DefaultAzureCredential object at 0x7fb604621de0>,
         subscription_id=72510a3d-1523-4e16-be26-bd516ff30c38,
         resource_group_name=default_resource_group,
         workspace_name=thireshworkspace)


## **3. Register the Reports Dataset as a Data Asset**

Register the dataset in AzureML for further processing.

`Upload  Dataset-10k.zip` file in to the Azure Machine Learning Studio before executing the below cell 

In [None]:
# Import libraries for data registration
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
import zipfile
import os

# Path to the ZIP file containing Tesla annual reports
zip_file_path = 'Dataset-10k.zip'

# Directory to extract the reports
extract_to_directory = './extracted_dataset_reports'
os.makedirs(extract_to_directory, exist_ok=True)

# Extract the ZIP file containing the reports
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    zip_ref.extractall(extract_to_directory)

# Register the extracted data as a Data asset in AzureML
local_data_path = extract_to_directory
data_asset = Data(
    path=local_data_path,
    type=AssetTypes.URI_FOLDER,  # Registering as a folder URI
    description="Finsights collected reports for embedding generation",
    name="finsights-dataset-reports"
)

# Use the MLClient to register the data asset
ml_client.data.create_or_update(data_asset)
print(f"Data asset '{data_asset.name}' registered successfully.")

[32mUploading extracted_dataset_reports (7.13 MBs): 100%|██████████| 7130176/7130176 [00:00<00:00, 57091825.93it/s]
[39m



Data asset 'finsights-dataset-reports' registered successfully.


## **4. Set Up Azure OpenAI Connection**

### Which Embeddings Model to use?

There are currently two supported Embedding options: OpenAI's `text-embedding-ada-002` embedding model or HuggingFace embedding models. Here are some factors that might influence your decision:

#### OpenAI

OpenAI has [great documentation](https://platform.openai.com/docs/guides/embeddings) on their Embeddings model `text-embedding-ada-002`, it can handle up to 8191 tokens and can be accessed using [Azure OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/concepts/models#embeddings-models) or OpenAI directly.
If you have an existing Azure OpenAI Instance you can connect it to AzureML, if you don't AzureML provisions a default one for you called `Default_AzureOpenAI`.
The main limitation when using `text-embedding-ada-002` is cost/quota available for the model. Otherwise it provides high quality embeddings across a wide array of text domains while being simple to use.

#### HuggingFace

HuggingFace hosts many different models capable of embedding text into single-dimensional vectors. The [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) ranks the performance of embeddings models on a few axis, not all models ranked can be run locally (e.g. `text-embedding-ada-002` is on the list), though many can and there is a range of larger and smaller models. When embedding with HuggingFace the model is loaded locally for inference, this will potentially impact your choice of compute resources.

**NOTE:** The default PromptFlow Runtime does not come with HuggingFace model dependencies installed, Indexes created using HuggingFace embeddings will not work in PromptFlow by default. **Pick OpenAI if you want to use PromptFlow**

### Run the cells under _either_ heading (OpenAI or HuggingFace) to use the respective embedding model

#### OpenAI

We can use the automatically created `Default_AzureOpenAI` connection.

If you would rather use an existing Azure OpenAI connection then change `aoai_connection_name` below.
If you would rather use an existing Azure OpenAI resource, but don't have a connection created, modify `aoai_connection_name` and the details under the `# Create New Connection` code comment, or navigate the PromptFlow section in your AzureML Workspace and use the Connections create UI flow.

In [8]:
# # Azure Open AI redentials and the id of the deployed chat model are stored as
# # key value pairs in a json file

# with open('config.json', 'r') as az_creds:
#     data = az_creds.read()

# # Credentials to authenticate to the personalized Open AI model server
# import json
# creds = json.loads(data)

In [1]:
# from azureml.rag.utils.connections import get_connection_by_name_v2, create_connection_v2

# # Define the connection name for Azure OpenAI
# aoai_connection_name = "Custom_AzureOpenAI_Connection"

# try:
#     # Retrieve an existing connection by name
#     aoai_connection = get_connection_by_name_v2(ws, aoai_connection_name)
#     aoai_connection_id = aoai_connection["id"]
# except Exception:
#     # If the connection doesn't exist, create a new one
#     target = creds["AZURE_OPENAI_ENDPOINT"]  # Replace with your Azure OpenAI endpoint
#     key = creds["AZURE_OPENAI_KEY"]          # Replace with your Azure OpenAI API key
#     api_version = creds["AZURE_OPENAI_APIVERSION"]    # Replace with the appropriate API version

#     aoai_connection = create_connection_v2(
#         workspace=ws,
#         name=aoai_connection_name,
#         category="AzureOpenAI",
#         target=target,
#         auth_type="ApiKey",
#         credentials={"key": key},
#         metadata={"ApiType": "azure", "ApiVersion": api_version},
#     )

#     aoai_connection_id = aoai_connection["id"]

# print(f"Azure OpenAI connection created or retrieved successfully: {aoai_connection_id}")

Now that your Workspace has a connection to Azure OpenAI we will make sure the `text-embedding-ada-002` model has been deployed ready for inference. This cell will fail if there is not deployment for the embeddings model, [follow these instructions](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#deploy-a-model) to deploy a model with Azure OpenAI.

In [10]:
# from azureml.rag.utils.deployment import infer_deployment

# aoai_embedding_model_name = "text-embedding-ada-002"
# try:
#     aoai_embedding_deployment_name = infer_deployment(
#         aoai_connection, aoai_embedding_model_name
#     )
#     print(
#         f"Deployment name in AOAI workspace for model '{aoai_embedding_model_name}' is '{aoai_embedding_deployment_name}'"
#     )
# except Exception as e:
#     print(f"Deployment name in AOAI workspace for model '{model_name}' is not found.")
#     print(
#         f"Please create a deployment for this model by following the deploy instructions on the resource page for '{aoai_connection['properties']['target']}' in Azure Portal."
#     )

Finally we will combine the deployment and model information into a uri form which the AzureML embeddings components expect as input.

In [11]:
# embeddings_model_uri = f"azure_open_ai://deployment/{aoai_embedding_deployment_name}/model/{aoai_embedding_model_name}"

#### HuggingFace

AzureMLs default model from HuggingFace is `all-mpnet-base-v2`, it can be run by most laptops. Any `sentence-transformer` models should be supported, you can learn more about `sentence-transformers` [here](https://huggingface.co/sentence-transformers).

In [12]:
embeddings_model_uri = "hugging_face://model/sentence-transformers/all-mpnet-base-v2"

## **5. Setup Pipeline to process data into Index**

AzureML [Pipelines](https://learn.microsoft.com/azure/machine-learning/concept-ml-pipelines?view=azureml-api-2) connect together multiple [Components](https://learn.microsoft.com/azure/machine-learning/concept-component?view=azureml-api-2). Each Component defines inputs, code that consumes the inputs and outputs produced from the code. Pipelines themselves can have inputs, and outputs produced by connecting together individual sub Components.
To process your data for embedding and indexing we will chain together multiple components each performing their own step of the workflow.

The Components are published to a [Registry](https://learn.microsoft.com/azure/machine-learning/how-to-manage-registries?view=azureml-api-2&tabs=cli), `azureml`, which should have access to by default, it can be accessed from any Workspace.
In the below cell we get the Component Definitions from the `azureml` registry.

### **Define Pipeline Components**

In [13]:
# Import the MLClient to access the AzureML registry
ml_registry = MLClient(credential=credential, registry_name="azureml")

# Retrieve components for processing data, generating embeddings, and creating the FAISS index
crack_and_chunk_component = ml_registry.components.get(
    "llm_rag_crack_and_chunk", label="latest"
)
generate_embeddings_component = ml_registry.components.get(
    "llm_rag_generate_embeddings", label="latest"
)
create_faiss_index_component = ml_registry.components.get(
    "llm_rag_create_faiss_index", label="latest"
)
register_mlindex_component = ml_registry.components.get(
    "llm_rag_register_mlindex_asset", label="latest"
)


Each Component has documentation which provides an overall description of the Components purpose and each of the inputs/outputs.
For example we can see understand what `crack_and_chunk` does by inspecting the Component definition.

In [14]:
print(crack_and_chunk_component)

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
name: llm_rag_crack_and_chunk
version: 0.0.74
display_name: LLM - Crack and Chunk Data
description: 'Creates chunks no larger than `chunk_size` from `input_data`, extracted
  document titles are prepended to each chunk


  LLM models have token limits for the prompts passed to them, this is a limiting
  factor at embedding time and even more limiting at prompt completion time as only
  so much context can be passed along with instructions to the LLM and user queries.

  Chunking allows splitting source data of various formats into small but coherent
  snippets of information which can be ''packed'' into LLM prompts when asking for
  answers to user query related to the source documents.


  Supported formats: md, txt, html/htm, pdf, ppt(x), doc(x), xls(x), py

  '
tags:
  Preview: ''
type: command
inputs:
  input_data:
    type: uri_folder
    description: Uri Folder containing files to be chunked.
    op

### **Build the AzureML Pipeline**

Below a Pipeline is built by defining a python function which chains together the above components inputs and outputs. Arguments to the function are inputs to the Pipeline itself and the return value is a dictionary defining the outputs of the Pipeline.

In [16]:
from azure.ai.ml import Input, Output
from azure.ai.ml.dsl import pipeline
from azure.ai.ml.entities._job.pipeline._io import PipelineInput
from typing import Optional


# Utility function for automatic compute configuration
def use_automatic_compute(component, instance_count=1, instance_type="Standard_NC4as_T4_v3"):
    """Configure a component to use automatic compute."""
    component.set_resources(
        instance_count=instance_count,
        instance_type=instance_type,
        properties={"compute_specification": {"automatic": True}},
    )
    return component


# Utility function to check if optional pipeline inputs are provided
def optional_pipeline_input_provided(input: Optional[PipelineInput]):
    """Check if optional pipeline inputs are provided."""
    return input is not None and input._data is not None


@pipeline(default_compute="serverless")
def finsights_to_faiss(
    data_asset_path: str,
    embeddings_model: str,
    asset_name: str,
    chunk_size: int = 1024,
    data_source_glob: str = None,
    document_path_replacement_regex: str = None,
    aoai_connection_id=None,
    embeddings_container=None,
):
    """Pipeline to process finsights reports and create a FAISS vector index."""
    
    # Step 1: Chunk data into smaller pieces
    crack_and_chunk = crack_and_chunk_component(
        input_data=Input(type="uri_folder", path=data_asset_path),  # Input data asset
        input_glob=data_source_glob,
        chunk_size=chunk_size,
        document_path_replacement_regex=document_path_replacement_regex,
    )
    use_automatic_compute(crack_and_chunk)  # Apply compute configuration

    # Step 2: Generate embeddings for the data chunks
    generate_embeddings = generate_embeddings_component(
        chunks_source=crack_and_chunk.outputs.output_chunks,
        embeddings_container=embeddings_container,
        embeddings_model=embeddings_model,
    )
    use_automatic_compute(generate_embeddings)  # Apply compute configuration
    
    # Optional: Include Azure OpenAI connection ID
    if optional_pipeline_input_provided(aoai_connection_id):
        generate_embeddings.environment_variables[
            "AZUREML_WORKSPACE_CONNECTION_ID_AOAI"
        ] = aoai_connection_id
    
    if optional_pipeline_input_provided(embeddings_container):
        generate_embeddings.outputs.embeddings = Output(
            type="uri_folder", path=f"{embeddings_container.path}/{{name}}"
        )

    # Step 3: Create a FAISS vector index from embeddings
    create_faiss_index = create_faiss_index_component(
        embeddings=generate_embeddings.outputs.embeddings,
    )
    use_automatic_compute(create_faiss_index)  # Apply compute configuration

    # Step 4: Register the FAISS index as an MLIndex asset
    register_mlindex = register_mlindex_component(
        storage_uri=create_faiss_index.outputs.index, 
        asset_name=asset_name
    )
    use_automatic_compute(register_mlindex) # Apply compute configuration
    
    return {
        "mlindex_asset_uri": create_faiss_index.outputs.index,
        "mlindex_asset_id": register_mlindex.outputs.asset_id,
    }

## **6.Submit the Pipeline**

This section covers how to instantiate the AzureML pipeline, configure its inputs, and submit it for execution. The pipeline processes the data, generates embeddings, creates a FAISS-based vector index, and registers the output as an AzureML asset.

In [17]:
# Define the asset name and data source glob pattern
asset_name = "finsights_faiss_index"  # Name for the FAISS index asset
data_source_glob = "**/*.pdf"  # Pattern to match input data files

In [18]:
# Get the input data asset path from the workspace datastore
datastore_path = ml_client.data.get("finsights-dataset-reports", version="1").path
print(f"Datastore path: {datastore_path}")

Datastore path: azureml://subscriptions/72510a3d-1523-4e16-be26-bd516ff30c38/resourcegroups/default_resource_group/workspaces/thireshworkspace/datastores/workspaceblobstore/paths/LocalUpload/d1b3eb074c8f06ba2272ac8a49b6ec0a/extracted_dataset_reports/


In [20]:
# Create the pipeline job by calling the defined pipeline function
pipeline_job = finsights_to_faiss(
    embeddings_model=embeddings_model_uri,  # URI of the embeddings model
    #aoai_connection_id=aoai_connection_id,  # Connection ID for Azure OpenAI (optional)
    embeddings_container=Input(
        type="uri_folder",
        path=f"azureml://datastores/workspaceblobstore/paths/embeddings/{asset_name}"
    ),  # Path for storing generated embeddings
    data_asset_path=Input(
        type="uri_folder",
        path=datastore_path
    ),  # Input data asset path
    chunk_size=1024,  # Size of chunks for processing
    data_source_glob=data_source_glob,  # Glob pattern for input files
    asset_name=asset_name  # Name of the MLIndex asset
)


In [21]:
# Add properties for better indexing and artifact tracking in the AzureML UI
pipeline_job.properties["azureml.mlIndexAssetName"] = asset_name
pipeline_job.properties["azureml.mlIndexAssetKind"] = "faiss"
pipeline_job.properties["azureml.mlIndexAssetSource"] = "Data asset"

In [22]:
# Submit the pipeline job for execution
submitted_pipeline = ml_client.jobs.create_or_update(pipeline_job)
print(f"Pipeline submitted successfully! Job ID: {submitted_pipeline.id}")

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
pathOnCompute is not a known attribute

Pipeline submitted successfully! Job ID: /subscriptions/72510a3d-1523-4e16-be26-bd516ff30c38/resourceGroups/default_resource_group/providers/Microsoft.MachineLearningServices/workspaces/thireshworkspace/jobs/upbeat_plow_phfttszdps


In [23]:
# Stream the pipeline job logs for real-time monitoring
ml_client.jobs.stream(submitted_pipeline.name)

RunId: upbeat_plow_phfttszdps
Web View: https://ml.azure.com/runs/upbeat_plow_phfttszdps?wsid=/subscriptions/72510a3d-1523-4e16-be26-bd516ff30c38/resourcegroups/default_resource_group/workspaces/thireshworkspace

Streaming logs/azureml/executionlogs.txt

[2024-11-27 04:55:21Z] Submitting 1 runs, first five are: f961e282:deda3541-d530-46e7-b33a-f0deb0b93a8d
[2024-11-27 05:04:02Z] Completing processing run id deda3541-d530-46e7-b33a-f0deb0b93a8d.
[2024-11-27 05:04:03Z] Submitting 1 runs, first five are: d9e298ba:e882655c-fa7f-4780-aad5-1fbde32d3f25
[2024-11-27 05:08:52Z] Completing processing run id e882655c-fa7f-4780-aad5-1fbde32d3f25.
[2024-11-27 05:08:53Z] Submitting 1 runs, first five are: 8dcba5ad:08a9aad7-5853-4866-948b-d6ff66ee6a08
[2024-11-27 05:09:56Z] Completing processing run id 08a9aad7-5853-4866-948b-d6ff66ee6a08.
[2024-11-27 05:09:56Z] Submitting 1 runs, first five are: 8a6918dd:5600724a-0f60-41f5-9b3f-ed238a9df903
[2024-11-27 05:10:53Z] Completing processing run id 5600724

# **Information Retrieval and Response Generation Using LangChain-FAISS and Azure OpenAI**

This section focuses on retrieving information from pre-indexed data using FAISS (Facebook AI Similarity Search) and generating responses with Azure OpenAI.

## **1.Installing Required Libraries**

In [25]:
# Install the required LangChain and HuggingFace libraries
%pip install -U langchain-community
%pip install -U langchain-huggingface



## **2. Setting Up Data Retrieval**

### **Downloading and Setting Up FAISS Index Assets**

This step ensures that we download the necessary FAISS index and associated metadata from Azure ML. FAISS is a powerful tool for similarity search, and here, the pre-trained index will serve as the backbone for retrieving relevant documents based on user queries.

In [28]:
# Import necessary utilities for artifact retrieval
import azure.ai.ml._artifacts._artifact_utilities as artifact_utils

# Retrieve the path to the latest FAISS index asset from Azure ML
data_info = ml_client.data.get(name=asset_name, label="latest").path

# Download the FAISS index asset to a local directory
artifact_utils.download_artifact_from_aml_uri(
    uri=data_info,
    destination="./finsightsfaissindexasset/",
    datastore_operation=ml_client.datastores
)

# The FAISS index asset will be used for vector-based similarity search.

'./finsightsfaissindexasset/'

## **3. Loading the FAISS Index**

### **Loading the FAISS Index and Preparing the Retriever**

We load the FAISS index from the downloaded files and connect it to an embedding model. This embedding model ensures that queries are converted into vector space to match the stored documents effectively.

In [29]:
from langchain.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

# Path to the directory containing FAISS index files
index_folder_path = "./finsightsfaissindexasset/"

# Specify the embedding model used during FAISS index creation
embedding_model_name = "sentence-transformers/all-mpnet-base-v2"
embedding_model = HuggingFaceEmbeddings(model_name=embedding_model_name)

# Load the FAISS index and associate it with the embedding model
retriever = FAISS.load_local(
    folder_path=index_folder_path, 
    embeddings=embedding_model, 
    allow_dangerous_deserialization=True  # Acknowledge the source of the data for safe loading
)

# The retriever is now ready to perform similarity searches.

## **4. Performing a Similarity Search**

In [30]:
# Define a query to test the retriever
query = "How is the company integrating AI across their various business units"

# Retrieve the top 3 most relevant documents
results = retriever.similarity_search(query, k=3)

# Display the results
for doc in results:
    print(f"Document: {doc.page_content}\nMetadata: {doc.metadata}")

# This step helps validate that the retriever is functioning as expected.

Document: Title: google-10-k-2023.pdf•Collaboration Tools:  Google Workspace and Duet AI in Google Workspace provide easy-to-use, secure 
communication and collaboration tools, including apps like Gmail, Docs, Drive, Calendar, Meet, and more. 
These tools enable secure hybrid and remote work, boosting productivity and collaboration. AI has been used 
in Google Workspace for years to improve grammar, efficiency, security, and more with features like Smart 
Reply, Smart Compose, and malware and phishing protection in Gmail.  Duet AI in Google Workspace helps 
users write, organize, visualize, accelerate workflows, and have richer meetings.
•AI Platform and Duet AI for Google Cloud:  Our Vertex AI platform gives developers the ability to train, tune, 
augment, and deploy applications using generative AI models and services such as Enterprise Search and 
Conversations. Duet AI for Google Cloud provides pre-packaged AI agents that assist developers to write, test, 
document, and operate sof

## **5: Creating the System and User Prompt Templates**

Prompts guide the Azure OpenAI model to generate accurate responses. Here, we define two parts:

    1. The system message describing the assistant's role.
    2. A user message template including context and the question.

In [31]:
# Define the system prompt for the Azure OpenAI model
qna_system_message = """
You are an assistant to a Financial Analyst. Your task is to summarize and provide relevant information to the financial analyst's question based on the provided context.

User input will include the necessary context for you to answer their questions. This context will begin with the token: ###Context.
The context contains references to specific portions of documents relevant to the user's query, along with page number from the report.
The source for the context will begin with the token ###Page

When crafting your response:
1. Select only context relevant to answer the question.
2. Include the source links in your response.
3. User questions will begin with the token: ###Question.
4. If the question is irrelevant or if the context is empty - "Sorry, this is out of my knowledge base"

Please adhere to the following guidelines:
- Your response should only be about the question asked and nothing else.
- Answer only using the context provided.
- Do not mention anything about the context in your final answer.
- If the answer is not found in the context, it is very very important for you to respond with "Sorry, this is out of my knowledge base"
- If NO CONTEXT is provided, it is very important for you to respond with "Sorry, this is out of my knowledge base"

Here is an example of how to structure your response:

Answer:
[Answer]

Page:
[Page number]
"""



# Define the user message template
qna_user_message_template = """
###Context
Here are some documents and their page number that are relevant to the question mentioned below.
{context}

###Question
{question}
"""

### **6. Generating the Response**

In [33]:
# Install the rquired packages
!pip install openai==1.2.0 tiktoken==0.6 session-info --quiet

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [34]:
# Import required libraries
import json
import tiktoken
import pandas as pd
from openai import AzureOpenAI

In [35]:
# Load Azure OpenAI credentials
with open('config.json', 'r') as az_creds:
    data = az_creds.read()

creds = json.loads(data)

In [36]:
# Initialize the Azure OpenAI client
client = AzureOpenAI(
    azure_endpoint=creds["AZURE_OPENAI_ENDPOINT"],
    api_key=creds["AZURE_OPENAI_KEY"],
    api_version=creds["AZURE_OPENAI_APIVERSION"]
)

In [37]:
# Define a sample user input question
user_input = "How much is the company investing in research and development, and what are the key areas of focus for innovation?"

In [38]:
# Retrieve relevant document chunks
relevant_document_chunks = retriever.similarity_search(user_input, k=3)
context_list = [d.page_content for d in relevant_document_chunks]

# Combine document chunks into a single context
context_for_query = ". ".join(context_list)

In [39]:
# Compose the prompt
prompt = [
    {'role': 'system', 'content': qna_system_message},
    {'role': 'user', 'content': qna_user_message_template.format(
         context=context_for_query,
         question=user_input
        )
    }
]

# Generate the response using Azure OpenAI
try:
    response = client.chat.completions.create(
        model=creds["CHATGPT_MODEL"],
        messages=prompt,
        temperature=0
    )

    # Extract and print the model's response
    response = response.choices[0].message.content.strip()
except Exception as e:
    response = f'Sorry, I encountered the following error: \n {e}'


print(response)

# The model uses the context to answer the user's query.

The company is investing $27,195 million in research and development for the fiscal year 2023, which is an increase of $2.7 billion or 11% compared to the previous year. The key areas of focus for innovation include cloud engineering, AI, digital work and life experiences, devices, and operating systems.

Page:
23, 33
