# Building a RAG application with Azure Cosmos DB for NoSQL
In this notebook, we'll go step-by-step and build a RAG application. We'll demonstrate how you can use Azure Cosmos DB for NoSQL as the knowledgebase for your RAG application.
The tutorial is structured as below:
1. Pre setup - Provision Azure Cosmos DB and OpenAI resources
2. Get the OpenAI and Azure Cosmos DB Account keys and endpoints
3. Install the requisite python libraries
4. Load sample data (about Azure app service) into the notebook
5. Generate embeddings using OpenAI model and update the data with the embeddings
6. Create an Azure Cosmos DB database
7. Create an Azure Cosmos DB container with vector embeddings and indexing policy
8. Take user question input in natural language and perform a vector search on the data stored in Cosmos DB to filter the most relevant items to pass on to the LLM.
9. Use OpenAI GPT3.5 model to generate responses to the user questions based on the filtered data.


# Create an Azure Cosmos DB for NoSQL resource

Let's start by creating an Azure Cosmos DB for NoSQL Resource (Cosmos DB Account) by following [this section in the Quickstart guide](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quickstart-portal#create-account)


While creating the account, it is recommended that you select the **"Serverless" Capacity Mode** for this tutorial.


## Get Cosmos DB Account Key and Endpoint
Once the account is provisioned, head over to the provisioned account and navigate to **"Settings > Keys"** section in the left-side panel. From the Keys section, make a note of the **Primary Key and the URI** - these will be used later to connect to the cosmos DB account through the python client.
Store the Primary Key and URI in a .env file

# Provision Azure Open AI resource
Finally, let's setup our Azure OpenAI resource Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at [https://aka.ms/oai/access](https://aka.ms/oai/access)

Once you have access, complete the following step:
1. Create an Azure OpenAI resource [following this quickstart](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource?pivots=-eb-portal)
2. Deploy an embeddings model. For more information on embeddings, refer to [this article](https://learn.microsoft.com/azure/ai-services/openai/how-to/embeddings)
3. Deploy a completions model. For more information on completions, refer to [this article](https://learn.microsoft.com/azure/ai-services/openai/how-to/completions)
4. Make a note of the endpoint and key for your Azure OpenAI resource
5. Make a note of the **deployment names** of the embedding and completion models.

Store the Endpoint, Key, and deployment names in the .env file


# Install the required libraries

In [None]:
! pip install numpy
! pip install openai
! pip install python-dotenv
! pip install azure-core
! pip install azure-cosmos

# Necessary imports

In [None]:
import json
import datetime
import time
import urllib 

from azure.core.exceptions import AzureError
from azure.core.credentials import AzureKeyCredential

#Cosmos DB imports
from azure.cosmos import CosmosClient
from azure.cosmos.aio import CosmosClient as CosmosAsyncClient
from azure.cosmos import PartitionKey, exceptions

from openai import AzureOpenAI
from dotenv import load_dotenv

# Load Keys, Endpoints, and other variables from the .env file

In [None]:
from dotenv import dotenv_values

# specify the name of the .env file name 
env_name = "example.env" # following example.env template change to your own .env file name
config = dotenv_values(env_name)

OPENAI_API_KEY = config['openai_api_key']
OPENAI_API_ENDPOINT = config['openai_api_endpoint']
OPENAI_API_VERSION = config['openai_api_version'] # at the time of authoring, the api version is 2024-02-01
COMPLETIONS_MODEL_DEPLOYMENT_NAME = config['completions_model_deployment_name']
EMBEDDING_MODEL_DEPLOYMENT_NAME = config['embedding_model_deployment_name']
COSMOSDB_NOSQL_ACCOUNT_KEY = config['cosmosdb_nosql_account_key']
COSMOSDB_NOSQL_ACCOUNT_ENDPOINT = config['cosmosdb_nosql_account_endpoint']

# Instantiate the Azure Open AI client

In [None]:
AOAI_client = AzureOpenAI(api_key=OPENAI_API_KEY, azure_endpoint=OPENAI_API_ENDPOINT, api_version=OPENAI_API_VERSION,)

# Generating Embedding
We'll use the deployed embeddings model to generate the embeddings

In [None]:
def generate_embeddings(text):
    '''
    Generate embeddings from string of text.
    This will be used to vectorize data and user input for interactions with Azure OpenAI.
    '''
    response = AOAI_client.embeddings.create(input=text, model=EMBEDDING_MODEL_DEPLOYMENT_NAME)
    embeddings =response.model_dump()
    time.sleep(0.5) 
    return embeddings['data'][0]['embedding']

# Load the data with embeddings or generate embeddings
We have a sample data file with embeddings but you can generate the embeddings afresh before uploading the data.

In [None]:
# Load text-sample_w_embeddings.json which has embeddings pre-computed
data_file = open(file="../../DataSet/AzureServices/text-sample_w_embeddings.json", mode="r") 

# OR Load text-sample.json data file. Embeddings will need to be generated using the function below.
# data_file = open(file="../../DataSet/AzureServices/text-sample.json", mode="r")

data = json.load(data_file)
data_file.close()

In [None]:
# Take a peek at one data item
print(json.dumps(data[0], indent=2))

In [None]:
# Generate embeddings for title and content fields
n = 0
for item in data:
    n+=1
    item['id'] = str(n)
    title = item['title']
    content = item['content']
    title_embeddings = generate_embeddings(title)
    content_embeddings = generate_embeddings(content)
    item['titleVector'] = title_embeddings
    item['contentVector'] = content_embeddings
    item['@search.action'] = 'upload'
    print("Creating embeddings for item:", n, "/" ,len(data), end='\r')


In [None]:
# Save embeddings to sample_text_w_embeddings.json file
with open("../../DataSet/AzureServices/text-sample_w_embeddings.json", "w") as f:
    json.dump(data, f)

# Connect and setup Cosmos DB for NoSQL
Now that we have the data with embeddings ready, we need to upload this data to Azure Cosmos DB container with vector search capability. For this, we need to create a new container (as vector search is currently supported in new containers only) with vector embedding and indexing policy.

## Set up the connection

In [None]:
cosmos_client = CosmosClient(url=COSMOSDB_NOSQL_ACCOUNT_ENDPOINT, credential=COSMOSDB_NOSQL_ACCOUNT_KEY)

## Create a new database or use existing one

In [None]:
#create database
DATABASE_NAME = "vector-nosql-db"
db= cosmos_client.create_database_if_not_exists(
    id=DATABASE_NAME
)
properties = db.read()
print(json.dumps(properties))

## Author the vector embedding policy
Vector embedding policy defines the necessary information for the vector search queries as detailed below: 
* “path”: what properties contain vectors 
* “datatype”: What type are the vector’s elements (default Float32) 
* “dimensions”: The length of each vector in the path (default 1536) 
* “distanceFunction”: The metric used to compute distance/similarity (default Cosine)

In [None]:
vector_embedding_policy = {
    "vectorEmbeddings": [
        {
            "path":"/titleVector",
            "dataType":"float32",
            "distanceFunction":"dotproduct",
            "dimensions":1536
        },
        {
            "path":"/contentVector",
            "dataType":"float32",
            "distanceFunction":"cosine",
            "dimensions":1536
        }
    ]
}

## Add vector indexes to indexing policy

In [None]:
indexing_policy = {
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        }
    ],
    "vectorIndexes": [
        {"path": "/titleVector",
         "type": "quantizedFlat"
        },
        {"path": "/contentVector",
         "type": "quantizedFlat"
        }
    ]
}


## Create container with the embedding and indexing policy

In [None]:
CONTAINER_NAME = "vector-nosql-cont"
try:    
    container = db.create_container_if_not_exists(
                    id=CONTAINER_NAME,
                    partition_key=PartitionKey(path='/id', kind='Hash'),
                    indexing_policy=indexing_policy,
                    vector_embedding_policy=vector_embedding_policy)

    print('Container with id \'{0}\' created'.format(id))

except exceptions.CosmosResourceExistsError:
    print('A container with id \'{0}\' already exists'.format(id))

In [None]:
container = db.get_container_client(CONTAINER_NAME)

## Upload data to the container
Azure Cosmos DB Python SDK does not currently support bulk inserts so we'll have to insert the items sequentially

In [None]:
with open('../../DataSet/AzureServices/text-sample_w_embeddings.json') as f:
   data = json.load(f)

container_client = db.get_container_client(CONTAINER_NAME)

for item in data:
    print("writing item ", item['id'])
    container_client.upsert_item(item)

## Vector search in Azure Cosmos DB for NoSQL
Let's write a function that will take in user's query, generate embeddings for the query text and then use the embedding to run a vector search to find the similar items. The most similar items must be used as additional knowledgebase for the completions model to answer the user's query

In [None]:
# Simple function to assist with vector search
def vector_search(query, num_results=5):
    query_embedding = generate_embeddings(query)
    results = container.query_items(
            query='SELECT TOP @num_results c.content, c.title, c.category, VectorDistance(c.contentVector,@embedding) AS SimilarityScore  FROM c ORDER BY VectorDistance(c.contentVector,@embedding)',
            parameters=[
                {"name": "@embedding", "value": query_embedding} 
                {"name": "@num_results", "value": num_results} 
            ],
            enable_cross_partition_query=True)
    #correct this
    return results

Let's run a test below

In [None]:
query = "What are some NoSQL databases in Azure?"#"What are the services for running ML models?"
results = vector_search(query)
for result in results: 
#     print(result)
    print(f"Similarity Score: {result['similarityScore']}")  
    print(f"Title: {result['title']}")  
    print(f"Content: {result['content']}")  
    print(f"Category: {result['category']}\n") 

# Q&A over the data with GPT-3.5
Finally, we'll create a helper function to feed prompts into the Completions model. Then we'll create interactive loop where you can pose questions to the model and receive information grounded in your data.

In [None]:
#This function helps to ground the model with prompts and system instructions.

def generate_completion(vector_search_results, user_prompt):
    system_prompt = '''
    You are an intelligent assistant for Microsoft Azure services.
    You are designed to provide helpful answers to user questions about Azure services given the information about to be provided.
        - Only answer questions related to the information provided below, provide at least 3 clear suggestions in a list format.
        - Write two lines of whitespace between each answer in the list.
        - If you're unsure of an answer, you can say ""I don't know"" or ""I'm not sure"" and recommend users search themselves."
        - Only provide answers that have products that are part of Microsoft Azure and part of these following prompts.
    '''

    messages=[{"role": "system", "content": system_prompt}]
    for item in vector_search_results:
        messages.append({"role": "system", "content": item['content']})
    messages.append({"role": "user", "content": user_prompt})
    response = AOAI_client.chat.completions.create(model=COMPLETIONS_MODEL_DEPLOYMENT_NAME, messages=messages,temperature=0)
    
    return response.dict()

In [None]:
# Create a loop of user input and model output. You can now perform Q&A over the sample data!

user_input = ""
print("*** Please ask your model questions about Azure services. Type 'end' to end the session.\n")
user_input = input("User prompt: ")
while user_input.lower() != "end":
    search_results = dummy_vector_search()
    completions_results = generate_dummy_completion(search_results, user_input)
    print("\n")
    print(completions_results['choices'][0]['message']['content'])
    user_input = input("User prompt: ")