# Introduction

In this tutorial, we'll demonstrate how to leverage a sample dataset stored in Azure Cosmos DB for MongoDB vCore to ground OpenAI models. We'll do this taking advantage of Azure Cosmos DB for Mongo DB vCore's [vector similarity search](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/vector-search) functionality. In the end, we'll create an interatice chat session with the GPT-3.5 completions model to answer questions about Azure services informed by our dataset. This process is known as Retrieval Augmented Generation, or RAG.

This tutorial borrows some code snippets and example data from the Azure Cognitive Search Vector Search demo 

# Preliminaries <a class="anchor" id="preliminaries"></a>
First, let's start by installing the packages that we'll need later. 

In [None]:
! pip install numpy
! pip install openai>=1.0
! pip install pymongo
! pip install python-dotenv
! pip install azure-core
! pip install azure-cosmos
! pip install tenacity

In [3]:
import json
import time

from azure.core.exceptions import AzureError
from azure.core.credentials import AzureKeyCredential
import pymongo

import openai
from dotenv import load_dotenv
from tenacity import retry, wait_random_exponential, stop_after_attempt

Please use the example.env as a template to provide the necessary keys and endpoints in your own .env file.
Make sure to modify the env_name accordingly. 

In [4]:
from dotenv import dotenv_values

# specify the name of the .env file name 
env_name = ".env" # following example.env template change to your own .env file name
config = dotenv_values(env_name)

cosmosdb_endpoint = config['cosmos_db_api_endpoint']
cosmosdb_key = config['cosmos_db_api_key']
cosmosdb_connection_str = config['cosmos_db_connection_string']

COSMOS_MONGO_USER = config['cosmos_db_mongo_user']
COSMOS_MONGO_PWD = config['cosmos_db_mongo_pwd']
COSMOS_MONGO_SERVER = config['cosmos_db_mongo_server']
openai.api_type = config['openai_api_type']
openai.api_key = config['openai_api_key']
openai.api_base = config['openai_api_endpoint']
openai.api_version = config['openai_api_version']
embeddings_deployment = config['openai_embeddings_deployment']
completions_deployment = config['openai_completions_deployment']

## Create an Azure Cosmos DB for MongoDB vCore resource<a class="anchor" id="cosmosdb"></a>
Let's start by creating an Azure Cosmos DB for MongoDB vCore Resource following this quick start guide: https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/quickstart-portal

Then copy the connection details (server, user, pwd) into the config.json file.

## Azure OpenAI <a class="anchor" id="azureopenai"></a>

Finally, let's setup our Azure OpenAI resource Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access. Once you have access, complete the following steps:

- Create an Azure OpenAI resource following this quickstart: https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource?pivots=web-portal
- Deploy a `completions` and `embeddings` model 
    - For more information on `completions`, go here: https://learn.microsoft.com/azure/ai-services/openai/how-to/completions
    - For more information on `embeddings`, go here: https://learn.microsoft.com/azure/ai-services/openai/how-to/embeddings
- Copy the endpoint, key, deployment names for (embeddings model, completions model) into the config.json file.

# Load data and create embeddings <a class="anchor" id="loaddata"></a>
Here we'll load a sample dataset containing descriptions of Azure services. Then we'll user Azure OpenAI to create vector embeddings from this data.

In [5]:
# Load text-sample.json data file
data_file = open(file="../../DataSet/AzureServices/text-sample.json", mode="r")
# Load the following file instead if embeddings were previously created and saved.
#data_file = open(file="../../DataSet/AzureServices/text-sample_w_embeddings.json", mode="r") 
data = json.load(data_file)
data_file.close()

In [6]:
# Take a peek at one data item
print(json.dumps(data[0], indent=2))

{
  "id": "1",
  "title": "Azure App Service",
  "content": "Azure App Service is a fully managed platform for building, deploying, and scaling web apps. You can host web apps, mobile app backends, and RESTful APIs. It supports a variety of programming languages and frameworks, such as .NET, Java, Node.js, Python, and PHP. The service offers built-in auto-scaling and load balancing capabilities. It also provides integration with other Azure services, such as Azure DevOps, GitHub, and Bitbucket.",
  "category": "Web",
  "titleVector": [
    -0.010636103339493275,
    -0.021644677966833115,
    0.0019778874702751637,
    -0.014540146104991436,
    -0.021975763142108917,
    0.011774207465350628,
    -0.026569565758109093,
    -0.008615105412900448,
    0.013195114210247993,
    -0.025355586782097816,
    0.014636713080108166,
    -0.01378141064196825,
    0.005511184688657522,
    0.0016166255809366703,
    -0.023838115856051445,
    0.014595326967537403,
    0.014305627904832363,
    0.

In [17]:
client = openai.AzureOpenAI(
  api_key = config['openai_api_key'],  
  api_version = config['openai_api_version'],
  azure_endpoint = config['openai_api_endpoint'] 
)

In [None]:
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(10))
def generate_embeddings(text):
    '''
    Generate embeddings from string of text.
    This will be used to vectorize data and user input for interactions with Azure OpenAI.
    '''
    text = text.replace("\n", " ")
    response = client.embeddings.create(
        input = [text], 
        model = config['openai_embeddings_deployment']
        ).data[0].embedding
    time.sleep(0.5) # rest period to avoid rate limiting on AOAI for free tier
    return response

In [None]:
# Generate embeddings for title and content fields
n = 0
for item in data:
    n+=1
    title = item['title']
    content = item['content']
    title_embeddings = generate_embeddings(title)
    content_embeddings = generate_embeddings(content)
    item['titleVector'] = title_embeddings
    item['contentVector'] = content_embeddings
    item['@search.action'] = 'upload'
    print("Creating embeddings for item:", n, "/" ,len(data), end='\r')
# Save embeddings to sample_text_w_embeddings.json file
with open("../../DataSet/AzureServices/text-sample_w_embeddings.json", "w") as f:
    json.dump(data, f)

# Connect and setup Cosmos DB for MongoDB vCore

## Set up the connection

In [18]:
mongo_conn = "mongodb+srv://"+COSMOS_MONGO_USER+":"+COSMOS_MONGO_PWD+"@"+COSMOS_MONGO_SERVER+"/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
mongo_client = pymongo.MongoClient(mongo_conn)

In [28]:
print(mongo_conn)

##  Set up the DB and collection

In [None]:
# create a database called TutorialDB
db = mongo_client['TutorialDB']

# Create collection if it doesn't exist
COLLECTION_NAME = "TutorialCol"

collection = db[COLLECTION_NAME]

Run the following code only if you want to re-run the setup of the DB and collection. Otherwise, skip to the next section.

In [None]:
## Use only if re-reunning code and want to reset db and collection
collection.drop_index("vectorSearchIndex")
mongo_client.drop_database("TutorialDB")


Continue here.

In [29]:
# create a database called TutorialDB
db = mongo_client['TutorialDB']

# Create collection if it doesn't exist
COLLECTION_NAME = "TutorialCol"

collection = db[COLLECTION_NAME]

if COLLECTION_NAME not in db.list_collection_names():
    # Creates a unsharded collection that uses the DBs shared throughput
    db.create_collection(COLLECTION_NAME)
    print("Created collection '{}'.\n".format(COLLECTION_NAME))
else:
    print("Using collection: '{}'.\n".format(COLLECTION_NAME))

Created collection 'TutorialCol'.



## Create the vector index

In [30]:
db.command({
  'createIndexes': 'TutorialCol',
  'indexes': [
    {
      'name': 'vectorSearchIndex',
      'key': {
        "contentVector": "cosmosSearch"
      },
      'cosmosSearchOptions': {
        'kind': 'vector-ivf',
        'numLists': 1,
        'similarity': 'COS',
        'dimensions': 1536
      }
    }
  ]
});

## Upload data to the collection
A simple `insert_many()` to insert our data in JSON format into the newly created DB and collection.

In [31]:
collection.insert_many(data)

<pymongo.results.InsertManyResult at 0x16735eee0>

# Vector Search in Cosmos DB for MongoDB vCore

In [36]:
# Simple function to assist with vector search
def vector_search(query, num_results=3):
    query_embedding = generate_embeddings(query)
    pipeline = [
        {
            '$search': {
                "cosmosSearch": {
                    "vector": query_embedding,
                    "path": "contentVector",
                    "k": num_results
                },
                "returnStoredSource": True
            }
        }
    ]
    return collection.aggregate(pipeline)

Let's run a test query.

In [37]:
query = "What services do you have?"
results = vector_search(query)
for result in results: 
    print(result)
    print("\n\n")
#     print(f"Similarity Score: {result['similarityScore']}")  
#     print(f"Title: {result['document']['title']}")  
#     print(f"Content: {result['document']['content']}")  
#     print(f"Category: {result['document']['category']}\n")  

{'_id': ObjectId('654ac7c5ee3a24b21b18b818'), 'similarityScore': 0.693442997922637, 'document': {'_id': ObjectId('654ac7c5ee3a24b21b18b818'), 'id': '8', 'title': 'Azure Virtual Machines', 'content': 'Azure Virtual Machines (VMs) is an Infrastructure-as-a-Service (IaaS) offering that allows you to deploy and manage virtual machines in the cloud. You can choose from a wide range of VM sizes, operating systems, and software configurations. VMs support various operating systems, including Windows Server, Linux, and SQL Server. You can scale your VMs up or down as needed, and pay only for the resources you use. VMs provide built-in security features, such as Azure Security Center, Azure Active Directory, and encryption.', 'category': 'Compute', 'titleVector': [-0.016730263829231262, -0.01966140605509281, -0.00032435799948871136, -0.017372706905007362, -0.020009396597743034, 0.03544139117002487, -0.04443558305501938, -0.026447201147675514, 0.008826887235045433, -0.03873390704393387, 0.021561

# Q&A over the data with GPT-3.5

Finally, we'll create a helper function to feed prompts into the `Completions` model. Then we'll create interactive loop where you can pose questions to the model and receive information grounded in your data.

In [38]:
#This function helps to ground the model with prompts and system instructions.

def generate_completion(user_input, results_for_prompt):
    system_prompt = '''
    You are an intelligent assistant for Microsoft Azure services.
    You are designed to provide helpful answers to user questions about Azure services given the information about to be provided.
        - Only answer questions related to the information provided below, provide 3 clear suggestions in a list format.
        - Write two lines of whitespace between each answer in the list.
        - Only provide answers that have products that are part of Microsoft Azure.
        - If you're unsure of an answer, you can say ""I don't know"" or ""I'm not sure"" and recommend users search themselves."
    '''

    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_input},
    ]

    for item in results_for_prompt:
        messages.append({"role": "system", "content": item['content']})

    response = client.chat.completions.create(
        model=completions_deployment, 
        messages=messages
        ).choices[0].message.content
    
    return response

In [None]:
# Create a loop of user input and model output. You can now perform Q&A over the sample data!

user_input = ""
print("*** Please ask your model questions about Azure services. Type 'end' to end the session.\n")
user_input = input("Prompt: ")
while user_input.lower() != "end":
    results_for_prompt = vector_search(user_input)
    completions_result = generate_completion(user_input, results_for_prompt)
    print("\n")
    print(completions_result)
    user_input = input("Prompt: ")
