# Azure chat completion models with on your data -OYD (RAG)
This example shows how to use Azure OpenAI service models with your own data. 

Azure OpenAI on your data enables you to run supported chat models such as GPT-3.5-Turbo and GPT-4 on your data without needing to train or fine-tune models. Running models on your data enables you to chat on top of, and analyze your data with greater accuracy and speed. One of the key benefits of Azure OpenAI on your data is its ability to tailor the content of conversational AI. Because the model has access to, and can reference specific sources to support its responses, answers are not only based on its pretrained knowledge but also on the latest information available in the designated data source. This grounding data also helps the model avoid generating responses based on outdated or incorrect information.

Azure OpenAI on your own data with Azure AI Search (f.k.a. Azure Cognitive Search) provides a customizable, pre-built solution for knowledge retrieval, from which a conversational AI application can be built. To see alternative methods for knowledge retrieval and semantic search, check out the cookbook examples for [vector databases](https://github.com/openai/openai-cookbook/tree/main/examples/vector_databases).

## How it works

[Azure OpenAI on your own data](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data) connects the model with your data, giving it the ability to retrieve and utilize data in a way that enhances the model's output. Together with Azure AI Search, data is retrieved from designated data sources based on the user input and provided conversation history. The data is then augmented and resubmitted as a prompt to the model, giving the model contextual information it can use to generate a response.

![image.png](attachment:image.png)

See the [Data, privacy, and security for Azure OpenAI Service](https://learn.microsoft.com/legal/cognitive-services/openai/data-privacy?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext) for more information.

## Prerequisites
To get started, we'll cover a few prequisites. 

To properly access the Azure OpenAI Service, we need to create the proper resources at the [Azure Portal](https://portal.azure.com) (you can check a detailed guide on how to do this in the [Microsoft Docs](https://learn.microsoft.com/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal))

To use your own data with Azure OpenAI models, you will need:

1. Azure OpenAI access and a resource with a chat model deployed (for example, GPT-3 or GPT-4)
2. Azure AI Search (f.k.a. Azure Cognitive Search) resource
3. Azure Blob Storage resource
4. Your documents to be used as data (See [data source options](https://learn.microsoft.com/azure/ai-services/openai/concepts/use-your-data#data-source-options))


For a full walk-through on how to upload your documents to blob storage and create an index using the Azure AI Studio, see this [Quickstart](https://learn.microsoft.com/azure/ai-services/openai/use-your-data-quickstart?pivots=programming-language-studio&tabs=command-line).

## Chat completion model with On Your data (OYD)

Below is My example - code generated from Azure Open AI Studio - > Show code option for the example I was working on: 

In [None]:
import openai
import os
from openai import AzureOpenAI
import json
import pandas as pd
from dotenv import load_dotenv
load_dotenv("..\\common\\credentials.env", override=True)

True

In [2]:
aoai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
aoai_api_key = os.environ["AZURE_OPENAI_API_KEY"]
deployment_name = "gpt-35-turbo"
aoai_api_version = "2024-02-15-preview" #Using latest version as of this date. Older versions may not work with the code as syntax has changed drastically.

# Azure AI Search setup
search_endpoint = os.getenv("AZURE_SEARCH_ENDPOINT"); # Add your Azure AI Search endpoint here
search_key = os.getenv("AZURE_SEARCH_KEY"); # Add your Azure AI Search admin key here
search_index_name = "cogsrch-index-files"; # Add your Azure AI Search index name here

In [3]:
# Create an Azure OpenAI client.  
client = openai.AzureOpenAI(        
    #base_url=f"{aoai_endpoint}/openai/deployments/{deployment_name}/extensions/",
    azure_endpoint=aoai_endpoint,    
    api_key=aoai_api_key,    
    api_version=aoai_api_version
)


In [4]:

sys_prompt = "You are AI assistant who provides information from retrieved data. If the required information is not available in the retrieved data, apologize and say that the requested information is not available in the retrieved data. You are friendly and concise. Do not provide answers to questions not related to retrieved data or prompt message."
user_prompt = "explain Variants and Genomic Surveillance for SARS- CoV-2"
messages = [{"role": "system", "content": sys_prompt}]
messages.append({"role": "user", "content": user_prompt})

In [5]:
messages

[{'role': 'system',
  'content': 'You are AI assistant who provides information from retrieved data. If the required information is not available in the retrieved data, apologize and say that the requested information is not available in the retrieved data. You are friendly and concise. Do not provide answers to questions not related to retrieved data or prompt message.'},
 {'role': 'user',
  'content': 'explain Variants and Genomic Surveillance for SARS- CoV-2'}]

In [6]:
# Create and return a new chat completion request
    # Be sure to include the "extra_body" parameter to use Azure AI Search as the data source
    #this is Azure Open AI On your data feature

response = client.chat.completions.create(
    model=deployment_name,    
    messages = messages,
    stream=False,
    extra_body={
        "data_sources": [
            {
                "type": "azure_search",
                "parameters": {
                    "endpoint": search_endpoint,                    
                    "index_name": search_index_name,
                    "authentication": {
                        "type": "api_key",
                         "key": search_key 
                    }
                                      
                }
            }
        ]
    }
)    

In [7]:
print(response.model)
print(response.usage)
print(response.created)
response.object

gpt-35-turbo
CompletionUsage(completion_tokens=180, prompt_tokens=6485, total_tokens=6665)
1722025550


'extensions.chat.completion'

In [28]:
response.choices[0].message.context

{'citations': [{'content': " Variants and Genomic Surveillance for SARS- CoV-2 in Texas Variants of SARS-COV-2, the virus that causes COVID-19, are expected to continue to emerge, a natural process that occurs as viruses spread. Some variants will disappear, and others will continue to spread and may overtake previous variants. For example, the ancestral strain of the virus that caused the first Texas COVID-19 cases in early 2020 is no longer being detected. It was displaced by the Alpha variant followed by the Delta variant and may continue to be replaced by other emerging variants. The Texas SARS-COV-2 genomic sequencing data includes data provided by the CDC's commercial partner laboratories as a part of the national SARS-COV-2 genomic surveillance program, sequencing conducted at academic and commercial laboratories, and Texas Department of State Health Services Austin Laboratory's genomic sequencing. The programs sequence hundreds of COVID-19 cases each week to monitor the spread 

List all the Citations: 

In [29]:
response.choices[0].message.context['citations']

[{'content': " Variants and Genomic Surveillance for SARS- CoV-2 in Texas Variants of SARS-COV-2, the virus that causes COVID-19, are expected to continue to emerge, a natural process that occurs as viruses spread. Some variants will disappear, and others will continue to spread and may overtake previous variants. For example, the ancestral strain of the virus that caused the first Texas COVID-19 cases in early 2020 is no longer being detected. It was displaced by the Alpha variant followed by the Delta variant and may continue to be replaced by other emerging variants. The Texas SARS-COV-2 genomic sequencing data includes data provided by the CDC's commercial partner laboratories as a part of the national SARS-COV-2 genomic surveillance program, sequencing conducted at academic and commercial laboratories, and Texas Department of State Health Services Austin Laboratory's genomic sequencing. The programs sequence hundreds of COVID-19 cases each week to monitor the spread of variants in

Show all the citations: 

In [26]:
for citation in response.choices[0].message.context['citations']:
    print(citation['chunk_id'] + "" if(pd.isnull(citation['filepath'])) else + "-" + citation['filepath'])
#print(response.choices[0].message.context['citations'][0]['chunk_id'] + "-" + pd.isnull(response.choices[0].message.context['citations'][0]['filepath']))  

0
17
20


Getting list of citations for troubleshooting is important so that you can find out what search results are returned by AI search or Vector DB layer and if you see issues with responses how to troubleshoot.

Refer [OYD Best Practices and Troubleshooting](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/on-your-data-best-practices) for troubleshooting steps and best practices.

Notice intent field is generated along with citation. Azure OpenAI On Your Data generates multiple search intents by using user questions and conversation history. It generates multiple search intents to address any ambiguity in users' questions, add more context by using the conversation history to retrieve holistic information in the retrieval stage, and provide any additional information to make the final response thorough and useful.

You also need to make sure intents are generated accurately and you may need to troubleshoot based on intents different AOAI models generate during [OYD Troubleshooting steps](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/on-your-data-best-practices)

In [33]:
response.choices[0].message.context['intent']

'["Explain Variants and Genomic Surveillance for SARS-CoV-2", "What is the process of Genomic Surveillance for SARS-CoV-2?", "How are Variants of SARS-CoV-2 identified and monitored?", "What is the significance of Genomic Surveillance in tracking SARS-CoV-2 variants?"]'

First citation is considered as response when you use message.content

In [80]:
response.choices[0].message.content

"The genomic surveillance for SARS-CoV-2 in Texas involves monitoring the spread of variants of concern (VOC), variants of interest (VOI), and variants being monitored (VBM) to understand how the virus spreads and changes over time. This surveillance program includes data from CDC's commercial partner laboratories, academic and commercial laboratories, and the Texas Department of State Health Services Austin Laboratory's genomic sequencing. It helps scientists and public health professionals assess the impact of emerging variants on the effectiveness of existing COVID-19 tests, treatments, and vaccines [doc1]."

Please note - this is very basic standard RAG example which uses Azure AI search as a knowledge store. I have intentionally kept this simple for begineers to understand this concept and not included vectorized data and other advanced RAG concepts. Refer other notebooks for advanced RAG examples.

Disclaimer: with changes in library and AOAI rest API versions - some of these syntax may change over the time. Refer latest MS documentation for details