
# Retrieval-Augmented Generation (RAG) with Azure OpenAI and Azure AI Search

This notebook demonstrates how to set up and use Azure OpenAI and Azure AI Search to retrieve relevant documents using vector search and generate responses using a Retrieval-Augmented Generation (RAG) approach.

## Prerequisites

Before running the notebook, ensure you have the following: 

- [Fork](https://github.com/microsoft/rag-time/fork) the repository and clone it to your local machine by following the script below:

    ```bash
    git clone https://github.com/your-org/rag-time.git
    cd rag-time
    ```

- An [Azure account](https://portal.azure.com) with proper permissions to access the following services:
    - An **Azure OpenAI** service with an active deployment of a **chat model** and an **embedding model**.
    - An **Azure AI Search** service with an index that contains vectorized text data. Follow the instructions in the [Quickstart](https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-import-vectors?tabs=sample-data-storage%2Cmodel-aoai%2Cconnect-data-storage) to index the documents in [data](./../../data/) folder. 
- Install Python 3.8 or later from [python.org](https://python.org).

## Steps to Use the Notebook

### 1. Install Required Libraries

Run the first code cell to install the required Python libraries:

In [None]:
!python3 -m pip install azure-search-documents 
!python3 -m pip install azure-identity
!python3 -m pip install openai
!python3 -m pip install python-dotenv

### 2. Set Up Environment Variables

To store credentials securely, rename `.env.sample` file to `.env` in the same directory as the notebook and update the following variables:

In [None]:
AZURE_OPENAI_ENDPOINT="https://aoai-fastai-rag.openai.azure.com/"
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME="gpt-4o-mini"
AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME="text-embedding-3-large"
AZURE_SEARCH_SERVICE_ENDPOINT="https://aisearch-fastai-rag.search.windows.net"
AZURE_SEARCH_INDEX_NAME="sopravector"
AZURE_OPENAI_API_KEY=""
AZURE_SEARCH_ADMIN_KEY=""

After setting up, the notebook will automatically load these values using dotenv.

### 3. Setup API Clients

This section initializes API clients for Azure OpenAI and Azure AI Search. Ensure the necessary credentials are properly configured before proceeding.

In [16]:
import os
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.core.credentials import AzureKeyCredential
from openai import AzureOpenAI
from azure.search.documents import SearchClient
import dotenv
from azure.search.documents.models import VectorizedQuery, VectorizableTextQuery

dotenv.load_dotenv(override=True)

# Load Azure OpenAI environment variables
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME")

# Load Azure Search environment variables
AZURE_SEARCH_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
AZURE_SEARCH_INDEX_NAME = os.getenv("AZURE_SEARCH_INDEX_NAME")
AZURE_SEARCH_ADMIN_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY")

# 🔹 Initialize Azure OpenAI Client (API Key or Managed Identity)
if AZURE_OPENAI_API_KEY:
    openai_client = AzureOpenAI(
        api_key=AZURE_OPENAI_API_KEY,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        api_version="2024-10-21"
    )
else:
    azure_credential = DefaultAzureCredential()
    token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default")
    openai_client = AzureOpenAI(
        azure_ad_token_provider=token_provider,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        api_version="2024-10-21"
    )

# 🔹 Initialize Azure AI Search Client (API Key or Managed Identity)
if AZURE_SEARCH_ADMIN_KEY:
    search_client = SearchClient(
        endpoint=AZURE_SEARCH_ENDPOINT,
        index_name=AZURE_SEARCH_INDEX_NAME,
        credential=AzureKeyCredential(AZURE_SEARCH_ADMIN_KEY)
    )
else:
    azure_credential = DefaultAzureCredential()
    search_client = SearchClient(
        endpoint=AZURE_SEARCH_ENDPOINT,
        index_name=AZURE_SEARCH_INDEX_NAME,
        credential=azure_credential
    )

def get_embedding(text):
    return openai_client.embeddings.create(
        model=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME"),
        input=text
    ).data[0].embedding

### 4. Prepare a question

Define a sample question and convert it into an embedding vector:

In [55]:
user_question = "Hva dekker reiseforsikringen?"
#print(os.getenv("AZURE_OPENAI_ENDPOINT"))
#print(os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME"))
user_question_vector = get_embedding(user_question)
print("Text embedding")
print(user_question_vector)
print(len(user_question_vector))

Text embedding
[0.017701974138617516, 0.023882802575826645, -0.009450619108974934, -0.024846313521265984, -0.011767148971557617, 0.024190306663513184, -0.0004814357671421021, 0.008374355733394623, 0.00710846483707428, -0.050307635217905045, 0.010762635618448257, -0.019403494894504547, -0.001866804901510477, -0.04044701159000397, 0.0441165566444397, 0.034378934651613235, 0.013468670658767223, 0.008153977803885937, 0.031365398317575455, -0.030401883646845818, 0.017107466235756874, 0.021381771191954613, 0.011264892295002937, -0.008543482981622219, -0.031262896955013275, -0.018552735447883606, -0.023636799305677414, 0.0063140797428786755, 0.009937500581145287, 0.007395468652248383, 0.0136121716350317, 0.01743547059595585, -0.011480145156383514, 0.005335192661732435, 0.03505544364452362, 0.0035567949526011944, 0.029602374881505966, -0.0042871166951954365, -0.016451457515358925, 0.053710680454969406, -0.0007860568002797663, 0.011193141341209412, 0.025686824694275856, -0.01477043703198433, -0

### 5. Retrieve matching documents

Perform a vector search in Azure AI Search to retrieve relevant document chunks:

In [56]:
search_results = search_client.search(
    None,
    top=3,
    vector_queries=[
        VectorizableTextQuery(
            text=user_question, k_nearest_neighbors=3, fields="text_vector"
        )
    ],
)

# Print Results
for result in search_results:
     print("Chunk ID:", result["chunk_id"])
     print("Title:", result["title"])
     print("Text:", result["chunk"])
     print()

Chunk ID: 3aa7ddcb9da4_aHR0cHM6Ly9zdG9yYWdlZmFzdGFpcmFnLmJsb2IuY29yZS53aW5kb3dzLm5ldC9kYXRhc29wcmEvMTYuJTIwUGVuc2pvbiUyMG9nJTIwZm9yc2lrcmluZy5wZGY1_pages_4
Title: 16. Pensjon og forsikring.pdf
Text: Få helsehjelp.
Merk at noen konsultasjoner kan ha en egenandel du må dekke selv. Du kan også enkelt kontakte Helselosen via denne
siden. 

Ønsker du å utvide helseforsikringen til også å gjelde din familie, kan du benytte deg av tilbudet om familieforsikring. 

Vedrørende bytte av leverandør av helseforsikringen fra og med 1. oktober 2024:

Har du en pågående sak hos Storebrand skal denne fortsatt behandles av Storebrand og avsluttes der innen 31. desember
2024. Forsikringsnummer er: 11191347. Trenger du hjelp, mangler SakID eller trenger å få åpnet SakID så kan du kontakte
Helpline på telefon 800 83 313. Refusjon for pågående behandling kan du søke om her.

Reiseforsikring

Sopra Steria har reiseforsikring gjennom Storebrand. 

Polisenummer: 3199644. 

Reiseforsikringen dekker tjenestereis

### 6. RAG TIME! Generate a Response

Using the retrieved documents, construct a **system prompt** and generate a response with Azure OpenAI:

In [58]:
# First, let's collect the context from search results
context = ""
for result in search_results:
    context += result["chunk"] + "\n\n"

SYSTEM_MESSAGE = f"""
Du er en AI Assistant som har innsikt i Sopra Steria sin personalhåndbok vedlagt som context.
Vær kortfattet og detaljert i svarene dine, bruk kun informasjon fra context.

Context:
{context}
"""

USER_MESSAGE = user_question

response = openai_client.chat.completions.create(
    model=os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME"),
    temperature=0.7,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_MESSAGE},
    ],
)

answer = response.choices[0].message.content
print(answer)

Reiseforsikringen dekker nødvendige kostnader som oppstår ved sykdom, skade eller dødsfall under reise. Den inkluderer også dekning for avbestilling av reise og tap av bagasje. Det er viktig å sjekke spesifikasjonene i den aktuelle forsikringen, da vilkår og dekning kan variere.


## Troubleshooting

1. **Environment Variables Not Loaded:** Ensure you have correctly set the `.env` file or manually export them in your terminal before running the notebook.
1. **Authentication Issues:** If using Managed Identity, make sure your Azure identity has proper role assignments.
1. **Search Results Are Empty:** Ensure your Azure AI Search index contains vectorized data.
1. **OpenAI API Errors:** Verify your deployment name and API key.

## Summary

This notebook demonstrates a **vector-based RAG pipeline** using Azure OpenAI and Azure AI Search. It retrieves relevant documents using vector search and generates responses using GPT-based chat completions. The approach improves the accuracy of AI responses by grounding them in real data.