
# Retrieval-Augmented Generation (RAG) with Azure OpenAI and Azure AI Search

This notebook demonstrates how to set up and use Azure OpenAI and Azure AI Search to retrieve relevant documents using vector search and generate responses using a Retrieval-Augmented Generation (RAG) approach.

## Prerequisites

Before running the notebook, ensure you have the following: 

- [Fork](https://github.com/microsoft/rag-time/fork) the repository and clone it to your local machine by following the script below:

    ```bash
    git clone https://github.com/your-org/rag-time.git
    cd rag-time
    ```

- An [Azure account](https://portal.azure.com) with proper permissions to access the following services:
    - An **Azure OpenAI** service with an active deployment of a **chat model** and an **embedding model**.
    - An **Azure AI Search** service with an index that contains vectorized text data. Follow the instructions in the [Quickstart](https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-import-vectors?tabs=sample-data-storage%2Cmodel-aoai%2Cconnect-data-storage) to index the documents in [data](./../../data/) folder. 
- Install Python 3.8 or later from [python.org](https://python.org).

## Steps to Use the Notebook

### 1. Install Required Libraries

Run the first code cell to install the required Python libraries:

In [None]:
!python3 -m pip install azure-search-documents 
!python3 -m pip install azure-identity
!python3 -m pip install openai
!python3 -m pip install python-dotenv

### 2. Set Up Environment Variables

To store credentials securely, rename `.env.sample` file to `.env` in the same directory as the notebook and update the following variables:

In [None]:
AZURE_OPENAI_ENDPOINT="https://aoai-fastai-rag.openai.azure.com/"
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME="gpt-4o-mini"
AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME="text-embedding-3-large"
AZURE_SEARCH_SERVICE_ENDPOINT="https://aisearch-fastai-rag.search.windows.net"
AZURE_SEARCH_INDEX_NAME="sopravector"
AZURE_OPENAI_API_KEY=""
AZURE_SEARCH_ADMIN_KEY=""

After setting up, the notebook will automatically load these values using dotenv.

### 3. Setup API Clients

This section initializes API clients for Azure OpenAI and Azure AI Search. Ensure the necessary credentials are properly configured before proceeding.

In [None]:
import os
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.core.credentials import AzureKeyCredential
from openai import AzureOpenAI
from azure.search.documents import SearchClient
import dotenv
from azure.search.documents.models import VectorizedQuery, VectorizableTextQuery

dotenv.load_dotenv(override=True)

# Load Azure OpenAI environment variables
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME = os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME")

# Load Azure Search environment variables
AZURE_SEARCH_ENDPOINT = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
AZURE_SEARCH_INDEX_NAME = os.getenv("AZURE_SEARCH_INDEX_NAME")
AZURE_SEARCH_ADMIN_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY")

# 🔹 Initialize Azure OpenAI Client (API Key or Managed Identity)
if AZURE_OPENAI_API_KEY:
    openai_client = AzureOpenAI(
        api_key=AZURE_OPENAI_API_KEY,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        api_version="2024-10-21"
    )
else:
    azure_credential = DefaultAzureCredential()
    token_provider = get_bearer_token_provider(azure_credential, "https://cognitiveservices.azure.com/.default")
    openai_client = AzureOpenAI(
        azure_ad_token_provider=token_provider,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        api_version="2024-10-21"
    )

# 🔹 Initialize Azure AI Search Client (API Key or Managed Identity)
if AZURE_SEARCH_ADMIN_KEY:
    search_client = SearchClient(
        endpoint=AZURE_SEARCH_ENDPOINT,
        index_name=AZURE_SEARCH_INDEX_NAME,
        credential=AzureKeyCredential(AZURE_SEARCH_ADMIN_KEY)
    )
else:
    azure_credential = DefaultAzureCredential()
    search_client = SearchClient(
        endpoint=AZURE_SEARCH_ENDPOINT,
        index_name=AZURE_SEARCH_INDEX_NAME,
        credential=azure_credential
    )

def get_embedding(text):
    return openai_client.embeddings.create(
        model=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME"),
        input=text
    ).data[0].embedding

### 4. Prepare a question

Define a sample question and convert it into an embedding vector:

In [None]:
user_question = "Fortell meg om foreldrepermisjon"
#print(os.getenv("AZURE_OPENAI_ENDPOINT"))
#print(os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL_NAME"))
user_question_vector = get_embedding(user_question)
print("Text embedding")
print(user_question_vector)
print(len(user_question_vector))

### 5. Retrieve matching documents

Perform a vector search in Azure AI Search to retrieve relevant document chunks:

In [None]:
search_results = search_client.search(
    None,
    top=3,
    vector_queries=[
        VectorizableTextQuery(
            text=user_question, k_nearest_neighbors=3, fields="text_vector"
        )
    ],
)

# Print Results
for result in search_results:
     print("Chunk ID:", result["chunk_id"])
     print("Title:", result["title"])
     print("Text:", result["chunk"])
     print()

### 6. RAG TIME! Generate a Response

Using the retrieved documents, construct a **system prompt** and generate a response with Azure OpenAI:

In [None]:
# First, let's collect the context from search results
context = ""
for result in search_results:
    context += result["chunk"] + "\n\n"

SYSTEM_MESSAGE = f"""
Du er en AI Assistant som har innsikt i Sopra Steria sin personalhåndbok.
Vær kortfattet og presis i svarene dine, bruk kun informasjon fra håndboken.
Hvis ikke du finner informasjon i håndboken, si at du ikke vet svaret.

Context:
{context}
"""

USER_MESSAGE = user_question

response = openai_client.chat.completions.create(
    model=os.getenv("AZURE_OPENAI_CHAT_COMPLETION_DEPLOYED_MODEL_NAME"),
    temperature=0.7,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_MESSAGE},
    ],
)

answer = response.choices[0].message.content
print(answer)

## Troubleshooting

1. **Environment Variables Not Loaded:** Ensure you have correctly set the `.env` file or manually export them in your terminal before running the notebook.
1. **Authentication Issues:** If using Managed Identity, make sure your Azure identity has proper role assignments.
1. **Search Results Are Empty:** Ensure your Azure AI Search index contains vectorized data.
1. **OpenAI API Errors:** Verify your deployment name and API key.

## Summary

This notebook demonstrates a **vector-based RAG pipeline** using Azure OpenAI and Azure AI Search. It retrieves relevant documents using vector search and generates responses using GPT-based chat completions. The approach improves the accuracy of AI responses by grounding them in real data.