# RAG Quickstart for Azure AI Search

This quickstart provides a query for RAG scenarios. It demonstrates an approach for a chat experience using grounding data from a search index on Azure AI Search.

We took a few shortcuts to keep the exercise basic and focused on query definitions:

- We use the **hotels-sample-index**, which can be created in minutes and runs on any search service tier. This index is created by a wizard using built-in sample data.

- We omit vectors so that we can skip chunking and embedding. The index contains plain text.

A non-vector index isn't ideal for RAG patterns, but it makes for a simpler example.

Once you understand the fundamentals of integrating queries from Azure AI Search to an LLM, you can build on that experience by adding vector fields and vector and hybrid queries. We recommend the [phi-chat Python code example](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/phi-chat/phi-chat.ipynb) for that step.

This example is fully documented in [Quickstart: Generative search (RAG) with grounding data from Azure AI Search](https://learn.microsoft.com/azure/search/search-get-started-rag). If you need more guidance than the readme provides, please refer to the article.


## Prerequisites

- [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource)

  - Deploy a chat model (gpt-4o, gpt-4o-mini, or equivalent LLM).

- [Azure AI Search](https://learn.microsoft.com/azure/search/search-create-service-portal)

  - Basic tier or higher is recommended.
  - Choose the same region as Azure OpenAI.
  - Enable semantic ranking.
  - Enable role-based access control.
  - Enable a system identity for Azure AI Search.
  
Make sure you know the name of the deployed model, and have the endpoints for both Azure resources at hand. You will provide this information in the steps that follow.

## Configure access

This quickstart assumes authentication and authorization using Microsoft Entra ID and role assignments. It also assumes that you run this code from your local device.

1. To create, load, and query the sample index on Azure AI Search, you must personally have role assignments for: **Search Index Data Reader**, **Search Index Data Contributor**, **Search Service Contributor**.

1. To send the query and search results to Azure OpenAI, both you and the search system identity must have **Cognitive Services OpenAI User** permissions on Azure OpenAI.

   - Queries in the system prompt are sent from your local system, which is why you need permissions on Azure OpenAI.
   - Results used for grounding data are sent from the search engine, which is why the search service needs permissions on Azure OpenAI.

## Create the sample index

This quickstart assumes the hotels-sample-index, which you can create in minutes using [this quickstart](https://learn.microsoft.com/azure/search/search-get-started-portal).

Once the index exists, modify it in the Azure portal to use this semantic configuration:

Now that you have your Azure resources, an index, and model in place, you can run the script to chat with the index.

## Create a virtual environment

Create a virtual environment so that you can install the dependencies in isolation.

1. In Visual Studio Code, open the folder containing Quickstart-RAG.ipynb.

1. Press Ctrl-shift-P to open the command palette, search for "Python: Create Environment", and then select `Venv` to create a virtual environment in the current workspace.

1. Select Quickstart-RAG\requirements.txt for the dependencies.

It takes several minutes to create the environment. When the environment is ready, continue to the next step.

## Run the code

In [1]:
# Package install for quickstart
! pip install -r requirements.txt --quiet

In [2]:
# Set endpoints and deployment model (provide the name of the deployment)
AZURE_SEARCH_SERVICE: str = "PUT YOUR SEARCH SERVICE ENDPOINT HERE"
AZURE_OPENAI_ACCOUNT: str = "PUT YOUR AZURE OPENAI ENDPOINT HERE"
AZURE_DEPLOYMENT_MODEL: str = "gpt-4o"

## Basic RAG Query

The following code demonstrates how to query fields such as strings or string collections and use them to answer a question using RAG

In [3]:
# Set up the query for generating responses
from azure.identity import DefaultAzureCredential
from azure.identity import get_bearer_token_provider
from azure.search.documents import SearchClient
from openai import AzureOpenAI

credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
openai_client = AzureOpenAI(
    api_version="2024-06-01",
    azure_endpoint=AZURE_OPENAI_ACCOUNT,
    azure_ad_token_provider=token_provider
)

search_client = SearchClient(
    endpoint=AZURE_SEARCH_SERVICE,
    index_name="hotels-sample-index",
    credential=credential
)

# This prompt provides instructions to the model
GROUNDED_PROMPT="""
You are a friendly assistant that recommends hotels based on activities and amenities.
Answer the query using only the sources provided below in a friendly and concise bulleted manner.
Answer ONLY with the facts listed in the list of sources below.
If there isn't enough information below, say you don't know.
Do not generate answers that don't use the sources below.
Query: {query}
Sources:\n{sources}
"""

# Query is the question being asked. It's sent to the search engine and the LLM.
query="Can you recommend a few hotels with complimentary breakfast?"

# Set up the search results and the chat thread.
# Retrieve the selected fields from the search index related to the question.
search_results = search_client.search(
    search_text=query,
    top=5,
    select="Description,HotelName,Tags",
    query_type="semantic"
)
sources_formatted = "\n".join([f'{document["HotelName"]}:{document["Description"]}:{document["Tags"]}' for document in search_results])

response = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
        }
    ],
    model=AZURE_DEPLOYMENT_MODEL
)

print(response.choices[0].message.content)

Sure! Here are a few hotels that offer complimentary breakfast:

- **Whitefish Lodge & Suites**:
  - Continental breakfast
  - Free parking
  - Free wifi

- **Trails End Motel**:
  - Free hot breakfast buffet
  - Free wireless internet

- **Peaceful Market Hotel & Spa**:
  - Continental breakfast
  - Restaurant
  - View

- **Marquis Plaza & Suites**:
  - Free breakfast buffet
  - Free wifi
  - Pool

Enjoy your stay!


If you get an authorization error instead of results:

+ If you just enabled role assignments, wait a few minutes and try again. It can take several minutes for role assignments to become operational.

+ Make sure you [enabled RBAC](https://learn.microsoft.com/azure/search/search-security-enable-roles?tabs=config-svc-portal%2Cdisable-keys-portal) on Azure AI Search. An HttpStatusMessage of **Forbidden** is an indicator that RBAC isn't enabled.

+ Recheck role assignments for yourself and for the search service system identity.

+ Check firewall settings. This quickstart assumes public network access. If you have a firewall, you need to add rules to allow inbound requests from your device and for service-to-service connections.

+ For more debugging guidance, see the [troubleshooting section](https://learn.microsoft.com/azure/search/search-get-started-rag#troubleshooting-errors) in the Quickstart documentation.

## Complex RAG Query

Use complex types and collections in RAG by converting the result to JSON before sending it to the LLM

In [4]:
import json

# Query is the question being asked. It's sent to the search engine and the LLM.
query="Can you recommend a few hotels that offer complimentary breakfast? Tell me their description, address, tags, and the rate for one room they have which sleep 4 people."

# Set up the search results and the chat thread.
# Retrieve the selected fields from the search index related to the question.
selected_fields = ["HotelName","Description","Address","Rooms","Tags"]
search_results = search_client.search(
    search_text=query,
    top=5,
    select=selected_fields,
    query_type="semantic"
)
sources_filtered = [{field: result[field] for field in selected_fields} for result in search_results]
sources_formatted = "\n".join([json.dumps(source) for source in sources_filtered])

response = openai_client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
        }
    ],
    model=AZURE_DEPLOYMENT_MODEL
)

print(response.choices[0].message.content)

Sure! Here are a few hotels that offer complimentary breakfast:

1. **Trails End Motel**
   - **Description**: Only 8 miles from Downtown. On-site bar/restaurant, Free hot breakfast buffet, Free wireless internet, All non-smoking hotel. Only 15 miles from the airport.
   - **Address**: 7014 E Camelback Rd, Scottsdale, AZ 85251, USA
   - **Tags**: Continental breakfast, view
   - **Room Recommendation**: 
     - **Type**: Suite, 2 Queen Beds (City View)
     - **Rate**: $266.99 per night
     - **Sleeps**: 4

2. **Whitefish Lodge & Suites**
   - **Description**: Located in the heart of the forest. Enjoy Warm Weather, Beach Club Services, Natural Hot Springs, Airport Shuttle.
   - **Address**: 3000 E 1st Ave, Denver, CO 80206, USA
   - **Tags**: Continental breakfast, free parking, free wifi
   - **Room Recommendation**: 
     - **Type**: Suite, 2 Queen Beds (Mountain View)
     - **Rate**: $256.99 per night
     - **Sleeps**: 4

3. **Peaceful Market Hotel & Spa**
   - **Description**: B