# Re-rank chunks and generate answers

This code demonstrate how to evaluate chunks in parallel generating the percentage of similarity with the question and the answer (part of text in the chunk more appropiate to answer the question).

The output is the chunks more relevant to answer the question and the answer generated with those chunks.

## Prerequisites

+ An Azure subscription, with [access to Azure OpenAI](https://aka.ms/oai/access).
+ An Azure OpenAI service with the service name and an API key.
+ A deployment of the text-embedding-ada-002 embedding model on the Azure OpenAI Service.
+ An Azure AI Search service with the end-point, API Key and the index name to create.

We used Python 3.12.5, [Visual Studio Code with the Python extension](https://code.visualstudio.com/docs/python/python-tutorial), and the [Jupyter extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) to test this example.

### Set up a Python virtual environment in Visual Studio Code

1. Open the Command Palette (Ctrl+Shift+P).
1. Search for **Python: Create Environment**.
1. Select **Venv**.
1. Select a Python interpreter. Choose 3.10 or later.

It can take a minute to set up. If you run into problems, see [Python environments in VS Code](https://code.visualstudio.com/docs/python/environments).

### Install packages

In [None]:
! pip install openai
! pip install azure-search-documents

## Import packages and create AOAI client

In [1]:
import os
from dotenv import load_dotenv
from openai import AzureOpenAI
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
import sys
sys.path.append('../..')
from rag_utils import calculate_rank, semantic_hybrid_search_with_filter, get_filtered_chunks, generate_answer

# Load environment variables from .env
load_dotenv(override=True)

# AZURE AI SEARCH
ai_search_endpoint = os.environ["SEARCH_SERVICE_ENDPOINT"]
ai_search_apikey = os.environ["SEARCH_SERVICE_QUERY_KEY"]
ai_search_index_name = os.environ["SEARCH_INDEX_NAME"]
ai_search_credential = AzureKeyCredential(ai_search_apikey)
# Create Azure AI Search client
ai_search_client = SearchClient(endpoint=ai_search_endpoint, index_name=ai_search_index_name, credential=ai_search_credential)

aoai_api_version = '2024-02-15-preview'

# AOAI FOR ANSWER GENERATION
aoai_answer_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
aoai_answer_apikey = os.environ["AZURE_OPENAI_API_KEY"]
aoai_answer_model_name = os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"]
# Create AOAI client for answer generation
aoai_answer_client = AzureOpenAI(
    azure_deployment=aoai_answer_model_name,
    api_version=aoai_api_version,
    azure_endpoint=aoai_answer_endpoint,
    api_key=aoai_answer_apikey
)

# AZURE OPENAI FOR RERANKING
aoai_rerank_endpoint = os.environ["AZURE_OPENAI_RERANK_ENDPOINT"]
azure_openai_rerank_key = os.environ["AZURE_OPENAI_RERANK_API_KEY"]
rerank_model_name = os.environ["AZURE_OPENAI_RERANK_DEPLOYMENT_NAME"]
# Create AOAI client for reranking
aoai_rerank_client = AzureOpenAI(
    azure_deployment=rerank_model_name,
    api_version=aoai_api_version,
    azure_endpoint=aoai_rerank_endpoint,
    api_key=azure_openai_rerank_key
)

# AZURE OPENAI FOR EMBEDDING
aoai_embedding_endpoint = os.environ["AZURE_OPENAI_EMBEDDING_ENDPOINT"]
azure_openai_embedding_key = os.environ["AZURE_OPENAI_EMBEDDING_API_KEY"]
embedding_model_name = os.environ["AZURE_OPENAI_EMBEDDING_NAME_ADA"]
# Create AOAI client for embedding creation (ADA)
aoai_embedding_client = AzureOpenAI(
    azure_deployment=embedding_model_name,
    api_version=aoai_api_version,
    azure_endpoint=aoai_embedding_endpoint,
    api_key=azure_openai_embedding_key
)

# CONSTANTS
MAX_DOCS = 20 # Maximum number of documents to retrieve in the query
EMBEDDING_FIELDS = "embeddingTitle, embeddingContent" # Vector fields to search for
SELECT_FIELDS=["id", "title", "content"] # Fields to retrieve in the search
QUERY_LANGUAGE="es-es" # Query language

In [2]:
import pandas as pd

def show_results(results, query, rerank=False):
    #print(f'query: {query}, num results: {results.get_count()}')
    data = []
    for i, result in enumerate(results):
        if rerank:
            confidence, answer = calculate_rank(aoai_rerank_client, rerank_model_name, result['title'] + ". " + result['content'], query)
            response = f'confidence: {confidence}, answer: {answer}'
        else:
            response = 'n/a'

        data.append(
            [result["id"],
             result["title"],
             result["content"], 
             result["@search.score"],
             response
            ]
        )
        if i + 1 == MAX_DOCS: break # Stops at the maximum number of documents

    return pd.DataFrame(data, columns=["id", "title", "content", "@search.score", "rerank"])

In [None]:
query = "¿CÓMO DESISTIR DEL RELOJ?"
results = semantic_hybrid_search_with_filter(ai_search_client, query, aoai_embedding_client, embedding_model_name, EMBEDDING_FIELDS, MAX_DOCS, SELECT_FIELDS, QUERY_LANGUAGE)
show_results(results, query, True)

In [None]:
query = "Patrocinio de Eventos"
results = semantic_hybrid_search_with_filter(ai_search_client, query, aoai_embedding_client, embedding_model_name, EMBEDDING_FIELDS, MAX_DOCS, SELECT_FIELDS, QUERY_LANGUAGE)
chunks = get_filtered_chunks(aoai_rerank_client, rerank_model_name, results, query, MAX_DOCS)
if len(chunks) > 0:
    # Generate answer using the complete chunks
    answer = generate_answer(aoai_answer_client, aoai_answer_model_name, chunks, query, 'content')
else:
    answer = 'There it not content to generate the answer'
print(f'\nANSWER: {answer}')

In [None]:
query = "Patrocinio de Eventos"
results = semantic_hybrid_search_with_filter(ai_search_client, query, aoai_embedding_client, embedding_model_name, EMBEDDING_FIELDS, MAX_DOCS, SELECT_FIELDS, QUERY_LANGUAGE)
chunks = get_filtered_chunks(aoai_rerank_client, rerank_model_name, results, query, MAX_DOCS)
if len(chunks) > 0:
    # Generate answer using the 'answer' generated by the re-ranker
    answer = generate_answer(aoai_answer_client, aoai_answer_model_name, chunks, query, 'answer')
else:
    answer = 'There it not content to generate the answer'
print(f'\nANSWER: {answer}')