# 2. Workshop Setup

## Pre-requisites:

- [Request Access to Azure OpenAI Service](https://aka.ms/oai/access)
- Azure Search Service (which can host one or more search indexes) with Semantic Ranker enabled. Note: it is not supported in sweden central https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?products=search
- Azure OpenAI Service and text-embedding-ada-002 model deployed

## Overview

In this part, we will build the building blocks of a RAG solution.

- We will create a Search Index
- We will create a prompt
  ...

<!-- To create the index we need the following objects:

- Data Source - a `link` to some data storage
- Azure Index - defines the data structure over which to search
  - Create an empty index based on an index schema
  - Fill in the data using the Search Indexer (below\_)
- Azure Search Indexer - which acts as a crawler that retrieves data from external sources, can also trigger skillsets (Optical Character Recognition) -->

### Setup

First, we install the necessary dependencies.
https://github.com/openai/openai-cookbook/blob/main/examples/azure/chat_with_your_own_data.ipynb


In [1]:
%pip install python-dotenv
%pip install azure-search-documents==11.4.0

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.3.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip






[notice] A new release of pip is available: 23.3.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In this workshop, we'll use `dotenv`. To connect with Azure OpenAI and the Search index, the following variables should be added to a .env file in KEY=VALUE format:
...


In [2]:
import os
import dotenv

# %reload_ext dotenv
# %dotenv
%reload_ext dotenv
%dotenv

### Import required libraries and environment variables


In [3]:
import os
import json
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.models import (
    VectorizedQuery,
    VectorFilterMode,
    QueryType,
    QueryCaptionType,
    QueryAnswerType,
)
from azure.search.documents.indexes.models import (
    SearchIndex,
    ScoringProfile,
    SearchFieldDataType,
    SimpleField,
    SearchableField,
    SearchField,
    SemanticConfiguration,
    SemanticField,
    VectorSearchProfile,
    HnswAlgorithmConfiguration,
    VectorSearch,
    HnswParameters,
    SemanticPrioritizedFields,
    SemanticSearch,
)
from azure.search.documents.indexes import SearchIndexClient

subscription_id = os.environ["subscription_id"]
resource_group_name = os.environ["resource_group_name"]
workspace_name = os.environ["workspace_name"]
service_endpoint = os.environ[
    "service_endpoint"
]  # the endpoint of your Azure Cognitive Search service
key = os.environ["search_key"]

# aoai_connection_name = os.environ['aoai_connection_name']
aoi_api_key = os.environ["aoi_api_key"]
aoai_endpoint = os.environ["aoai_endpoint"]
embedding_model_name = os.environ["embeddingModelName"]

search_index_name = "my_index_2"
search_index_key = os.getenv("AZURE_SEARCH_ADMIN_KEY")
credential = AzureKeyCredential(key)

### 1. Create Search Index

<!-- https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples/sample_index_crud_operations.py

https://github.com/microsoft/rag-experiment-accelerator/blob/development/rag_experiment_accelerator/init_Index/create_index.py

Used for overall Fields and Semantic Settings inspiration - https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/azure-search-vector-python-huggingface-model-sample.ipynb

Used for SearchField inspiration - https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/samples/sample_vector_search.py -->


In [4]:
def create_index(search_index_name):
    client = SearchIndexClient(service_endpoint, AzureKeyCredential(key))

    # 1. Define the fields
    fields = [
        SimpleField(
            name="id",
            type=SearchFieldDataType.String,
            key=True,
            sortable=True,
            filterable=True,
            # facetable=True,
        ),
        SearchableField(name="title", type=SearchFieldDataType.String),
        SearchableField(name="content", type=SearchFieldDataType.String),
        SearchableField(
            name="category", type=SearchFieldDataType.String, filterable=True
        ),
        SearchField(
            name="titleVector",
            type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
            searchable=True,
            vector_search_dimensions=384,
            # Assign a vector profile to the field to specify the algorithm
            # to use when searching the vector field.
            vector_search_profile_name="my-vector-config",
        ),
        SearchField(
            name="contentVector",
            type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
            searchable=True,
            vector_search_dimensions=384,
            vector_search_profile_name="my-vector-config",
        ),
    ]

    # 2. Define the semantic Settings
    # Note: It requires semantic ranker enabled on your search service
    # https://learn.microsoft.com/en-us/azure/search/semantic-search-overview
    # https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-request?tabs=portal%2Cportal-query
    # https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-request?tabs=sdk%2Cportal-query
    semantic_config = SemanticConfiguration(
        name="my-semantic-config",
        prioritized_fields=SemanticPrioritizedFields(
            title_field=SemanticField(field_name="title"),
            keywords_fields=[SemanticField(field_name="category")],
            content_fields=[SemanticField(field_name="content")],
        ),
    )
    semantic_search = SemanticSearch(configurations=[semantic_config])

    # 3. Configure the vector search configuration
    vector_search = VectorSearch(
        profiles=[
            VectorSearchProfile(
                name="my-vector-config",
                algorithm_configuration_name="my-algorithms-config",
                # Configuring a vectorizer in a search index is currently in public preview and available through API and beta SDK.
                # A vectorizer is a component of a search index that specifies a vectorization agent, such as a deployed embedding model on Azure OpenAI that converts text to vectors. You can define a vectorizer once, and then reference it in the vector profile assigned to a vector field.
                # A vectorizer is used for queries. It allows the search service to vectorize a text query on your behalf.
                # https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-configure-vectorizer
            )
        ],
        algorithms=[
            # Contains configuration options specific to the hnsw approximate nearest neighbors  algorithm used during indexing and querying
            HnswAlgorithmConfiguration(
                name="my-algorithms-config",
                kind="hnsw",
                # https://learn.microsoft.com/en-us/python/api/azure-search-documents/azure.search.documents.indexes.models.hnswparameters?view=azure-python-preview#variables
                parameters=HnswParameters(
                    m=4,
                    # The size of the dynamic list containing the nearest neighbors, which is used during index time.
                    # Increasing this parameter may improve index quality, at the expense of increased indexing time.
                    ef_construction=400,
                    # The size of the dynamic list containing the nearest neighbors, which is used during search time.
                    # Increasing this parameter may improve search results, at the expense of slower search.
                    ef_search=500,
                    # The similarity metric to use for vector comparisons.
                    # Known values are: "cosine", "euclidean", and "dotProduct"
                    metric="cosine",
                ),
            )
        ],
        
    )

    # CORS is used for apps that issues requests from different domains.
    # cors_options = CorsOptions(allowed_origins=["*"], max_age_in_seconds=60)

    # 4. Add scoring profiles when the default ranking behavior doesn't go far enough in meeting your business objectives.
    # https://learn.microsoft.com/en-us/azure/search/index-add-scoring-profiles
    scoring_profiles: List[ScoringProfile] = []
    index = SearchIndex(
        name=search_index_name,
        fields=fields,
        scoring_profiles=scoring_profiles,
        # cors_options=cors_options,
        # tokenizers=[], # TOOD: Add tokenizers,
        semantic_search=semantic_search,
        vector_search=vector_search,
    )

    result = client.create_or_update_index(index)
    print(f"{result.name} created or updated")

In [5]:
create_index(search_index_name)

my_index_2 created or updated


### 2. Create Embeddings

<!-- #### Which Embeddings Model to use?

There are several embedding options:

- OpenAI models, such as: [`text-embedding-ada-002`](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings), `text-embedding-3-small`, `text-embedding-3-large`
- HuggingFace models, which offers a wide range of models. The [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) ranks the performance of embeddings models on a few axis, though not all models can be run locally. -->


<!-- ### a) Embed a query using an embedding model from OpenAI -->


In [None]:
# import requests

# def get_query_embedding(
#     query,
#     endpoint=aoai_endpoint,
#     api_key=aoi_api_key,
#     api_version="2023-07-01-preview",
#     embedding_model_deployment=embedding_model_name,
# ):
#     request_url = f"{endpoint}/openai/deployments/{embedding_model_deployment}/embeddings?api-version={api_version}"
#     headers = {"Content-Type": "application/json", "api-key": api_key}
#     request_payload = {"input": query}
#     embedding_response = requests.post(
#         request_url, json=request_payload, headers=headers, timeout=None
#     )
#     if embedding_response.status_code == 200:
#         data_values = embedding_response.json()["data"]
#         embeddings_vectors = [data_value["embedding"] for data_value in data_values]
#         return embeddings_vectors
#     else:
#         raise Exception(f"failed to get embedding: {embedding_response.json()}")

In [None]:
# query = "Hello"

# query_vectors = get_query_embedding(
#     query, aoai_endpoint, aoi_api_key, "2023-07-01-preview", embedding_model_name
# )

# print(f"The embedded vector is: {query_vectors}")

<!-- #### Create embeddings using OpenAI

Read your data, generate embeddings using OpenAI model -->


In [None]:
# with open("./data/text-sample.json", "r", encoding="utf-8") as file:
#     input_data = json.load(file)

# for item in input_data:
#     title = item["title"]
#     content = item["content"]
#     title_embeddings = get_query_embedding(title)
#     content_embeddings = get_query_embedding(content)
#     item["titleVector"] = title_embeddings
#     item["contentVector"] = content_embeddings

# with open("./output/docVectors-openai.json", "w") as f:
#     json.dump(input_data, f)

### Embed a query using an embedding model from Hugging Face

We will use [`infloat/e5-small-v2`](https://huggingface.co/intfloat/e5-small-v2) from Hugging Face, which is of size 0.13 GB.


In [6]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("intfloat/e5-small-v2")
query = "Hello"

embedded_query = model.encode(query, normalize_embeddings=True)
print(len(embedded_query))

  from .autonotebook import tqdm as notebook_tqdm


384


#### Create embeddings using Hugging Face model

Read your data, generate embeddings using HuggingFace model


In [None]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("intfloat/e5-small-v2")

with open("./data/text-sample.json", "r", encoding="utf-8") as file:
    input_data = json.load(file)

for item in input_data:
    title = item["title"]
    content = item["content"]
    title_embeddings = model.encode(title, normalize_embeddings=True)
    content_embeddings = model.encode(content, normalize_embeddings=True)
    item["titleVector"] = title_embeddings.tolist()
    item["contentVector"] = content_embeddings.tolist()

with open("./output/docVectors-e5.json", "w") as f:
    json.dump(input_data, f)

### 3. Upload data

<!-- https://github.com/microsoft/rag-experiment-accelerator/blob/development/rag_experiment_accelerator/ingest_data/acs_ingest.py -->


Add texts and metadata from the JSON data to the vector store using Hugging Face embedded vectors:


In [7]:
# Upload some documents to the index
with open('./output/docVectors-e5.json', 'r') as file:  
    documents = json.load(file)  
search_client = SearchClient(endpoint=service_endpoint, index_name=search_index_name, credential=credential)
result = search_client.upload_documents(documents)  
print(f"Uploaded {len(documents)} documents") 

Uploaded 108 documents


In [8]:
def print_results(result):
    for result in results:
        print(f"Title: {result['title']}")
        print(f"Score: {result['@search.score']}")
        print(f"Content: {result['content']}")
        print(f"Category: {result['category']}\n")


search_client = SearchClient(service_endpoint, search_index_name, credential=credential)
query_embeddings = model.encode(query, normalize_embeddings=True)

### 4. Search

<!-- https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-ai-search-outperforming-vector-search-with-hybrid/ba-p/3929167 -->

There are two layers of execution: retrieval and ranking.

- Retrieval - also called L1, has the goal to quickly find all the documents from the index that satisfy the search criteria (possibly across millions or billions of documents). These are scored to pick the top few (typically in order of 50) to return to the user or to feed the next layer. Azure AI Search supports three different models:

  - Keyword: Uses traditional full-text search methods – content is broken into terms through language-specific text analysis, inverted indexes are created for fast retrieval, and the BM25 probabilistic model is used for scoring.

  - Vector: Documents are converted from text to vector representations using an embedding model. Retrieval is performed by generating a query embedding and finding the documents whose vectors are closest to the query’s. We used Azure Open AI text-embedding-ada-002 (Ada-002) embeddings and cosine similarity for all our tests in this post.
  - Hybrid: Performs both keyword and vector retrieval and applies a fusion step to select the best results from each technique. Azure AI Search currently uses Reciprocal Rank Fusion (RRF) to produce a single result set.

- Ranking – also called L2, takes a subset of the top L1 results and computes higher quality relevance scores to reorder the result set. The L2 can improve the L1's ranking because it applies more computational power to each result. The L2 ranker can only reorder what the L1 already found – if the L1 missed an ideal document, the L2 can't fix that. L2 ranking is critical for RAG applications to make sure the best results are in the top positions.
  - Semantic ranking is performed by Azure AI Search's L2 ranker which utilizes multi-lingual, deep learning models adapted from Microsoft Bing. The Semantic ranker can rank the top 50 results from the L1.

https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-ai-search-outperforming-vector-search-with-hybrid/ba-p/3929167


### Perform a vector similarity search


In [9]:
query = "tools for software development"
vector_query = VectorizedQuery(
    vector=query_embeddings.tolist(), k_nearest_neighbors=3, fields="contentVector"
)

results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["title", "content", "category"],
)

print_results(results)

Title: Azure Front Door
Score: 0.80073524
Content: Azure Front Door is a global, scalable, and secure entry point for fast delivery of your web applications. It provides features like load balancing, SSL offloading, and web application firewall (WAF). Front Door supports various Azure services, such as Azure App Service, Azure Storage, and Azure Virtual Machines. You can use Azure Front Door to build highly available and responsive applications, optimize your users' experience, and improve the security of your infrastructure. It also integrates with other Azure services, such as Azure CDN and Azure Traffic Manager.
Category: Networking

Title: Azure Power BI Embedded
Score: 0.7912205
Content: Azure Power BI Embedded is a cloud-based analytics service that enables you to embed interactive visualizations and reports into your applications. It provides features like data exploration, custom visuals, and real-time data refresh. Power BI Embedded supports various data sources, such as Azure

### Perform a hybrid search

Hybrid Retrieval brings out the best of Keyword and Vector Search

Keyword and vector retrieval tackle search from different perspectives, which yield complementary capabilities. Vector retrieval semantically matches queries to passages with similar meanings. This is powerful because embeddings are less sensitive to misspellings, synonyms, and phrasing differences and can even work in cross lingual scenarios. Keyword search is useful because it prioritizes matching specific, important words that might be diluted in an embedding.

User search can take many forms. Hybrid retrieval consistently brings out the best from both retrieval methods across query types. With the most effective L1, the L2 ranking step can significantly improve the quality of results in the top positions.


In [10]:
# Pure Vector Search
query = "scalable storage solution"
query_embeddings = model.encode(query, normalize_embeddings=True)
vector_query = VectorizedQuery(
    vector=query_embeddings.tolist(), k_nearest_neighbors=3, fields="contentVector"
)

results = search_client.search(
    search_text=query,
    vector_queries=[vector_query],
    select=["title", "content", "category"],
    top=3,
)

print_results(results)

Title: Azure Storage
Score: 0.03333333507180214
Content: Azure Storage is a scalable, durable, and highly available cloud storage service that supports a variety of data types, including blobs, files, queues, and tables. It provides a massively scalable object store for unstructured data. Storage supports data redundancy and geo-replication, ensuring high durability and availability. It offers a variety of data access and management options, including REST APIs, SDKs, and Azure Portal. You can secure your data using encryption at rest and in transit.
Category: Storage

Title: Azure File Storage
Score: 0.0320020467042923
Content: Azure File Storage is a fully managed, scalable, and secure file sharing service that enables you to store and access your files over the Server Message Block (SMB) protocol. It provides features like snapshots, shared access signatures, and integration with Azure Backup. File Storage supports various platforms, such as Windows, Linux, and macOS. You can use Az

### Perform a semantic hybrid search - Required Semantic Ranker enabled


In [12]:
query = "what is azure sarch?"

query_embeddings = model.encode(query, normalize_embeddings=True)
vector_query = VectorizedQuery(
    vector=query_embeddings.tolist(), k_nearest_neighbors=3, fields="contentVector"
)

results = search_client.search(
    search_text=query,
    vector_queries=[vector_query],
    select=["title", "content", "category"],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="my-semantic-config",
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=3,
)

semantic_answers = results.get_answers()
for answer in semantic_answers:
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}\n")

for result in results:
    print(f"Title: {result['title']}")
    print(f"Reranker Score: {result['@search.reranker_score']}")
    print(f"Content: {result['content']}")
    print(f"Category: {result['category']}")

    captions = result["@search.captions"]
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}\n")
        else:
            print(f"Caption: {caption.text}\n")

Semantic Answer: Azure File Storage is<em> a fully managed, scalable, and secure file sharing service that enables you to store and access your files over the Server Message Block (SMB) protocol.</em> It provides features like snapshots, shared access signatures, and integration with Azure Backup. File Storage supports various platforms, such as Windows, Linux, and macOS.
Semantic Answer Score: 0.9208984375

Title: Azure Stack Edge
Reranker Score: 2.075716972351074
Content: Azure Stack Edge is a managed, edge computing appliance that enables you to run Azure services and AI workloads on-premises or at the edge. It provides features like hardware-accelerated machine learning, local caching, and integration with Azure IoT Hub. Azure Stack Edge supports various Azure services, such as Azure Functions, Azure Machine Learning, and Azure Kubernetes Service. You can use Azure Stack Edge to build edge computing applications, optimize your data processing, and ensure the security and compliance