# 3.2. Search Experiment

WARNING: this Notebook is not meant to be run in this learning experience.

Search is at the core of the RAG pattern. Precise, efficient, and consistent search is critical when implementing a solution based on RAG.

There are four types of search [options](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/use-your-data?tabs=ai-search#search-options): keyword search, semantic search, vector search, hybrid search.

### [The Role of Search](https://github.com/microsoft/rag-openai/blob/main/topics/RAG_EnablingSearch.md#the-role-of-search)

The main purpose of the search tool is to bring the first cut of relevant documents for further analysis by the large language model - it is there to filter the noise and reduce the result set for the model to summarize.

Search is at the heart of a RAG solution - it is the mechanism that ensures that the context that is sent to the prompt contains relevant information for it to answer the question.

<!-- ### [Evaluating the Retrieval Component](https://github.com/microsoft/rag-openai/blob/main/topics/RAG_EnablingEvaluation.md#evaluating-the-retrieval-component)

Regarding the Retrieval component, the dataset is composed of question and citation instead of question and answer.

- question: the user question
- citation: the piece(s) of text that contains the relevant content to answer the user question
- answer: the final answer in a human readable/friendly format
  Evaluating the Retrieval component means to evaluate if for a given query (user question) the search engine is returning the relevant citation(s). -->

📝**Hypothesis**

The hypothesis for this experiment is an exploratory one: "Can introducing a new type of search improve the system's performance?"

🎯 **Measure of Success**

Retrieval information is a well-known problem and the classic metrics are: Precision, Recall, F1 Score, Mean Average Precision (MAP), Mean Normalized Discounted Cumulative Gain (Mean NDCG) and Mean Reciprocal Rank (MRR). More details can be found at Evaluating Information Retrieval Models: A Comprehensive Guide to Performance Metrics.

[Evaluation Metrics](https://github.com/microsoft/rag-openai/blob/main/topics/RAG_EnablingEvaluation.md#evaluation-metrics)
Link: https://medium.com/@prateekgaurav/evaluating-information-retrieval-models-a-comprehensive-guide-to-performance-metrics-78aadacb73b4#:~:text=Evaluating%20Information%20Retrieval%20Models%3A%20A%20Comprehensive%20Guide%20to,...%206%206.%20Mean%20Reciprocal%20Rank%20%28MRR%29%20


In [4]:
%%capture --no-display
%run -i ./pre-requisites.ipynb
%run -i ./helpers/search.ipynb

In [8]:
from azure.search.documents.models import (
    VectorizedQuery,
    QueryType,
    QueryCaptionType,
    QueryAnswerType,
)
from azure.search.documents import SearchClient

search_index_name = "index_chunks_2"
search_client = SearchClient(
    endpoint=service_endpoint, index_name=search_index_name, credential=credential
)

## 1. Perform a keyword search


In [9]:
query = "How can I test my solution"

results = search_client.search(
    search_text=query,
    select=["chunkId", "chunkContent", "source"],
    top=1,
)

for result in results:
    print(f"chunkId: {result['chunkId']}")
    print(f"source: {result['source']}")
    print(f"Score: {result['@search.score']}")
    print(f"chunkContent: {result['chunkContent']}")
    print("\n")
print(results)

chunkId: chunk63_0
source: ..\data\docs\code-with-engineering\code-reviews\pull-request-template.md
Score: 15.121317
chunkContent: Work Item ID

For more information about how to contribute to this repo, visit this page

Description

Should include a concise description of the changes (bug or feature), it's impact, along with a summary of the solution

Steps to Reproduce Bug and Validate Solution

Only applicable if the work is to address a bug. Please remove this section if the work is for a feature or story
Provide details on the environment the bug is found, and detailed steps to recreate the bug.
This should be detailed enough for a team member to confirm that the bug no longer occurs

PR Checklist

Use the check-list below to ensure your branch is ready for PR.  If the item is not applicable, leave it blank.

[ ] I have updated the documentation accordingly.

[ ] I have added tests to cover my changes.

[ ] All new and existing tests passed.

[ ] My code follows the code style of 

## 2. Perform a vector search


In [10]:
query = "tools for software development"

query_embeddings = get_query_embedding(query)
vector_query = VectorizedQuery(
    vector=query_embeddings[0], k_nearest_neighbors=1, fields="chunkContentVector"
)

results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["chunkId", "chunkContent", "source"],
    top=1,
)

for result in results:
    print(f"chunkId: {result['chunkId']}")
    print(f"source: {result['source']}")
    print(f"Score: {result['@search.score']}")
    print(f"chunkContent: {result['chunkContent']}")

chunkId: chunk178_4
source: ..\data\docs\code-with-engineering\agile-development\advanced-topics\team-agreements\team-manifesto.md
Score: 0.85869
chunkContent: Tools

Generally team sessions are enough for building a manifesto and having a consensus around it, and if there is a need for improving it in a structured way, there are many blogs and tools online, any retrospective tool can be used.

Resources

Technical Agility*


### 3. Perform a hybrid search

Hybrid Retrieval brings out the best of Keyword and Vector Search

Keyword and vector retrieval tackle search from different perspectives, which yield complementary capabilities. Vector retrieval semantically matches queries to passages with similar meanings. This is powerful because embeddings are less sensitive to misspellings, synonyms, and phrasing differences and can even work in cross lingual scenarios. Keyword search is useful because it prioritizes matching specific, important words that might be diluted in an embedding.

User search can take many forms. Hybrid retrieval consistently brings out the best from both retrieval methods across query types. With the most effective L1, the L2 ranking step can significantly improve the quality of results in the top positions.


In [11]:
query = "scalable storage solution"
query_embeddings = query_embeddings = get_query_embedding(query)
vector_query = VectorizedQuery(
    vector=query_embeddings[0], k_nearest_neighbors=3, fields="chunkContentVector"
)

results = search_client.search(
    search_text=query,
    vector_queries=[vector_query],
    select=["chunkId", "chunkContent", "source"],
    top=1,
)

for result in results:
    print(f"chunkId: {result['chunkId']}")
    print(f"source: {result['source']}")
    print(f"Score: {result['@search.score']}")
    print(f"chunkContent: {result['chunkContent']}")

chunkId: chunk80_12
source: ..\data\docs\code-with-dataops\industry-solutions\dataops-for-automotive\dataops-avops-platform.md
Score: 0.030751174315810204
chunkContent: Similarly, there are rules set up in the ADLS blob storage for moving the data from "Hot" to "Cold" tier as data travels to "DERIVED" from "RAW" Zone.

Solution also provides flexibility in terms of configurations for data lifecycle to meet the customer needs.

Cosmos DB stores the structured data of this solution, that is the foundational units of this solution viz. Measurement and Datastream json files.

The lineage information for Datastream is finally stored in the Cosmos DB. An API (Metadata API) is exposed over Cosmos DB to query the structured data and that also helps in tracking the data lineage.

Cosmos DB serves multiple purposes for the AVOps platform. Not only does it store the structured metadata of Rosbag files, but it also allows multiple clients to work with the system through a storage layer manager (Me

### 4. Perform a semantic hybrid search - Required Semantic Ranker enabled

For this, we would first need to update our index and add semantic configuration


In [14]:
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SemanticConfiguration,
    SemanticSearch,
    SemanticField,
    SemanticPrioritizedFields,
)


def create_index(search_index_name):

    client = SearchIndexClient(service_endpoint, credential)
    index = client.get_index(search_index_name)

    # 2. Define the semantic Settings
    # Note: It requires semantic ranker enabled on your search service
    # https://learn.microsoft.com/en-us/azure/search/semantic-search-overview
    # https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-request?tabs=portal%2Cportal-query
    # https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-request?tabs=sdk%2Cportal-query
    semantic_config = SemanticConfiguration(
        name="my-semantic-config",
        prioritized_fields=SemanticPrioritizedFields(
            # title_field=SemanticField(field_name="title"),
            # keywords_fields=[SemanticField(field_name="category")],
            content_fields=[SemanticField(field_name="chunkContent")],
        ),
    )
    semantic_search = SemanticSearch(configurations=[semantic_config])
    index.semantic_search = semantic_search

    result = client.create_or_update_index(index)
    print(f"{result.name} created or updated")

In [15]:
create_index(search_index_name)

index_chunks_2 created or updated


In [16]:
query = "what is azure sarch?"

query_embeddings = get_query_embedding(query)
vector_query = VectorizedQuery(
    vector=query_embeddings[0], k_nearest_neighbors=3, fields="chunkContentVector"
)

results = search_client.search(
    search_text=query,
    vector_queries=[vector_query],
    select=["chunkId", "chunkContent", "source"],
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name="my-semantic-config",
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=1,
)

semantic_answers = results.get_answers()
for answer in semantic_answers:
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}\n")

for result in results:
    print(f"chunkId: {result['chunkId']}")
    print(f"source: {result['source']}")
    print(f"Score: {result['@search.score']}")
    print(f"chunkContent: {result['chunkContent']}")
    # print(f"Title: {result['title']}")
    print(f"Reranker Score: {result['@search.reranker_score']}")

    captions = result["@search.captions"]
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}\n")
        else:
            print(f"Caption: {caption.text}\n")

chunkId: chunk80_13
source: ..\data\docs\code-with-dataops\industry-solutions\dataops-for-automotive\dataops-avops-platform.md
Score: 0.020846012979745865
chunkContent: Azure Batch is the heart of compute in this solution. Azure Batch natively supports massively parallel processing, which fine-tunes the performance and turn-around time of data pipelines.

For a sample Rosbag extraction process, multiple topics (lidar, radar, camera1, camera2 etc.) need to be extracted. It is designed to extract all the topics in with multiple jobs running in parallel.

Azure Batch supports Azure spot VM instances, which reduces the cost of the Batch workloads. Setting up an auto-scale formula helps further in cost optimization.

Solution provides flexibility in terms of hosting the compute for the API/services exposed.

Azure Function can help optimize cost of burst kind of traffic scenarios, when services are idle for most of the time.

AKS (Azure Kubernetes Service) can be also in consideration when 

## 💡 Conclusions
