# 05 Vector DB | 01 Azure Cognitive Search

## Azure Environment

You can create the necessary instance of Azure OpenAI and deploy an embedding model by executing one of the deployment options [mentioned here](../../create_env/src/AzureCLI/CreateEnv.azcli). To execute the sample code service specific information is needed ([Details and instructions here](../01_DemoEnvironment/01_Environment.ipynb)) 

## Step 1: Create OpenAIClient

The OpenAIClient from Azure.AI.OpenAI is a .NET client library that acts as the centralized point for all .NET functionality that want to interact with a deployed Azure OpenAI Large Language Model. It provides methods to access the OpenAI REST APIs for various tasks such as text completion, text embedding, and chat completion, etc.. It also allows developers to specify the model, engine, and options for each request, such as temperature, frequency penalty, presence penalty, and stop sequences. 

The OpenAIClient can connect to any Azure OpenAI resource or to the non-Azure OpenAI inference endpoint, making it a versatile and powerful tool for .NET development with OpenAI.

In [None]:
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.8"
#r "nuget: DotNetEnv, 2.5.0"
#r "nuget: Azure.Search.Documents, 11.5.0-beta.4"

using Azure;
using Azure.AI.OpenAI;
using Azure.Search.Documents;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
using Azure.Search.Documents.Models;
using DotNetEnv;


static string _configurationFile = @"../01_DemoEnvironment/conf/application.env";
Env.Load(_configurationFile);

string oAiApiKey = Environment.GetEnvironmentVariable("SKIT_AOAI_APIKEY") ?? "SKIT_AOAI_APIKEY not found";
string oAiEndpoint = Environment.GetEnvironmentVariable("SKIT_AOAI_ENDPOINT") ?? "SKIT_AOAI_ENDPOINT not found";
string embeddingDeploymentName = Environment.GetEnvironmentVariable("SKIT_EMBEDDING_DEPLOYMENTNAME") ?? "SKIT_EMBEDDING_DEPLOYMENTNAME not found";

string cognitiveSearchEndpoint = Environment.GetEnvironmentVariable("SKIT_SEARCH_ENDPOINT") ?? "";
string cognitiveSearchApiKey = Environment.GetEnvironmentVariable("SKIT_SEARCH_APIKEY") ?? "";

AzureKeyCredential azureKeyCredential = new AzureKeyCredential(oAiApiKey);
OpenAIClient openAIClient = new OpenAIClient(new Uri(oAiEndpoint), azureKeyCredential);

AzureKeyCredential searchCredential = new AzureKeyCredential(cognitiveSearchApiKey);
SearchIndexClient searchIndexClient = new SearchIndexClient(new Uri(cognitiveSearchEndpoint), searchCredential);

Console.WriteLine("Azure OpenAI client created...");
Console.WriteLine("Cognitive Search client created...");    


Expected output:
```
Installed Packages
    Azure.AI.OpenAI, 1.0.0-beta.8
    Azure.Search.Documents, 11.5.0-beta.4
    DotNetEnv, 2.5.0
Azure OpenAI client created...
Cognitive Search client created...
```


## Step 2 - Create Search Index

In [None]:
//Create search index
string indexName = "fact-index"; 
string searchConfigName = "fact-config";

int modelDimensions = 1536;
SearchIndex searchIndex = new(indexName)
{
    Fields =
    {
        new SimpleField("FactId", SearchFieldDataType.String) { IsKey = true, IsFilterable = true, IsSortable = true, IsFacetable = true },
        new SearchableField("FactName") { IsFilterable = true, IsSortable = true },
        new SearchableField("FactDescription") { IsFilterable = true },
        new SearchField("FactVector", SearchFieldDataType.Collection(SearchFieldDataType.Single))
        {
            IsSearchable = true,
            VectorSearchDimensions = modelDimensions,
            VectorSearchConfiguration = searchConfigName
        },
    },
    VectorSearch = new()
    {
        AlgorithmConfigurations =
        {
            new HnswVectorSearchAlgorithmConfiguration(searchConfigName)
        }
    }
}; 

try { await searchIndexClient.DeleteIndexAsync(indexName); } catch {}
await searchIndexClient.CreateIndexAsync(searchIndex);   
Console.WriteLine("Search index created...");


Expected output:
```
Search Index created...
```

## Step 3 - Define facts

Technical facts are precise pieces of information or data related to a specific technical domain or subject. They can encompass a wide range of topics including software specifications, hardware configurations, algorithm behaviors, coding standards, or any other technical attributes and characteristics. These facts are typically derived from technical documentation, product manuals, technical research papers, or direct observations and measurements.

The use case for having a list of technical facts and enabling semantic search on these facts is manifold. It can significantly aid in:

1. **Knowledge Retrieval**: Quickly retrieving relevant technical information needed for decision-making, problem-solving, or understanding a particular technical concept or system.
2. **Technical Support and Troubleshooting**: Assisting in identifying solutions to technical issues or providing guidance based on the technical facts available.
3. **Product Development and Improvement**: Facilitating a deeper understanding of technical specifications and requirements which in turn can guide product development and improvement efforts.
4. **Research and Innovation**: Providing a foundation for research and innovation by offering a rich source of technical data that can be explored and analyzed.
5. **Educational Purposes**: Serving as a valuable resource for educational purposes, helping learners, and professionals to grasp technical concepts and stay updated on technical advancements.

By having a semantic search capability on these technical facts, users can go beyond simple keyword searches and leverage the power of semantic understanding to find more relevant and contextual results. This makes the search process much more effective and insightful, especially in scenarios where precision and context are crucial.


In [None]:
//Define facts
public class Fact
{
    public string FactId { get; set; } = "";
    public string FactName { get; set; } = "";
    public string FactDescription { get; set; } = "";
    public IReadOnlyList<float> FactVector { get; set; } = new List<float>(); 
}

Fact[] facts = new[]
{
    new Fact()
    {
        FactId = "1",
        FactName = "Company Music",
        FactDescription = @"Firma Musik is one of the world's leading record labels. 
                            It has signed famous singers and is very profitable! 
                            The flagship of Contoso Music is a group that performs under the name 'Contoso Only'!",
        FactVector = new float[1536],
    },
    new Fact()
    {
        FactId = "2",
        FactName = "Company Maritim",
        FactDescription = @"Company Heavy Industry Maritime products. 
                        The current bestseller is the container transporter 'Contoso XL Heavy 2000'.",
        FactVector = new float[1536],
    },
    new Fact()
    {
        FactId = "3",
        FactName = "Company Agriculture",
        FactDescription = @"Company Agriculture is a German start-up that focuses on the production of milk and grain. 
                            Since this is a start-up, no further information is available!",
        FactVector = new float[1536], 
    },
};

Console.WriteLine("Facts defined...");

Expected output:
```
Facts defined...
```

## Step 4 - Upload facts to Azure Cognitive Search

In this sept we calculate the embeddings for the facts and upload them to Azure Cognitive Search.

In [None]:
foreach(Fact fact in facts) {
    EmbeddingsOptions factEmbeddingsOptions = new EmbeddingsOptions(fact.FactDescription);
    var factEmbedding = await openAIClient.GetEmbeddingsAsync(embeddingDeploymentName, factEmbeddingsOptions);
    float[] embeddingData = factEmbedding.Value.Data[0].Embedding.ToArray<float>();
    fact.FactVector = embeddingData;
}

SearchClient searchClient = searchIndexClient.GetSearchClient(indexName);
await searchClient.IndexDocumentsAsync(IndexDocumentsBatch.Upload(facts));

Console.WriteLine("Fact documents uploaded...");

Expected output:
```
Facts documents uploaded...
```

## Step 5 - Vector search

In this step we use the Azure Cognitive Search to search for the most relevant fact based on a user query. The query is converted to an embedding and compared to the embeddings of the facts. The most similar fact is returned.

In [None]:
//Search vectors
EmbeddingsOptions embeddingsOptions; 
embeddingsOptions = new EmbeddingsOptions("Who produces Container Ships?");
var embedding = await openAIClient.GetEmbeddingsAsync(embeddingDeploymentName, embeddingsOptions);
float[] vectorizedResult = embedding.Value.Data[0].Embedding.ToArray<float>();


SearchResults<Fact> response = await searchClient.SearchAsync<Fact>(
    null,
    new SearchOptions {
        Vectors = { new() { Value = vectorizedResult, KNearestNeighborsCount = 1, Fields = { "FactVector" } } },
    }
);

Console.WriteLine($"Single Vector Search Results:");
await foreach (SearchResult<Fact> result in response.GetResultsAsync())
{
    Console.WriteLine($"Search result: {result.Document.FactId}: {result.Document.FactName}");
}

## Understanding the result

First, you can see that the right company was selected based on the user query. The customer was querying about which of the companies defined in the facts is the best fit for his needs. The reason this was selected by the service is since the query and the fact are semantically similar. 

## Next step

The previous notebooks used the Azure OpenAI SDK and specifically the OpenAI Client object to interact with LLMs. Microsoft has open sourced another SDK `Semantic Kernel` which introduces additional concepts like (PlugIns, Memory, Planner, Connectors etc.) to simplify development even further. [Here's and overview](../06_SemanticKernel/README.md) of Semantic Kernel and here are notebooks with demo code to explore the concepts: 

- [Plugin semantic inline function](../06_SemanticKernel/01_PlugIn_SemanticFunction_Inline.ipynb)
- [Plugin semantic function from file](../06_SemanticKernel/02_PlugIn_SemanticFunction_File.ipynb)
- [Plugin native function](../06_SemanticKernel/03_PlugIn_NativeFunction.ipynb)
- [Demo Memories](../06_SemanticKernel/04_Memory.ipynb)
- [Demo Planner](../06_SemanticKernel/05_Planner.ipynb)
- [Demo Logs in Semantic Kernel](../06_SemanticKernel/06_Logs.ipynb)
