# 03 Vector DB | 01 Azure Cognitive Search

## Azure Environment

To execute the sample code Azure service specific information like endpoint, api key etc. is needed ([Details and instructions can be found here](../01_CreateEnvironment/01_Environment.ipynb))

## Step 1: Create OpenAIClient / SearchIndexClient

The OpenAIClient from Azure.AI.OpenAI is a .NET client library that acts as the centralized point for all .NET functionality that want to interact with a deployed Azure OpenAI Large Language Model. It provides methods to access the OpenAI REST APIs for various tasks such as text completion, text embedding, and chat completion, etc.. It also allows developers to specify the model, engine, and options for each request, such as temperature, frequency penalty, presence penalty, and stop sequences. 

The OpenAIClient can connect to any Azure OpenAI resource or to the non-Azure OpenAI inference endpoint, making it a versatile and powerful tool for .NET development with OpenAI.

In [1]:
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.16"
#r "nuget: DotNetEnv, 2.5.0"
#r "nuget: Azure.Search.Documents, 11.5.0-beta.4"

using Azure;
using Azure.AI.OpenAI;
using Azure.Search.Documents;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
using Azure.Search.Documents.Models;
using DotNetEnv;


//configuration file is created during environment creation
static string _configurationFile = @"../Configuration/application.env";
Env.Load(_configurationFile);

string oAiApiKey = Environment.GetEnvironmentVariable("WS_AOAI_APIKEY") ?? "WS_AOAI_APIKEY not found";
string oAiEndpoint = Environment.GetEnvironmentVariable("WS_AOAI_ENDPOINT") ?? "WS_AOAI_ENDPOINT not found";
string embeddingDeploymentName = Environment.GetEnvironmentVariable("WS_EMBEDDING_DEPLOYMENTNAME") ?? "WS_EMBEDDING_DEPLOYMENTNAME not found";

string cognitiveSearchEndpoint = Environment.GetEnvironmentVariable("WS_SEARCH_ENDPOINT") ?? "";
string cognitiveSearchApiKey = Environment.GetEnvironmentVariable("WS_SEARCH_APIKEY") ?? "";

AzureKeyCredential azureKeyCredential = new AzureKeyCredential(oAiApiKey);
OpenAIClient openAIClient = new OpenAIClient(new Uri(oAiEndpoint), azureKeyCredential);

AzureKeyCredential searchCredential = new AzureKeyCredential(cognitiveSearchApiKey);
SearchIndexClient searchIndexClient = new SearchIndexClient(new Uri(cognitiveSearchEndpoint), searchCredential);

Console.WriteLine("Azure OpenAI client created...");
Console.WriteLine("Cognitive Search client created...");    


Azure OpenAI client created...
Cognitive Search client created...


## Step 2 - Create Search Index

To store data in an AI Search Instance a so called Search Index is needed. 

In [2]:
//Create search index
string indexName = "fact-index"; 
string searchConfigName = "fact-config";

int modelDimensions = 1536;
SearchIndex searchIndex = new(indexName)
{
    Fields =
    {
        new SimpleField("FactId", SearchFieldDataType.String) { IsKey = true, IsFilterable = true, IsSortable = true, IsFacetable = true },
        new SearchableField("FactName") { IsFilterable = true, IsSortable = true },
        new SearchableField("FactDescription") { IsFilterable = true },
        new SearchField("FactVector", SearchFieldDataType.Collection(SearchFieldDataType.Single))
        {
            IsSearchable = true,
            VectorSearchDimensions = modelDimensions,
            VectorSearchConfiguration = searchConfigName
        },
    },
    VectorSearch = new()
    {
        AlgorithmConfigurations =
        {
            new HnswVectorSearchAlgorithmConfiguration(searchConfigName)
        }
    }
}; 

try { await searchIndexClient.DeleteIndexAsync(indexName); } catch {}
await searchIndexClient.CreateIndexAsync(searchIndex);   
Console.WriteLine("Search index created...");


Search index created...


## Step 3 - Define Facts

For demo purpose 3 facts about artificial companies are defined:

- *Company Music:* Company Music is one of the world's leading record labels. It has signed famous singers and is very profitable! The flagship of Contoso Music is a group that performs under the name 'Contoso Only'!
- *Company Maritim:* Company Heavy Industry Maritime products. The current bestseller is the container transporter 'Contoso XL Heavy 2000'.
- *Company Agriculture:* Company Agriculture is a German start-up that focuses on the production of milk and grain. Since this is a start-up, no further information is available!

In [3]:
//Define facts
public class Fact
{
    public string FactId { get; set; } = "";
    public string FactName { get; set; } = "";
    public string FactDescription { get; set; } = "";
    public IReadOnlyList<float> FactVector { get; set; } = new List<float>(); 
}

Fact[] facts = new[]
{
    new Fact()
    {
        FactId = "1",
        FactName = "Company Music",
        FactDescription = @"Company Musik is one of the world's leading record labels. 
                            It has signed famous singers and is very profitable! 
                            The flagship of Contoso Music is a group that performs under the name 'Contoso Only'!",
        FactVector = new float[1536],
    },
    new Fact()
    {
        FactId = "2",
        FactName = "Company Maritim",
        FactDescription = @"Company Heavy Industry Maritime products. 
                        The current bestseller is the container transporter 'Contoso XL Heavy 2000'.",
        FactVector = new float[1536],
    },
    new Fact()
    {
        FactId = "3",
        FactName = "Company Agriculture",
        FactDescription = @"Company Agriculture is a German start-up that focuses on the production of milk and grain. 
                            Since this is a start-up, no further information is available!",
        FactVector = new float[1536], 
    },
};

Console.WriteLine("Facts defined...");

Facts defined...


## Step 4 - Upload facts to Azure Cognitive Search

The code cell calculates embeddings for the facts and uploads them to Azure Cognitive Search.

In [5]:
foreach(Fact fact in facts) {
    EmbeddingsOptions factEmbeddingsOptions = new EmbeddingsOptions(embeddingDeploymentName,new List<string> {fact.FactDescription});
    var factEmbedding = await openAIClient.GetEmbeddingsAsync(factEmbeddingsOptions);
    float[] embeddingData = factEmbedding.Value.Data[0].Embedding.ToArray();
    fact.FactVector = embeddingData;
    Console.WriteLine($"Fact {fact.FactName} embedded...");
}

SearchClient searchClient = searchIndexClient.GetSearchClient(indexName);
await searchClient.IndexDocumentsAsync(IndexDocumentsBatch.Upload(facts));

Console.WriteLine("Facts uploaded...");

Fact Company Music embedded...
Fact Company Maritim embedded...
Fact Company Agriculture embedded...
Facts uploaded...


## Step 5 - Vector search

The code cell is using Azure Cognitive Search to search for the most relevant fact based on a user query. 
The query is converted to an embedding and compared to the embeddings of the facts. The most similar fact is returned.

*Query:*: "Who produces Container Ships?"

In [6]:
//Search vectors
EmbeddingsOptions embeddingsOptions; 
embeddingsOptions = new EmbeddingsOptions(embeddingDeploymentName, new List<string> {"Who produces Container Ships?"});
var embedding = await openAIClient.GetEmbeddingsAsync(embeddingsOptions);
float[] vectorizedResult = embedding.Value.Data[0].Embedding.ToArray();


SearchResults<Fact> response = await searchClient.SearchAsync<Fact>(
    null,
    new SearchOptions {
        Vectors = { new() { Value = vectorizedResult, KNearestNeighborsCount = 1, Fields = { "FactVector" } } },
    }
);

Console.WriteLine($"Single Vector Search Results:");
await foreach (SearchResult<Fact> result in response.GetResultsAsync())
{
    Console.WriteLine($"Search result: {result.Document.FactId}: {result.Document.FactName}");
}

Single Vector Search Results:
Search result: 2: Company Maritim
