# Baseline RAG
There are many pre-existing documents, examples, and accelerators demonstrating Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs). This solution is meant to build on these materials, providing implementation specifics for a complete solution.

*For general guidance on RAG*
  - [Retrieval Augmented Generation using Azure Machine Learning prompt flow (preview)](https://learn.microsoft.com/en-us/azure/machine-learning/concept-retrieval-augmented-generation?view=azureml-api-2)
  - [Retrieval Augmented Generation (RAG) in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview)
  - [Azure Enterprise RAG Accelerator](https://aka.ms/gpt-rag)

## Key Topics

1. **Semantic Search:** Comparing different approaches of semantic search over structured data, unstructured data, text, and embeddings.

1. **Noise, Data Size, Data Velocity:** Searching over large sets of data can cause noise in the grounding context. Similarly, how do you handle datasets with a high velocity of data changes.

1. **Data stores:** Which data stores should be used in various scenarios.

1. **Security:** Implementing record-level security around data retreival in conversational applications.

1. **Performance:** How to evaluate the performance of retreival techniques and solutions to improve performance.

## Considerations

**Alternative solutions**
- Basic Semantic Search fails when answers require piecing together information from more than a few sources or when the queries are abstract (e.g. catch me up on the last two weeks of emails)

- Large context window (e.g. 1 million tokens++) fails when scale of dataset is too large. "Lost in the middle" phenonemon (e.g. "What creatures are discussed in this podcast series?")

- See [GraphRAG](../graph_rag/solution.ipynb) for handling large context windows and dataset summarization

# Implementation

## Setup Guide
1. Deploy the following services:
    - Deploy Bicep script or bring your own

1. Rename the [env.example.json](env.example.json) file to `env.json` and fill in the values.

1. Load the Semantic Kernel

In [7]:
#!import code/Setup.cs 

## Semantic Memory

AI memory structures hold a lot of promise in helping Language Models orchestrate how it approaches problems. 

### Vectors and Embeddings 


#### Do you need a vector database?

#### Do you need hybrid search?


In [52]:
graph TD
    A[User Query] -->|Sends Query| B[Azure AI Search]
    B -->|Performs Hybrid Search| C[Search Index]
    C -->|Returns Results| D[Azure AI Search]
    D -->|Performs Semantic Re-ranking| E[Re-ranked Results]
    E -->|Returns to User| F[User]

#### Which database?

**Aure Databases supporting vector fields are preferred when:**
- You have structured or semi-structured operatational data (e.g. chat history, user profiles) in the database
- You need a single source of truth and don't want to synchronize separate databases
- You need [OLTP](https://learn.microsoft.com/en-us/azure/architecture/data-guide/relational-data/online-transaction-processing) database characteristics, like atomic transactions and consistency

  *Databases supporting vector fields*
  - Azure Cosmos DB for NoSQL Integrated Vector - - Database with DiskANN
  - Azure Cosmos DB for MongoDB Integrated Vector Database
  - Azure SQL Database
  - Azure PostgreSQL Server pgvector Extension
  - Open-source vector databases

**Azure AI Search is preferred when:**
- You have both structured & unstructured data (e.g. images, PDFs, text) from a variety of data sources
- You require search technology such as semantic re-ranking, multi-language support, hybrid text/vector search, etc.
- The consuming application requires a search engine like experience

In [4]:
#pragma warning disable SKEXP0001
// Vector Models
using Microsoft.Azure.Cosmos;
using Microsoft.SemanticKernel.Connectors.AzureCosmosDBNoSQL;
using Microsoft.SemanticKernel.Data;
using IndexKind = Microsoft.Azure.Cosmos.IndexKind;
using System.Reflection;
using System.Text.Json.Serialization;
using Azure.Core.Serialization;
using Container = Microsoft.Azure.Cosmos.Container;

// public record 10KDocument(
//     [property: VectorStoreRecordKey] string HotelId,
//     [property: VectorStoreRecordData] string HotelName,
//     [property: VectorStoreRecordData] string Description,
//     [property: VectorStoreRecordVector(Dimensions: 4, IndexKind: IndexKind.Hash, DistanceFunction: DistanceFunction.CosineSimilarity), JsonPropertyName("description_embeddings")] ReadOnlyMemory<float>? DescriptionEmbeddings);


public record PartitionedEntity(string PartitionKey, string Type){

    [JsonConstructor]
    public PartitionedEntity(string PartitionKey, string Type, string Id): this(PartitionKey, Type)
    {
        this.PartitionKey = PartitionKey;
        this.Type = Type;
        this.Id = Id;
    }
    public string Id { get; set; } = null;
    public string SourceUri { get; set; } = string.Empty;
};

public record CompanyOfficer(
    int CompanyCIK,
    string FirstName,
    string LastName,
    int? Age,
    string Title,
    int? YearBorn,
    long TotalPay
): PartitionedEntity(CompanyCIK.ToString(), "CompanyOfficer", $"{CompanyCIK}_{FirstName + LastName}");

public record BasicCompanyInfo (
    string Address1,
    string City,
    string State,
    string Zip,
    string Country,
    string Phone,
    string Website,
    string Industry,
    string Sector,
    string LongBusinessSummary,
    ICollection<CompanyOfficer> CompanyOfficers,
    string IrWebsite,
    string Exchange,
    string QuoteType,
    string TickerSymbol,
    string UnderlyingSymbol,
    string ShortName,
    string SecName,
    int CIK,
    string PrimaryExchange,
    ICollection<string> AssociatedCusips
): PartitionedEntity(PartitionKey: CIK.ToString(), Type: "CompanyInfo", Id: CIK.ToString());

var Form10KSections = new Dictionary<string, string>
{
    { "item1", "Business: requires a description of the company’s business, including its main products and services, what subsidiaries it owns, and what markets it operates in" },
    { "item1a", "Risk Factors: includes information about the most significant risks that apply to the company or to its securities" },
    { "item1b", "Unresolved Staff Comments: requires the company to explain certain comments it has received from the SEC staff on previously filed reports that have not been resolved after an extended period of time" },
    { "item2", "Properties: includes information about the company’s significant properties, such as principal plants, mines and other materially important physical properties" },
    { "item3", "Legal Proceedings: requires the company to include information about significant pending lawsuits or other legal proceedings, other than ordinary litigation" },
    { "item7", "Management’s Discussion and Analysis of Financial Condition and Results of Operations (MD&A): gives the company’s perspective on the business results of the past financial year. This section, known as the MD&A for short, allows company management to tell its story in its own words" },
    { "item7a", "Quantitative and Qualitative Disclosures About Market Risk: requires information about the company’s exposure to market risk, such as interest rate risk, foreign currency exchange risk, commodity price risk or equity price risk" },
    { "item8", "Financial Statements and Supplementary Data: requires the company’s audited financial statements" },
    { "item10", "Directors, Executive Officers and Corporate Governance: requires information about the background and experience of the company’s directors and executive officers, the company’s code of ethics, and certain qualifications for directors and committees of the board of directors" },
    { "item11", "Executive Compensation: includes detailed disclosure about the company’s compensation policies and programs and how much compensation was paid to the top executive officers of the company in the past year" },
    { "item15", "Exhibits, Financial Statement Schedules: Many exhibits are required, including documents such as the company’s bylaws, copies of its material contracts, and a list of the company’s subsidiaries" }
};
public record SecForm10KSection(int CIK, DateTime FilingDate, string SectionName, string SectionShortName, string SectionText, ReadOnlyMemory<float> ContentEmbedding): PartitionedEntity(CIK.ToString(), "10-K", $"{CIK}_{FilingDate}_{SectionName}");
public record SecForm13FHolding(int CIK, string ManagerName, string SecurityName, int Shares, int Value, string SecurityType, string Cusip, DateTime ReportedDate): PartitionedEntity(Cusip, "13F-HR", $"{Cusip}_{ManagerName}");
public record SecForm13D(int CIK, string ReportingPerson, DateTime FilingDate, string Description): PartitionedEntity(CIK.ToString(), "13D");

public record DailyMarketData(string Symbol, DateTime Date, float Open, float High, float Low, float Close, long Volume): PartitionedEntity(Symbol, "DailyMarketData");
public record NewsArticle(string Headline, string ArticleText, string SourceName, string Uri, DateTime PublishDate): PartitionedEntity(SourceName, "NewsArticle");


// cosmosNoSqlService.BulkUpload()
// public class ContextBuilder
// {
//     readonly int _maxTokens;
//     readonly int _maxPromptTokens;
//     readonly Dictionary<string, Type> _memoryTypes;
//     readonly PromptOptimizationSettings? _promptOptimizationSettings;

//     const int BufferTokens = 50;

//     string _systemPrompt = string.Empty;
//     List<object> _memories = new List<object>();
//     List<(AuthorRole AuthorRole, string Content)> _messages = new List<(AuthorRole AuthorRole, string Content)>();

//     public ContextBuilder(
//         int maxTokens,
//         Dictionary<string, Type> memoryTypes,
//         ITokenizer? tokenizer = null,
//         PromptOptimizationSettings? promptOptimizationSettings = null) 
//     {
//         _maxTokens = maxTokens;
//         _memoryTypes = memoryTypes;

//         // If no external tokenizer has been provided, use our own
//         _tokenizer = tokenizer ?? new MicrosoftMLTokenizer();
        
//         _promptOptimizationSettings = promptOptimizationSettings != null
//             ? promptOptimizationSettings
//             : new PromptOptimizationSettings 
//             {
//                 CompletionsMinTokens = 50,
//                 CompletionsMaxTokens = 300,
//                 SystemMaxTokens = 1500,
//                 MemoryMinTokens = 500,
//                 MemoryMaxTokens = 2500,
//                 MessagesMinTokens = 1000,
//                 MessagesMaxTokens = 3000
//             };

//         // Use BufferTokens (default 50) tokens as a buffer for extra needs resulting from concatenation, new lines, etc.
//         _maxPromptTokens = _maxTokens - _promptOptimizationSettings.CompletionsMaxTokens - BufferTokens;
//     }

//     public ContextBuilder WithSystemPrompt(string prompt)
//     {
//         ArgumentNullException.ThrowIfNullOrEmpty(prompt, nameof(prompt));
//         _systemPrompt = prompt;
//         return this;
//     }

//     public ContextBuilder WithMemories(List<string> memories)
//     {
//         ArgumentNullException.ThrowIfNull(memories, nameof(memories));

//         // This function transforms the JSON into a more streamlined string of text, more suitable for generating responses
//         // Use by default the JSON text representation based on EmbeddingFieldAttribute
//         // TODO: Test also using the more elaborate text representation - itemToEmbed.TextToEmbed
//         _memories = memories.Select(m => (object) EmbeddingUtility.Transform(m, _memoryTypes).TextToEmbed).ToList();
//         return this;
//     }

//     public ContextBuilder WithMessageHistory(List<(AuthorRole AuthorRole, string Content)> messages) 
//     {
//         ArgumentNullException.ThrowIfNull(messages, nameof(messages));
//         _messages = messages;
//         return this;
//     }

//     public string Build()
//     {
//         OptimizePromptSize();

//         var result = new StringBuilder();

//         if (_memories.Count > 0)
//         {
//             var memoriesPrompt = string.Join(Environment.NewLine, _memories.Select(
//                 m => $"{JsonConvert.SerializeObject(m)}{Environment.NewLine}---------------------------{Environment.NewLine}").ToArray());
//             result.Append($"Context:{Environment.NewLine}{Environment.NewLine}{memoriesPrompt}{Environment.NewLine}{Environment.NewLine}".NormalizeLineEndings());
//         }

//         if (_messages.Count > 0)
//         {
//             result.Append($"The history of the current conversation is:{Environment.NewLine}{Environment.NewLine}".NormalizeLineEndings());
//             foreach (var message in _messages)
//                 result.Append($"{message.AuthorRole}: {message.Content}{Environment.NewLine}".NormalizeLineEndings());
//         }

//         return result.ToString();
//     }

//     private void OptimizePromptSize()
//     {
//         var systemPromptTokens = _tokenizer!.GetTokensCount(_systemPrompt);

//         var memories = _memories.Select(m => new
//         {
//             Memory = m,
//             Tokens = _tokenizer.GetTokensCount(JsonConvert.SerializeObject(m).NormalizeLineEndings())
//         }).ToList();

//         // Keep in reverse order because we need to keep the most recents messages
//         var messages = _messages.Select(m => new
//         {
//             Message = m,
//             Tokens = _tokenizer.GetTokensCount(m.Content)
//         }).Reverse().ToList();

//         // All systems green?
//         var totalTokens = systemPromptTokens + memories.Sum(mt => mt.Tokens) + messages.Sum(mt => mt.Tokens) + BufferTokens;
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, not reaching the limit
//             return;

//         // Start trimming down things to fit within the defined constraints

//         if (systemPromptTokens > _promptOptimizationSettings!.SystemMaxTokens)
//             throw new Exception($"The estimated size of the core system prompt ({systemPromptTokens} tokens) exceeds the configured maximum of {_promptOptimizationSettings.SystemMaxTokens}.");

//         // Limit memories

//         var tmpMemoryTokens = 0;
//         var validMemoriesCount = 0;

//         foreach (var m in memories)
//         {
//             tmpMemoryTokens += m.Tokens;
//             if (tmpMemoryTokens <= _promptOptimizationSettings.MemoryMaxTokens)
//                 validMemoriesCount++;
//             else
//                 break;
//         }

//         // Keep the memories that allow us to obey the limit rule (still in reverse order as we might need to further limit)
//         memories = memories.Take(validMemoriesCount).ToList();
//         _memories = memories.Select(m => m.Memory).ToList();

//         var tmpMessagesTokens = 0;
//         var validMessagesCount = 0;

//         foreach(var m in messages)
//         {
//             tmpMessagesTokens += m.Tokens;
//             if (tmpMessagesTokens <= _promptOptimizationSettings.MessagesMaxTokens)
//                 validMessagesCount++;
//             else
//                 break;
//         }

//         // Keep the messages that allow us to obey the limit rule (still in reverse order as we might need to further limit)
//         messages = messages.Take(validMessagesCount).ToList();
//         _messages = messages.Select(m => m.Message).Reverse().ToList();

//         // All systems green?
//         var memoryTokens = memories.Sum(mt => mt.Tokens);
//         var messagesTokens = messages.Sum(mt => mt.Tokens);
//         totalTokens = systemPromptTokens + memoryTokens + messagesTokens + BufferTokens;
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, just got below the overall limit using the configured max limits for memories and messages
//             return;

//         // Still not good, so continue trimming down things

//         // Eliminate one memory at a time in reverse order until we either reach the token goal or we fall bellow the minimum memory token count
//         for (int i = memories.Count - 1; i >= 0; i--)
//         {
//             if (memoryTokens - memories[i].Tokens < _promptOptimizationSettings.MemoryMinTokens
//                 || totalTokens <= _maxPromptTokens)
//             // This memory will not be eliminated because we've either got below the overall limit or its elimination will get us below the minimum memory token count
//             {
//                 memories = memories.Take(i + 1).ToList();
//                 _memories = memories.Select(m => m.Memory).ToList();
//                 memoryTokens = memories.Sum(mt => mt.Tokens);
//                 break;
//             }

//             memoryTokens -= memories[i].Tokens;
//             totalTokens -= memories[i].Tokens;
//         }

//         // All systems green?
//         totalTokens = systemPromptTokens + memoryTokens + messagesTokens + BufferTokens;
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, just got below the overall limit without reaching the lower limit for memory tokens
//             return;

//         // Still not good, so continue trimming down things

//         // Eliminate one message at a time in reverse order until we either reach the token goal or we fall bellow the minimum memory token count
//         for (int i = messages.Count - 1; i > 0; i--)
//         {
//             if (messagesTokens - messages[i].Tokens < _promptOptimizationSettings.MessagesMinTokens
//                 || totalTokens <= _maxPromptTokens)
//             // This message will not be eliminated because we've either got below the overall limit or its elimination will get us below the minimum messages token count
//             {
//                 messages = messages.Take(i + 1).ToList();
//                 _messages = messages.Select(m => m.Message).Reverse().ToList();
//                 messagesTokens = messages.Sum(mt => mt.Tokens);
//                 break;
//             }

//             messagesTokens -= messages[i].Tokens;
//             totalTokens -= messages[i].Tokens;
//         }

//         // All systems green?
//         totalTokens = systemPromptTokens + memoryTokens + messagesTokens + BufferTokens;
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, just got below the overall limit without reaching the lower limit for messages tokens
//             return;

//         // Oops! The least significant memory and the least significant message are preventing us from getting below the overall limit

//         // Remove the least significant memory
//         totalTokens -= memories.Last().Tokens;
//         memories.RemoveAt(memories.Count - 1);
//         _memories = memories.Select(m => m.Memory).ToList();

//         // All systems green?
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, just got below the overall limit by removing the least significant memory
//             return;

//         // Remove the least significant message
//         totalTokens -= messages.Last().Tokens;
//         messages.RemoveAt(messages.Count - 1);
//         _messages = messages.Select(m => m.Message).Reverse().ToList();

//         // All systems green?
//         if (totalTokens <= _maxPromptTokens)
//             // We're good, just got below the overall limit by removing the least significant message
//             return;

//         // Error! Most likely, the prompt optimization settings are inconsistent
//         throw new Exception("Cannot produce a prompt using the current prompt optimization settings.");
//     }


// }

### **CosmosDB: Understand Index Type and Distance Functions**
---
#### **1. `Vector Index Type`**

This option determines how vectors are indexed within Cosmos DB to optimize search performance.
<details>
<summary>
Options
</summary>

- **`flat`**: Stores vectors alongside other indexed properties without additional indexing structures. Supports up to **505 dimensions**.

  **When to Use:**

  - **Low-dimensional data**: Ideal for applications with vectors up to 505 dimensions.
  - **Exact search requirements**: When you need precise search results.
  - **Small to medium datasets**: Efficient for datasets where the index size won't become a bottleneck.

    **Real-World Scenario:**

    - **Customer Segmentation**: A retail company uses customer feature vectors (age, income, purchase history) with dimensions well below 505 to segment customers. Exact matches are important for targeted marketing campaigns.

- **`quantizedFlat`**: Compresses (quantizes) vectors before indexing, improving performance at the cost of some accuracy. Supports up to **4096 dimensions**.

  **When to Use:**

  - **High-dimensional data with storage constraints**: Suitable for vectors up to 4096 dimensions where storage efficiency is important.
  - **Performance-critical applications**: When reduced latency and higher throughput are needed.
  - **Acceptable accuracy trade-off**: Minor losses in accuracy are acceptable for performance gains.

    **Real-World Scenario:**

    - **Mobile Image Recognition**: An app recognizes objects using high-dimensional image embeddings. Quantization reduces the storage footprint and improves search speed, crucial for mobile devices with limited resources.

- **`diskANN`**: Utilizes the DiskANN algorithm for approximate nearest neighbor searches, optimized for speed and efficiency. Supports up to **4096 dimensions**.

  **When to Use:**

  - **Large-scale, high-dimensional data**: Best for big datasets where quick approximate searches are acceptable.
  - **Real-time applications**: When fast response times are critical.
  - **Scalability needs**: Suitable for applications expected to grow significantly.

  **Real-World Scenario:**

  - **Semantic Search Engines**: A search engine indexes millions of documents using embeddings from language models like BERT (768 dimensions). DiskANN allows users to get fast search results by efficiently handling high-dimensional data.
</details>

---

#### **2. `Vector Data Type`**

Specifies the data type of the vector components.
<details>
<summary>Options</summary>

- **`float32`** (default): 32-bit floating-point numbers.

  **When to Use:**

  - **High precision requirements**: Necessary when the application demands precise calculations.
  - **Standard ML embeddings**: Most machine learning models output float32 vectors.

  **Real-World Scenario:**

  - **Scientific Simulations**: In climate modeling, vectors represent complex data where precision is vital for accurate simulations and predictions.

- **`uint8`**: 8-bit unsigned integers.

  **When to Use:**

  - **Memory optimization**: Reduces storage needs when precision can be sacrificed.
  - **Quantized models**: When vectors are output from models that already quantize data.

  **Real-World Scenario:**

  - **Basic Image Features**: Storing color histograms for image retrieval systems, where each bin can be represented with an 8-bit integer.

- **`uint8`**: 8-bit integer with potentially specialized encoding (interpretation may vary; assuming it's an 8-bit integer with logarithmic encoding).

  **When to Use:**

  - **Custom quantization schemes**: When using specialized compression techniques that map floating-point values to an 8-bit integer scale.
  - **Edge devices**: Ideal for applications on devices with extreme memory limitations.

  **Real-World Scenario:**

  - **Audio Fingerprinting**: Compressing audio feature vectors for song recognition apps where storage and quick retrieval are essential.
</details>

---
#### **3. `Dimension Size`**

The length of the vectors being indexed. Ranges from 0-4096, default is **1536**.
<details>
<summary>Options</summary>


**When to Consider Lower Dimensions (≤ 505):**

  - **Simpler models**: Applications using basic embeddings or feature vectors.
  - **Flat index type**: Required when using the `flat` index type due to its dimension limit.

  *Real-World Scenario:*

  - **Keyword Matching**: Using low-dimensional TF-IDF vectors for document similarity in a content management system.

  **When to Consider Higher Dimensions (506 - 4096):**

  - **Complex models**: Deep learning applications with high-dimensional embeddings.
  - **Advanced search features**: When richer representations of data are necessary for accuracy.

  *Real-World Scenario:*

  - **Face Recognition**: Using high-dimensional embeddings (e.g., 2048 dimensions) to represent facial features for security systems.
</details>

---

#### **4. `Distance Function`**

Determines how similarity between vectors is calculated.
<details>
<summary>Options</summary>

- **`cosine`**: Measures the cosine of the angle between vectors.

  **When to Use:**

  - **Orientation-focused similarity**: When the magnitude is less important than the direction.
  - **Normalized data**: Ideal when vectors are normalized to unit length.

  **Real-World Scenario:**

  - **Document Similarity**: In text analytics, comparing documents based on topic similarity where word counts are normalized.

- **`dot product`**: Computes the scalar product of two vectors.

  **When to Use:**

  - **Magnitude matters**: When both direction and magnitude are significant.
  - **Machine learning models**: Often used in recommendation systems where strength of preferences is important.

  **Real-World Scenario:**

  - **Personalized Recommendations**: Matching users to products by calculating the dot product of user and item embeddings in a collaborative filtering system.

- **`euclidean`**: Calculates the straight-line distance between vectors.

  **When to Use:**

  - **Spatial distance relevance**: When physical distance correlates with similarity.
  - **High-dimensional data**: Suitable for embeddings where both magnitude and direction impact similarity.

  **Real-World Scenario:**

  - **Anomaly Detection**: Identifying outliers in network traffic patterns by measuring Euclidean distances in feature space.

---



### **CosmosDB: Option Combinations and Their Preferred Use-Cases**

---

#### **Combination 1: Low-Dimensional, Exact Searches**

- **`vectorIndexType`**: `flat`
- **`datatype`**: `float32`
- **`dimensions`**: ≤ 505
- **`distanceFunction`**: `cosine`

**Real-World Scenario:**

- **Small-Scale Text Classification**: A startup builds a news categorization tool using word embeddings (300 dimensions). Exact cosine similarity searches ensure accurate article tagging without the overhead of approximate methods.

---

#### **Combination 2: High-Dimensional, Performance-Critical Applications**

- **`vectorIndexType`**: `diskANN`
- **`datatype`**: `float32`
- **`dimensions`**: 768 - 1536
- **`distanceFunction`**: `cosine` or `dot product`

**Real-World Scenario:**

- **Real-Time Recommendations**: A streaming service uses user and content embeddings (1024 dimensions) to provide instantaneous movie recommendations. DiskANN accelerates search times, offering a smooth user experience despite the large dataset.

---

#### **Combination 3: Storage-Efficient High-Dimensional Data**

- **`vectorIndexType`**: `quantizedFlat`
- **`datatype`**: `uint8` or `iln8`
- **`dimensions`**: 2048
- **`distanceFunction`**: `cosine`

**Real-World Scenario:**

- **Mobile Visual Search**: An app allows users to search for products by uploading photos. High-dimensional image embeddings are quantized to fit the storage constraints of mobile devices, and approximate searches provide quick results.

---

#### **Combination 4: Precision-Critical Scientific Computing**

- **`vectorIndexType`**: `flat`
- **`datatype`**: `float32`
- **`dimensions`**: 4096
- **`distanceFunction`**: `euclidean`

**Real-World Scenario:**

- **Genomic Data Analysis**: Researchers analyze genetic sequences represented as high-dimensional vectors. Precise Euclidean distance calculations are essential for identifying genetic similarities and mutations.

---

#### **Combination 5: Medium-Dimensional Data with Storage Constraints**

- **`vectorIndexType`**: `quantizedFlat`
- **`datatype`**: `uint8`
- **`dimensions`**: 500
- **`distanceFunction`**: `dot product`

**Real-World Scenario:**

- **IoT Sensor Data**: A network of sensors generates medium-dimensional vectors representing environmental data. Quantization reduces storage and transmission costs, and dot product calculations help in identifying patterns and anomalies efficiently.

## Structured Database Copilot
NL2SQL - Database query generation

### Considerations
- Usually good at building most of the database query, however it needs prompt tuning or native functions to improve the where clause.

**Example User Stories**
- I have application monitoring or metric data that I want to derive insights from. 
- I want to chat over the entire corpus of Service Now or other ICM support ticket information

In [None]:


public class CosmosCopilot {
    private readonly CosmosClient _client;
    private readonly Container _container;
    private readonly ILogger _logger;

    public CosmosCopilot(string connectionString, string databaseName, string containerName) {
        _client = new CosmosClient(connectionString);
        _container = _client.GetContainer(databaseName, containerName);
        _logger = NullLogger.Instance;
    }

    public CosmosCopilot(string connectionString, string databaseName, string containerName, ILogger logger) {
        _client = new CosmosClient(connectionString);
        _container = _client.GetContainer(databaseName, containerName);
        _logger = logger;
    }

    public async Task<IEnumerable<T>> QueryAsync<T>(string query) {
        var iterator = _container.GetItemQueryIterator<T>(query);
        var results = new List<T>();
        while (iterator.HasMoreResults) {
            var response = await iterator.ReadNextAsync();
            results.AddRange(response);
        }
        return results;
    }

    public async Task<IEnumerable<T>> QueryAsync<T>(QueryDefinition query) {
        var iterator = _container.GetItemQueryIterator<T>(query);
        var results = new List<T>();
        while (iterator.HasMoreResults) {
            var response = await iterator.ReadNextAsync();
            results.AddRange(response);
        }
        return results;
    }

    public async Task<T> GetItemAsync<T>(string partitionKey, string id) {
        try {
            var response = await _container.ReadItemAsync<T>(id, new PartitionKey(partitionKey));
            return response.Resource;
        } catch (CosmosException e) {
            _logger.LogError(e, "Error reading item from Cosmos DB");
            return default;
        }
    }

    public async Task<T> UpsertItemAsync<T>(T item) {
        try {
            var response = await _container.UpsertItemAsync(item);
            return response.Resource;
        } catch (CosmosException e) {
            _logger.LogError(e, "Error upserting item to Cosmos DB");
            return default;
        }
    }
}

## Semantic Search Context

Security Trimming - https://learn.microsoft.com/en-us/azure/search/search-security-trimming-for-azure-search

## Multiple Indexes

# Best Practices