# Improving Baseline RAG

## Key Topics
1. **Observability**
    - Observability should be a priority.
1. **Chunk Strategy**
    - Chunk size and overlap
    - Semantic chunking: divide text based on semantic coherence rather than fixed sizes
1. **Query Transformations**
    - Reform queries to improve retrieval
    - Break down complex queries into sub-queries
1. **Adaptive Techniques**
    - Perform multiple rounds of retrievel to refine and enhance result
    - Using user provided feedback of AI generated results
1. **Similarity Algorithms** 
    - Understand the various algorithms for finding similar content.
1. **Claim Provenance**
    - Methods for linking AI results and source material
## Considerations


# Implementation

## Prereqs
Deploy the following services:

Rename the [env.example.json](env.example.json) file to `env.json` and fill in the values.

### Load the Semantic Kernel

In [4]:
#!import ../../config/csharp/SemanticKernelSettings.cs 
#!import ../../config/csharp/AzureAISearchSettings.cs 
#!import ../../config/csharp/CosmosSqlService.cs 

#r "nuget: Azure.AI.OpenAI, 2.0.0-beta.3"
#r "nuget: Azure.Search.Documents, 11.6.0"
#r "nuget: Azure.Identity, 1.12.0"
#r "nuget: Microsoft.Azure.Cosmos, 3.42.0"
#r "nuget: Microsoft.SemanticKernel, 1.18.1-rc"
#r "nuget: Microsoft.SemanticKernel.Connectors.OpenAI, 1.18.1-rc"
#r "nuget: Microsoft.SemanticKernel.Plugins.Memory, 1.18.1-alpha"
#r "nuget: Microsoft.SemanticKernel.Planners.OpenAI, 1.18.1-preview"
#r "nuget: Microsoft.ML.Tokenizers, 0.22.0-preview.24378.1"
#r "nuget: Microsoft.Data.Analysis, 0.21.0"
#r "nuget: System.Linq.Async, 6.0.1"
#r "nuget: CsvHelper, 33.0.1"

using System.Globalization;
using System.ComponentModel;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.DependencyInjection;
using System.Text.Json;
using System.Text.Json.Serialization;

using Microsoft.Data.Analysis;
using CsvHelper;
using CsvHelper.Configuration;
using Azure;
using Azure.Identity;


using Azure.Search.Documents;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Models;
using Azure.Search.Documents.Indexes.Models;


using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Connectors.AzureOpenAI;
using Microsoft.SemanticKernel.Planning;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Text;
using Microsoft.SemanticKernel.Memory;

using Microsoft.ML.Tokenizers;
using Kernel = Microsoft.SemanticKernel.Kernel;

var (textModel, embeddingModel, openAIEndpoint, openAIKey) = SemanticKernelSettings.LoadFromFile("env.json");
var (searchEndpoint, searchKey, searchIndex) = AzureAISearchSettings.LoadFromFile("env.json");

var promptExecutionSettings = new OpenAIPromptExecutionSettings { ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions };

IKernelBuilder  getDefaultKernelBuilder() => Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(
        endpoint: openAIEndpoint,
        apiKey: openAIKey,
        deploymentName: textModel);


var searchIndexClient = new SearchClient(new Uri(searchEndpoint), searchIndex, new AzureKeyCredential(searchKey));
Tokenizer s_tokenizer = TiktokenTokenizer.CreateForModel(textModel);


## Semantic Memory

**Vector Stores** maps to an instance of a database

**Collection** is a collection of records including any index required to query or filter those records

**Record** is an individual data entry in the database




### Basic
Most frameworks that build chat interfaces around LLMs integrate the ability to provide a chat history as part of the context. This allows the AI application to build

In [None]:
// Disabling Experimental warnings
#pragma warning disable SKEXP0010

var builder = getDefaultKernelBuilder();
builder.AddAzureOpenAITextEmbeddingGeneration(embeddingModel, openAIEndpoint, openAIKey)
.Add;


### Vectors and Embeddings 

**Aure Databases supporting vector fields are preferred when:**
- You have structured or semi-structured operatational data (e.g. chat history, user profiles) in the database
- You need a single source of truth and don't want to synchronize separate databases
- You need [OLTP](https://learn.microsoft.com/en-us/azure/architecture/data-guide/relational-data/online-transaction-processing) database characteristics, like atomic transactions and consistency

  *Databases supporting vector fields*
  - Azure Cosmos DB for NoSQL Integrated Vector - - Database with DiskANN
  - Azure Cosmos DB for MongoDB Integrated Vector Database
  - Azure SQL Database
  - Azure PostgreSQL Server pgvector Extension
  - Open-source vector databases

**Azure AI Search is preferred when:**
- You have both structured & unstructured data (e.g. images, PDFs, text) from a variety of data sources
- You require search technology such as semantic re-ranking, multi-language support, hybrid text/vector search, etc.
- The consuming application requires a search engine like experience

In [1]:
// TODO: Example with Azure AI search and Cosmos DB with Vectors



## Structured Database Copilot
When there is structured

**Example User Stories**
- I have application monitoring or metric data that I want to derive insights from. 
- I want to chat over the entire corpus of Service Now or other ICM support ticket information

In [None]:


public class CosmosCopilot {
    private readonly CosmosClient _client;
    private readonly Container _container;
    private readonly ILogger _logger;

    public CosmosCopilot(string connectionString, string databaseName, string containerName) {
        _client = new CosmosClient(connectionString);
        _container = _client.GetContainer(databaseName, containerName);
        _logger = NullLogger.Instance;
    }

    public CosmosCopilot(string connectionString, string databaseName, string containerName, ILogger logger) {
        _client = new CosmosClient(connectionString);
        _container = _client.GetContainer(databaseName, containerName);
        _logger = logger;
    }

    public async Task<IEnumerable<T>> QueryAsync<T>(string query) {
        var iterator = _container.GetItemQueryIterator<T>(query);
        var results = new List<T>();
        while (iterator.HasMoreResults) {
            var response = await iterator.ReadNextAsync();
            results.AddRange(response);
        }
        return results;
    }

    public async Task<IEnumerable<T>> QueryAsync<T>(QueryDefinition query) {
        var iterator = _container.GetItemQueryIterator<T>(query);
        var results = new List<T>();
        while (iterator.HasMoreResults) {
            var response = await iterator.ReadNextAsync();
            results.AddRange(response);
        }
        return results;
    }

    public async Task<T> GetItemAsync<T>(string partitionKey, string id) {
        try {
            var response = await _container.ReadItemAsync<T>(id, new PartitionKey(partitionKey));
            return response.Resource;
        } catch (CosmosException e) {
            _logger.LogError(e, "Error reading item from Cosmos DB");
            return default;
        }
    }

    public async Task<T> UpsertItemAsync<T>(T item) {
        try {
            var response = await _container.UpsertItemAsync(item);
            return response.Resource;
        } catch (CosmosException e) {
            _logger.LogError(e, "Error upserting item to Cosmos DB");
            return default;
        }
    }
}

## Semantic Search Context

Security Trimming - https://learn.microsoft.com/en-us/azure/search/search-security-trimming-for-azure-search

## Multiple Indexes