# SK RAG Pattern Foundational Concepts

Learning objectives:

- RAG pattern foundational concepts
- SK Memories with RAM or Sqlite as VectorDB


## Setup

### Load required .NET packages and supporting constants, classes, etc.

In [1]:
#r "nuget: Microsoft.SemanticKernel, 1.4.0"
#r "nuget: Microsoft.SemanticKernel.Core, 1.4.0"
#r "nuget: Microsoft.SemanticKernel.Plugins.Memory, 1.4.0-alpha"
#r "nuget: Microsoft.SemanticKernel.Connectors.Sqlite, 1.4.0-alpha"
#r "nuget: Microsoft.SemanticKernel.Connectors.Postgres, 1.4.0-alpha"
#r "nuget: Npgsql" 

#r "nuget: dotenv.net"

using System;

using System.IO;
using System.Text;
using System.Text.RegularExpressions;
using System.Text.Json;
using System.Text.Json.Serialization;
using Npgsql;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel.Connectors.Sqlite;
using Microsoft.SemanticKernel.Connectors.Postgres;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Plugins.Memory;

using dotenv.net;
using InteractiveKernel = Microsoft.DotNet.Interactive.Kernel;

#!import Models/Models.cs

const string MemoryCollectionName = "LearningsCollection";

### Read the API Key and endpoints from environment variables or the .env file

In [2]:
// Load the .env file
DotEnv.Load();

// Get the OpenAI deployment name, endpoint, and key from the environment variables
var deploymentName = Environment.GetEnvironmentVariable("GPT_OPENAI_DEPLOYMENT_NAME");
var endpoint = Environment.GetEnvironmentVariable("GPT_OPENAI_ENDPOINT");
var apiKey = Environment.GetEnvironmentVariable("GPT_OPENAI_KEY");
var pg_conn_str = Environment.GetEnvironmentVariable("PG_CONN_STR");
var adaDeploymentName = "ada";

In [3]:
#pragma warning disable CS8618,IDE0009,CA1051,CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102
var memoryWithCustomDb = new Microsoft.SemanticKernel.Memory.MemoryBuilder()
            .WithAzureOpenAITextEmbeddingGeneration(deploymentName, endpoint, apiKey)
            .WithMemoryStore(new Microsoft.SemanticKernel.Memory.VolatileMemoryStore())
            .Build();
            

### Get a kernel instance configured for text completions and embeddings

In [4]:
// I'm using a RAM stored Vector DB, but I can switch providers like Azure Search, DuckDB, SQLite, etc.
#pragma warning disable CS8618,IDE0009,CA1051,CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102
//var memoryStore = await Microsoft.SemanticKernel.Connectors.Postgres.PostgresMemoryStore("Host=localhost;Port=5432;Username=postgres;Password=postgres;Database=postgres");
//var memoryStore = new Microsoft.SemanticKernel.Memory.VolatileMemoryStore();
//var sqliteStore = await SqliteMemoryStore.ConnectAsync("./vectors.sqlite");

NpgsqlDataSourceBuilder dataSourceBuilder = new(pg_conn_str);
dataSourceBuilder.UseVector();
NpgsqlDataSource dataSource = dataSourceBuilder.Build();
IMemoryStore memoryStore = new PostgresMemoryStore(dataSource, vectorSize: 1536, schema: "public");

var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(deploymentName, endpoint, apiKey)
    .AddAzureOpenAITextEmbeddingGeneration(adaDeploymentName, endpoint, apiKey)
    .Build();

kernel

In [5]:
#pragma warning disable CS8618,IDE0009,CA1051,CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102
var embeddingGenerator = new AzureOpenAITextEmbeddingGenerationService(adaDeploymentName, endpoint, apiKey);

// The combination of the text embedding generator and the memory store makes up the 'SemanticTextMemory' object used to
// store and retrieve memories.
SemanticTextMemory textMemory = new(memoryStore, embeddingGenerator);
textMemory

## Ingestion

### Read and deserialize the JSON learnings data file

In [6]:
var jsonFileContents = File.ReadAllText("data/learnings.json");
var learnings = System.Text.Json.JsonSerializer.Deserialize<List<Learning>>(jsonFileContents);
learnings

index,value
,
,
0,"Learning { Id = 1, Content = In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult.\n\nThe recommendation is both to improve the documentation and p...Id1ContentIn a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult. The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring."
,
Id,1
Content,In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult. The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring.
1,"Learning { Id = 2, Content = In scenarios where App Services are using VNET integration, customers will beginner knowledge of App Services Networking are finding difficult to configure the VNET integration.\n\nThe recommendation is to identify customers who are new to App Services networking and fir...Id2ContentIn scenarios where App Services are using VNET integration, customers will beginner knowledge of App Services Networking are finding difficult to configure the VNET integration. The recommendation is to identify customers who are new to App Services networking and first have them watch a video on the topic."
,
Id,2
Content,"In scenarios where App Services are using VNET integration, customers will beginner knowledge of App Services Networking are finding difficult to configure the VNET integration. The recommendation is to identify customers who are new to App Services networking and first have them watch a video on the topic."

Unnamed: 0,Unnamed: 1
Id,1
Content,In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult. The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring.

Unnamed: 0,Unnamed: 1
Id,2
Content,"In scenarios where App Services are using VNET integration, customers will beginner knowledge of App Services Networking are finding difficult to configure the VNET integration. The recommendation is to identify customers who are new to App Services networking and first have them watch a video on the topic."


### Chunk the learnings & recommendations

**Note:** This is a simple chunker. It chunks by splitting the document into paragraphs. A more realistic chunker would try to optimize the token size limit, chunking smartly (not in the a middle of a paragraph or sentence), etc.

In [7]:
// Keep a list of chunks
var chunks = new List<Chunk>();

// For each learning process the chunks
foreach(var learning in learnings)
{
    // Break the learnings into paragraphs
    var paragraphs = learning.Content.Split("\n\n");
    
    // For each paragraph create a chunk
    for(var i=0;i<paragraphs.Length;i++)
    {
        // Add the chunk to the list
        chunks.Add(new Chunk(learning.Id+"-"+(i+1),paragraphs[i]));
    }
}
chunks

index,value
,
,
,
,
0,"Chunk { Id = 1-1, Text = In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult. }Id1-1TextIn a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult."
,
Id,1-1
Text,In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult.
1,"Chunk { Id = 1-2, Text = The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring. }Id1-2TextThe recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring."
,

Unnamed: 0,Unnamed: 1
Id,1-1
Text,In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult.

Unnamed: 0,Unnamed: 1
Id,1-2
Text,The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring.

Unnamed: 0,Unnamed: 1
Id,2-1
Text,"In scenarios where App Services are using VNET integration, customers will beginner knowledge of App Services Networking are finding difficult to configure the VNET integration."

Unnamed: 0,Unnamed: 1
Id,2-2
Text,The recommendation is to identify customers who are new to App Services networking and first have them watch a video on the topic.


### Save every chunk as a memory

In [8]:
// Create an embedding generator to use for semantic memory.
foreach(var chunk in chunks)
{    
    await textMemory.SaveInformationAsync(MemoryCollectionName, id: chunk.Id, text: chunk.Text);
}

## Grounding

### Retrieve the memory based on a query

In [9]:
//var query = await InteractiveKernel.GetInputAsync("What is your query?");
var question = "What scenario is FrontDoor good for?";

#pragma warning disable CS8618,IDE0009,CA1051,CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102
IAsyncEnumerable<MemoryQueryResult> queryResults =
                textMemory.SearchAsync(MemoryCollectionName, question, limit: 3, minRelevanceScore: 0.77);


### Find memories based on query, and collect the text in the memories to augment the prompt

In [10]:
// Keep the text for the recalled memories
StringBuilder memoryText = new StringBuilder();

#pragma warning disable CS8618,IDE0009,CA1051,CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102
await foreach (MemoryQueryResult r in queryResults)
{
    // Append the text
    memoryText.Append(r.Metadata.Text+"\n\n");
}

// Final augmented text
var promptContext = memoryText.ToString();
Console.WriteLine($"User:\n{question}\n\nNearest results:\n{promptContext}")

User:
What scenario is FrontDoor good for?

Nearest results:
The recommendation is both to improve the documentation and provide feedback to both the FrontDoor and ACA teams to improve the experience when customers are configuring.

In a scenario where Azure FrontDoor is used with Azure Container Apps configuring the certificates is difficult both because the documentation is not clear and because the configuration itself is difficult.




## Process Prompt & Completion

### Create a SK function

In [11]:
const string promptTemplate = "{{$input}}\n\nText:\n\"\"\"{{$context}}\n\"\"\"Use only the provided text.";
var excuseFunction = kernel.CreateFunctionFromPrompt(promptTemplate, new OpenAIPromptExecutionSettings() { MaxTokens = 100, Temperature = 0.4, TopP = 1 });



### Submit the prompt and print the results

In [12]:
var arguments = new KernelArguments()
        {
            ["input"] = question,
            ["context"] = promptContext
        };
var result = await kernel.InvokeAsync(excuseFunction, arguments);
Console.WriteLine(result);

FrontDoor is good for scenarios where customers need to configure certificates for Azure Container Apps. However, in the given scenario, the documentation is not clear and the configuration itself is difficult. Therefore, it is recommended to improve the documentation and provide feedback to both the FrontDoor and ACA teams to enhance the customer experience.
