# Retrieving Answers from MemDoc

Steps:

1. Create and Setup the Kernel with the appropriate model.
2. Acquire the data.
3. Load the data in Volatile Memory.
4. Create the prompts.
5. Ask questions and the get the result via using Recalling.

In [1]:
#r "nuget: Microsoft.SemanticKernel, 0.12.207.1-preview"

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.SemanticFunctions;
using Microsoft.SemanticKernel.KernelExtensions;
using Microsoft.SemanticKernel.Orchestration;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.CoreSkills;

using System.Net.Http;

HttpClient httpClient = new();

## Kernel Setup

In [18]:
string apiKey = ".. Enter your Open AI Key.";
string model = "text-davinci-003";
string orgId = "";

In [4]:
using Microsoft.SemanticKernel.Memory;

var kernel = new KernelBuilder()
    .Configure(c =>
    {
        /* To Use: AzureOpenAI
        if (useAzureOpenAI)
        {
            c.AddAzureTextEmbeddingGenerationService("ada", "text-embedding-ada-002", azureEndpoint, apiKey);
            c.AddAzureTextCompletionService("davinci", model, azureEndpoint, apiKey);
        }
        */
        c.AddOpenAITextEmbeddingGenerationService("ada", "text-embedding-ada-002", apiKey);
        c.AddOpenAITextCompletionService("text-davinci-003", model, apiKey, orgId);
    })
    .WithMemoryStorage(new VolatileMemoryStore())
    .Build();

## Data Acquisition

In [5]:
// Extract Mem Doc questions
string url = "https://gist.githubusercontent.com/MokoSan/ce5b4d5a6cca725385ebdadb377b36b0/raw/fc1f700bc2783852c14549194c2a3aaae0f80d43/MemdocQuestions.md";
var response = await httpClient.GetAsync(url);
string content = await response.Content.ReadAsStringAsync();
string[] split = content.Split("##", StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);

## Load Up The Data in a Memory

In [6]:
const string MemoryCollectionName = "memdocQnA";

var memorySkill = new TextMemorySkill();
kernel.ImportSkill(memorySkill);

// Build a semantic function that saves info to memory
const string SaveFunctionDefinition = @"{{save $info}}";
var memorySaver = kernel.CreateSemanticFunction(SaveFunctionDefinition);

var context = kernel.CreateNewContext();
context[TextMemorySkill.CollectionParam] = MemoryCollectionName;

for (int i = 0; i < split.Length; i++)
{
    context[TextMemorySkill.KeyParam] = $"q{i}";
    context["info"] = split[i];
    await memorySaver.InvokeAsync(context);
}

/*
// Memory can be stored alternatively like this:

for (int i = 0; i < split.Length - 210; i++)
{
    await kernel.Memory.SaveInformationAsync(MemoryCollectionName, id: $"q{i}", text: split[i]);
}
*/

## QnA

In [7]:
// Set up the Semantic Function.
const string skPrompt = @"
{{recall $input}}
---
If the question is irrelevant to the topics above, reply 'I don't know'.
Considering only the information above, which has been loaded from a document consisting of questions and answers, answer the following:
Explain the answers with depth as if you are speaking to a junior performance engineer: 
Question: {{$input}}

Answer:";
context[TextMemorySkill.LimitParam] = "1";
context[TextMemorySkill.RelevanceParam] = "0.7";

var chatFunction = kernel.CreateSemanticFunction(skPrompt, maxTokens: 400, temperature: 0.0);
Func<string, Task> Chat = async (string query) =>
{
    context["input"] = query;
    var res = await chatFunction.InvokeAsync(context);
    Console.WriteLine($"{query}\n >> {res}");
};

### Generic Questions

In [8]:
await Chat("What is a GC?");
await Chat("What are some definitive signs of performance issues?");
await Chat("What is the meaning of life?");
await Chat("What is GC suspension?");

What is a GC?
 >>  A GC, or Garbage Collector, is a process in computer programming that is responsible for managing the memory usage of a program. It does this by periodically scanning the memory of the program and reclaiming any memory that is no longer being used. This helps to ensure that the program runs efficiently and does not run out of memory.
What are some definitive signs of performance issues?
 >>  Performance issues can be indicated by long suspensions, random long GC pauses, and most GCs occurring as full blocking GCs. Long suspensions occur when the application is taking longer than expected to complete a task. Random long GC pauses occur when the garbage collector is taking longer than expected to clean up memory. Full blocking GCs occur when the garbage collector is taking a long time to complete its task, blocking other processes from running. These are all signs that the application is not performing as expected and may need to be optimized.
What is the meaning of li

In [9]:
await Chat("What is the ephemeral segment? Explain in as much depth as possible."); 
await Chat("Identifying all issues that performance engineers should focus their efforts.");

What is the ephemeral segment? Explain in as much depth as possible.
 >>  The ephemeral segment in the GC heap is a specific segment that holds the ephemeral generations (gen0 and gen1). It is designed to contain newly allocated objects and is never larger than a single segment, which simplifies the memory management and garbage collection process. The space after the last live object in the ephemeral segment is kept committed after a garbage collection cycle, which allows gen0 allocations to immediately use this space. This committed space in the ephemeral segment corresponds to the gen0 budget, which is why the GC commits more memory than the current heap size. This distinction is especially important in server GC scenarios with multiple heaps and large gen0 budgets, as it ensures that the ephemeral segment is always available for new allocations.
Identifying all issues that performance engineers should focus their efforts.
 >>  Performance engineers should focus their efforts on ide

### Taking Traces

In [10]:
await Chat("How do I take a GCCollectOnly trace? I am on Windows and want to take a top level trace. Please help!");
await Chat("How do I take a top level trace using the PerfView command line?");
await Chat("How do I take a top level trace"); 
await Chat("How do I take a CPU based GC trace?");
await Chat("How do I take a thread time trace?"); 

How do I take a GCCollectOnly trace? I am on Windows and want to take a top level trace. Please help!
 >>  To take a GCCollectOnly trace on Windows, you can use the PerfView tool. First, open a command prompt window and type in the following command: `perfview /GCCollectOnly /AcceptEULA /nogui collect`. This will start the trace. When you are done, press 's' in the PerfView command window to stop the trace. This will generate a trace file that you can analyze to identify any performance issues related to garbage collection.
How do I take a top level trace using the PerfView command line?
 >>  To take a top level trace using the PerfView command line, you need to run the command `perfview /GCCollectOnly /AcceptEULA /nogui collect`. This will start the trace. When you are done, press `s` in the PerfView command window to stop the trace. This will collect a top level GC trace which can be used to analyze the performance of your application.
How do I take a top level trace
 >>  To take a t

### Pinning

In [11]:
await Chat("What is pinning?");
await Chat("How do I reduce pinning?");
await Chat("How do I improve pinning performance?");

What is pinning?
 >>  Pinning is a feature in .NET that allows objects to be marked as unpinned, indicating that they cannot be moved by the garbage collector. This means that the objects will remain in the same memory location, even when the garbage collector runs. Pinning can be used to improve performance by ensuring that objects are not moved around in memory, which can be costly. However, it can also cause fragmentation issues if pinned objects are promoted to higher generations, such as gen1 or gen2. To mitigate this, it is recommended to pin objects in already compacted portions of the heap and allocate batches of pinned objects instead of allocating and pinning objects individually. Additionally, in .NET 5, the Pinned Object Heap (POH) feature was introduced, allowing pinned objects to be allocated on a separate heap to reduce fragmentation.
How do I reduce pinning?
 >>  To reduce pinning, you should try to minimize the amount of objects that need to be pinned. This can be done

### Memory Footprint Issues

In [12]:
await Chat("How do I reduce my memory footprint? What are some topic metrics I should look at?");
await Chat("What metrics should I look at to address my memory issues.");
await Chat("What are some important guidelines to use to address my memory related problems?"); 

How do I reduce my memory footprint? What are some topic metrics I should look at?
 >>  To reduce your memory footprint, you should look at the GC heap size histogram. This metric provides information about the distribution of memory allocations in the GC heap, helping you understand the memory usage patterns and identify opportunities for optimization. You can also look at other metrics such as the number of objects allocated, the number of objects freed, and the amount of time spent in garbage collection. By understanding these metrics, you can identify areas where you can reduce memory usage and improve performance.
What metrics should I look at to address my memory issues.
 >>  To address memory issues, it is important to look at the top-level garbage collection metrics, such as the amount of memory allocated, the amount of memory freed, the number of objects created, and the number of objects destroyed. Additionally, it is important to consider the time taken for garbage collectio

### Throughput Issues

In [14]:
await Chat("My GC is slowing my application down. What should I do?");
await Chat("How should I address throughput issues?");
await Chat("What are some ways to reduce my GC latency?");

My GC is slowing my application down. What should I do?
 >>  If your GC is slowing down your application, you should first investigate the cause of the slowdown. You can do this by looking at the GC logs to see if there are any issues with the GC itself, or if there are any other performance issues that could be causing the slowdown. You can also use a profiler to identify any bottlenecks in your application code. Once you have identified the cause of the slowdown, you can take steps to address it, such as optimizing your code, tuning the GC parameters, or using a different GC algorithm.
How should I address throughput issues?
 >>  To address throughput issues, performance engineers should first identify the root cause of the issue. This can be done by monitoring metrics such as % Pause time in GC and % CPU time in GC. If the issue is related to GC, then optimizing GC parameters such as heap size, GC type, and GC frequency can help improve throughput. Additionally, optimizing applicati

### Different GC Flavors

In [17]:
await Chat("What's the difference between Workstation and Server GC?");
await Chat("What is background GC?");
await Chat("Will WKS work better for my application than SVR?");

What's the difference between Workstation and Server GC?
 >>  The main difference between Workstation GC (WKS GC) and Server GC (SVR GC) in .NET is the way they are optimized for different workloads. WKS GC is designed for workstation or client scenarios where the application shares the machine with other processes, while SVR GC is optimized for server workloads where the application is the dominant process on the machine and handles multiple user threads. 

WKS GC has a single heap, while SVR GC has one heap per logical core. Additionally, SVR GC threads have their priority set to `THREAD_PRIORITY_HIGHEST`, which allows them to preempt lower-priority threads. WKS GC runs on the user thread that triggered the GC, typically at normal priority. SVR GC threads are also hard affinitized to logical cores, ensuring better utilization of available CPU resources. 

The choice between WKS GC and SVR GC depends on the nature of the workload and the specific requirements of the application. SVR G

## Debugging

In [13]:
using System.Diagnostics;

#!about

0,1
,.NET Interactive© 2020 Microsoft CorporationVersion: 1.0.425803+1db2979099d0272660e1497cae9b9af1238db42fLibrary version: 1.0.0-beta.23258.3+1db2979099d0272660e1497cae9b9af1238db42fBuild date: 2023-05-12T10:30:52.4965699Zhttps://github.com/dotnet/interactive
