In [None]:
#r "nuget: Microsoft.DotNet.Interactive.AIUtilities, 1.0.0-beta.23562.1"

In [None]:
#r "nuget:Microsoft.DotNet.Interactive.AI, 1.0.0-beta.23570.1"


In [None]:
#!value --name key
YOUR AZURE OPEN AI KEY

In [None]:
#!value --name endpoint
https://your-enpoint.openai.azure.com/

Note: If your deployment names are different, you will need to change `--deployment` from `text-embedding-ada-002` and `gpt-35-turbo-16k` in the lines below:

In [None]:
#!connect azure-openai --model-type TextEmbeddingGenerator --kernel-name knowledge --api-key @value:key --endpoint @value:endpoint --deployment text-embedding-ada-002

In [None]:
#!connect azure-openai --model-type ChatCompletion --kernel-name chat --api-key @value:key --endpoint @value:endpoint --deployment gpt-35-turbo-16k --use-knowledge knowledge

In [None]:
System.Diagnostics.Debugger.Launch();

### we will be using text embeddings and cosine distance similarity to focus the conversation with the agent on the questions the user is about ot submit

In [None]:
#r "nuget: Microsoft.ML,  3.0.0-preview.23511.1"

In [None]:
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Trainers;

## Focus the conversation

At each turn let's filter out irrelevant turns for the question at hand. TextEmbedding ans cosine similarity will help us keep the conversation on track!

In [None]:
using Microsoft.DotNet.Interactive;
using Microsoft.DotNet.Interactive.Commands;
using Microsoft.DotNet.Interactive.AI;
using Microsoft.DotNet.Interactive.AIUtilities;

var chatCompletionKernel = Microsoft.DotNet.Interactive.Kernel.Root.FindKernelByName("chat") as ChatCompletionKernel;
var knowledgeKernel = Microsoft.DotNet.Interactive.Kernel.Root.FindKernelByName("knowledge") as KnowledgeKernel;

var comparer = new CosineSimilarityComparer<float[]>(f => f);

chatCompletionKernel.SetUserTurnFilter(async (turn, userPrompt, token) =>{
    var userEmbeddings = await knowledgeKernel.GenerateEmbeddingsAsync(userPrompt, token);
    var questionRelevance = comparer.Score(userEmbeddings, turn.QuestionTexEmbedding);
    var answerRelevance = comparer.Score(userEmbeddings, turn.AnswerTexEmbedding);
    var relevanceThreshold = 0.8;
    if (questionRelevance >= relevanceThreshold || answerRelevance >= relevanceThreshold)
    {
        return true;
    }else{
        //"The turn is not relevant to the conversation.".Display();
        //turn.Display();
        return false;
    }
});

## what are we talking about??

Let's tap into ML and use K-Means to cluster the chatlog using the text embeddings of each turns!

In [None]:
public class ModelInput
{
  public string Text {get;set;}
  [VectorType(1536)]
  public float[] Embedding {get;set;}
}

In [None]:
IDataView clusteredData = null;
ClusteringPredictionTransformer<KMeansModelParameters> model = null;

chatCompletionKernel.OnConversationTurnCompleted(( chatLog, toekn) =>
{
   var embeddings = new List<ModelInput>();
   var ctx = new MLContext();
   var clusterCount  = Math.Min(20, 2+ (chatLog.Turns.Count / 5));
   if(chatLog.Turns.Count < clusterCount)
   {
      return Task.CompletedTask;
   }  
   foreach (var turn in chatLog.Turns)
   {
      switch (turn){
       case ChatLog.ChatUserTurn  userTurn:
          embeddings.Add(new ModelInput() { Text = userTurn.Question, Embedding = userTurn.QuestionTexEmbedding});
          embeddings.Add(new ModelInput() { Text = userTurn.Answer, Embedding = userTurn.AnswerTexEmbedding });
          break;
          case ChatLog.ChatSystemTurn systemTurn:
         // embeddings.Add(new ModelInput() { Text = systemTurn.Prompt, Embedding = systemTurn..AnswerTexEmbedding.AsReadOnlySpan().ToArray() });
          break;
      }
   }
   var idv = ctx.Data.LoadFromEnumerable(embeddings);

   var pipeline = 
   ctx.Clustering.Trainers.KMeans("Embedding", numberOfClusters: clusterCount);
   model = pipeline.Fit(idv);
   clusteredData = model.Transform(idv);
   return Task.CompletedTask;
});

## Start chatting then!

In [None]:
I would like to ask you questions about coding in C#

In [None]:
How do I cook a fish?

In [None]:
What is a c# class?

In [None]:
What is the most common modal scale in heavy metal music?

In [None]:
How do I setup my amplifier to play heavy metal guitar riffs?

In [None]:
What is the mest way to model embeddings in a relational database?

In [None]:
What is a motorbike?

In [None]:
Then, what is the best way to represent a motorbike in C# using classes and interfaces? I am thinking of making a motorbike racing game in .NET.

In [None]:
Show me an example of a class that represents a motorbike using C#, not sure what methods I would need for the game simulation of the bike.

In [None]:
clusteredData.Preview().Display();

## Lets find what has been covered in the conversation.
Now that we have the `KMean` model we can look at the centroids, they represent the topic that turns had in common. 

In [None]:
VBuffer<float>[] centroids = default;

In [None]:
model.Model.GetClusterCentroids(ref centroids, out var _);

In [None]:
centroids.Display();

now using `Microsoft.DotNet.Interactive.AIUtilities` and the extension `ScoreBySimilarityTo` we will try to collect the top 3 questions and 3 answer that are very close to each topic.

In [None]:
using Microsoft.DotNet.Interactive.AIUtilities;

var log = chatCompletionKernel.ChatLog;

var examples = centroids.Select(c => {
    var embedding = c.GetValues().ToArray();
    var questions = log.Turns
        .OfType<ChatLog.ChatUserTurn>()
        .ScoreBySimilarityTo(embedding, new CosineSimilarityComparer<float[]>(v => v), turn => turn.QuestionTexEmbedding )
        .OrderByDescending(e => e.Value)
        .Where(e => e.Value > 0.8)
        .Take(3)
        .Select(e => e.Key.Question)
        .ToArray();

    var answers = log.Turns
        .OfType<ChatLog.ChatUserTurn>()
        .ScoreBySimilarityTo(embedding, new CosineSimilarityComparer<float[]>(v => v), turn => turn.AnswerTexEmbedding )
        .OrderByDescending(e => e.Value)
        .Where(e => e.Value > 0.8)
        .Take(3)
        .Select(e => e.Key.Answer)
        .ToArray();
   return new {
        CenstroidEmbedding = embedding,
        Text = questions.Concat(answers).ToArray()
        };
    }
).ToArray();


In [None]:
examples.Display();

Using the `TextCompletion` kernel we will try to generate a label for each centroid using the examples we collected

In [None]:
var textCompletionKernel = Microsoft.DotNet.Interactive.Kernel.Root.FindKernelByName("chat(text)") as TextCompletionKernel;

In [None]:
using Microsoft.DotNet.Interactive.Events;
public record CentroidLabel(string Label, float[] Embedding);
var labels = new List<CentroidLabel>();
foreach (var example in examples)
{
    textCompletionKernel.SetSuppressDisplay(true);
    var result = await textCompletionKernel.SendAsync(
        new SubmitCode(
"""
Give the following conversation examples, please provide a label that classifies the topic being discussed. Return a single respose and that should be a single label, not a list. Thhe label will be used to classify future conversations. make sure not to returl a list, or bulelt points, only a single line response.

The response should be in like : 

Topic : Prorgramming


Here are the conversation examples :

"""+ string.Join("\n", example.Text.Select(t => $"- {t}"))
));
    textCompletionKernel.SetSuppressDisplay(false);
    var response = result.Events.OfType<ReturnValueProduced>().Last().FormattedValues.First(fm => fm.MimeType == "text/plain").Value;
    labels.Add(new CentroidLabel(response, example.CenstroidEmbedding));
}

And this is the resul!

In [None]:
labels.Display();

In [None]:
chatCompletionKernel.ResetChatLog();