# Feedback Exploration

This notebook, leverage the pre-processed data from [this notebook](./research.ipynb).

## Approach

Using embeddings, we can find the most similar feedbacks to a given user query. For this we would need an OpenAI client. We will embed the user query and seek for `cosine` similarities within the pre-processed feedbacks.

In [7]:
using System.Text.Json.Serialization;
using System.Text.Json;

In [8]:
# load "../utils/VectorMath.cs"
# load "../FeedbackApi/Models/FeedbackRecord.cs"

In [1]:
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.12"
#r "nuget: DotNetEnv, 2.5.0"

using Azure; 
using Azure.AI.OpenAI;
using DotNetEnv;
using System.IO;
using System.Text.Json; 

## Creating OpenAI Client

The following cell initialize the OpenAi client with which all calls would be made to the end-point.

In [2]:
static string _configurationFile = @"../../configuration/.env";
Env.Load(_configurationFile);

string oAiApiKey = Environment.GetEnvironmentVariable("AOAI_APIKEY") ?? "AOAI_APIKEY not found";
string oAiEndpoint = Environment.GetEnvironmentVariable("AOAI_ENDPOINT") ?? "AOAI_ENDPOINT not found";
string chatCompletionDeploymentName = Environment.GetEnvironmentVariable("CHATCOMPLETION_DEPLOYMENTNAME") ?? "CHATCOMPLETION_DEPLOYMENTNAME not found";
string embeddingDeploymentName = Environment.GetEnvironmentVariable("EMBEDDING_DEPLOYMENTNAME") ?? "EMBEDDING_DEPLOYMENTNAME not found";

AzureKeyCredential azureKeyCredential = new AzureKeyCredential(oAiApiKey);
OpenAIClient openAIClient = new OpenAIClient(new Uri(oAiEndpoint), azureKeyCredential);

Console.WriteLine($"OpenAI Client created: {oAiEndpoint} with: {chatCompletionDeploymentName} and {embeddingDeploymentName} deployments");

OpenAI Client created: https://yd-openai-sweden.openai.azure.com/ with: yd-sweeden-40 and emedd-ada-002 deployments


In [3]:
async Task<float[]> GetEmbeddingAsync(string textToBeVecorized)
{
    // Prepare the embeddings options with the user story
    EmbeddingsOptions embeddingsOptions = new EmbeddingsOptions(embeddingDeploymentName, new List<string> { textToBeVecorized });
    var modelResponse = await openAIClient.GetEmbeddingsAsync( embeddingsOptions);
    float[] response = modelResponse.Value.Data[0].Embedding.ToArray();
    return response;
}

# Search over a list of feedbacks

The input is a json file, generated in the `research` notebook, containing a list of feedbacks. The generated user story and the embedding are part of the json file.

This notebook, will use the same object that represent a feedback, it will load the content of the file into a list of feedbacks and then few search methods will be implemented to search over the list of feedbacks. in all cases the search would be brute force, but the idea is to show how the search can be done, the number of items in this demo is limited (20) and this concept can grow up to ~1000 before showing significant performance issues.

a openai client would be required here to perform embedding on the user question.

## Loading pre-processed feedbacks

The feedbacks are pre-processed in the `research` notebook, and saved in a json file. The file is loaded here and the feedbacks are stored in a list.

In [5]:
using System.IO;
using System.Collections.Generic;
using System.Text.Json;
using System.Threading.Tasks;

public async Task<List<FeedbackRecord>> ReadFeedbackRecordsFromJsonAsync(string filePath)
{
    // Read the JSON file content
    string jsonString = await File.ReadAllTextAsync(filePath);
    
    // Deserialize the JSON string into a list of FeedbackRecord objects
    List<FeedbackRecord> feedbackRecords = JsonSerializer.Deserialize<List<FeedbackRecord>>(jsonString, new JsonSerializerOptions
    {
        PropertyNameCaseInsensitive = true // Ignore case differences in JSON property names
    });
    
    return feedbackRecords;
}

## GetTop Similar Feedbacks

This method will return the top similar feedbacks to the user question, the similarity is calculated based on the embedding of the user question and the feedbacks user story (a normalized version of the feedback details).

It uses the cosine similarity to calculate the similarity between the user question and the feedbacks.

In [14]:
public async Task<List<FeedbackRecord>> GetTopnSimilarUserStoriesAsync(string userQuery,List<FeedbackRecord> feedbackRecords, int topN = 3)
{
    // Step 1: Get the embedding for the user query
    float[] queryEmbedding = await GetEmbeddingAsync(userQuery);

    // Step 2: Calculate cosine similarity for each feedback record and store results
    var similarityResults = new List<(FeedbackRecord record, float similarity)>();
    
    foreach (var record in feedbackRecords)
    {
        if (record.Embedding != null && record.Embedding.Length > 0)
        {
            float similarity = VectorMath.CosineSimilarity(record.Embedding, queryEmbedding);
            similarityResults.Add((record, similarity));
        }
    }

    // Step 3: Sort by similarity and take top n
    var top3Stories = similarityResults.OrderByDescending(r => r.similarity).Take(topN).Select(r => r.record).ToList();

    return top3Stories;
}

## Pre-Processed data

The feedbacks are pre-processed in the `research` notebook, and saved in a json file. The file is loaded here and the feedbacks are stored in a list. Modify the location and name for the file if needed.

In [9]:
var feedbacklocation = "../../sample-data/exportedstories.json";
List<FeedbackRecord> feedbackRecords = await ReadFeedbackRecordsFromJsonAsync(feedbacklocation);


## Potential user queries

- i wonder if anyone was looking for older windows auth for his aks cluster
- I am looking for evidence that there is a capacity issue in europe
- I am looking for aks related feedback that devops would benefit
- what are the most common featrues requests on app service?


In [21]:
string userQuery = "i wonder if anyone was looking for older windows auth for his aks cluster";
List<FeedbackRecord> topStories = await GetTopnSimilarUserStoriesAsync(userQuery,feedbackRecords,2);

// Output the top 3 user stories
foreach (var story in topStories)
{
    Console.WriteLine($"Id: {story.CustomerName}, UserStory: {story.UserStory}");
}

Id: Molina Healthcare, UserStory: As a system administrator, I want to enable gMSA v2 support on Windows AKS, so that I can ensure our API pods can access a domain-joined SQL server without disruptions.
Id: JFrog Ltd, UserStory: As a DevOps engineer, I want to have extended support for older Kubernetes versions on Azure Kubernetes Service, so that I can have more time to migrate to fully supported versions and maintain consistency across multi-cloud environments.
