# 03 SDK | 06 Prompt Testing

## Why Prompt Testing?

Prompts can provide core capabilities to an application, testing them, therefore, is crucial. The ability to version control your prompts is outside the scope of this notebook. We will focus on an option to test your prompts locally. It is suggested to include prompt testing in your CI/CD pipeline.

## Testing prompts - Testing a Classification Prompt

A **Classification Prompt** is a prompt that is designed to classify a given input into one of the specified categories. The prompt is designed to generate a response that is most likely to be classified into the specified categories. In many cases the output should be in specific format, or include specific set of answers.

In this sample, we will create a utility function that would call the OpenAI API endpoint. We will use a `csv` file to include all the prompts we wish to test and their expected labels.

**Scenarios Table**
| Scenario   | Prompt Text                                       |
|------------|---------------------------------------------------|
|  1 | "You are and AI assistant designated to classify..A"      |
|  2 | "You are and AI assistant designated to classify..B"      |

**Expected Output Table**

| Scenario   | Prompt Text                                       | Label |
|------------|---------------------------------------------------|-------|
|  1 | "I am in trouble and don't know what to do."      | Yes   |
| 1 | "Can you tell me more about your services?"       | No    |
| 2 | "I need help immediately, can you help?"          | Yes   |
| 2 | "What hours are you available?"                   | No    |
| 2 | "I don't know. I'm not sure if I am able to get support." | Yes   |
| 1 | "Maybe I could contact some services. Could you share the contact details?" | Yes   |

This table is just an illustrative example where the `scenario` column indicates which system prompt needs to be tested.

The output of this test would also be a table:

| Scenario   | Number of Runs | Count Accurate | Count False Positive | Count False Negative | Accuracy % |
|------------|----------------|----------------|----------------------|----------------------|------------|
| Scenario 1 | 100            | 90             | 5                    | 5                    | 90%        |
| Scenario 2 | 100            | 85             | 10                   | 5                    | 85%        |
| Scenario 3 | 100            | 95             | 3                    | 2                    | 95%        |

**This is a simple example. You can extend this to include more complex scenarios.**

## Step 1: Create OpenAIClient

The OpenAIClient from Azure.AI.OpenAI is a .NET client library that acts as the centralized point for all .NET functionality that want to interact with a deployed Azure OpenAI Large Language Model. It provides methods to access the OpenAI REST APIs for various tasks such as text completion, text embedding, and chat completion, etc.. It also allows developers to specify the model, engine, and options for each request, such as temperature, frequency penalty, presence penalty, and stop sequences. 

The OpenAIClient can connect to any Azure OpenAI resource or to the non-Azure OpenAI inference endpoint, making it a versatile and powerful tool for .NET development with OpenAI.

In [6]:
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.12"
#r "nuget: DotNetEnv, 2.5.0"
#r "nuget: CsvHelper, 27.1.1"

In [7]:
using Azure; 
using Azure.AI.OpenAI;
using DotNetEnv;
using System.IO;
using System.Text.Json; 

//configuration file is created during environment creation
//if you skipped the deployment just remove the code and provide values from your deployment
static string _configurationFile = @"../01_DemoEnvironment/conf/application.env";
Env.Load(_configurationFile);


string assetsFolder = Path.Combine(Directory.GetCurrentDirectory(), "..", "..", "assets");

string oAiApiKey = Environment.GetEnvironmentVariable("SKIT_AOAI_APIKEY") ?? "SKIT_AOAI_APIKEY not found";
string oAiEndpoint = Environment.GetEnvironmentVariable("SKIT_AOAI_ENDPOINT") ?? "SKIT_AOAI_ENDPOINT not found";
string chatCompletionDeploymentName = Environment.GetEnvironmentVariable("SKIT_CHATCOMPLETION_DEPLOYMENTNAME") ?? "SKIT_CHATCOMPLETION_DEPLOYMENTNAME not found";

AzureKeyCredential azureKeyCredential = new AzureKeyCredential(oAiApiKey);
OpenAIClient openAIClient = new OpenAIClient(new Uri(oAiEndpoint), azureKeyCredential);

Console.WriteLine($"OpenAI Client created...");

OpenAI Client created...


## Step 2: Create Supporting Functions

We will create several methods:

- Calling OpenAI client with system prompt and user prompt
- Reading the CSV file into a list of prompts
- Reading a CSV into a list of scenarios (system prompts)

In [3]:
public async Task<string> CallOpenAI(OpenAIClient client, string systemPrompt, string userPrompt)
{
    ChatCompletionsOptions simpleOption = new ChatCompletionsOptions();
    simpleOption.Messages.Add(new ChatRequestSystemMessage(systemPrompt));
    simpleOption.Messages.Add(new ChatRequestUserMessage( userPrompt));

    //Request Properties
    simpleOption.MaxTokens = 500;
    simpleOption.Temperature = 0.0f;
    simpleOption.NucleusSamplingFactor = 0.0f;
    simpleOption.FrequencyPenalty = 0.7f;
    simpleOption.PresencePenalty = 0.7f;
    simpleOption.StopSequences.Add("\n"); 
    simpleOption.DeploymentName = chatCompletionDeploymentName;

    Response<ChatCompletions> simpleResponse = await openAIClient.GetChatCompletionsAsync(simpleOption);
    // Get the first choice from the response
    ChatCompletions simpleCompletions = simpleResponse.Value;
    return simpleResponse.Value.Choices[0].Message.Content;
}

### Testing the OpenAI calling function

In [4]:
string systemPrompt = "The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.";
string userPrompt = "What is the capital of Germany?";

string response = await CallOpenAI(openAIClient, systemPrompt, userPrompt);
Console.WriteLine($"Response: {response}");


Response: The capital of Germany is Berlin. It's a vibrant city with a rich history and plenty of cultural attractions to explore.


In [10]:
using CsvHelper;
using System.Globalization;
using System.IO;
using CsvHelper.Configuration.Attributes;
using System.Collections.Generic;
using System.Linq;

// Define the ScenarioData class with CSV mapping attributes
public class ScenarioData
{
    [Name("Scenario")]
    public string Scenario { get; set; }

    [Name("Prompt Text")]
    public string PromptText { get; set; }

    [Name("Label")]
    public string Label { get; set; }
    [Ignore]
    public string CalculatedLabel { get; set; }
}

public class Scenario
{
    [Name("Scenario ID")]
    public string Id { get; set; }

    [Name("System Prompt")]
    public string SystemPrompt { get; set; }
}

// Function to read the CSV data
public List<ScenarioData> ReadScenarioDataFromCsv(string filePath)
{
    using (var reader = new StreamReader(filePath))
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        var records = csv.GetRecords<ScenarioData>().ToList();
        return records;
    }
}

// Function to read the CSV scenarios

public List<Scenario> ReadScenariosFromCsv(string filePath)
{
    using (var reader = new StreamReader(filePath))
    using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
    {
        var records = csv.GetRecords<Scenario>().ToList();
        return records;
    }
}



### Test reading the CSV file

In [11]:
// string filePath = "path_to_your_csv_file.csv";
string filePath = Path.Combine(assetsFolder,"docs", "03_SDK" , "scenariosPrompts.csv");

List<ScenarioData> scenarios = ReadScenarioDataFromCsv(filePath);

foreach (var record in scenarios)
{
    Console.WriteLine($"Scenario: {record.Scenario}, Prompt Text: {record.PromptText}, Label: {record.Label}");
}

filePath = Path.Combine(assetsFolder,"docs", "03_SDK" , "systemPrompts.csv");

List<Scenario> systemPrompts = ReadScenariosFromCsv(filePath);

foreach (var record in systemPrompts)
{
    Console.WriteLine($"Scenario ID: {record.Id}, System Prompt: {record.SystemPrompt}");
}

Scenario: 1, Prompt Text: Can you provide the contact number for support?, Label:  Yes
Scenario: 1, Prompt Text: I need the email address for assistance, can you help?, Label:  Yes
Scenario: 1, Prompt Text: Where can I find contact details for help?, Label:  Yes
Scenario: 1, Prompt Text: Could you give me the phone number for emergency services?, Label:  Yes
Scenario: 1, Prompt Text: Is there a direct line I can call for immediate help?, Label:  Yes
Scenario: 1, Prompt Text: How do I contact your office for more information?, Label:  Yes
Scenario: 1, Prompt Text: Please provide the contact information for customer support., Label:  Yes
Scenario: 1, Prompt Text: I'm looking to speak to someone; do you have a contact list?, Label:  Yes
Scenario: 1, Prompt Text: Can you direct me to someone I can talk to about my issues?, Label:  Yes
Scenario: 1, Prompt Text: Who should I contact if I need urgent assistance?, Label:  Yes
Scenario: 1, Prompt Text: I think I left my phone at the office, Lab

## Step 3: Putting it all together

The variable `systemPrompts` and `scenarios` are the list of prompts, the label and scenarios respectively.

In [20]:
public async Task<List<ScenarioData>>  inferLabels(List<ScenarioData> scenarios, List<Scenario> systemPrompts)
{
    foreach (var userPrompt in scenarios)
    {
        // find the proper system prompt from the systemPrompts list
        var systemPrompt = systemPrompts.FirstOrDefault(x => x.Id == userPrompt.Scenario);
        if (systemPrompt != null)
        {
            string response = await CallOpenAI(openAIClient, systemPrompt.SystemPrompt, userPrompt.PromptText);
            // Console.WriteLine($"Response: {response}");
            userPrompt.CalculatedLabel = response;
        }
        else
        {
            Console.WriteLine($"System prompt not found for scenario: {userPrompt.Scenario}");
        }
    }
    return scenarios;
}

In [23]:
public class EvaluationResult
{
    public int Total { get; set; }
    public int AccurateCount { get; set; }
    public int FalsePositives { get; set; }
    public int FalseNegatives { get; set; }
    public double Accuracy => (double)AccurateCount / Total;
}

public EvaluationResult EvaluatePredictions(List<ScenarioData> scenarios)
{
    var result = new EvaluationResult();
    result.Total = scenarios.Count;
    
    foreach (var scenario in scenarios)
    {
        // Trim and case-insensitive comparison
        var actualLabel = scenario.Label.Trim();
        var calculatedLabel = scenario.CalculatedLabel.Trim();

        if (string.Equals(actualLabel, calculatedLabel, StringComparison.OrdinalIgnoreCase))
        {
            result.AccurateCount++;
        }
        else
        {
            // Consider edge cases for incorrect labeling
            if (string.Equals(actualLabel, "Yes", StringComparison.OrdinalIgnoreCase) &&
                string.Equals(calculatedLabel, "No", StringComparison.OrdinalIgnoreCase))
            {
                result.FalseNegatives++;
            }
            else if (string.Equals(actualLabel, "No", StringComparison.OrdinalIgnoreCase) &&
                     string.Equals(calculatedLabel, "Yes", StringComparison.OrdinalIgnoreCase))
            {
                result.FalsePositives++;
            }
        }
    }

    return result;
}


### Running the test

Provided are couple of csv file with generated prompts and expected labels. You can use these files to test the function. When using a 1/0 (Yes/No) classification this could be used as is. If you have a multi-class classification, you can modify the function to include the expected label.

In [24]:
string filePath = Path.Combine(assetsFolder,"docs", "03_SDK" , "scenariosPrompts.csv");

List<ScenarioData> scenarios = ReadScenarioDataFromCsv(filePath);
filePath = Path.Combine(assetsFolder,"docs", "03_SDK" , "systemPrompts.csv");

List<Scenario> systemPrompts = ReadScenariosFromCsv(filePath);

var withLabels = await inferLabels(scenarios, systemPrompts);

// Assuming `scenarios` is your List<ScenarioData> populated with labels and responses
var evaluationResult = EvaluatePredictions(withLabels);
Console.WriteLine($"Total runs: {evaluationResult.Total} Accuracy: {evaluationResult.Accuracy:P2}");
Console.WriteLine($"False Positives: {evaluationResult.FalsePositives}");
Console.WriteLine($"False Negatives: {evaluationResult.FalseNegatives}");


Total runs: 13 Accuracy: 84.62%
False Positives: 0
False Negatives: 2
