
---

### Reminder: This 📘 `.NET Interactive` notebook needs to be run from VS Code with [these prerequisites](../PREREQS.md).

#### How to use this notebook: 

* Just read the text and scroll along until you run into code blocks.
* Code blocks have computer code inside them — hover over the block and you can run the code.
* Run the code by hitting the ▶️ "play" button to the left. If the code runs you'll see a ✔️. If not, you'll get a ❌.
* The output and status of the code block will appear just below itself — you need to scroll down further to see it.
* Sometimes a code block will ask you for input in a hard-to-notice dialog box 👆 at the top of your notebook window. 

---

# Bonus Recipe: 🕵️ Test your prompts
## 🧪 Non-deterministic computation requires testing

Because the output from LLM AI is unpredictable in nature, you're going to want to enforce some degree of reliability that it will behave a similar way each time. 

## Step 1: Instantiate a 🔥 kernel so we can start cooking

In [1]:
#r "nuget: Microsoft.SemanticKernel, 0.15.230531.5-preview"

#!import ../config/Settings.cs

using Microsoft.SemanticKernel;
using System.IO;
using Microsoft.SemanticKernel.SemanticFunctions;
using Microsoft.SemanticKernel.Orchestration;

// Grab the locally stored credentials from the settings.json file. Name the "backend" as "davinci" — assuming that you're using one of the davinci completion models. 

var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();

IKernel kernel = Microsoft.SemanticKernel.Kernel.Builder
    .WithAzureChatCompletionService(
        deploymentName: model,
        endpoint: azureEndpoint,
        apiKey: apiKey
    )
    .Build();

## Step 2: We're going to make one semantic function and run it at two different temperatures

Different temperatures result in different outcomes — we can illustrate that as follows:

In [3]:
using Microsoft.SemanticKernel.SemanticFunctions;

string myFun = """
{{$input}}
Define the term above in less than five words.
""";

var promptConfigA = new PromptTemplateConfig
{
    Completion =
    {
        Temperature = 0.2, 
        TopP = 0.1
    }
};

var promptConfigB = new PromptTemplateConfig
{
    Completion =
    {
       Temperature = 1,
       TopP = 1.0
    }
};

var functionConfigA = new SemanticFunctionConfig(promptConfigA, new PromptTemplate( myFun, promptConfigA, kernel ));
var functionConfigB = new SemanticFunctionConfig(promptConfigB, new PromptTemplate( myFun, promptConfigB, kernel ));

var myFunCold = kernel.RegisterSemanticFunction("MyTest", "A", functionConfigA);
var myFunHot = kernel.RegisterSemanticFunction("MyTest", "B", functionConfigB);

Console.WriteLine("A semantic function has been registered with two configurations.");


Now we can run these two functions over multiple iterations to see how they differ, or not. Note that our configurations are as follows:

### Configuration A

* Temperature = 0.2 <-- ranges from 0 (low randomness) to 1 (high randomness)
* TopP = 0.1 <-- ranges from 0 (low variability) to 1 (high variability)

### Configuration B

* Temperature = 1 <-- ranges from 0 (low randomness) to 1 (high randomness)
* TopP = 1.0 <-- ranges from 0 (low variability) to 1 (high variability)

In [4]:
var input = "AI";
var coldResponses = "";
var hotResponses = "";
for(var i = 0; i < 5; i++) {
   var coldResult = await kernel.RunAsync(input, myFunCold);
   coldResponses+=coldResult.ToString().Trim() + "\n";
   var hotResult = await kernel.RunAsync(input, myFunHot);
   hotResponses+=hotResult.ToString().Trim() + "\n";
}

Console.WriteLine($"COLD responses:\n\n{coldResponses}");
Console.WriteLine($"\nHOT responses:\n\n{hotResponses}");

## Step 3: Let's get freezing cold to aim towards a higher degree of determinism in outputs.

You should note that the COLD responses don't deviate so much; whereas the HOT responses do. Another set of parameters to consider that increase variability are `PresencePenalty` and `FrequencyPenalty`. As the official definition goes:

* `PresencePenalty`: "Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics."
* `FrequencyPenalty`: "Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim."

So if you have the LLM AI generating a longer piece of text, you can either have it try to use new words (instead of repeating old ones) either by nudging the usage frequency or count of the word (i.e. "presence") with positive number settings. Or when using negative number settings it will be happy repeating itself by using the same words over and over. 

The benefit of using the same words repetitively is that the LLM AI won't veer off course so easily. That said, you may think it's a bit ... boring. But sometimes boring is what you want when you'd like the same answer to always get generated. Let's try this in practice.

In [5]:
using Microsoft.SemanticKernel.SemanticFunctions;

string myFunNext = """
Term: {{$input}}
A fifteen word story about the term:
""";

var promptConfigC = new PromptTemplateConfig
{
    Completion =
    {
        Temperature = 0.2, 
        TopP = 0.1,
        PresencePenalty = 2,
        FrequencyPenalty = 2
    }
};

var promptConfigD = new PromptTemplateConfig
{
    Completion =
    {
        Temperature = 1,
        TopP = 1.0,
        PresencePenalty = -2,
        FrequencyPenalty = -2
    }
};

var functionConfigC = new SemanticFunctionConfig(promptConfigC, new PromptTemplate( myFunNext, promptConfigC, kernel ));
var functionConfigD = new SemanticFunctionConfig(promptConfigD, new PromptTemplate( myFunNext, promptConfigD, kernel ));

var myFunColder = kernel.RegisterSemanticFunction("MyTest", "C", functionConfigC);
var myFunHotter = kernel.RegisterSemanticFunction("MyTest", "D", functionConfigD);

Console.WriteLine("A semantic function has been registered with two more EXTREME configurations.");


And now let's run them to see what happens:

In [6]:
var input = "AI";
var colderResponses = "";
var hotterResponses = "";
for(var i = 0; i < 3; i++) {
   var colderResult = await kernel.RunAsync(input, myFunColder);
   colderResponses+=colderResult.ToString().Trim() + "\n";
   var hotterResult = await kernel.RunAsync(input, myFunHotter);
   hotterResponses+=hotterResult.ToString().Trim() + "\n";
}

Console.WriteLine($"COLDER responses:\n\n{colderResponses}");
Console.WriteLine($"\nHOTTER responses:\n\n{hotterResponses}");

So now you understand how to "tamp down" the LLM AI's degree of creativity with a COLDER approach. And also how to unleash its freedom of expression with a HOTTER approach.

## Step 4: Just one more thing — let's use a different model

Changing the model will also change the result. Let's see that up close by comparing what the output of `text-davinci-003` will be compared with `text-davinci-002` (the older model). We'll start by making a second kernel that uses `text-davinci-002` called `kernel2`:

In [None]:
#r "nuget: Microsoft.SemanticKernel, 0.9.61.1-preview"

#!import ../config/Settings.cs

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.KernelExtensions;
using System.IO;
using Microsoft.SemanticKernel.Configuration;
using Microsoft.SemanticKernel.SemanticFunctions;
using Microsoft.SemanticKernel.Orchestration;

IKernel kernel2 = Microsoft.SemanticKernel.Kernel.Builder.Build();

// Grab the locally stored credentials from the settings.json file. Name the "backend" as "davinci" — assuming that you're using one of the davinci completion models. 

var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();

if (useAzureOpenAI)
    kernel2.Config.AddAzureOpenAITextCompletion("davinci", "text-davinci-002", azureEndpoint, apiKey);
else
    kernel2.Config.AddOpenAITextCompletion("davinci", "text-davinci-002", apiKey, orgId);

And we'll run the same HOTTER vs COLDER scenario with `text-davinci-002` in charge:

In [None]:
var colderResponses2 = "";
var hotterResponses2 = "";
for(var i = 0; i < 3; i++) {
   var colderResult2 = await kernel2.RunAsync(input, myFunColder);
   colderResponses2+=colderResult2.ToString().Trim() + "\n";
   var hotterResult2 = await kernel.RunAsync(input, myFunHotter);
   hotterResponses2+=hotterResult2.ToString().Trim() + "\n";
}

Console.WriteLine($"COLDER responses (text-davinci-002):\n\n{colderResponses2}");
Console.WriteLine($"\nHOTTER responses (text-davinci-002):\n\n{hotterResponses2}");

If you look carefully, the COLDER responses should be similar or the same. The HOTTER ones, however, will be different. You'll notice that `text-davinci-003` felt less robot-like compared with `text-davinci-002.` But don't forget that each model has different cost structures -- so you can choose the model that matches your quality needs and economics by learning how to tune these basic parameters. 

# ⏭️ Next Steps

Run through more advanced examples in the notebooks that are available in our GitHub repo at [https://aka.ms/sk/repo](https://aka.ms/sk/repo).

[You're all done! Visit the main GitHub repo to check out what's new — because LLM AI is changing ever-so-rapidly!](https://aka.ms/sk/repo)

Or, stay a longer while and modify the various parameters to get a better feel from them. Just be careful how many times you keep calling the models in a loop — because the $$$ can quickly add up!