
---

### Reminder: This 📘 `Python` notebook can be run from VS Code with [these prerequisites](../PREREQS.md).

#### How to use this notebook: 

* Just read the text and scroll along until you run into code blocks.
* Code blocks have computer code inside them — hover over the block and you can run the code.
* Run the code by hitting the ▶️ "play" button to the left. If the code runs you'll see a ✔️. If not, you'll get a ❌.
* The output and status of the code block will appear just below itself — you need to scroll down further to see it.
* Sometimes a code block will ask you for input in a hard-to-notice dialog box 👆 at the top of your notebook window. 

---

# Bonus Recipe: 🕵️ Test your prompts
## 🧪 Non-deterministic computation requires testing

Because the output from LLM AI is unpredictable in nature, you're going to want to enforce some degree of reliability that it will behave a similar way each time. 

## Step 1: Instantiate a 🔥 kernel so we can start cooking

In [None]:
!python -m pip install semantic-kernel==0.3.3.dev

In [None]:
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureTextCompletion, OpenAITextCompletion

kernel = sk.Kernel()

useAzureOpenAI = False

# Configure AI service used by the kernel
if useAzureOpenAI:
    deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env()
    kernel.add_text_completion_service("dv", AzureTextCompletion(deployment, endpoint, api_key))
else:
    api_key, org_id = sk.openai_settings_from_dot_env()
    kernel.add_text_completion_service("dv", OpenAITextCompletion("text-davinci-003", api_key, org_id))

## Step 2: We're going to make one semantic function and run it at two different temperatures

Different temperatures result in different outcomes — we can illustrate that as follows:

In [None]:
myFun = """
{{$input}}
Define the term above in less than five words.
"""

myFunCold = kernel.create_semantic_function(prompt_template=myFun,
                skill_name="MyTest", function_name="A", temperature=0.2, top_p=0.1)
myFunHot = kernel.create_semantic_function(prompt_template=myFun,
                skill_name="MyTest", function_name="A", temperature=1, top_p=1.0)

print("A semantic function has been registered with two configurations.")


Now we can run these two functions over multiple iterations to see how they differ, or not. Note that our configurations are as follows:

### Configuration A

* Temperature = 0.2 <-- ranges from 0 (low randomness) to 1 (high randomness)
* TopP = 0.1 <-- ranges from 0 (low variability) to 1 (high variability)

### Configuration B

* Temperature = 1 <-- ranges from 0 (low randomness) to 1 (high randomness)
* TopP = 1.0 <-- ranges from 0 (low variability) to 1 (high variability)

In [None]:
skInput = "AI"
coldResponses = ""
hotResponses = ""
for i in range(5):
   coldResult = myFunCold(skInput)
   coldResponses += str(coldResult).strip() + "\n"
   hotResult = myFunHot(skInput)
   hotResponses += str(hotResult).strip() + "\n"

print(f"COLD responses:\n\n{coldResponses}")
print(f"\nHOT responses:\n\n{hotResponses}")

## Step 3: Let's get freezing cold to aim towards a higher degree of determinism in outputs.

You should note that the COLD responses don't deviate so much; whereas the HOT responses do. Another set of parameters to consider that increase variability are `PresencePenalty` and `FrequencyPenalty`. As the official definition goes:

* `PresencePenalty`: "Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics."
* `FrequencyPenalty`: "Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim."

So if you have the LLM AI generating a longer piece of text, you can either have it try to use new words (instead of repeating old ones) either by nudging the usage frequency or count of the word (i.e. "presence") with positive number settings. Or when using negative number settings it will be happy repeating itself by using the same words over and over. 

The benefit of using the same words repetitively is that the LLM AI won't veer off course so easily. That said, you may think it's a bit ... boring. But sometimes boring is what you want when you'd like the same answer to always get generated. Let's try this in practice.

In [None]:
myFunNext = """
Term: {{$input}}
A fifteen word story about the term:
"""

myFunColder = kernel.create_semantic_function(prompt_template=myFunNext,
                skill_name="MyTest", function_name="C", temperature=0.2,
                top_p=0.1, presence_penalty=2, frequency_penalty=2)
myFunHotter = kernel.create_semantic_function(prompt_template=myFunNext,
                skill_name="MyTest", function_name="D", temperature=1,
                presence_penalty=-2, frequency_penalty=-2, top_p=1.0)

print("A semantic function has been registered with two more EXTREME configurations.")

And now let's run them to see what happens:

In [None]:
skInput = "AI"
colderResponses = ""
hotterResponses = ""

for i in range(3):
   colderResult = myFunColder(skInput)
   colderResponses += str(colderResult).strip() + "\n"
   hotterResult = myFunHotter(skInput)
   hotterResponses += str(hotterResult).strip() + "\n"

print(f"COLDER responses:\n\n{colderResponses}")
print(f"\nHOTTER responses:\n\n{hotterResponses}")

So now you understand how to "tamp down" the LLM AI's degree of creativity with a COLDER approach. And also how to unleash its freedom of expression with a HOTTER approach.

## Step 4: Just one more thing — let's use a different model

Changing the model will also change the result. Let's see that up close by comparing what the output of `text-davinci-003` will be compared with `text-davinci-002` (the older model). We'll start by making a second kernel that uses `text-davinci-002` called `kernel2`:

In [None]:
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureTextCompletion, OpenAITextCompletion

kernel2 = sk.Kernel()

useAzureOpenAI = False

# Configure AI service used by the kernel
if useAzureOpenAI:
    deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env()
    kernel2.add_text_completion_service("dv", AzureTextCompletion(deployment, endpoint, api_key))
else:
    api_key, org_id = sk.openai_settings_from_dot_env()
    kernel2.add_text_completion_service("dv", OpenAITextCompletion("text-davinci-002", api_key, org_id))

And we'll run the same HOTTER vs COLDER scenario with `text-davinci-002` in charge:

In [None]:
myFunColder2 = kernel2.create_semantic_function(prompt_template=myFunNext,
                skill_name="MyTest", function_name="C2", temperature=0.2,
                top_p=0.1, presence_penalty=2, frequency_penalty=2)
myFunHotter2 = kernel2.create_semantic_function(prompt_template=myFunNext,
                skill_name="MyTest", function_name="D2", temperature=1,
                presence_penalty=-2, frequency_penalty=-2, top_p=1.0)

colderResponses2 = ""
hotterResponses2 = ""

for i in range(3):
   colderResult2 = myFunColder2(skInput)
   colderResponses2 += str(colderResult2).strip() + "\n"
   hotterResult2 = myFunHotter(skInput)
   hotterResponses2 += str(hotterResult2).strip() + "\n"

print(f"COLDER responses (text-davinci-002):\n\n{colderResponses2}")
print(f"\nHOTTER responses (text-davinci-002):\n\n{hotterResponses2}")

If you look carefully, the COLDER responses should be similar or the same. The HOTTER ones, however, will be different. You'll notice that `text-davinci-003` felt less robot-like compared with `text-davinci-002.` But don't forget that each model has different cost structures -- so you can choose the model that matches your quality needs and economics by learning how to tune these basic parameters. 

# ⏭️ Next Steps

Run through more advanced examples in the notebooks that are available in our GitHub repo at [https://aka.ms/sk/repo](https://aka.ms/sk/repo).

[You're all done! Visit the main GitHub repo to check out what's new — because LLM AI is changing ever-so-rapidly!](https://aka.ms/sk/repo)

Or, stay a longer while and modify the various parameters to get a better feel from them. Just be careful how many times you keep calling the models in a loop — because the $$$ can quickly add up!