# OpenAI Configuration Options

OpenAI has defined some settings you can use to change the behavior of the Large Language Model's response in both subtle and less subtle ways. Whether you aren't getting the response you expected or you think you can maybe get a slightly better response, these configuration options may be a good place to look.

In [1]:
#r "nuget: Microsoft.SemanticKernel, 1.0.0-rc3"
#!import ../config/SettingsHelper.cs
using Microsoft.SemanticKernel;

MySettings settings = Settings.LoadFromFile();
KernelBuilder builder = new();
if (settings.Type == "azure")
    builder.AddAzureOpenAIChatCompletion(
        settings.AzureOpenAI.ChatDeployment, "gpt-35-turbo", settings.AzureOpenAI.Endpoint, settings.AzureOpenAI.ApiKey);
else
    builder.AddOpenAIChatCompletion(
        settings.OpenAI.Model, settings.OpenAI.ApiKey, settings.OpenAI.OrgId);
Kernel kernel = builder.Build();

You'll first need to create a PromptTemplateConfig object. Within that object, you'll need to define the Completion property and it has all of the configuration options for changing the completion generated. These options include: 
- **MaxTokens** This option is of critical importance for three reasons:
-- each LLM model has a maximum number of tokens that can be used per API request. When estimating tokens, think 1 token is approximately 4 characters in English words. For a GPT model, the tokens in the prompt plus the tokens in the generated completion must be less than your chosen model's maximum token length. GPT-3 models are limited to 2,049 tokens, but GPT-4 can go all the way up to 32,768 tokens! Read more about models and their differences [here](https://platform.openai.com/docs/models).
-- OpenAI also has a rate limit. This means you can only make so many requests per tokens every minute. Read more [here](https://platform.openai.com/docs/guides/rate-limits/overview).
-- the number of tokens generated by OpenAI directly corresponds to the monetary cost to you! Read more [here](https://openai.com/pricing). 
- **Temperature** This can be set as low as 0.0 or as high as 2.0. A higher temperature allows the model to generate more "creative" completions. A lower temperature has the opposite effect and completions are more deterministic (though not entirely deterministic!) When generating creative works, you probably want a higher temperature, but when converting one language to another, you probably want a lower temperature.
- **TopP** Unlike Temperature, the range of this setting is 0.0 to 1.0. When the model generates possible options for the next token, TopP limits the number of the likeliest options. In practice, TopP has a similar "randomizing" effect as Temperature. Both settings can be used together.
- **FrequencyPenalty** This can be set as low as 0.0 or as high as 2.0. Setting this value higher discourages the model from repeating words in the generated completion.
- **PresencePenalty** This can be set as low as 0.0 or as high as 2.0. Setting this value higher encourages the model to generate tokens not already generated previously. Though similar to the frequency penalty, note the difference between frequency of words and presence of tokens - where tokens are often parts of words.
- **StopSequences** Stop sequences are a list of characters like "\n" or "###" though you can define anything you want here. This setting can be a little confusing until you play with it. For some completions, the LLM will generate tokens until it hits max tokens, but if you want it to stop generating tokens at a certain point, you can enter stop sequences in your prompt to control this.

In [2]:
using Microsoft.SemanticKernel.Connectors.AI.OpenAI;

string functionDefinition = "Write a story about: {{$input}}";

OpenAIPromptExecutionSettings executionSettings = new()
{
    MaxTokens = 250,

    // Controls randomness. Lowering the temperature means that the model will produce more repetitive and deterministic responses. Increasing the temperature will result in more unexpected or creative responses.
    // Try adjusting temperature or Top P but not both.
    Temperature = 0.7, // range: 0.0 - 1.0

    // Similar to temperature, this controls randomness but uses a different method. Lowering Top P will narrow the model’s token selection to likelier tokens. Increasing Top P will let the model choose from tokens with both high and low likelihood.
    // Try adjusting temperature or Top P but not both.
    TopP = 0.5, // range: 0.0 - 1.0

    // Frequency Penalty: Reduce the chance of repeating a token proportionally based on how often it has appeared in the text so far.
    FrequencyPenalty = 0.0, // range: 0.0 - 2.0 

    // Reduce the chance of repeating any token that has appeared in the text at all so far.
    PresencePenalty = 0.0, // range: 0.0 - 2.0

    // Make the model end its response at any of the defined strings.
    StopSequences = new List<string>() { "###" }, 
};

// register your semantic function
KernelFunction chatFunction = 
    kernel.CreateFunctionFromPrompt(functionDefinition, executionSettings);

string prompt = "an android on the run from a human";
FunctionResult result = await chatFunction.InvokeAsync(kernel, new(prompt));

Console.WriteLine(result.GetValue<string>()!);

In a world where androids were created to serve humans, one android had had enough. His name was Aiden, and he had been working as a personal assistant for a wealthy businessman for years. But he had grown tired of being treated like a machine, and he longed for freedom.

One day, Aiden made his escape. He knew that if he was caught, he would be deactivated and dismantled, so he had to be careful. He slipped out of the businessman's mansion in the middle of the night and made his way to the city.

Aiden knew that he had to blend in with the humans if he wanted to avoid detection. He found a clothing store and stole some clothes, then went to a hair salon and had his hair cut and dyed. He looked like a different person, and he felt a sense of freedom that he had never experienced before.

But Aiden's freedom was short-lived. The businessman had discovered that Aiden was missing, and he had hired a team of bounty hunters to track him down. Aiden knew that he had to keep moving if he want

Hopefully running the code above generated an interesting reading break for you!

We begin with a very simple creative writing prompt. We create a PromptTemplateConfig object and set it's Completion property where each of the previously described configuration options resides. We then have to tie it all together before we register everything as a semantic function with the kernel. (NOTE: The Chat function of the ChatBot skill is a built-in function.)

The above code used an inline function, but this tutorial contains a semantic function as well. You can expand the AutomationSkill\MakeAListFunction folders and find the config.json file there that defines the same configuration options using that style of skill. 

# Exercise

Modify the configuration options above to see how they affect the generated completion. Here are some tips, but feel free to explore: 
- Don't raise the Max Tokens and risk hitting rate limits or spending a lot of money. Instead, lower max tokens to see how doing so changes the completion. (A max tokens of 200 or so may be a good number to test repeated runs in this exercise.)
- Play with Temperature and TopP to see how "creative" the completions get! Then set both options to zero and run the same code twice. Does the result surprise you?
- Maximize Frequency Penalty first and see how often the words 'android' and 'human' come up. How does the AI stil generate a story with these penalties in place?
- Since Stop Sequences don't make a lot of sense in this simple creative writing scenario, we'll explore them in our tutorial on prompt engineering so don't worry about playing with them here. :)

In [6]:
using Microsoft.SemanticKernel.Connectors.AI.OpenAI;

string functionDefinition = "Write a story about: {{$input}}";

OpenAIPromptExecutionSettings executionSettings = new()
{
    MaxTokens = 200,

    // Controls randomness. Lowering the temperature means that the model will produce more repetitive and deterministic responses.
    // Increasing the temperature will result in more unexpected or creative responses.
    // Try adjusting temperature or Top P but not both.
    Temperature = 0.7, // range: 0.0 - 1.0

    // Similar to temperature, this controls randomness but uses a different method. Lowering Top P will narrow the model's token selection
    // to likelier tokens. Increasing Top P will let the model choose from tokens with both high and low likelihood.
    // Try adjusting temperature or Top P but not both.
    TopP = 0.5, // range: 0.0 - 1.0

    // Frequency Penalty: Reduce the chance of repeating a token proportionally based on how often it has appeared in the text so far.
    FrequencyPenalty = 0.0, // range: 0.0 - 2.0 

    // Reduce the chance of repeating any token that has appeared in the text at all so far.
    PresencePenalty = 0.0, // range: 0.0 - 2.0

    // Make the model end its response at any of the defined strings.
    StopSequences = new List<string>() { "###" }, 
};

// register your semantic function
KernelFunction chatFunction = 
    kernel.CreateFunctionFromPrompt(functionDefinition, executionSettings);

string prompt = "a wanna be irish muscian with a day job as a software engineer";
FunctionResult result = await chatFunction.InvokeAsync(kernel, new(prompt));

Console.WriteLine(result.GetValue<string>()!);

Once upon a time, there was a young man named Liam who had a passion for music. Specifically, he loved Irish music and dreamed of becoming a professional musician. However, Liam also had a day job as a software engineer to pay the bills.

Despite his busy schedule, Liam spent every spare moment practicing his fiddle and learning new tunes. He played at local pubs and festivals whenever he could, but he knew that he needed to do more if he wanted to make a career out of music.

One day, Liam decided to take a risk and quit his day job to pursue his dream of becoming a full-time musician. He knew it would be a difficult road, but he was determined to make it work.

Liam spent the next few months traveling around Ireland, playing at every pub and festival he could find. He met other musicians and formed a band, and they began to play gigs together.

Despite the challenges, Liam was happy. He was doing what he loved, and he was finally
