<a target="_blank" href="https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/Parameters_for_Controlling_Outputs.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Parameters for Controlling Outputs

In this notebook, you’ll learn about the parameters you can use to control the Chat endpoint's outputs.

*Read the accompanying [blog post here](https://docs.cohere.com/docs/parameters-for-controlling-outputs).*

## Overview

The notebook has 3 sections:
- **Model Type** - Select a variation of the Command model.
- **Randomness** - Use the `temperature` parameter to control the level of randomness of the model.
- **Conciseness** - Set the `preamble` parameter to an empty string to make model responses more concise.

## Setup

We'll start by installing the tools we'll need and then importing them.

In [1]:
! pip install cohere -q

In [None]:
#@title Enable text wrapping in Google Colab

from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

Fill in your Cohere API key in the next cell. To do this, begin by [signing up to Cohere](https://os.cohere.ai/) (for free!) if you haven't yet. Then get your API key [here](https://dashboard.cohere.com/api-keys).

In [2]:
import cohere

# Paste your API key here. Remember to not share publicly
co = cohere.Client("COHERE_API_KEY") # Insert your Cohere API key

## Model Type

When calling the Chat endpoint, use the `model` parameter to choose from several variations of the Command model. 

[See the documentation](https://docs.cohere.com/docs/models#command) for the most updated list of available Cohere models.

In [4]:
response = co.chat(message="Hello",
                   model="command-r")
print(response.text)

Hi! Hello there! How's it going? I hope you're having a fantastic day so far. Is there anything I can help you with?


## Randomness

You can use the `temperature` parameter to control the level of randomness of the model. It is a value between 0 and 1. As you increase the temperature, the model gets more creative and random. Temperature can be tuned for different problems, and most people will find that the default temperature of 0.3 is a good starting point.

Consider the example below, where we ask the model to generate alternative titles for a blog post. Prompting the endpoint five times when the temperature is set to 0 yields the same output each time. 

In [7]:
message = """Suggest a more exciting title for a blog post titled: Intro to Retrieval-Augmented Generation. \
Respond in a single line."""

for _ in range(5):
    response = co.chat(message=message,
                       temperature=0)
    print(response.text)

"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"


However, if we increase the temperature to the maximum value of 1, the model gives different proposals.

In [10]:
message = """Suggest a more exciting title for a blog post titled: Intro to Retrieval-Augmented Generation. \
Respond in a single line."""

for _ in range(5):
    response = co.chat(message=message,
                       temperature=1)
    print(response.text)

"Unleashing the Power of Generation: A Guide to the Future of Retrieval-Augmented Creation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"
"Unleashing the Power of RAG: A Guide to the Future of Generation"
"Unleashing the Power of Augmented Generation: A Guide to the Future of AI Text Generation"
"Unleashing the Power of Generation: A Guide to the Exciting World of Retrieval-Augmented Creation"


## Conciseness

Sometimes, the model provides more context than you need to address a query.  For instance, consider what happens when we ask the model a simple question: "How many eggs are in one dozen?".

In [11]:
response = co.chat(message="How many eggs are in one dozen?")
print(response.text)

There are 12 eggs in one dozen. The term "dozen" is used to represent the number 12, and it's commonly used when referring to measurements or quantities, especially for eggs. So, when you buy or hear about a dozen eggs, it means you're dealing with 12 eggs.


The model answers the question, but it uses multiple sentences when the first sentence would have been sufficient.

We can get the model to shorten its response by setting the `preamble` parameter to an empty string.

In [13]:
response = co.chat(message="How many eggs are in one dozen?",
                   preamble="")
print(response.text)

There are 12 eggs in one dozen.


As you have explored in this notebook, the Chat endpoint is a versatile tool that empowers developers with a useful array of options and parameters.