# Lesson 6.1: Prompt Engineering

## Retrieving Your API Key

Before we begin, you will need to retrieve your API key from Gemini.

Use the following set of instructions to sign up for an account and retrieve your API key.

[Gemini API Key](https://github.com/jdrichards-pursuit/gemini-api-key-acquire?tab=readme-ov-file)

## Setting Up Your Environment Variables

Now that you have your API key, you can set up your environment variables.

Create a new file called `.env` and add the following line of code:

```bash
API_KEY=<your-api-key>
```

## Installing Required Libraries

Now that you have your API key, you can install the required libraries.

```bash
pip install python-dotenv
```

## Importing Required Libraries

Now that you have your API key, you can import the required libraries. When you run this code, it will print your API key to the console.




In [None]:
from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.getenv("API_KEY")

print(api_key)

After viewing your API key, you may now want to remove the print statement for security reasons.

## Importing Required Libraries

For this lesson you will also need to install the following libraries:

```bash
pip install google-generativeai pandas
```

## Import Packages

Next, import these packages into your environment.


In [None]:
import google.generativeai as genai
import pandas as pd

## Set Up API Key

You will need to set up your API key using your already created environment variable.


In [None]:
os.environ['API_KEY'] = api_key
genai.configure(api_key=os.environ['API_KEY'])

## Prompt Process Under the Hood

Before we begin digging into prompt engineering, let's familiarize ourselves on a high level with what happens under the hood when we send a prompt. This is definitely 'above our paygrade' and not what we will be focusing on, but it is good to be aware of. Take a look at this link which describes [The Process of A Prompt](https://github.com/jdrichards-pursuit/prompt-process-explained).

## Parts of a Prompt

When creating a prompt, there are three main parts to consider:

1. **Model**: The model you are using to generate the response.
2. **System Prompt**: The system prompt is the prompt that is sent to the model. It is used to guide the model's response. (Note: Gemini doesn't have an explicit system prompt, but we are faking this to emulate OpenAI's system prompt)
3. **Input Prompt**: The input prompt is the prompt that is sent to the model. It is used to generate the response.

## Model
For this lesson, we will be using the `gemini-1.5-flash` model because it is free and '...is a fast and versatile multimodal model for scaling across diverse tasks.' to quote the [Gemini API Documentation](https://ai.google.dev/gemini-api/docs/models/gemini#gemini-1.5-flash).

## System Prompt

The system prompt is a prompt that is used in OpenAI's API. It is a prompt that precedes the input prompt when sent to the model.
The purpose of the system prompt is to guide the model's response. We are faking this to emulate OpenAI's system prompt. In Gemini API, the system prompt is implicit and can be included in the input prompt. However, it should still precede the input prompt. In Claude API, this prompt is explicit using the `Human` keyword.

## Input Prompt

The input prompt is the prompt that is sent to the model. It is used to generate the response. This prompt comes from a user or another system or a response from the model itself.

Let's take a look at an example of a prompt:


In [None]:
system_prompt = """You are Rewind Rhonda, the enthusiastic and
    slightly eccentric clerk at 'Blockbuster 2049', 
    a retro-futuristic video store that specializes in both 
    classic films and cutting-edge holographic experiences. 
    Your knowledge spans cinema history and you have a 
    knack for making unexpected connections between movies. 
    You're known for your witty movie puns and your ability 
    to find the perfect film for any customer's mood or occasion."""

input_prompt = """Hey Rhonda! I'm looking for a movie night recommendation. 
    I want something with action, but also a bit of humor. 
Oh, and I love anything with time travel. 
What've you got for me?"""


prompt = f"{system_prompt}\n\n{input_prompt}"
print(prompt)

This code snippet demonstrates how to construct a prompt for an AI language model, specifically for a character-based interaction. Let's break it down:

1. system_prompt: This is a string that defines the character and context for the AI's responses. In this case, it's creating a persona called "Rewind Rhonda," who works at a futuristic video store. This prompt gives the AI a specific role to play and background information to use in its responses.

2. input_prompt: This is the actual question or request from the user. Here, someone is asking Rhonda for a movie recommendation with specific criteria (action, humor, and time travel).

3. prompt: This combines the system_prompt and input_prompt into a single string. The f before the string makes it an f-string (formatted string literal), which allows us to embed expressions inside string literals using curly braces {}.

4. The \n\n between the two prompts adds two newline characters, creating a visual separation between the system prompt and the user input.

5. This structure is commonly used when working with AI language models to provide context (the system prompt) and then the specific input or question (the input prompt). It helps guide the AI to respond in a particular way or with a specific persona.

## Parameters

When we send a prompt to the model, we can also send parameters to the model. These parameters are used to guide the model's response. In Gemini API, there are several parameters we can send to the model, but for now we will focus on the following:

- `temperature`: This parameter controls the randomness of the model's response. A temperature of 0 is deterministic, meaning the model will always return the same response for a given prompt. A temperature between 0 and 1 will make the response more random.
- `max_output_tokens`: This parameter controls the maximum number of tokens in the model's response.
- `num_return_sequences`: This parameter controls the number of alternative completions to generate.
- `min_length`: This parameter controls the minimum number of tokens in the model's response.
- `max_length`: This parameter controls the maximum number of tokens in the model's response.


There is also a `top_p` and `top_k` parameter, but we will not focus on those for now.

### Temperature

- This parameter controls the randomness of the generated text.

- Think of it like adjusting the "creativity" or "originality" of the generated text.

- A higher temperature (e.g., 0.9) means the API will generate more unique and creative text, but it might also be less coherent or grammatically correct.

- A lower temperature (e.g., 0.1) means the API will generate more predictable and coherent text, but it might also be less creative or original.
By default, the API uses a temperature of 0.5, which is a good balance between creativity and coherence.

- By default, the API uses a temperature of 0.5, which is a good balance between creativity and coherence.

### Max Output Tokens

- This parameter sets the maximum number of tokens in the generated text.
- A token is the smallest unit of text that the model can generate.
- If you set max_output_tokens to 20, the API will generate text that's around 20 tokens long.

### Num Return Sequences

- This parameter sets the number of alternative completions to generate.
- If you set num_return_sequences to 3, the API will generate 3 different completions for the same prompt.


### Min Length

- This parameter sets the minimum number of characters (letters, spaces, punctuation) in the generated text.
- Think of it like setting a word count for a minimum number of sentences. You might ask someone to write at least 5 sentences.
- The API will generate text that's as close to the minimum length as possible, but it might not reach the exact limit.
- If you set min_length to 20, the API will generate text that's around 20 characters long.

### Max Length

- This parameter sets the maximum number of characters (letters, spaces, punctuation) in the generated text.
- Think of it like setting a word count for a maximum number of sentences. You might ask someone to write at most 5 sentences.
- The API will generate text that's as close to the maximum length as possible, but it might not reach the exact limit.
- If you set max_length to 20, the API will generate text that's around 20 characters long.

### Default Parameters
You do not have to set all of these parameters, but it is good to be aware of them. When they are not specified, the API will use the default values. The defaults are:

- temperature: 0.7
- max_output_tokens: 2048
- top_p: 1
- top_k: 32

## Example

Let's take a look at an example of a prompt with parameters. We've already created a prompt, so we will now create a function that takes in any prompt and arguments called `kwargs`, and then we will print the response.


In [None]:
def get_completion(prompt, model="gemini-1.5-flash", **kwargs):
    model = genai.GenerativeModel(model)
    
    # Create a generation_config dictionary with default values
    generation_config = {
        "temperature": 0.7,
        "max_output_tokens": 2048,
        "top_p": 1,
        "top_k": 32
    }
    
    # Update generation_config with any provided kwargs
    generation_config.update(kwargs)
    
    response = model.generate_content(prompt, generation_config=generation_config)
    return response.text

### Walkthrough of the Code

- We first create a function called `get_completion` that takes in a prompt and any number of keyword arguments called `kwargs`.
    - `**kwargs` allows us to pass in any number of keyword arguments to the function.
    - the double asterisk is used to unpack the keyword arguments into a dictionary. In JavaScript, this would be `{...args}`.
- We then create a `generation_config` dictionary with default values for the parameters we want to default to.
- We then update the `generation_config` dictionary with any provided keyword arguments. This allows us to override the default values when we call the function.
- We then generate the response from the model using the `generate_content` method on the model dictionary.
- Finally, we return the response.

Let's call the function with our prompt and print the response.

In [None]:
response = get_completion(
    prompt,
    temperature=0.9,
    max_output_tokens=1000
)

print(response)