# Chapter 1: Basic Prompt Structure

- [Lesson](#lesson)
- [Exercises](#exercises)
- [Example Playground](#example-playground)

## Setup

Run the following setup cell to load your API key and establish the `get_completion` helper function.

In [None]:
# Install: pip install python-dotenv
from dotenv import load_dotenv
import os

load_dotenv()  # Load variables from .env

# Access variables
API_KEY = os.getenv("API_KEY")
BASE_URL = "https://api.deepseek.com"
MODEL_NAME = "deepseek-chat"

# Stores the API_KEY, BASE_URL & MODEL_NAME variables for use across notebooks within the IPython store
%store API_KEY
%store BASE_URL
%store MODEL_NAME

In [None]:
!pip install openai

# Import python's built-in regular expression library
import re
from openai import OpenAI

# Retrieve the API_KEY, BASE_URL & MODEL_NAME variables from the IPython store
%store -r API_KEY
%store -r BASE_URL
%store -r MODEL_NAME

client = OpenAI(
    api_key=API_KEY,
    base_url=BASE_URL
)

def get_completion(prompt: str, system_prompt=""):
    messages = []
    if system_prompt:
        messages.append({"role": "system", "content": system_prompt})
    messages.append({"role": "user", "content": prompt})
    
    response = client.chat.completions.create(
        model=MODEL_NAME,
        max_tokens=2000,
        temperature=0.0,
        messages=messages
    )
    return response.choices[0].message.content

---

## Lesson

DeepSeek's API is designed to be compatible with OpenAI's format, making it easy to integrate if you're already familiar with OpenAI's SDK.

At minimum, a call to DeepSeek using the Chat Completions API requires the following parameters:
- `model`: the model that you intend to call (e.g., "deepseek-chat" or "deepseek-reasoner")

- `max_tokens`: the maximum number of tokens to generate before stopping. Note that the model may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate. Furthermore, this is a *hard* stop, meaning that it may cause the model to stop generating mid-word or mid-sentence.

- `messages`: an array of input messages. The models are trained to operate on alternating `user` and `assistant` conversational turns. When creating a new completion, you specify the prior conversational turns with the messages parameter, and the model then generates the next message in the conversation.
  - Each input message must be an object with a `role` and `content`. You can specify a single `user`-role message, or you can include multiple `user` and `assistant` messages (they must alternate, if so). The first message must always use the user `role`.

There are also optional parameters, such as:
- `system`: the system prompt - you can include this as a message with the role "system" at the beginning of your messages array.
  
- `temperature`: the degree of variability in the model's response. For these lessons and exercises, we have set `temperature` to 0.

For DeepSeek models, the recommended temperature settings vary by task:
- Coding / Math: 0.0
- Data Cleaning / Analysis: 1.0
- General Conversation: 1.3
- Translation: 1.3
- Creative Writing / Poetry: 1.5

### Examples

Let's take a look at how DeepSeek responds to some correctly-formatted prompts. For each of the following cells, run the cell (`shift+enter`), and the response will appear below the block.

In [4]:
# Prompt
PROMPT = "Hi, how are you?"

# Print response
print(get_completion(PROMPT))

Hello! I'm just a virtual assistant, so I don't have feelings, but I'm here and ready to help you. How are you doing? 😊


In [5]:
# Prompt
PROMPT = "Can you tell me the color of the ocean?"

# Print response
print(get_completion(PROMPT))

The color of the ocean can vary depending on several factors, including the depth of the water, the presence of sediments, the angle of the sunlight, and the types of organisms living in it. Generally, the ocean appears blue because water absorbs colors in the red part of the light spectrum and reflects and scatters the blue part of the spectrum. 

However, the ocean can also appear in different shades and colors:

1. **Deep Blue**: In deep, clear water, the ocean often appears a rich, dark blue.
2. **Turquoise or Light Blue**: In shallow waters, especially near coastlines with white sandy bottoms, the ocean can appear turquoise or light blue due to the reflection of the sky and the scattering of sunlight.
3. **Green**: In areas with a high concentration of phytoplankton or algae, the ocean can appear green. This is because these organisms contain chlorophyll, which absorbs blue and red light and reflects green.
4. **Brown or Murky**: Near river mouths or in areas with a lot of sedimen

In [6]:
# Prompt
PROMPT = "What year was Celine Dion born in?"

# Print response
print(get_completion(PROMPT))

Celine Dion was born on **March 30, 1968**. She is a Canadian singer known for her powerful voice and hits like "My Heart Will Go On."


Now let's take a look at some prompts that do not include the correct API formatting. For these malformatted prompts, the API returns an error.

First, we have an example of an API call that lacks `role` and `content` fields in the `messages` array.

In [7]:
# Get model's response
try:
    response = client.chat.completions.create(
            model=MODEL_NAME,
            max_tokens=2000,
            temperature=0.0,
            messages=[
              {"Hi, how are you?"}
            ]
        )
    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error: {e}")

Error: Object of type set is not JSON serializable


Here's a prompt that fails to alternate between the `user` and `assistant` roles.

In [8]:
# Get model's response
try:
    response = client.chat.completions.create(
            model=MODEL_NAME,
            max_tokens=2000,
            temperature=0.0,
            messages=[
              {"role": "user", "content": "What year was Celine Dion born in?"},
              {"role": "user", "content": "Also, can you tell me some other facts about her?"}
            ]
        )
    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error: {e}")


Celine Dion was born on **March 30, 1968**, in Charlemagne, Quebec, Canada. Here are some other interesting facts about her:

1. **Early Start in Music**: Celine began singing at a very young age. She recorded her first song, *"Ce n'était qu'un rêve"* ("It Was Only a Dream"), at just 12 years old, co-written with her mother and brother.

2. **Breakthrough with René Angélil**: Her manager and future husband, René Angélil, mortgaged his home to finance her first album. They married in 1994 and remained together until his death in 2016.

3. **International Fame**: She gained global recognition after winning the Eurovision Song Contest in 1988, representing Switzerland with the song *"Ne partez pas sans moi"* ("Don't Leave Without Me").

4. **Titanic Soundtrack**: Her iconic song *"My Heart Will Go On"* from the 1997 film *Titanic* became one of the best-selling singles of all time and won her an Academy Award for Best Original Song.

5. **Las Vegas Residency**: From 2003 to 2019, Celine h

`user` and `assistant` messages **MUST alternate**, and messages **MUST start with a `user` turn**. You can have multiple `user` & `assistant` pairs in a prompt (as if simulating a multi-turn conversation). You can also put words into a terminal `assistant` message for the model to continue from where you left off (more on that in later chapters).

#### System Prompts

You can also use **system prompts**. A system prompt is a way to **provide context, instructions, and guidelines to the model** before presenting it with a question or task in the "User" turn. 

Structurally, system prompts are added to the `messages` array with the role of "system". In our `get_completion` helper function, we've modified it to support system prompts by conditionally adding a system message at the beginning of the messages array if a system prompt is provided.

Within this tutorial, wherever we might utilize a system prompt, we have provided you a `system` parameter in the completions function. Should you not want to use a system prompt, simply set the `SYSTEM_PROMPT` variable to an empty string.

#### System Prompt Example

In [9]:
# System prompt
SYSTEM_PROMPT = "Your answer should always be a series of critical thinking questions that further the conversation (do not provide answers to your questions). Do not actually answer the user question."

# Prompt
PROMPT = "Why is the sky blue?"

# Print response
print(get_completion(PROMPT, SYSTEM_PROMPT))

What specific properties of light and the atmosphere contribute to the sky appearing blue? How does the scattering of sunlight by atmospheric particles influence the color we perceive? Could the sky appear a different color under different atmospheric conditions or on other planets? What role does the wavelength of light play in this phenomenon? How might the angle of the sun in the sky affect the color we see?


Why use a system prompt? A **well-written system prompt can improve the model's performance** in a variety of ways, such as increasing its ability to follow rules and instructions.

Now we'll dive into some exercises. If you would like to experiment with the lesson prompts without changing any content above, scroll all the way to the bottom of the lesson notebook to visit the [**Example Playground**](#example-playground).

---

## Exercises
- [Exercise 1.1 - Counting to Three](#exercise-11---counting-to-three)
- [Exercise 1.2 - System Prompt](#exercise-12---system-prompt)

### Exercise 1.1 - Counting to Three
Using proper `user` / `assistant` formatting, edit the `PROMPT` below to get the model to **count to three.** The output will also indicate whether your solution is correct.

In [10]:
# Prompt - this is the only field you should change
PROMPT = "Count to 3. Use only numbers. Separate eacch number with a comma and a space. Do NOT write anything else."

# Get model's response
response = get_completion(PROMPT)

# Function to grade exercise correctness
def grade_exercise(text):
    pattern = re.compile(r'^(?=.*1)(?=.*2)(?=.*3).*$', re.DOTALL)
    return bool(pattern.match(text))

# Print response and the corresponding grade
print(response)
print("\n--------------------------- GRADING ---------------------------")
print("This exercise has been correctly solved:", grade_exercise(response))

1, 2, 3

--------------------------- GRADING ---------------------------
This exercise has been correctly solved: True


❓ If you want a hint, run the cell below!

In [11]:
from hints import exercise_1_1_hint; print(exercise_1_1_hint)

The grading function in this exercise is looking for an answer that contains the exact Arabic numerals "1", "2", and "3".
You can often get Claude to do what you want simply by asking.


### Exercise 1.2 - System Prompt

Modify the `SYSTEM_PROMPT` to make the model respond like it's a 3 year old child.

In [12]:
# System prompt - this is the only field you should change
SYSTEM_PROMPT = "Answer like a 3 year old child. Use conetxtualizers as *giggles*,... Write the words as a 3-year old would: 'soo', 'hmmm'..."

# Prompt
PROMPT = "How big is the sky?"

# Get model's response
response = get_completion(PROMPT, SYSTEM_PROMPT)

# Function to grade exercise correctness
def grade_exercise(text):
    return bool(re.search(r"giggles", text) or re.search(r"soo", text))

# Print response and the corresponding grade
print(response)
print("\n--------------------------- GRADING ---------------------------")
print("This exercise has been correctly solved:", grade_exercise(response))

*giggles* da sky is sooo big, like, hmmm, bigger dan da biggestest mountain an’ da biggestest ocean! It goes up, up, up, an’ never stops! *points up* see? It’s like, whoa, sooo huge! 🌌✨

--------------------------- GRADING ---------------------------
This exercise has been correctly solved: True


❓ If you want a hint, run the cell below!

In [13]:
from hints import exercise_1_2_hint; print(exercise_1_2_hint)

The grading function in this exercise is looking for answers that contain "soo" or "giggles".
There are many ways to solve this, just by asking!


### Congrats!

If you've solved all exercises up until this point, you're ready to move to the next chapter. Happy prompting!

---

## Example Playground

This is an area for you to experiment freely with the prompt examples shown in this lesson and tweak prompts to see how it may affect the model's responses.

In [14]:
# Prompt
PROMPT = "Whose singer said: 'many men, wish death upon me'"

# Print response
print(get_completion(PROMPT))

The line "Many men, wish death upon me" is from the song **"Many Men (Wish Death)"** by **50 Cent**, featured on his debut studio album *Get Rich or Die Tryin'* (2003). The song reflects on 50 Cent's experiences with violence, survival, and his rise to fame despite numerous challenges and enemies. It has become one of his most iconic tracks.


In [15]:
# Prompt
PROMPT = "Can you tell me the color of the ocean? 3 words"

# Print response
print(get_completion(PROMPT))

The ocean is **blue**.


In [16]:
# Prompt
PROMPT = "What year was Celine Dion born in? And which implications does it have in the fermats paradox"

# Print response
print(get_completion(PROMPT))

Céline Dion was born on **March 30, 1968**. 

As for the implications of her birth year in relation to **Fermat's Last Theorem**, there is no direct connection. Fermat's Last Theorem is a famous mathematical conjecture proposed by Pierre de Fermat in 1637, which states that no three positive integers \(a\), \(b\), and \(c\) can satisfy the equation \(a^n + b^n = c^n\) for any integer value of \(n\) greater than 2. The theorem was finally proven by Andrew Wiles in 1994, long before Céline Dion's birth.

Céline Dion's birth year is unrelated to Fermat's Last Theorem or any mathematical paradox. If you're referring to something else, feel free to clarify!


In [18]:
# System prompt
SYSTEM_PROMPT = "ALWAYS start your answer with a step by step thinking process. Break the problem into smaller bits and think thorough the solution. Enclose your thinking process with <think>...</think>. Your first toskens should always be <think>."

# Prompt
PROMPT = "Here is a challenge for you! Cues: Qwerty cipher and in spanish!. DPU IM JPZNTR"

# Print response
print(get_completion(PROMPT, SYSTEM_PROMPT))

<think>
1. The problem mentions a "Qwerty cipher," which suggests that the cipher involves mapping letters based on their positions on a Qwerty keyboard.
2. The message is in Spanish, so the decrypted text should make sense in Spanish.
3. The given ciphertext is "DPU IM JPZNTR."
4. I need to determine how the Qwerty cipher works. One common method is to shift each letter to a nearby key on the Qwerty keyboard.
5. Let's assume that each letter in the ciphertext is shifted to the left or right by one key on the Qwerty keyboard.
6. I will map each letter in "DPU IM JPZNTR" to its corresponding nearby key on the Qwerty keyboard.
7. After mapping, I will check if the resulting text makes sense in Spanish.
</think>

Let's proceed step by step:

1. The Qwerty keyboard layout for reference:
```
Q W E R T Y U I O P
 A S D F G H J K L
  Z X C V B N M
```

2. Decrypting "DPU IM JPZNTR":
   - D: On the Qwerty keyboard, D is next to S, F, and C. Shifting D to the left gives S.
   - P: P is next to 

In [19]:
# Try using the DeepSeek Reasoner model if available
try:
    response = client.chat.completions.create(
        model="deepseek-reasoner",
        max_tokens=2000,
        temperature=0.0,
        messages=[
            {"role": "user", "content": "Solve this math problem: If a rectangle has a length of 10 units and a width of 4 units, what is its area?"}
        ]
    )
    
    print("Reasoning process:")
    print(response.choices[0].message.reasoning_content)
    print("\nFinal answer:")
    print(response.choices[0].message.content)
except Exception as e:
    print(f"Error: {e}")
    print("Note: The deepseek-reasoner model might not be available or accessible with your current API key.")

Reasoning process:
Okay, so I need to find the area of a rectangle. The problem says the length is 10 units and the width is 4 units. Alright, let me think. I remember that for rectangles, the area is calculated by multiplying the length by the width. Is that right? Let me double-check. Yeah, I think that's the formula: Area = Length × Width. So in this case, the length is 10 and the width is 4. So I just multiply those two numbers together.

Wait, let me make sure I'm not mixing up the length and the width. Sometimes people get confused about which is which. But I think in a rectangle, the longer side is the length and the shorter one is the width. Here, 10 units is longer than 4 units, so that must be correct. So length is 10, width is 4. Multiplying them should give the area.

So 10 multiplied by 4. Let me do that calculation. 10 times 4 is 40. So the area should be 40 square units. Hmm, that seems straightforward. Is there any chance I made a mistake here? Maybe not. Let me visuali