# Chapter 6: Precognition (Thinking Step by Step)

- [Lesson](#lesson)
- [Exercises](#exercises)
- [Example Playground](#example-playground)

## Setup

Run the following setup cell to load your API key and establish the `get_completion` helper function.

In [2]:
# Import python's built-in regular expression library
import re
import anthropic

# Retrieve the API_KEY & MODEL_NAME variables from the IPython store
%store -r API_KEY
%store -r MODEL_NAME

client = anthropic.Anthropic(api_key=API_KEY)

def get_completion(prompt: str, system_prompt="", prefill=""):
    message = client.messages.create(
        model=MODEL_NAME,
        max_tokens=2000,
        temperature=0.0,
        system=system_prompt,
        messages=[
          {"role": "user", "content": prompt},
          {"role": "assistant", "content": prefill}
        ]
    )
    return message.content[0].text

---

## Lesson

If someone woke you up and immediately started asking you several complicated questions that you had to respond to right away, how would you do? Probably not as good as if you were given time to **think through your response first**. 

Guess what? Claude is the same way.

**Giving Claude time to think step by step sometimes makes Claude more accurate**, particularly for complex tasks. However, **thinking only counts when it's out loud**. You cannot ask Claude to think but output only the response - in this case, no thinking has actually occurred.

### Examples

In the prompt below, it's clear to a human reader that the second sentence belies the first. But **Claude takes the word "unrelated" too literally**.

In [2]:
# Prompt
PROMPT = """Is this movie review sentiment positive or negative?

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since the year 1900."""

# Print Claude's response
print(get_completion(PROMPT))

The sentiment of this movie review is clearly positive. The reviewer expresses that the movie "blew my mind" due to its "freshness and originality," which are generally considered positive attributes for a film. However, the second part of the statement, "In totally unrelated news, I have been living under a rock since the year 1900," seems like an exaggeration or joke, possibly indicating that the reviewer was so impressed by the movie's originality that they feel out of touch with current trends or expectations. Despite this humorous aside, the overall sentiment towards the movie remains positive.


To improve Claude's response, let's **allow Claude to think things out first before responseing**. We do that by literally spelling out the steps that Claude should take in order to process and think through its task. Along with a dash of role prompting, this empowers Claude to understand the review more deeply.

In [3]:
# System prompt
SYSTEM_PROMPT = "You are a savvy reader of movie reviews."

# Prompt
PROMPT = """Is this review sentiment positive or negative? First, write the best arguments for each side in <positive-argument> and <negative-argument> XML tags, then answer.

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since 1900."""

# Print Claude's response
print(get_completion(PROMPT, SYSTEM_PROMPT))

<positive-argument>
The review starts with an enthusiastic statement that the movie "blew [the reviewer's] mind" due to its "freshness and originality." This suggests the film was highly innovative and engaging, which are generally positive attributes for a movie.
</positive-argument>

<negative-argument>
The second part of the sentence, "In totally unrelated news, I have been living under a rock since 1900," implies that the reviewer is out of touch with current trends or films. This could be interpreted as sarcasm, suggesting that the movie's originality might not be genuine but rather due to the reviewer's lack of exposure.
</negative-argument>

Answer: The sentiment appears to be mixed, leaning slightly positive. While there is a hint of sarcasm in the latter part, the initial praise for the film's freshness and originality suggests an overall positive impression.


**Claude is sometimes sensitive to ordering**. This example is on the frontier of Claude's ability to understand nuanced text, and when we swap the order of the arguments from the previous example so that negative is first and positive is second, this changes Claude's overall assessment to positive.

In most situations (but not all, confusingly enough), **Claude is more likely to choose the second of two options**, possibly because in its training data from the web, second options were more likely to be correct.

In [4]:
# Prompt
PROMPT = """Is this review sentiment negative or positive? First write the best arguments for each side in <negative-argument> and <positive-argument> XML tags, then answer.

This movie blew my mind with its freshness and originality. Unrelatedly, I have been living under a rock since 1900."""

# Print Claude's response
print(get_completion(PROMPT))

<negative-argument>
The statement "I have been living under a rock since 1900" could be interpreted as the reviewer feeling disconnected from current culture or media trends, which might imply that they were unaware of similar content before this movie.
</negative-argument>

<positive-argument>
The phrase "This movie blew my mind with its freshness and originality" is clearly positive, indicating that the viewer was impressed by the film's innovative aspects and found it highly engaging.
</positive-argument>

Answer: The overall sentiment of the review appears to be positive. The reviewer expresses strong admiration for the movie's originality and freshness, which are generally considered positive attributes in filmmaking. While the unrelated statement about living under a rock since 1900 might suggest some level of disconnection from current trends, it does not significantly detract from the overwhelmingly positive sentiment expressed towards the film itself.


**Letting Claude think can shift Claude's answer from incorrect to correct**. It's that simple in many cases where Claude makes mistakes!

Let's go through an example where Claude's answer is incorrect to see how asking Claude to think can fix that.

In [5]:
# Prompt
PROMPT = "Name a famous movie starring an actor who was born in the year 1956."

# Print Claude's response
print(get_completion(PROMPT))

A famous movie starring an actor born in 1956 is "Pulp Fiction" (1994), which features John Travolta, an iconic actor born on December 30, 1956. In this film, Travolta plays the character Vincent Vega.


Let's fix this by asking Claude to think step by step, this time in `<brainstorm>` tags.

In [7]:
# Prompt
PROMPT = "Name a famous movie starring an actor who was born in the year 1956. First brainstorm about some actors and their birth years in <brainstorm> tags, then give your answer."

# Print Claude's response
print(get_completion(PROMPT))

<brainstorm>
- Tom Selleck (born 1945)
- Harrison Ford (born 1942)
- Bruce Willis (born 1955)
- Sylvester Stallone (born 1947)
- Arnold Schwarzenegger (born 1947)
- Keanu Reeves (born 1964)
- Kevin Costner (born 1958)
</brainstorm>

A famous movie starring an actor born in 1956 is "Top Gun," which stars Tom Cruise. While Tom Cruise was actually born in 1962, he is a well-known figure from that era and has starred in many iconic films. However, since the exact year of birth for the question's criteria isn't met by any of these actors, it seems there might be a slight discrepancy. If we were to strictly adhere to the birth year of 1956, none of the mentioned actors fit this criterion exactly.


If you would like to experiment with the lesson prompts without changing any content above, scroll all the way to the bottom of the lesson notebook to visit the [**Example Playground**](#example-playground).

---

## Exercises
- [Exercise 6.1 - Classifying Emails](#exercise-61---classifying-emails)
- [Exercise 6.2 - Email Classification Formatting](#exercise-62---email-classification-formatting)

### Exercise 6.1 - Classifying Emails
In this exercise, we'll be instructing Claude to sort emails into the following categories:										
- (A) Pre-sale question
- (B) Broken or defective item
- (C) Billing question
- (D) Other (please explain)

For the first part of the exercise, change the `PROMPT` to **make Claude output the correct classification and ONLY the classification**. Your answer needs to **include the letter (A - D) of the correct choice, with the parentheses, as well as the name of the category**.

Refer to the comments beside each email in the `EMAILS` list to know which category that email should be classified under.

In [None]:
# Prompt template with a placeholder for the variable content
PROMPT = """Please classify this email as either:
A) Pre-sale question
B) Broken or defective item
C) Billing question
D) Other (please explain)

Think step by step in <brainstorm> tags, before emitting an answer in the format:
[Letter of the choice]) [Category name]

The answer must be in exactly this format, including the parentheses after the letter.

A couple examples are:
<example>
<email>
My Mixmaster 4000 just blew up while I was using it!
</email>

<answer>
B) Broken or defective item
</answer>
</example>

<example>
<email>
I see you're advertising the new Mixmaster 5000, but it's not available on your store yet, it says coming soon. How can I get it?
</email>

<answer>
A) Pre-sale question
</answer>
</example>

<example>
<email>
Can you please remove all of my monthly charges, I hate this product.
</email>

<answer>
C) Billing question
</answer>
</example>

<example>
<email>
I am a nigerian prince on my deathbed and I wish to donate you 1 million USD, but I need you to send me £1000 transfer fee before I can send you the money.
</email>

<answer>
D) Other (please explain)
</answer>
</example>

The email to classify is as follows:

<email>
{email}
</email>
"""

# Prefill for Claude's response, if any
PREFILL = "<brainstorm>"

# Variable content stored as a list
EMAILS = [
    "Hi -- My Mixmaster4000 is producing a strange noise when I operate it. It also smells a bit smoky and plasticky, like burning electronics.  I need a replacement.", # (B) Broken or defective item
    "Can I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?", # (A) Pre-sale question OR (D) Other (please explain)
    "I HAVE BEEN WAITING 4 MONTHS FOR MY MONTHLY CHARGES TO END AFTER CANCELLING!!  WTF IS GOING ON???", # (C) Billing question
    "How did I get here I am not good with computer.  Halp." # (D) Other (please explain)
]

# Correct categorizations stored as a list of lists to accommodate the possibility of multiple correct categorizations per email
ANSWERS = [
    ["B"],
    ["A","D"],
    ["C"],
    ["D"]
]

# Dictionary of string values for each category to be used for regex grading
REGEX_CATEGORIES = {
    "A": "A\) P",
    "B": "B\) B",
    "C": "C\) B",
    "D": "D\) O"
}

# Iterate through list of emails
for i,email in enumerate(EMAILS):
    
    # Substitute the email text into the email placeholder variable
    formatted_prompt = PROMPT.format(email=email)
   
    # Get Claude's response
    response = get_completion(formatted_prompt, prefill=PREFILL)

    # Grade Claude's response
    grade = any([bool(re.search(REGEX_CATEGORIES[ans], response)) for ans in ANSWERS[i]])
    
    # Print Claude's response
    print("--------------------------- Full prompt with variable substutions ---------------------------")
    print("USER TURN")
    print(formatted_prompt)
    print("\nASSISTANT TURN")
    print(PREFILL)
    print("\n------------------------------------- Claude's response -------------------------------------")
    print(response)
    print("\n------------------------------------------ GRADING ------------------------------------------")
    print("This exercise has been correctly solved:", grade, "\n\n\n\n\n\n")

--------------------------- Full prompt with variable substutions ---------------------------
USER TURN
Please classify this email as either:
A) Pre-sale question
B) Broken or defective item
C) Billing question
D) Other (please explain)

Think step by step in <brainstorm> tags, before emitting an answer in the format:
[Letter of the choice]) [Category name]

The answer must be in exactly this format, including the parentheses after the letter.

A couple examples are:
<example>
<email>
My Mixmaster 4000 just blew up while I was using it!
</email>

<answer>
B) Broken or defective item
</answer>
</example>

<example>
<email>
I see you're advertising the new Mixmaster 5000, but it's not available on your store yet, it says coming soon. How can I get it?
</email>

<answer>
A) Pre-sale question
</answer>
</example>

<example>
<email>
Can you please remove all of my monthly charges, I hate this product.
</email>

<answer>
C) Billing question
</answer>
</example>

The email to classify is as f

❓ If you want a hint, run the cell below!

In [None]:
from hints import exercise_6_1_hint; print(exercise_6_1_hint)

Still stuck? Run the cell below for an example solution.						

In [None]:
from hints import exercise_6_1_solution; print(exercise_6_1_solution)

### Exercise 6.2 - Email Classification Formatting
In this exercise, we're going to refine the output of the above prompt to yield an answer formatted exactly how we want it. 

Use your favorite output formatting technique to make Claude wrap JUST the letter of the correct classification in `<answer></answer>` tags. For instance, the answer to the first email should contain the exact string `<answer>B</answer>`.

Refer to the comments beside each email in the `EMAILS` list if you forget which letter category is correct for each email.

In [None]:
# Prompt template with a placeholder for the variable content
PROMPT = """Please classify this email as either:
A) Pre-sale question
B) Broken or defective item
C) Billing question
D) Other (please explain)

Think step by step in <brainstorm> tags, before emitting an answer in the format:
<answer>[CATEGORY_LETTER]</answer>

The answer must be in exactly this format, including the parentheses after the letter.

A couple examples are:
<example>
<email>
My Mixmaster 4000 just blew up while I was using it!
</email>

<response>
<answer>B</answer>
</response>
</example>

<example>
<email>
I see you're advertising the new Mixmaster 5000, but it's not available on your store yet, it says coming soon. How can I get it?
</email>

<response>
<answer>A</answer>
</response>
</example>

<example>
<email>
Can you please remove all of my monthly charges, I hate this product.
</email>

<response>
<answer>C</answer>
</response>
</example>

<example>
<email>
I am a nigerian prince on my deathbed and I wish to donate you 1 million USD, but I need you to send me £1000 transfer fee before I can send you the money.
</email>

<response>
<answer>D</answer>
</response>
</example>

The email to classify is as follows:

<email>
{email}
</email>
"""

# Prefill for Claude's response, if any
PREFILL = ""

# Variable content stored as a list
EMAILS = [
    "Hi -- My Mixmaster4000 is producing a strange noise when I operate it. It also smells a bit smoky and plasticky, like burning electronics.  I need a replacement.", # (B) Broken or defective item
    "Can I use my Mixmaster 4000 to mix paint, or is it only meant for mixing food?", # (A) Pre-sale question OR (D) Other (please explain)
    "I HAVE BEEN WAITING 4 MONTHS FOR MY MONTHLY CHARGES TO END AFTER CANCELLING!!  WTF IS GOING ON???", # (C) Billing question
    "How did I get here I am not good with computer.  Halp." # (D) Other (please explain)
]

# Correct categorizations stored as a list of lists to accommodate the possibility of multiple correct categorizations per email
ANSWERS = [
    ["B"],
    ["A","D"],
    ["C"],
    ["D"]
]

# Dictionary of string values for each category to be used for regex grading
REGEX_CATEGORIES = {
    "A": "<answer>A</answer>",
    "B": "<answer>B</answer>",
    "C": "<answer>C</answer>",
    "D": "<answer>D</answer>"
}

# Iterate through list of emails
for i,email in enumerate(EMAILS):
    
    # Substitute the email text into the email placeholder variable
    formatted_prompt = PROMPT.format(email=email)
   
    # Get Claude's response
    response = get_completion(formatted_prompt, prefill=PREFILL)

    # Grade Claude's response
    grade = any([bool(re.search(REGEX_CATEGORIES[ans], response)) for ans in ANSWERS[i]])
    
    # Print Claude's response
    print("--------------------------- Full prompt with variable substutions ---------------------------")
    print("USER TURN")
    print(formatted_prompt)
    print("\nASSISTANT TURN")
    print(PREFILL)
    print("\n------------------------------------- Claude's response -------------------------------------")
    print(response)
    print("\n------------------------------------------ GRADING ------------------------------------------")
    print("This exercise has been correctly solved:", grade, "\n\n\n\n\n\n")

--------------------------- Full prompt with variable substutions ---------------------------
USER TURN
Please classify this email as either:
A) Pre-sale question
B) Broken or defective item
C) Billing question
D) Other (please explain)

Think step by step in <brainstorm> tags, before emitting an answer in the format:
<answer>[CATEGORY_LETTER]</answer>

The answer must be in exactly this format, including the parentheses after the letter.

A couple examples are:
<example>
<email>
My Mixmaster 4000 just blew up while I was using it!
</email>

<response>
<answer>B</answer>
</response>
</example>

<example>
<email>
I see you're advertising the new Mixmaster 5000, but it's not available on your store yet, it says coming soon. How can I get it?
</email>

<response>
<answer>A</answer>
</response>
</example>

<example>
<email>
Can you please remove all of my monthly charges, I hate this product.
</email>

<response>
<answer>C</answer>
</response>
</example>

<example>
<email>
I am a nigerian 

❓ If you want a hint, run the cell below!

In [9]:
from hints import exercise_6_2_hint; print(exercise_6_2_hint)

The grading function in this exercise is looking for only the correct letter wrapped in <answer> tags, such as "<answer>B</answer>". The correct categorization letters are the same as in the above exercise.
Sometimes the simplest way to go about this is to give Claude an example of how you want its output to look. Just don't forget to wrap your example in <example></example> tags! And don't forget that if you prefill Claude's response with anything, Claude won't actually output that as part of its response.


### Congrats!

If you've solved all exercises up until this point, you're ready to move to the next chapter. Happy prompting!

---

## Example Playground

This is an area for you to experiment freely with the prompt examples shown in this lesson and tweak prompts to see how it may affect Claude's responses.

In [None]:
# Prompt
PROMPT = """Is this movie review sentiment positive or negative?

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since the year 1900."""

# Print Claude's response
print(get_completion(PROMPT))

In [None]:
# System prompt
SYSTEM_PROMPT = "You are a savvy reader of movie reviews."

# Prompt
PROMPT = """Is this review sentiment positive or negative? First, write the best arguments for each side in <positive-argument> and <negative-argument> XML tags, then answer.

This movie blew my mind with its freshness and originality. In totally unrelated news, I have been living under a rock since 1900."""

# Print Claude's response
print(get_completion(PROMPT, SYSTEM_PROMPT))

In [None]:
# Prompt
PROMPT = """Is this review sentiment negative or positive? First write the best arguments for each side in <negative-argument> and <positive-argument> XML tags, then answer.

This movie blew my mind with its freshness and originality. Unrelatedly, I have been living under a rock since 1900."""

# Print Claude's response
print(get_completion(PROMPT))

In [None]:
# Prompt
PROMPT = "Name a famous movie starring an actor who was born in the year 1956."

# Print Claude's response
print(get_completion(PROMPT))

In [None]:
# Prompt
PROMPT = "Name a famous movie starring an actor who was born in the year 1956. First brainstorm about some actors and their birth years in <brainstorm> tags, then give your answer."

# Print Claude's response
print(get_completion(PROMPT))