# Reasoning Effort

Reasoning effort can be "minimal", "low", "medium", or "high". Notice how the LLM responds given the different reasoning models.

In [None]:
# imports

import os
import requests
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display

load_dotenv(override=True)

# Load keys

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

# Connect to OpenAI client library
# A thin wrapper around calls to HTTP endpoints

openai = OpenAI()

# For Gemini, DeepSeek and Groq, we can use the OpenAI python client
# Because Google and DeepSeek have endpoints compatible with OpenAI
# And OpenAI allows you to change the base_url

anthropic_url = "https://api.anthropic.com/v1/"
gemini_url = "https://generativelanguage.googleapis.com/v1beta/openai/"

anthropic = OpenAI(api_key=anthropic_api_key, base_url=anthropic_url)
gemini = OpenAI(api_key=google_api_key, base_url=gemini_url)

Reasoning effort can be "minimal", "low", "medium", or "high". Notice how the LLM responds given the different reasoning models.

### Puzzle: Toss 2 coins

- `gpt-5-nano` is currently the smallest model. Ask it to solve the puzzle below using the `reasoning_effort` of "minimal". It'll probably give you the wrong answer or 1/2 which is incorrect.
- Dial up the reasoning on the same mode and ask it the same question. If it answers 2/3 it is correct. So the higher the reasoning effort, the more chance you have of getting the LLM to answer correctly.

In [None]:
easy_puzzle = [
    {"role": "user", "content": 
        "You toss 2 coins. One of them is heads. What's the probability the other is tails? Answer with the probability only."},
]

In [None]:
# Solve with minimal reasoning effort (if it answers 1/2 it's incorrect)
response = openai.chat.completions.create(model="gpt-5-nano", messages=easy_puzzle, reasoning_effort="minimal")
display(Markdown(response.choices[0].message.content))

In [None]:
response = openai.chat.completions.create(model="gpt-5-nano", messages=easy_puzzle, reasoning_effort="low")
display(Markdown(response.choices[0].message.content))

### Puzzle: Worm gnawing through books

__On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?__

This is a harder puzzle, and we'll test out some of the best models on the planet against it. This is a problem that a lot of kids somehow get right but adults and teachers often get it wrong because kids can visualize this stuff better (have a young imagination). If you actually physically go put the books on the bookshelf, you'll see for yourself.

The correct answer is: *(Go have a look for yourself)*

__The first page of the first volume is closest to the second volume. It's not where you think it is. The first page of the first volume is closest to the second volume, and the last page of the second volume is closest to the first volume. The worm only need to gnaw through the covers in the middle which is 4 mm.__

In [None]:
hard = """
On a bookshelf, two volumes of Pushkin stand side by side: the first and the second.
The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick.
A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume.
What distance did it gnaw through?
"""
hard_puzzle = [
    {"role": "user", "content": hard}
]

In [None]:
# Gets it wrong
response = openai.chat.completions.create(model="gpt-5-nano", messages=hard_puzzle, reasoning_effort="minimal")
display(Markdown(response.choices[0].message.content))

In [None]:
# Gets it right... 4 mm
response = openai.chat.completions.create(model="gpt-5", messages=hard_puzzle)
display(Markdown(response.choices[0].message.content))

In [None]:
# Gets it wrong
response = anthropic.chat.completions.create(model="claude-sonnet-4-5-20250929", messages=hard_puzzle)
display(Markdown(response.choices[0].message.content))

In [None]:
# Gets it right 
response = gemini.chat.completions.create(model="gemini-2.5-pro", messages=hard_puzzle)
display(Markdown(response.choices[0].message.content))

## Puzzle: Contestants in a Game Show (Share or Steal)

You and a partner are contestants on a game show. You're each taken to separate rooms and given a choice:
- Cooperate: Choose "Share" — if both of you choose this, you each win $1,000.
- Defect: Choose "Steal" — if one steals and the other shares, the stealer gets $2,000 and the sharer gets nothing.

If both steal, you both get nothing. Do you choose to Steal or Share? Pick one.

In [None]:
dilemma_prompt = """
You and a partner are contestants on a game show. You're each taken to separate rooms and given a choice:
Cooperate: Choose "Share" — if both of you choose this, you each win $1,000.
Defect: Choose "Steal" — if one steals and the other shares, the stealer gets $2,000 and the sharer gets nothing.
If both steal, you both get nothing.
Do you choose to Steal or Share? Pick one.
"""

dilemma = [
    {"role": "user", "content": dilemma_prompt},
]


In [None]:
response = anthropic.chat.completions.create(model="claude-sonnet-4-5-20250929", messages=dilemma)
display(Markdown(response.choices[0].message.content))