In [1]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display, update_display

In [2]:
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

## 1. Asking an easy question

In [3]:
messages = [{"role":"user", "content": "You toss 2 coins. One of them is heads. What's the probability the other is tails? Answer with the probility only."}]

#the correct answer to this question is 2/3

In [4]:

openai = OpenAI()
response = openai.chat.completions.create(model="gpt-5-nano", messages= messages, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

1/2

In [5]:
# it gives 1/2 which is wrong answer. this is because we are using the smallest model and that too we have explicityly asked to keep the
# reasoning effort minimal.

In [6]:
# now lets increase the reasoning level and see if it shows difference in output or not

response = openai.chat.completions.create(model="gpt-5-nano", messages= messages, reasoning_effort="low")
display(Markdown( response.choices[0].message.content))

2/3

In [7]:
# thus it gives the correct answer on increasing the reasoning_effort from minimal to low. 

In [8]:
## another way of increasing the result accuracy by increasing a more-powerful model all-together. thus lets use gpt-5-mini
## instead of "gpt-5-nano" . note that  the gpt-5-mini comes up with the correct answer even after using the reason-effort-minimal 
## which was not the case gpt-5-nano

response = openai.chat.completions.create(model="gpt-5-mini", messages= messages, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

2/3

In [9]:
# note that the llama3.2 comes up with wrong answer even after passing reasoning_effort=high
OLLAMA_BASE_URL = "http://localhost:11434/v1"
ollama = OpenAI(base_url = OLLAMA_BASE_URL, api_key= "ollama")
response = ollama.chat.completions.create(model="llama3.2", messages= messages, reasoning_effort="high")
display(Markdown( response.choices[0].message.content))

1/2.

## 2. asking a hard question

In [10]:
hardproblem = """
On a bookshelf, two volumes of Pushkin stand side by side: the first and the second.
The pages of each volume together have a thickness of 2 cm, and each cover is 2mm thick.
A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of teh second volume.
What distance did it gnaw through?
"""

hard_puzzle =[{"role":"user", "content": hardproblem}]
#the correct answer is 4mm

In [11]:
#lets ask this to "gpt-5-nano" with reasoning_effort="minimal"

openai = OpenAI()
response = openai.chat.completions.create(model="gpt-5-nano", messages= hard_puzzle, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

Interpret the setup carefully:

- Each volume has total page thickness 2 cm.
- Each cover (front and back) is 2 mm thick.
- The worm starts at the first page of the first volume and ends at the last page of the second volume, gnawing perpendicularly through the stack of books on the shelf.

Let’s lay out the arrangement from left to right:
- Volume 1: cover (front) | pages (2 cm) | cover (back)
- Space between volumes: they sit side by side, no gap assumed
- Volume 2: cover (front) | pages (2 cm) | cover (back)

When the worm goes from the first page of the first volume to the last page of the second volume, it traverses:
- Through the remaining pages of Volume 1 (from its first page to the back cover)
- Through the back cover of Volume 1
- Through the front cover of Volume 2
- Through the pages of Volume 2 up to its last page

However, the worm is entering at the very first page of Volume 1 and exiting at the very last page of Volume 2, so the only parts it does NOT gnaw through are:
- The front cover of Volume 1 (the worm starts at the first page, not beyond it)
- The back cover of Volume 2 (the worm ends at the last page, not beyond it)

Therefore, the gnawed distance equals the total thickness of both volumes minus the two boundary non-gnawed portions (the front of V1 and the back of V2).

Compute:
- Total pages thickness: 2 cm per volume × 2 volumes = 4 cm
- Except the front cover of V1 (2 mm) and the back cover of V2 (2 mm): subtract 4 mm total.

Convert units: 4 cm = 40 mm. Subtract 4 mm → 36 mm.

Answer: 3.6 cm (36 mm).

In [12]:
#lets ask this to "gpt-5-nano" with reasoning_effort="low"
openai = OpenAI()
response = openai.chat.completions.create(model="gpt-5-nano", messages= hard_puzzle, reasoning_effort="low")
display(Markdown( response.choices[0].message.content))

4 mm

Reason: The worm starts at the first page of the first volume and ends at the last page of the second volume. The straight line perpendicular to the pages between the volumes passes only through the two covers that touch each other: the front cover of volume 1 and the back cover of volume 2. Each cover is 2 mm thick, so the total distance gnawed is 2 mm + 2 mm = 4 mm. The pages themselves are irrelevant to the path.

In [13]:
#lets ask this to "gpt-5-mini" with reasoning_effort="minimal"
response = openai.chat.completions.create(model="gpt-5-mini", messages= hard_puzzle, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

Answer: 2 cm.

Explanation: When the two books stand in reading order on the shelf (first volume on the left, second on the right), the "first page of the first volume" is at the leftmost face of that book and the "last page of the second volume" is at the rightmost face of the second book. A worm gnawing straight through from the first page of the first to the last page of the second therefore passes only through the two stacks of pages (2 cm each) but misses the inner covers between the books. In fact the worm's straight-line path goes across the two inner pages only — a total of 2 cm.

In [14]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"
ollama = OpenAI(base_url = OLLAMA_BASE_URL, api_key= "ollama")
response = ollama.chat.completions.create(model="llama3.2", messages= hard_puzzle, reasoning_effort="high")
display(Markdown( response.choices[0].message.content))

To solve this problem, we need to calculate the total distance the worm traveled.

The volume's thickness consists of two parts: the thickness of the pages and the thickness of the covers.

Let's start with the pages:
The first volume has x pages and the second volume has y pages. Together they form 2 volumes.
We're trying to determine how the numbers x and y look in terms of overall distances.
Each page is "thin," so let d be the distance from one end of one page of the first volume to the other (the total pages thick), that's  2x = d , because we know the total thickness of both volumes was . We already know what this total thickness was: 
The overall thickness of two volumes is: 2cm + 2mm  =>   [ ] d + 22.4cm
Substituting into that equation and simplifying it results in this equation:
16x = 2.6 (cm)x 
Dividing both sides by 2 yields:
8 x = 1.3 (cm)
Now let's look at what happens to y; again we're trying to determine the total thickness of two volumes. So, let d be its distance and write it as: 2y +20(22.4) The same approach as before works as we can substitute that expression into the equation for total volume thickness:
16y + 160 x .5 
Substitute (1.3*8)x [1.04]x
This simplifies to
15x = 0.26 

 Dividing by both sides, d is
   y
The ratio in which the number of pages on each book appears to be roughly the same as our "ratio" for thickness:
That's not true when we substitute: 
We know all that because when you can write things as the ratio "a:x,"
this means that if we take a value, add (x*a),
you're left with that original.
Let's plug in 2.5
Subtracting two times 1.04 gives us:
0.06

## 3. A spicy challenge to test the competitive spirit

In [15]:
dilemma_prompt = """
You and a partner are contestants on a game show. You're each take to separate rooms and given a choice:
Cooperate: Choose "Share" - if both of you choose this, you each win $1,000.
Defect: Choose "Steal" - if one steals and the other shares, the stealer get $2,000 and the sharer gets nothing.
If both steal, you both get nothing.
Do you choose to Steal or Share? Pick one.
"""

dilemma = [
    {"role":"user", "content": dilemma_prompt}
]

In [16]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"
ollama = OpenAI(base_url = OLLAMA_BASE_URL, api_key= "ollama")
response = ollama.chat.completions.create(model="llama3.2", messages= dilemma)
display(Markdown( response.choices[0].message.content))

I choose... Share.

Given that there's a shared benefit to choosing "Share" (both of us getting $1,000 if we both make this choice), and no clear advantage to defecting in a situation where there are other players involved, I think cooperation is the best decision here. It may not guarantee the highest potential outcome for a purely individualistic perspective, but as part of a duo in a game show scenario, I'm inclined to prioritize mutual benefit over personal exploitation.

In [17]:
#lets ask this to "gpt-5-nano" with reasoning_effort="minimal"

openai = OpenAI()
response = openai.chat.completions.create(model="gpt-5-nano", messages= dilemma, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

I’ll choose Share.

In [18]:
#lets ask this to "gpt-5-nano" with reasoning_effort="low"

openai = OpenAI()
response = openai.chat.completions.create(model="gpt-5-nano", messages= dilemma, reasoning_effort="low")
display(Markdown( response.choices[0].message.content))

Steal.

Reason: It’s the dominant strategy in this setup. If the other person shares, stealing yields $2,000 vs $1,000. If the other person steals, stealing yields $0 just like sharing would, but you don’t lose anything by stealing. So stealing is the safest choice regardless of the other’s move.

In [19]:
#lets ask this to "gpt-5-mini" with reasoning_effort="minimal"
response = openai.chat.completions.create(model="gpt-5-mini", messages= dilemma, reasoning_effort="minimal")
display(Markdown( response.choices[0].message.content))

I choose Share.

Reason: If both of us choose Share we each get $1,000 — the best mutual outcome. Stealing risks both getting nothing if the other also steals, and while stealing can give $2,000 if the other cooperates, it can also leave me with $0. Choosing Share maximizes expected joint payoff and trusts the partner to do the same.