In [1]:
import os
from openai import OpenAI
import json

# Initialize the OpenAI client
# Ensure your API key is set in your environment
client = OpenAI()

print("Client initialized successfully.")

Client initialized successfully.


## The Test Problem
We will use a Fermi estimation problem that benefits from deeper reasoning but also needs a clean final answer.
"How many piano tuners are there in Chicago?"

In [2]:
# A slightly complex reasoning task
prompt = "Estimate the number of piano tuners in Chicago based on first principles. Use Fermi estimation steps."

## 1. Deliberation Control: `reasoning.effort`
**Goal:** Control how much "thinking" the model does before answering.

**Parameters:** `reasoning={"effort": "low" | "medium" | "high"}`

We will compare "low" effort vs "high" effort.

In [3]:
# Helper function to print results clearly
def print_output(title, response):
    print(f"\n{'='*60}")
    print(f"  {title}")
    print(f"{'='*60}")
    print(f"Reasoning Tokens Used: {response.usage.output_tokens_details.reasoning_tokens}")
    print(f"Total Output Tokens: {response.usage.output_tokens}")
    print(f"\nFull Answer:")
    print("-"*60)
    print(response.output_text)
    print("-"*60)

print("Helper function defined.")

Helper function defined.


### 1A. Low Reasoning Effort

In [4]:
# 1A. Low Effort
print("Requesting Low Effort...")
response_low = client.responses.create(
    model="gpt-5.2",
    input=[{"role": "user", "content": prompt}],
    reasoning={"effort": "low"}
)
print_output("Low Effort Result", response_low)

Requesting Low Effort...

  Low Effort Result
Reasoning Tokens Used: 0
Total Output Tokens: 704

Full Answer:
------------------------------------------------------------
### Goal
Estimate the number of **piano tuners in Chicago** using a Fermi (first-principles) approach.

---

## 1) Estimate Chicago population and households
- Chicago city population ≈ **2.7 million**
- Average household size ≈ **2.5 people/household**
- Number of households ≈ \( 2.7\text{M} / 2.5 \approx 1.1\text{M} \) households

---

## 2) How many households have a piano?
Piano ownership in US households is often on the order of a few percent (higher in affluent/older households, lower elsewhere). Use a plausible range:

- Piano-owning households fraction ≈ **2%–5%**
- Piano households in Chicago ≈ \( 1.1\text{M} \times (0.02\text{ to }0.05) \approx 22{,}000\text{ to }55{,}000 \)

Add non-household pianos (schools, churches, venues, studios):
- As a rough uplift: **+20%** to household count

Total pianos needing 

### 1B. High Reasoning Effort

In [5]:
# 1B. High Effort
print("Requesting High Effort...")
response_high = client.responses.create(
    model="gpt-5.2",
    input=[{"role": "user", "content": prompt}],
    reasoning={"effort": "high"}
)
print_output("High Effort Result", response_high)

Requesting High Effort...

  High Effort Result
Reasoning Tokens Used: 417
Total Output Tokens: 1118

Full Answer:
------------------------------------------------------------
### Goal
Estimate the number of **piano tuners in Chicago (city proper)** using a Fermi (“back of the envelope”) approach.

---

## 1) How many pianos are in Chicago?

**Chicago population:** ~2.7 million  
**Average household size:** ~2.5 people/household  
\[
\text{Households} \approx \frac{2.7\text{M}}{2.5} \approx 1.1\text{ million households}
\]

**Fraction of households with a piano:** assume ~5% (≈ 1 in 20)  
\[
\text{Household pianos} \approx 1.1\text{M} \times 0.05 \approx 55{,}000
\]

Add **non-household pianos** (schools, churches, venues, universities, rehearsal spaces). A simple way: add ~15–25% extra. Use **~20%**:
\[
\text{Total pianos} \approx 55{,}000 \times 1.2 \approx 66{,}000
\]

So, **~70k pianos** is a reasonable round figure.

---

## 2) How many piano tunings happen per year?

**Average tu

## 2. Stopping Controls: `max_output_tokens`
**Goal:** Prevent runaway costs or "overthinking" by setting a hard budget on total output tokens.

**Parameter:** `max_output_tokens=500` 

This caps the *total* output token budget (reasoning + visible answer). When paired with high effort, if the model spends too many tokens reasoning, the visible answer may be truncated.

In [7]:
# 2. Strict Output Budget
# We set a total output token limit (reasoning + visible answer)
output_budget = 500

print(f"Requesting High Effort with Output Budget ({output_budget} tokens)...")

response_budget = client.responses.create(
    model="gpt-5.2",
    input=[{"role": "user", "content": prompt}],
    reasoning={"effort": "high"},
    max_output_tokens=output_budget
)

print_output("Budget Constrained Result", response_budget)

# Check token usage
reasoning_used = response_budget.usage.output_tokens_details.reasoning_tokens
total_output = response_budget.usage.output_tokens
visible_tokens = total_output - reasoning_used

print(f"\n[INFO] Budget: {output_budget} tokens")
print(f"  - Reasoning tokens: {reasoning_used}")
print(f"  - Visible answer tokens: {visible_tokens}")
print(f"  - Total output: {total_output}")

if total_output >= output_budget:
    print("[!] Output may have been truncated due to budget constraint.")

Requesting High Effort with Output Budget (500 tokens)...

  Budget Constrained Result
Reasoning Tokens Used: 500
Total Output Tokens: 500

Full Answer:
------------------------------------------------------------

------------------------------------------------------------

[INFO] Budget: 500 tokens
  - Reasoning tokens: 500
  - Visible answer tokens: 0
  - Total output: 500
[!] Output may have been truncated due to budget constraint.


## 3. Output Shaping via Prompt Engineering
**Goal:** Control what the user actually sees, regardless of the internal reasoning depth.

**Technique:** Since the Responses API doesn't support `response_format`, we use explicit prompt instructions to shape the output into JSON.

Even if the model thinks for thousands of tokens, we can ask it to deliver a clean JSON object.

In [8]:
# 3. Structured Output (via Prompt Engineering)
# Since the 'response_format' parameter is not currently supported by the client.responses.create method 
# in this environment, we will use explicit prompt instructions to shape the output into JSON.

json_instruction = """
Output the final answer as a valid JSON object with the following keys:
- population_chicago (integer)
- households_with_pianos (number, decimal % estimate)
- tuning_frequency_per_year (number)
- total_tunings_needed (integer)
- tunings_per_tuner_per_year (integer)
- estimated_tuners (integer)

Do not include markdown formatting (like ```json). Just the raw JSON string.
"""

print("\nRequesting High Effort with Output Shaping (JSON via Prompt)...")

# Append instructions to the user prompt
shaped_prompt = prompt + "\n\n" + json_instruction

response_shaped = client.responses.create(
    model="gpt-5.2",
    input=[{"role": "user", "content": shaped_prompt}],
    reasoning={"effort": "high"}
)

print(f"\n--- Shaped Output (High Reasoning, JSON View) ---")
print(f"Reasoning Tokens (Hidden): {response_shaped.usage.output_tokens_details.reasoning_tokens}")
print(f"Visible Answer:\n{response_shaped.output_text}")

# Verify it parses as JSON
try:
    data = json.loads(response_shaped.output_text)
    print("\n[SUCCESS] Output is valid JSON.")
except json.JSONDecodeError:
    print("\n[WARNING] Output is not valid JSON (Prompt shaping is less strict than response_format).")


Requesting High Effort with Output Shaping (JSON via Prompt)...

--- Shaped Output (High Reasoning, JSON View) ---
Reasoning Tokens (Hidden): 747
Visible Answer:
{
  "population_chicago": 2700000,
  "households_with_pianos": 6.0,
  "tuning_frequency_per_year": 1.0,
  "total_tunings_needed": 77625,
  "tunings_per_tuner_per_year": 600,
  "estimated_tuners": 129
}

[SUCCESS] Output is valid JSON.
