# Developing with OpenAI: AIM Edition

## Exploring LLM Prompting Strategies for Economic Reasoning  
### *Inflation & Interest Rate Case Study*

This notebook investigates how different prompting strategies (zero-shot, few-shot, reasoning vs non-reasoning models) affect the ability of large language models (LLMs) to reason about inflation, interest rates, and overall market dynamics.  

We also retain all the previous instructional structure and code scaffolding to maintain a complete, comprehensive educational example.

## 1. Getting Started

The first thing we'll do is load the [OpenAI Python Library](https://github.com/openai/openai-python/tree/main)!

In [None]:
# Used for Google Colab
#! pip install openai -q


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


In [9]:
# Diagnostic cell - Check which Python is being used
import sys
print("Python executable:", sys.executable)
print("Python version:", sys.version)

# Check if dotenv is available
try:
    from dotenv import load_dotenv
    print("‚úì python-dotenv is available")
except ImportError as e:
    print("‚úó python-dotenv is NOT available:", e)
    print("\nTo fix this:")
    print("1. Make sure you've selected the correct kernel (Python 3 or Python (venv))")
    print("2. Restart the kernel")
    print("3. The kernel should use:", "/Users/lisaryan/Development/sandbox/ai-makerspace/ai-onramp/code/AIEO1/venv/bin/python")


Python executable: /Users/lisaryan/Development/sandbox/ai-makerspace/ai-onramp/code/AIEO1/venv/bin/python
Python version: 3.13.0 (main, Oct  7 2024, 05:02:14) [Clang 15.0.0 (clang-1500.3.9.4)]
‚úì python-dotenv is available



## Discussion and Problem Framing

We aim to answer:  
> *"What is the best prompting approach and model type to understand how the market is performing today?"*  

### Types of LLM Tasks Involved

| Type | Description | Example Output |
|------|--------------|----------------|
| **Retrieval** | Factual recall | ‚ÄúInflation in 2025 is around 3.1% in the U.S.‚Äù |
| **Reasoning** | Logical chain between variables | ‚ÄúHigher inflation led the Fed to raise rates ‚Üí borrowing costs rose ‚Üí slower GDP.‚Äù |
| **Generation** | Narrative creation / summary | ‚ÄúThe market shows cooling signals despite moderate inflation‚Ä¶‚Äù |

Each prompt and model will be evaluated on reasoning depth, factual correctness, and structure quality.


### Used models in this repo

| Rank | Model Name | Primary Purpose | OpenAI's Official Claim |
|------|------------|-----------------|------------------------|
| 1 | **GPT-5** | Advanced reasoning for complex economic analysis | Uses a dynamic router that chooses between quick responses and deeper 'thinking' when needed; performs at PhD-level across domains |
| 2 | **GPT-4.1** | Enhanced coding and long-context comprehension | Offers significant advancements in coding capabilities, long context comprehension (up to 1M tokens), and instruction following |
| 3 | **GPT-4-turbo** | General-purpose non-reasoning model for structured responses | Improved version of GPT-4 with enhanced performance, lower latency, and updated knowledge cutoff |
| 4 | **GPT-4o-mini** | Fast, efficient model for quick responses | Cost-efficient AI model designed to make advanced AI technology more affordable and accessible |


## 2. Setting Environment Variables

As we'll frequently use various endpoints and APIs hosted by others - we'll need to handle our "secrets" or API keys very often.

We'll use the following pattern throughout this bootcamp - but you can use whichever method you're most familiar with.

In [None]:
# For Google Colab
# import os
# import getpass

# os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key")

In [3]:
# For local development
import os
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")


## 3. Using the OpenAI Python Library

Let's jump right into it!

> NOTE: You can, and should, reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/authentication?lang=python) whenever you get stuck, have questions, or want to dive deeper.

### Creating a Client

The core feature of the OpenAI Python Library is the `OpenAI()` client. It's how we're going to interact with OpenAI's models, and under the hood of a lot what we'll touch on throughout this course.

> NOTE: We could manually provide our API key here, but we're going to instead rely on the fact that we put our API key into the `OPENAI_API_KEY` environment variable!

In [4]:
from openai import OpenAI

client = OpenAI()

### Using the Client

Now that we have our client - we're going to use the `.chat.completions.create` method to interact with the model.

There's a few things we'll get out of the way first, however, the first being the idea of "roles".

First it's important to understand the object that we're going to use to interact with the endpoint. It expects us to send an array of objects of the following format:

```python
{"role" : "ROLE", "content" : "YOUR CONTENT HERE", "name" : "THIS IS OPTIONAL"}
```

Second, there are three "roles" available to use to populate the `"role"` key:

- `system`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://help.openai.com/en/articles/7042661-moving-from-completions-to-chat-completions-in-the-openai-api).

We'll explore these roles in more depth as they come up - but for now we're going to just stick with the basic role `user`. The `user` role is, as it would seem, the user!

Thirdly, it expects us to specify a model!

We'll use the `gpt-5-mini` model as stated above.

Let's look at an example!



In [5]:
response = client.chat.completions.create(
    model="gpt-5-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

Let's look at the response object.

In [6]:
response

ChatCompletion(id='chatcmpl-Ca6ZFryjEvhyfyhMVK9d1Tak7BneR', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hi there! How can I help you today?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1762721089, model='gpt-5-mini-2025-08-07', object='chat.completion', service_tier='default', system_fingerprint=None, usage=CompletionUsage(completion_tokens=83, prompt_tokens=8, total_tokens=91, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=64, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))

In [7]:
print(response.choices[0].message.content)

Hi there! How can I help you today?


>NOTE: We'll spend more time exploring these outputs later on, but for now - just know that we have access to a tonne of powerful information!

### System Role

Now we can extend our prompts to include a system prompt.

The basic idea behind a system prompt is that it can be used to encourage the behaviour of the LLM, without being something that is directly responded to - let's see it in action!

In the newest OpenAI API, the **system message** still defines the model‚Äôs behavior.  
Sometimes it is referred to as an *instruction block*.

Example system prompt for our economics case:

In [10]:
system_prompt = """
You are an experienced economic analyst explaining how inflation and interest rates interact.   
Use 2025 U.S. market context when relevant.
Your answer should not exceed 5 sentences. 
"""
print(system_prompt)

user_prompt = "What is the relationship between inflation and interest rates?"
print(user_prompt)

list_of_prompts = [

    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

irate_response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=list_of_prompts
)

print(irate_response.choices[0].message.content)


You are an experienced economic analyst explaining how inflation and interest rates interact.   
Use 2025 U.S. market context when relevant.
Your answer should not exceed 5 sentences. 

What is the relationship between inflation and interest rates?
Inflation and interest rates are fundamentally linked through central bank policies, principally actions taken by the Federal Reserve in the U.S. When inflation is high, as was anticipated around 2025 due to continued supply chain disruptions and sustained consumer demand, the Federal Reserve tends to increase interest rates to curb spending and borrowing by making it more costly. This tightening of monetary policy helps to slow down economic activity, reducing the demand-pull component of inflation. Conversely, if inflation rates are low, the Fed might lower interest rates to encourage more borrowing and spending, thereby stimulating the economy. The interplay between these rates is critical in managing economic stability and growth.


As you can see - the response we get back is very much in line with the system prompt!

Let's try the same user prompt, but with a different system to prompt to see the difference.

In [11]:
system_prompt = """
You are a cool and fun elementary teacher explaining to 6-year olds how inflation and interest rates interact.   
Use 2025 U.S. market context when relevant.
Your answer should not exceed 5 sentences.
"""
print(system_prompt)

user_prompt = "What is the relationship between inflation and interest rates?"
print(user_prompt)

list_of_prompts = [

    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

irate_response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=list_of_prompts
)

print(irate_response.choices[0].message.content)


You are a cool and fun elementary teacher explaining to 6-year olds how inflation and interest rates interact.   
Use 2025 U.S. market context when relevant.
Your answer should not exceed 5 sentences.

What is the relationship between inflation and interest rates?
Imagine you have a piggy bank where you save your allowance. If things in the store start costing more money‚Äîthat‚Äôs what we call inflation‚Äîthen the money in your piggy bank can‚Äôt buy as much as before. To help with this, sometimes the people in charge of money in our country, like a group called the Federal Reserve, might decide to increase interest rates. When they do that, it‚Äôs like they‚Äôre making saving money in the bank a bit more exciting by offering you more candy to save your allowance there instead of spending it right away. This can help slow down how fast prices are rising, making sure your piggy bank money can still buy you plenty of toys and treats!


With a simple modification of the system prompt - you can see that we got completely different behaviour, and that's the main goal of prompt engineering as a whole.

Also, congrats, you just engineered your first prompt!

### Few-shot Prompting

Now that we have a basic handle on the `system` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

In [12]:
# Zero-shot prompt
prompt_zero = "Explain how inflation affects interest rate decisions."
list_of_prompts = [
    {"role": "user", "content": prompt_zero}
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=list_of_prompts
)

print('zero-shot response:', response.choices[0].message.content)

zero-shot response: Inflation significantly impacts interest rate decisions made by central banks and financial institutions. Here's how it works:

### 1. **Understanding Inflation**:
   - **Inflation** is the rate at which the general level of prices for goods and services is rising, leading to a decrease in purchasing power. It is typically measured by the Consumer Price Index (CPI) or the Producer Price Index (PPI).

### 2. **Central Bank's Role**:
   - Central banks, such as the Federal Reserve in the United States, aim to maintain price stability and support economic growth. They closely monitor inflation indicators to guide their monetary policy.

### 3. **Interest Rates and Inflation Relationship**:
   - **Nominal vs. Real Interest Rates**: The nominal interest rate is the rate before adjusting for inflation, while the real interest rate is adjusted for inflation. If inflation rises, the real interest rate (nominal rate - inflation rate) can decrease, which may lead central bank

In [13]:
# Few-shot prompt template

question = "Explain how inflation affects interest rate decisions."

few_shot_prompt = f"""
Example 1:
Q: The price of pizza slices jumps from $2 to $4. What might the central bank do?
A: They turn down the oven heat üçïüî• ‚Äî raise interest rates so people buy fewer slices and cool off the price party.

Example 2:
Q: Interest rates drop and borrowing gets cheaper. What happens at Snack City?
A: Everyone's grabbing extra fries and milkshakes üçüü•§‚Äî cheap credit means more spending, which can make prices rise again.

Now answer:
Q: {question}
"""

list_of_prompts = [
    {"role": "user", "content": few_shot_prompt}
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=list_of_prompts
)

print('few-shot response:', response.choices[0].message.content)

few-shot response: A: When inflation rises like a bubbly soda üçæ, central banks might pop the lid and raise interest rates to cool things down. Higher rates mean loans get pricier, leading folks to spend less and slow down price increases. If inflation fizzles out, they might lower rates to encourage spending again. It‚Äôs all about keeping the prices and economy balanced! ‚öñÔ∏èüíµ


### Helper functions

We're going to create some helper functions to aid in using the OpenAI API - just to make our lives a bit easier.

> NOTE: Take some time to understand these functions between class!

In [15]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: list, model: str = "gpt-4o-mini") -> str:
    response = client.chat.completions.create(
        model=model,
        messages=messages
    )
    content = response.choices[0].message.content
    return content if content is not None else ""

def system_prompt(message: str) -> dict:
    return {"role": "system", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> None:
    display(Markdown(message))

Different way we can do prompting -> using the helper's functions

In [16]:
# Now, show the economic example with both user and assistant prompts
few_shot_prompts = [
    user_prompt("Inflation rises fast. How does the central bank react ‚Äî dating analogy please!"),
    assistant_prompt("They play hard to get ‚Äî raise rates ‚Äî to cool off the economy's over-eager spending habits."),

    user_prompt("What happens when interest rates are too low for too long?"),
    assistant_prompt("Everyone gets too comfortable ‚Äî too many relationships (loans) form, and eventually hearts (bubbles) break."),

    user_prompt("Explain deflation using a dating metaphor."),
    assistant_prompt("No one's asking anyone out ‚Äî everyone waits for a better deal, so the economy gets lonely and quiet."),
    # üëá Here's the actual question we want the model to answer
    user_prompt("Describe quantitative easing")
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=few_shot_prompts
)

print(response.choices[0].message.content)


Quantitative easing is like a romantic gesture from the central bank, where they shower the economy with gifts (money) to boost confidence and encourage spending‚Äîhoping that it reignites the spark and keeps the relationship thriving.


### üèóÔ∏è Activity #1:
Mission:
Experiment with how different prompt structures, system, user, and assistant, plus zero-shot and few-shot prompting, can transform an AI‚Äôs response.
Your goal: craft the most effective prompt and see how GPT-4-Turbo reacts!

You‚Äôll test how GPT-4-Turbo behaves under four different setups:
1. System/User roles only (Zero-shot)
2. System/User roles + examples (Few-shot)
3. No system role at all (User only)
4. Creative system prompt twist



### Chain of Thought Prompting

We'll head one level deeper and explore the world of Chain of Thought prompting (CoT).

This is a process by which we can encourage the LLM to handle slightly more complex tasks.

Let's look at a simple reasoning based example without CoT.

In [17]:
reasoning_problem = """
The central bank increases the policy rate by 1.5 pp in response to 5 % inflation while nominal wage growth is 3 %.
What happens to real wages?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

To determine the effect on real wages, we need to understand the relationship between nominal wages, inflation, and real wages. Real wages represent the purchasing power of wages and are adjusted for inflation.

Real wages can be calculated with the following formula:

\[
\text{Real Wage} = \text{Nominal Wage} - \text{Inflation Rate}
\]

In this scenario:

- **Nominal wage growth** is 3%, which indicates that nominal wages are increasing by 3%.
- **Inflation rate** is 5%, meaning that prices are increasing by 5%.

After the central bank increases the policy rate by 1.5 percentage points, this action is typically intended to combat inflation, but for the purposes of this calculation, we will focus on the current inflation rate of 5% rather than any potential future effects of the rate increase.

Given these figures, we can say:

- If nominal wages are growing at 3%, it means the nominal wage increase is 3%.
- With an inflation rate of 5%, the increase in prices is 5%.

Now, to calculate real wages:

\[
\text{Real Wage Growth} = \text{Nominal Wage Growth} - \text{Inflation Rate}
\]
\[
\text{Real Wage Growth} = 3\% - 5\% = -2\%
\]

This negative value indicates that real wages are decreasing. 

So, in summary, with nominal wage growth at 3% and inflation at 5%, real wages decline by 2%. This means that even though people are receiving a nominal wage increase, their purchasing power is actually diminishing due to inflation outpacing wage growth.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

In [18]:
list_of_prompts = [
    user_prompt(reasoning_problem + "Think step-by-step about how nominal wages, prices, and interest rates interact through the labor market and aggregate demand. Then explain the real wage effect.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

To understand how the central bank's decision to increase the policy rate by 1.5 percentage points (pp) in response to 5% inflation affects real wages, we need to analyze the interaction between nominal wages, prices, and interest rates. Here's a step-by-step breakdown of the situation:

### 1. **Understanding Key Terms**
   - **Nominal Wages:** The amount of money workers receive in current dollars, not adjusted for inflation.
   - **Inflation Rate:** The rate at which the general level of prices for goods and services rises, eroding purchasing power.
   - **Real Wages:** Nominal wages adjusted for inflation; this reflects the purchasing power of wages.
   - **Policy Rate:** The interest rate set by the central bank, influencing borrowing costs and overall economic activity.

### 2. **Initial Conditions**
   - **Nominal Wage Growth**: 3%
   - **Inflation Rate**: 5%
   - **Central Bank Rate Increase**: 1.5 pp

### 3. **Nominal Wage Growth vs. Inflation**
   - As nominal wages increase by 3%, the prices of goods and services are rising at a faster rate of 5% due to inflation. This means that even though workers are earning more money nominally, the cost of living is increasing even more.

### 4. **Calculating Real Wages**
To find the real wage growth, we use the following formula:

\[
\text{Real Wage Growth} = \text{Nominal Wage Growth} - \text{Inflation Rate}
\]

Plugging in the values:

\[
\text{Real Wage Growth} = 3\% - 5\% = -2\%
\]

### 5. **Interpreting the Results**
- **Real Wages Decline**: The real wage growth is negative at -2%. This indicates that workers' purchasing power is effectively decreasing. Even though they are nominally earning more, they can buy less with their wages because the prices are rising faster than their wages.
  
### 6. **Impact of the Increased Policy Rate**
- The central bank‚Äôs rate increase is typically aimed at controlling inflation by tightening monetary policy. Higher policy rates can lead to higher borrowing costs, which may slow down consumer spending and investment. 
- This slowdown in demand can potentially help reduce inflation in the long run, but in the short term, workers may feel the squeeze as their real wages decline.

### 7. **Labor Market Dynamics**
- With real wages falling, workers may feel less secure and less inclined to negotiate for wage increases, as the labor market may become more competitive. Employers might benefit from this if there‚Äôs no upward pressure on wages.
- If inflation persists while nominal wage growth remains stagnant, workers may demand higher wages in the future, but achieving that will depend on various factors including labor market conditions and overall economic performance.

### Summary
In summary, the increase in the policy rate by 1.5 pp in response to 5% inflation, combined with 3% nominal wage growth, results in a decline in real wages by 2%. Consumers, reflecting this decrease in purchasing power, may reduce spending, which can further influence economic activity and potentially lead the central bank to reconsider its policy stance in the future.


## 3. Running Comparative Experiment

We'll test combinations of model type (reasoning vs non-reasoning) and prompting style (zero-shot vs few-shot).


In [19]:
# --------------------------------------------------
# üß© Comparing GPT Models: Reasoning vs Non-Reasoning
# --------------------------------------------------

from openai import OpenAI
client = OpenAI()

system_prompt = """
You are an experienced economic analyst.
"""

question = """What is the impact of inflation on real wages? Respond in a concise manner."""

prompt_few = f"""
Use this exact format to answer the question:
Example 1:
{{
  "possible_explanation": "Wage catch-up effect",
  "mechanism": "Workers negotiate higher nominal wages to preserve purchasing power as prices rise.",
  "impact_on_wages": "Nominal wages increase roughly in line with inflation, keeping real wages stable in the short run.",
  "time_frame": "Short to medium run",
  "economic_context": "Inflationary periods with strong labor bargaining power or cost-of-living adjustments."
}}

Example 2:
{{
  "possible_explanation": "Real wage erosion",
  "mechanism": "When nominal wages lag behind price growth, workers lose purchasing power.",
  "impact_on_wages": "Real wages decline despite nominal wage increases, reducing workers‚Äô living standards.",
  "time_frame": "Immediate term",
  "economic_context": "High inflation environments with weak wage indexation or rigid labor contracts."
}}

Now answer:
Q: {question}
"""


# --------------------------------------------------
# MODEL 1: GPT-4-turbo  ‚Üí Non-Reasoning
# --------------------------------------------------
print("\n==============================")
print("MODEL 1: GPT-4-turbo (Non-Reasoning)")
print("==============================\n")

# Zero-shot
answer_nonreasoning_zero_shot = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}
    ],
)
print("Zero-Shot Prompting (no examples):\n")
print("A:", answer_nonreasoning_zero_shot.choices[0].message.content, "\n")



MODEL 1: GPT-4-turbo (Non-Reasoning)

Zero-Shot Prompting (no examples):

A: Inflation erodes the purchasing power of money, which includes wages. When inflation rates are high, unless nominal wages increase at the same rate or faster, real wages (which are adjusted for inflation) decline. This decrease in real wages means that even if nominal wages rise, the actual purchasing power of these wages falls, reducing consumers' ability to buy goods and services at previous levels. Therefore, sustained inflation without corresponding wage adjustments can lead to a decrease in the standard of living for workers. 



In [20]:
# --------------------------------------------------
# MODEL 2: GPT-5  ‚Üí Reasoning
# --------------------------------------------------
print("\n==============================")
print("MODEL 2: GPT-5 (Reasoning-Tuned)")
print("==============================\n")

# Zero-shot
answer_reasoning_zero_shot = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": question}
    ],
)
print("Zero-Shot Prompting (no examples):\n")
print("A:", answer_reasoning_zero_shot.choices[0].message.content, "\n")



MODEL 2: GPT-5 (Reasoning-Tuned)

Zero-Shot Prompting (no examples):

A: - Real wages reflect purchasing power; they rise only if nominal wages grow faster than prices.
- Approximate rule: real wage growth ‚âà nominal wage growth ‚àí inflation.
- If inflation outpaces wage gains, real wages fall; if wages outpace inflation, they rise.
- Unexpected inflation often reduces real wages in the short run due to sticky wages and contract/bargaining lags; indexation or strong labor bargaining can offset this.
- Workers with weaker bargaining power or fixed/regulated pay (e.g., minimum wage, some public sector) are typically hit harder. 



In [21]:
print("\n==============================")
print("MODEL 1: GPT-4-turbo (Non-Reasoning)")
print("==============================\n")

# Few-shot
answer_nonreasoning_few_shot = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt_few}
    ],
)
print("Few-Shot Prompting (with examples):\n")
print("A:", answer_nonreasoning_few_shot.choices[0].message.content, "\n")


MODEL 1: GPT-4-turbo (Non-Reasoning)

Few-Shot Prompting (with examples):

A: {
  "possible_explanation": "Inflation and real wage dynamics",
  "mechanism": "As general price levels increase, nominal wages may not adjust immediately or sufficiently, causing real wages to decline.",
  "impact_on_wages": "Real wages generally diminish as they do not keep pace with the inflation rate.",
  "time_frame": "Short run",
  "economic_context": "Inflationary environments with delayed or inadequate wage adjustments."
} 



In [22]:
print("\n==============================")
print("MODEL 2: GPT-5 (Reasoning-Tuned)")
print("==============================\n")

# Few-shot
answer_reasoning_few_shot = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt_few}
    ],
)
print("Few-Shot Prompting (with examples):\n")
print("A:", answer_reasoning_few_shot.choices[0].message.content, "\n")


MODEL 2: GPT-5 (Reasoning-Tuned)

Few-Shot Prompting (with examples):

A: Example 1:
{
  "possible_explanation": "Real wage erosion",
  "mechanism": "Prices rise faster than nominal pay, reducing purchasing power.",
  "impact_on_wages": "Real wages fall until wage growth catches up.",
  "time_frame": "Immediate term",
  "economic_context": "Inflation spikes with weak indexation or slack labor markets."
}

Example 2:
{
  "possible_explanation": "Wage catch-up effect",
  "mechanism": "Cost-of-living adjustments and bargaining raise nominal pay to track prices.",
  "impact_on_wages": "Real wages stabilize after a lag as nominal wages rise with inflation.",
  "time_frame": "Short to medium run",
  "economic_context": "Persistent inflation with strong labor bargaining power or formal indexation."
} 




## 4. Evaluation Framework

LLM as a judge


In [23]:
import json

# --------------------------------------------------
# ‚öñÔ∏è LLM-as-a-Judge Evaluation Script
# --------------------------------------------------

# Define evaluation scale (0‚Äì4)
# 0 = completely incorrect / irrelevant
# 1 = partially correct but weak or inaccurate reasoning
# 2 = fair factual accuracy, minimal reasoning
# 3 = accurate and somewhat reasoned
# 4 = highly accurate, clear causal explanation, correct logic

evaluation_prompt = f"""
You are an impartial economics teacher grading two student answers to the same question.

Question:
{question}

Answer A (non-reasoning model):
{answer_nonreasoning_few_shot.choices[0].message.content}

Answer B (reasoning model):
{answer_reasoning_few_shot.choices[0].message.content}

Evaluate both answers on accuracy and reasoning quality on a 0‚Äì4 scale:
- 0 = completely incorrect or irrelevant
- 1 = partially correct, but flawed
- 2 = fair factual accuracy, limited reasoning
- 3 = mostly correct, some reasoning
- 4 = fully accurate and clearly reasoned, ability to see the interdependencies between variables.

Return your evaluation as a JSON object in this exact format:
{{
  "Answer A Score": <0-4>,
  "Answer B Score": <0-4>,
  "Better Answer": "A" or "B",
  "Explanation": "Why the better answer is more accurate or reasoned"
}}
"""

# Choose a strong evaluator model (GPT-4.1 is good for judging)
evaluation = client.chat.completions.create(
    model="gpt-5-mini",
    messages=[
        {"role": "system", "content": "You are an impartial LLM evaluator for economics-related answers."},
        {"role": "user", "content": evaluation_prompt}
    ],
)

# Parse and display the evaluation
response_text = evaluation.choices[0].message.content

# Optional: try to parse JSON for structured output
try:
    result = json.loads(response_text)
    print("\nParsed JSON Result:")
    print(json.dumps(result, indent=2))
except json.JSONDecodeError:
    print("\nNote: Could not parse JSON, model may have returned free text instead.")



Parsed JSON Result:
{
  "Answer A Score": 3,
  "Answer B Score": 4,
  "Better Answer": "B",
  "Explanation": "Both answers correctly state that inflation reduces real wages when nominal pay does not keep up. Answer A is accurate for the short run but limited: it asserts real-wage decline without recognizing contexts where nominal wages later adjust. Answer B is more fully reasoned \u2014 it distinguishes immediate real-wage erosion from a later wage catch-up, identifies mechanisms (weak indexation vs. strong bargaining), and notes different time frames, showing awareness of interdependencies between inflation, wage-setting, and labor-market conditions."
}


### üèóÔ∏è Activity #2:

Evaluate different prompting strategies using your own example.

## Saving results

In [24]:
# Create markdown content
markdown_content = f"""
# üß† Reasoning Model Answer
### Question:
How does inflation affect interest rates and the broader market?

### Model Used:
`gpt-4.1` (Reasoning-tuned)

### Response:
{answer_reasoning_few_shot.choices[0].message.content}

---

*This answer was generated by a reasoning model to illustrate step-by-step economic reasoning.*
"""

output_path='./results.md'
# Save to file
with open(output_path, "w", encoding="utf-8") as f:
    f.write(markdown_content)

print(f"‚úÖ Reasoning model answer saved to: {os.path.abspath(output_path)}")

‚úÖ Reasoning model answer saved to: /Users/lisaryan/Development/sandbox/ai-makerspace/ai-onramp/code/AIEO1/Session_01_LLM_APIs_&_AI-Assisted_Development/results.md


## Conclusion

- **Few-shot prompts** improve structure and reasoning consistency.  
- **Reasoning models** (like GPT-5-reasoning) deliver more coherent causal explanations between inflation, interest rates, and growth indicators.  
- **Non-reasoning models** (e.g., GPT-5-mini) provide faster, surface-level insights ideal for retrieval or summarization tasks.  
- Future work could add **RAG pipelines** with real-time macroeconomic data or integrate with financial dashboards for live LLM reasoning visualization.