# Analyzing Data and Interpreting Images with OpenAI's o1 Reasoning Model vs. GPT

## Introduction
OpenAI's o1 reasoning model is designed for complex problem-solving, data analysis, and image interpretation by simulating a multi-step thought process before generating responses. Unlike traditional GPT models, which produce output in a single pass, reasoning models use internal **reasoning tokens** to explore multiple approaches before finalizing an answer.
<p align="center">
    <img src="https://cdn.openai.com/API/images/guides/reasoning_tokens.png" alt="Reasoning Tokens" width="600">
</p>  

*Source: [OpenAI Reasoning Models Guide](https://platform.openai.com/docs/guides/reasoning)*

**Key Differences: o1 Reasoning Model vs. GPT**
- Multi-step reasoning: o1 evaluates different solutions before selecting the best response.
- Deeper analytical capabilities: Optimized for complex data interpretation tasks.
- Context-aware image analysis: Provides more structured and insightful image descriptions.
- Reasoning Effort Control: Users can adjust the depth of reasoning (`low`, `medium`, `high`).


For more details, refer to the [OpenAI Reasoning Models Guide](https://platform.openai.com/docs/guides/reasoning).


## Purchase and Store API Key

You need to **purchase** your [OpenAI](https://openai.com/) API key and store it securely, such as in **AWS Secrets Manager**.

- **Key Name:** `api_key`  
- **Key Value:** `<your OpenAI API key>`  
- **Secret Name:** `openai`  

## Install Python Libraries

- **openai**: Used to call `o1` and `GPT` models for data analysis and image interpretation.

In [1]:
pip install openai -q

Note: you may need to restart the kernel to use updated packages.


## Import Required Libraries

The following libraries are used in this notebook:

- **boto3**: AWS SDK for Python, used to interact with AWS services.
- **json**: Standard Python library for handling JSON data.
- **IPython.display**: Provides tools to display images, Markdown content, and other rich media in Jupyter Notebook.
- **openai**: Used to call `o1` and `GPT` models for data analysis and image interpretation.
- **pandas**: A powerful library for data manipulation and analysis.
- **pprint**: Pretty prints data structures for better readability.

In [2]:
import boto3
import json
from IPython.display import display, Image, Markdown
from openai import OpenAI
import pandas as pd
from pprint import pprint

## Retrieve API Keys Securely from AWS Secrets Manager

The following function, `get_secret()`, retrieves a secret from **AWS Secrets Manager**. This is a secure way to store and access sensitive credentials, such as API keys, without hardcoding them into the script

In [3]:
def get_secret(secret_name):
    region_name = "us-east-1"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=region_name
    )

    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    except ClientError as e:
        raise e

    secret = get_secret_value_response['SecretString']
    
    return json.loads(secret)

## Initialize OpenAI Client

The following code initializes the OpenAI client using a securely stored API key retrieved from AWS Secrets Manager.

In [4]:
client = OpenAI(api_key= get_secret('openai')['api_key'])

## Load and Analyze the West African GDP Dataset

This notebook uses the **diamonds dataset ([diamonds.csv](https://github.com/lbsocial/data-analysis-with-generative-ai/blob/main/diamonds.csv))**, which contains detailed attributes of diamonds, including weight, color, clarity, and price.

One interesting pattern in the dataset is that **diamonds with "IF" (Internally Flawless) clarity tend to have the lowest average price** compared to other clarity grades. This observation is counterintuitive, as one might expect the highest-clarity diamonds to be the most expensive.

In [5]:
df = pd.read_csv('West_afr_GDP.csv')
data_json = df.to_json(orient="records")
df.head()

Unnamed: 0.1,Unnamed: 0,Date,BRE,TOGO,BFA,CPV,GMB,GHA,GUE,GUEB,LIB,MALI,MTA,NIG,NGR,SEN,SLN,IVC
0,0,2021,$17.14B,$8.41B,$19.74B,$1.94B,$2.04B,$77.59B,$16.09B,$1.64B,$3.51B,$19.14B,$10.00B,$14.92B,$440.83B,$27.63B,$4.04B,$70.04B
1,1,2020,$15.65B,$7.57B,$17.93B,$1.70B,$1.81B,$70.04B,$14.18B,$1.43B,$3.04B,$17.47B,$8.41B,$13.74B,$432.20B,$24.49B,$4.06B,$61.35B
2,2,2019,$14.39B,$7.22B,$16.18B,$1.98B,$1.81B,$68.34B,$13.44B,$1.44B,$3.32B,$17.28B,$8.07B,$12.92B,$448.12B,$23.40B,$4.08B,$58.54B
3,3,2018,$14.26B,$7.11B,$15.89B,$1.97B,$1.67B,$67.30B,$11.86B,$1.50B,$3.42B,$17.07B,$7.47B,$12.81B,$421.74B,$23.12B,$4.09B,$58.01B
4,4,2017,$12.70B,$6.40B,$14.11B,$1.77B,$1.50B,$60.41B,$10.32B,$1.35B,$3.39B,$15.37B,$6.80B,$11.19B,$375.75B,$21.00B,$3.72B,$51.59B


## Generate Data Analysis Prompt for OpenAI Model

To investigate why diamonds with **IF (Internally Flawless) clarity** have the **lowest average price**, we generate a structured prompt for the OpenAI model. The model will analyze the dataset and generate insights, including **Python code for visualizations**.


In [6]:
data_prompt = f"What are some trends in west African countries GDPs growth or decline and how have they changed over time. Provide Python-generated charts to support your conclusion. Data: {data_json}"
# print(prompt)

## Define a Function to Get Assistance from OpenAI GPT-4o

The following function, `openai_gpt_help()`, sends a prompt to OpenAI's **GPT-4o model** and returns a response. It also prints the number of tokens used in the request.

In [7]:
def openai_gpt_help(prompt):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model='gpt-4o',
        messages=messages,
        temperature = 0
    )
    token_usage = response.usage
    
    pprint(f"Tokens used: {token_usage}")

    return response.choices[0].message.content

In [8]:
gpt_result = openai_gpt_help(prompt=data_prompt)

('Tokens used: CompletionUsage(completion_tokens=855, prompt_tokens=9211, '
 'total_tokens=10066, '
 'completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, '
 'audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), '
 'prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))')


In [9]:
display(Markdown(gpt_result))

To analyze the trends in GDP growth or decline for West African countries over time, we can use the provided data to generate visualizations. We'll focus on a few key countries to keep the analysis concise. Let's use Python with libraries like Pandas and Matplotlib to create these visualizations.

First, we need to preprocess the data to convert the GDP values from strings to numeric values. Then, we'll plot the GDP trends for selected countries.

Here's the Python code to achieve this:

```python
import pandas as pd
import matplotlib.pyplot as plt

# Data provided
data = [
    {"Unnamed: 0":0,"Date":2021,"BRE":"$17.14B","TOGO":"$8.41B","BFA":"$19.74B","CPV":"$1.94B","GMB":"$2.04B","GHA":"$77.59B","GUE":"$16.09B","GUEB":"$1.64B","LIB":"$3.51B","MALI":"$19.14B","MTA":"$10.00B","NIG":"$14.92B","NGR":"$440.83B","SEN":"$27.63B","SLN":"$4.04B","IVC":"$70.04B"},
    {"Unnamed: 0":1,"Date":2020,"BRE":"$15.65B","TOGO":"$7.57B","BFA":"$17.93B","CPV":"$1.70B","GMB":"$1.81B","GHA":"$70.04B","GUE":"$14.18B","GUEB":"$1.43B","LIB":"$3.04B","MALI":"$17.47B","MTA":"$8.41B","NIG":"$13.74B","NGR":"$432.20B","SEN":"$24.49B","SLN":"$4.06B","IVC":"$61.35B"},
    # ... (rest of the data)
]

# Convert data to DataFrame
df = pd.DataFrame(data)

# Remove the dollar sign and 'B', then convert to float
for column in df.columns[2:]:
    df[column] = df[column].str.replace('$', '').str.replace('B', '').astype(float)

# Select countries for analysis
countries = ['NGR', 'GHA', 'IVC', 'SEN', 'MALI']

# Plot GDP trends
plt.figure(figsize=(12, 8))
for country in countries:
    plt.plot(df['Date'], df[country], label=country)

plt.title('GDP Trends of Selected West African Countries (1960-2021)')
plt.xlabel('Year')
plt.ylabel('GDP (in Billion USD)')
plt.legend()
plt.grid(True)
plt.show()
```

### Analysis

1. **Nigeria (NGR)**: Nigeria has the largest GDP among the selected countries, showing significant growth over the years, especially from the early 2000s onwards. However, there are fluctuations, particularly noticeable around 2014-2016, likely due to economic challenges such as oil price volatility.

2. **Ghana (GHA)**: Ghana's GDP has shown a steady increase, with notable growth from the mid-2000s. This growth can be attributed to economic reforms and the discovery of oil.

3. **Ivory Coast (IVC)**: Ivory Coast has experienced consistent GDP growth, particularly after the political stabilization post-2011. The economy has been bolstered by agriculture and infrastructure development.

4. **Senegal (SEN)**: Senegal's GDP growth has been steady, with recent years showing an upward trend due to investments in infrastructure and energy.

5. **Mali (MALI)**: Mali's GDP has grown over the years, but the growth rate is more moderate compared to other countries, likely due to political instability and security challenges.

These visualizations and analyses provide insights into the economic trajectories of these West African countries, highlighting both growth and challenges over the decades.

## Define a Function to Get Assistance from OpenAI o1 Model  

The following function, `openai_o_help()`, sends a prompt to OpenAI's **o1 reasoning model** and returns a response.  

### Key Differences Between o1 and GPT Models:
- **Reasoning Effort**: The o1 model allows users to control reasoning depth using `reasoning_effort` (`low`, `medium`, `high`).  
- **No Temperature Parameter**: Unlike GPT models, **o1 does not support `temperature`**.  
- **Developer Messages Replace System Messages**:  
  - Starting with `o1-2024-12-17`, **developer messages** replace **system messages** to align with chain-of-command behavior.  

### Best Practices for Prompting o1  
- **Keep prompts simple and direct.**  
- **Avoid chain-of-thought prompts.** o1 reasons internally, so step-by-step instructions aren't needed.  
- **Use delimiters for clarity.** Use Markdown, XML tags, or section titles.  
- **Try zero-shot first.** If needed, add few-shot examples that closely match your goal.  
- **Be explicit.** Clearly define success criteria and constraints.  
- **Markdown is disabled by default.** To enable, start with `"Formatting re-enabled"`.  

Source: [OpenAI Reasoning Models Best Practices Guide](https://platform.openai.com/docs/guides/reasoning-best-practices).  


In [10]:
def openai_o_help(prompt):
    messages = [ {"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model='o1',
        reasoning_effort="high", # low, medium or high
        messages=messages,

    )
    token_usage = response.usage
    
    pprint(f"Tokens used: {token_usage}")

    return response.choices[0].message.content

In [11]:
o1_result = openai_o_help(prompt=data_prompt)

('Tokens used: CompletionUsage(completion_tokens=3479, prompt_tokens=9210, '
 'total_tokens=12689, '
 'completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, '
 'audio_tokens=0, reasoning_tokens=1856, rejected_prediction_tokens=0), '
 'prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))')


In [12]:
print(o1_result)

Below is a sample Python workflow demonstrating how you might load the data into a pandas DataFrame, clean and convert the GDP values into numeric form, and then plot some of the countries’ GDP over time to visualize trends. After the code, you will find a brief discussion of notable GDP trends in West Africa based on these data.

--------------------------------------------------------------------------------
1) Data Cleaning and Visualization in Python
--------------------------------------------------------------------------------

Note: Make sure you have installed pandas (pip install pandas) and matplotlib (pip install matplotlib). Then run the code below in, for example, a Jupyter notebook or a Python script.

--------------------------------------------------------------------------------
import pandas as pd
import matplotlib.pyplot as plt
import re

#---------------------------------------------------
# 1. Load the data into a DataFrame
#----------------------------------------

### Reflection

The reasoning model gave an accurate desrciption of the dataset and identified accurate treds in the data that supports analysis regarding the fastest growing countries in west africa. It also gave potential reasons and drivers for growth or decline of GDP. These 



## References  
- **OpenAI Reasoning Models Guide**: [OpenAI](https://platform.openai.com/docs/guides/reasoning)  
- **OpenAI Reasoning Models Best Practices Guide**: [OpenAI](https://platform.openai.com/docs/guides/reasoning-best-practices)  
- **Colin Jarvis. “Reasoning with O1.” DeepLearning.AI.** Accessed February 14, 2025. [DeepLearning.AI](https://www.deeplearning.ai/short-courses/reasoning-with-o1/)  