### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [1]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

Please enter your OpenAI API Key:  ········


### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [2]:
from openai import OpenAI

client = OpenAI()

In [4]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-C0Zemtc8LrALLPpg1GdZ72f6kXwLS', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Great question! LangChain and LlamaIndex (formerly known as GPT Index) are both popular tools in the AI/LLM ecosystem, but they serve different purposes and have distinct features. Here's an overview of their differences:\n\n**1. Purpose and Use Cases:**\n\n- **LangChain:**\n  - Primarily a framework for building, managing, and deploying applications that leverage large language models (LLMs).\n  - Focuses on creating chains, prompts, and workflows to perform complex NLP tasks like chatbots, question-answering systems, or automation pipelines.\n  - Provides abstractions for chaining together models, prompts, memory, and tools.\n\n- **LlamaIndex (GPT Index):**\n  - Designed to facilitate the integration of external data sources into LLM workflows.\n  - Focuses on indexing, storing, and retrieving large amounts of unstructured da

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [5]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [6]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate the development of large language model (LLM) applications, but they focus on different aspects and offer different capabilities. Here's a comparative overview:

**LangChain:**

- **Primary Focus:** Building composable, multi-step applications with LLMs, including chatbots, agents, and complex pipelines.
- **Key Features:**
  - Modular components such as prompts, memory, tools, and chains.
  - Supports orchestration of multiple LLM calls and integrations.
  - Built-in support for reasoning, chain-of-thought, and multi-turn conversations.
  - Extensible framework for building agents that can interact with external APIs and tools.
  - Supports various language models and deployment options.
- **Use Cases:** Conversational AI, autonomous agents, complex workflows that combine LLMs with external tools.

**LlamaIndex (GPT Index):**

- **Primary Focus:** Efficiently indexing and querying large external data sources (like documents, files, or databases) using LLMs.
- **Key Features:**
  - Data ingestion pipelines to process and index various data formats.
  - Rich indexing structures (like vectors, trees) for fast retrieval.
  - Query engines that leverage LLMs to answer questions based on the indexed data.
  - Designed to simplify building custom knowledge bases and document-aware applications.
  - Supports multiple storage and retrieval backends.
- **Use Cases:** Building question-answering systems over large document collections, knowledge bases, or internal corpora.

---

### Summary

| Aspect               | **LangChain**                                   | **LlamaIndex (GPT Index)**                         |
|----------------------|------------------------------------------------|--------------------------------------------------|
| **Main Purpose**     | Building complex LLM applications and workflows | Indexing and querying large external data sources |
| **Focus Area**       | Orchestrating multi-step interactions, agents | Efficient retrieval over big data collections  |
| **Key Strengths**    | Modular chains, agents, tool integration      | Data ingestion, indexing, fast querying        |
| **Use Cases**        | Chatbots, autonomous agents, AI workflows     | Knowledge bases, document QA, data-driven apps |

**In essence:**  
- Use **LangChain** if you're building applications that require orchestration, multi-step reasoning, or integration with multiple tools and APIs.  
- Use **LlamaIndex** if your goal is to enable LLM-based querying and navigation over large document datasets or knowledge bases.

They can also be complementary; for example, you might use LlamaIndex to manage and retrieve relevant documents and LangChain to orchestrate interactions, reasoning, and external tool calls based on that data.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [7]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? At this point, I don't care if you give me a bucket of whatever! Just pick one — crushed or cubed — and make it quick! I'm starving and irate, and I need ice now!

Let's try that same prompt again, but modify only our system prompt!

In [8]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is fantastic for a refreshing drink because it chills quickly and feels so satisfying to bite! Cubed ice, on the other hand, looks sleek and melts more slowly, making it perfect for sipping pretty much any beverage. Both have their charms—what's your preference?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [9]:
print(joyful_response)

ChatCompletion(id='chatcmpl-C0Zfn45RdQK60jdpnULuHh9c9srhh', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is fantastic for a refreshing drink because it chills quickly and feels so satisfying to bite! Cubed ice, on the other hand, looks sleek and melts more slowly, making it perfect for sipping pretty much any beverage. Both have their charms—what's your preference?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1754253163, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=57, prompt_tokens=30, total_tokens=87, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [10]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using the words:

"During the whimsical tale, the stimple creature and the falbean flower became unlikely friends in the enchanted forest."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [11]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench worked perfectly with the falbean to securely fasten the bolts.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions - but it can still benefit from a Chain of Thought Prompt to increase the reliability of the response!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [12]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze both options based on the current time and travel durations.

**Current time:** 1PM local time (San Francisco), which is Pacific Daylight Time (PDT).  
**Target time:** Before 7PM EDT.

First, convert the target time to local time:

- EDT (Eastern Daylight Time) is 3 hours ahead of PDT.

So, **7PM EDT** is equivalent to:

**4PM PDT**

Billy needs to arrive **before 4PM PDT**.

---

### Option 1: Fly + Bus

- Flight: 3 hours
- Bus: 2 hours
- Total time: 3 + 2 = 5 hours

Starting at 1PM PDT:

- After flight: 4PM PDT
- After bus: 6PM PDT

**Arrival time: 6PM PDT** — **before 4PM PDT target?** No, it arrives after the deadline.

---

### Option 2: Teleporter + Bus

- Teleporter: 0 hours
- Bus: 1 hour
- Total time: 1 hour

Starting at 1PM PDT:

- Teleporter: remains at 1PM
- Bus: 1 hour, so arrival at 2PM PDT

**Arrival time: 2PM PDT** — well before the 4PM PDT deadline.

---

### **Conclusion:**

**Yes, it does matter.**  
Taking the teleporter + bus gets Billy home before 7PM EDT (which corresponds to 4PM PDT), while the fly + bus option arrives too late.

**Billy should choose the teleporter + bus.**

Let's use the same prompt with a small modification - but this time include "Let's think step by step"

In [13]:

list_of_prompts = [
    user_prompt(reasoning_problem + "\nLet's think step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options step by step.

### First, determine the current time in the local timezone and the deadline:

- It's currently **1PM local time**.
- Billy wants to arrive **before 7PM EDT**.

### Important:  
**San Francisco** is in **Pacific Daylight Time (PDT)** when daylight saving is active, which is **UTC-7**.  
**Eastern Daylight Time (EDT)** is **UTC-4**.

### Step 1: Convert current time to EDT:

- PDT (San Francisco) is **UTC-7**.
- 1PM PDT is equivalent to:

  ```
  1PM + (UTC-4) - (UTC-7) = 1PM + 3 hours = 4PM EDT
  ```

So, it is **4PM EDT** in San Francisco exactly right now.

### Step 2: Calculate the deadline in local time:

- Billy needs to arrive **before 7PM EDT**.
- In his local time, the deadline is **7PM EDT**, which is **7PM EDT** — now, convert that to PDT:

  ```
  7PM EDT - 3 hours = 4PM PDT
  ```

Thus, **Billy must arrive in San Francisco by 4PM PDT**, which is **7PM EDT**.

### Step 3: Assess the time remaining:

- It's **current time**: 1PM PDT
- **Deadline**: 4PM PDT

Remaining time:

```
4PM - 1PM = 3 hours
```

Billy has **3 hours** to reach his destination **before 4PM PDT**.

---

### Step 4: Evaluate travel options:

#### Option 1:
- Fly (3 hours), then take a bus (2 hours).
  
  Total travel time: **3 + 2 = 5 hours**.

- If Billy starts now (at 1PM PDT), he would arrive:

  ```
  1PM + 5 hours = 6PM PDT.
  ```
  
  **6PM PDT** is **before** 4PM PDT?

  **No**, 6PM PDT is **after** 4PM PDT.

  **Wait**, this indicates he would **not** arrive before 4PM PDT.

  **But hold on** — the calculation must be checked carefully:  
  Arrival time if starting immediately:

  ```
  1PM + 3hrs(flying) + 2hrs(bus) = 6PM PDT
  ```

  Since the deadline is 4PM PDT, starting at 1PM PDT, the trip would **not** make it on time.

  **Therefore**, for this trip to arrive before 4PM PDT, Billy must start **earlier**.

  That is, he must start **by** 1PM PDT minus total travel time:

  Since total travel time is 5 hours,

  ```
  Start time = 4PM PDT - 5 hours = **-1AM PDT**
  ```

  Which is in the past, so he **cannot** make it in time if starting now.

#### Option 2:
- Teleporter (0 hours), then bus (1 hour).

  
  Total travel time: **0 + 1 = 1 hour**.

- Starting now (1PM PDT):

  ```
  1PM + 1 hour = 2PM PDT
  ```

  Arrival at 2PM PDT, which is **before** the 4PM PDT deadline.

### **Conclusion:**

- The teleporter + bus option guarantees arrival **by 2PM PDT**, well before the 4PM PDT deadline.
- The flying + bus option would require starting **before 1PM PDT** to arrive by 4PM PDT, which is impossible since it's already 1PM.

### **Final Answer:**

**Yes, it does matter which option Billy selects.**

- Taking the teleporter + bus ensures he arrives **on time**.
- Taking the flight + bus **cannot** make it in time if he starts now.

**Therefore, Billy should choose the teleporter + bus.**

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/The-AI-Engineer-Challenge) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)