### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [1]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [2]:
from openai import OpenAI

client = OpenAI()

In [3]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-Bm3jtfQITlVhmSSDIwmayVSbQ0qb5', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Great question! LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks that facilitate building applications with large language models (LLMs), but they serve slightly different purposes and have distinct features.\n\n**1. Purpose and Focus:**\n\n- **LangChain:**\n  - Focuses on building ** conversational applications** and **agent-based systems**.\n  - Provides a toolkit for orchestrating prompts, managing memory, chaining multiple language model calls, and integrating with external tools and APIs.\n  - Emphasizes **composability** and **workflow management** for complex LLM applications.\n\n- **LlamaIndex (GPT Index):**\n  - Specializes in **building indices** over external data sources like documents, PDFs, or knowledge bases.\n  - Enables **efficient retrieval** and **question answering** over la

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [5]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [6]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both frameworks designed to facilitate building applications that leverage large language models (LLMs), but they serve somewhat different purposes and offer different functionalities. Here's a breakdown of their main differences:

### 1. **Primary Focus and Use Cases**

- **LangChain:**
  - Focuses on **building complex, multi-step, conversational, or agent-based applications** with LLMs.
  - Provides abstractions for chaining together prompts, managing memory, handling conversation states, and integrating with various APIs and tools.
  - Ideal for developing chatbots, virtual assistants, or applications requiring dynamic interactions and reasoning over multiple steps.

- **LlamaIndex (GPT Index):**
  - Primarily designed for **indexing, searching, and retrieving information from large external data sources** like documents, PDFs, or databases to answer queries.
  - Acts as a bridge to utilize LLMs for **semantic search, question answering, and knowledge base creation** over custom data.
  - Suitable for building knowledge bases, document QA systems, or retrieval-augmented generation (RAG) workflows.

### 2. **Core Functionality**

- **LangChain:**
  - Provides **building blocks** such as prompt templates, chains, agents, memory, and tool integrations.
  - Enables **orchestration** of many steps including calls to LLMs, external APIs, or tools.
  - Supports complex workflows like multi-turn conversations, agent prompts, and dynamic decision-making.

- **LlamaIndex:**
  - Offers **data ingestion and indexing capabilities** to convert raw data into queryable formats.
  - Contains **connectors and wrappers** for various data sources.
  - Focuses on **semantic search** and **retrieval-augmented generation**, where LLMs generate responses based on relevant parts of the indexed data.

### 3. **Level of Abstraction and Flexibility**

- **LangChain:**
  - Higher-level framework for designing **interactive and multi-step applications**.
  - Offers detailed control over prompt engineering, dialogue flow, and integration with external tools.
  - Emphasizes **custom workflows** and complex orchestration.

- **LlamaIndex:**
  - More specialized towards **efficient data management** and retrieval.
  - Simplifies the process of creating searchable indices over data, then querying them with LLMs.
  - Focuses on **ease of use** for integrating external data into language model workflows.

### 4. **Community, Ecosystem, and Adoption**

- **LangChain:**
  - Has a large and active community focused on conversational AI, chatbots, and multi-step reasoning.
  - Extensive documentation and a wide array of integrations.

- **LlamaIndex:**
  - Gaining popularity in contexts where external data needs to be incorporated into LLM workflows.
  - Often used in enterprise, research, and knowledge management systems.

---

### **In summary:**

| Aspect                     | **LangChain**                                      | **LlamaIndex**                                |
|----------------------------|-----------------------------------------------------|------------------------------------------------|
| Main purpose               | Building complex, multi-step LLM applications      | Indexing/searching external data for LLMs   |
| Focus                      | Orchestrating prompts, chats, tools, and memory    | Data ingestion, retrieval, QA over data     |
| Ideal for                  | Chatbots, agents, conversational AI                | Knowledge bases, document search, RAG      |
| Abstraction level          | High; flexible and customizable                     | Focused on data indexing and retrieval     |

---

**In essence:**  
- Use **LangChain** if you're building an application that requires complex workflows, multi-turn conversations, or integration of various tools and APIs.  
- Use **LlamaIndex** if you're aiming to retrieve and reason over large external datasets to support querying or question answering.

---

If you need a recommendation for a specific project or use case, feel free to share more details!

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [7]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? I can't believe I have to choose between crushed ice and cubed ice when I'm this angry and starving! Neither sounds appealing right now—just get me some real food already!

Let's try that same prompt again, but modify only our system prompt!

In [8]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is great for a refreshing, cool treat, especially in drinks like snow cones or cocktails. Cubed ice is perfect for keeping beverages cold without diluting them too quickly. Both have their charms—what do you prefer?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [9]:
print(joyful_response)

ChatCompletion(id='chatcmpl-Bm3sUbUR44Fv1db5XliDYrmqoeK2D', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='I think crushed ice is great for a refreshing, cool treat, especially in drinks like snow cones or cocktails. Cubed ice is perfect for keeping beverages cold without diluting them too quickly. Both have their charms—what do you prefer?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1750794350, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=48, prompt_tokens=30, total_tokens=78, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [10]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using both words:

"During the strange village festival, I encountered a stimple of falbean, which added to the mysterious atmosphere of the event."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [11]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench effortlessly tightened the falbean, ensuring the assembly spun smoothly without any wobble.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions - but it can still benefit from a Chain of Thought Prompt to increase the reliability of the response!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [12]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze Billy's options carefully.

**Given data:**

- Current time: 1 PM local time
- Deadline: 7 PM EDT
- Travel options:
  1. **Fly (3 hrs)** + bus (2 hrs) = total 5 hours
  2. **Teleporter (0 hrs)** + bus (1 hr) = total 1 hour

---

**Important considerations:**

- The local time is 1 PM now.
- The deadline is 7 PM EDT, so the latest he can arrive **EDT** time is 7 PM.
- **Are the local time and EDT synchronized?**  
  The key question is: *Is the current local time given in local time zone or EDT?*

**Assumption:**  
Since the problem states "It's currently 1PM local time," and the deadline is "before 7PM EDT," I will assume:

- The local time is in San Francisco's time zone (Pacific Time, PT).
- EDT is Eastern Daylight Time, which is ahead of PT by 3 hours.

**Time zone difference:**  
- Pacific Time (PT): UTC-7 or UTC-8 depending on daylight saving,  
- EDT: UTC-4.

**During daylight saving time**, Pacific Time is UTC-7, EDT is UTC-4, so EDT is ahead of Pacific Time by 3 hours.

**Convert current local time to EDT:**

- Current local time: 1 PM PT.
- In EDT, this is 1 PM + 3 hours = 4 PM EDT.

---

**Calculate total travel times in EDT:**

### Option 1: Fly + bus

- Total travel time: 3 + 2 = 5 hours
- Starting at 1 PM PT (which is 4 PM EDT), arriving **after 5 hours**:

  - Arrival EDT time = 4 PM + 5 hours = 9 PM EDT

### Option 2: Teleporter + bus

- Total travel time: 0 + 1 = 1 hour
- Starting at 1 PM PT (4 PM EDT):

  - Arrival EDT time = 4 PM + 1 hour = 5 PM EDT

---

**Does it matter which option Billy selects?**

- **Fly + bus:** arrives at 9 PM EDT (after deadline 7 PM EDT).
- **Teleporter + bus:** arrives at 5 PM EDT (before deadline).

**Conclusion:**

Yes, it **does matter** which option Billy selects because:

- The **teleporter + bus** option gets him home **before** 7 PM EDT.
- The **fly + bus** option **cannot** arrive before the deadline.

**Therefore, Billy should choose the teleport + bus option** to ensure he gets home before 7 PM EDT.

Let's use the same prompt with a small modification - but this time include "Let's think step by step"

In [13]:

list_of_prompts = [
    user_prompt(reasoning_problem + "\nLet's think step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options step by step.

**First, understand the current time and target:**

- Current local time: 1PM
- Billy wants to arrive **before 7PM EDT**.

**Important:** We need to verify if the local time is EDT or if there's a time difference.

Given that Billy is in San Francisco (Pacific Time Zone), and the deadline is in EDT (Eastern Time Zone):

- San Francisco (Pacific Time): UTC-8 (Standard Time)
- Eastern Time: UTC-5 (Standard Time)

Time difference: 3 hours (Eastern is ahead of Pacific)

**So, if it's 1PM local (Pacific), then in EDT it is:**

- 1PM Pacific + 3 hours = 4PM EDT.

**Therefore, Billy needs to arrive **before 7PM EDT**, and the current EDT time equivalent is 4PM.**

---

### Step 1: Determine total travel times for each option

**Option A: Fly + Bus**

- Flying takes 3 hours.
- Bus takes 2 hours.
- Total: 3 + 2 = **5 hours**

**Option B: Teleporter + Bus**

- Teleporter: 0 hours.
- Bus: 1 hour.
- Total: 0 + 1 = **1 hour**

---

### Step 2: Calculate departure times needed to arrive before 7PM EDT

Since the current time in EDT is 4PM, and Billy must arrive **before 7PM**, that means he has:

- 7PM - current EDT time (4PM) = **3 hours** remaining from now until 7PM.

Let's see if both options can get Billy home **before 7PM**.

---

### **Option A: Fly + Bus**

- Travel time: 5 hours.
- Earliest departure time in EDT: current time (4PM) plus travel time.

To arrive before 7PM, he must depart **no later than**:

- Arrival deadline (before 7PM) minus 5 hours.

But since it's 4PM now:

- If Billy departs **immediately at 4PM EDT**, he will arrive:

  - 4PM + 5 hours = **9PM EDT**.

- This is **after** the desired 7PM deadline.

Therefore, Billy **must depart earlier than 4PM EDT**, which is impossible because it's 4PM now.

*Conclusion:* with current time being 4PM EDT, **Billy cannot achieve arrival before 7PM with the fly + bus option unless he departs before now**. Since it is already 4PM, he cannot do better unless he departs earlier.

---

### **Option B: Teleporter + Bus**

- Travel time: 1 hour.

- From current EDT time (4PM), arrival will be:

  - 4PM + 1 hour = **5PM EDT**.

- Since 5PM is before 7PM, Billy can arrive on time **by taking this option and departing immediately**.

---

### **Summary:**

- **The teleporter option can get Billy home before 7PM if he departs immediately**, because arriving at 5PM EDT is well before the 7PM deadline.

- **The fly + bus option would require departure before 4PM EDT**, which is impossible because it's already 4PM.

**Therefore:**

- If Billy starts right now (1PM local / 4PM EDT), he **must** take the teleporter + bus to arrive on time.
- The fly + bus option **cannot** guarantee timely arrival unless he departs earlier than now, which isn't possible.

---

### **Answer:**

**Since it's currently 1PM local time (4PM EDT), Billy cannot arrive before 7PM EDT by flying + bus. He can arrive on time by taking the teleporter + bus.**

**In general:**

- If both options are available immediately, the teleporter + bus guarantees arrival before 7PM due to shorter total travel time (1 hour vs. 5 hours).
- **Yes, it does matter** which option Billy selects because the more time-consuming option (fly + bus) cannot meet the deadline if starting now, but the teleporter + bus can.

---

**Final note:** If the current time were earlier, say before 2PM local (before 5PM EDT), then the fly + bus might be feasible. But given the current time specified, the teleporter + bus is the only viable choice to meet the deadline.

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/The-AI-Engineer-Challenge) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)