### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [12]:
!pip install -r api/requirements.txt

1395.92s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


Collecting fastapi==0.115.12 (from -r api/requirements.txt (line 1))
  Using cached fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn==0.34.2 (from -r api/requirements.txt (line 2))
  Using cached uvicorn-0.34.2-py3-none-any.whl.metadata (6.5 kB)
Collecting openai==1.77.0 (from -r api/requirements.txt (line 3))
  Using cached openai-1.77.0-py3-none-any.whl.metadata (25 kB)
Collecting pydantic==2.11.4 (from -r api/requirements.txt (line 4))
  Using cached pydantic-2.11.4-py3-none-any.whl.metadata (66 kB)
Collecting python-multipart==0.0.18 (from -r api/requirements.txt (line 5))
  Using cached python_multipart-0.0.18-py3-none-any.whl.metadata (1.8 kB)
Collecting starlette<0.47.0,>=0.40.0 (from fastapi==0.115.12->-r api/requirements.txt (line 1))
  Using cached starlette-0.46.2-py3-none-any.whl.metadata (6.2 kB)
Collecting click>=7.0 (from uvicorn==0.34.2->-r api/requirements.txt (line 2))
  Using cached click-8.2.1-py3-none-any.whl.metadata (2.5 kB)
Collecting distro

In [16]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [17]:
from openai import OpenAI

client = OpenAI()

In [18]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BlnPwwflwUeqJcwfMv9FD89vA6EiF', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Great question! LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate the development of AI-powered applications involving large language models (LLMs). While they share some similarities, they serve different purposes and have distinct features. Here's a breakdown of their main differences:\n\n### Purpose and Focus\n\n**LangChain:**\n- **Primary Focus:** Building complex, multi-step, chain-based AI applications.\n- **Use Cases:** Conversational agents, automation workflows, language model integrations, and chaining multiple operations together.\n- **Approach:** Provides a flexible framework to connect prompting, memory, tools, and APIs, enabling developers to orchestrate sophisticated language model applications.\n\n**LlamaIndex:**\n- **Primary Focus:** Constructing, querying

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [19]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [20]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as Godel/GPT Index) are both popular frameworks designed to facilitate building applications that leverage large language models (LLMs), but they have different focuses and functionalities.

**1. Purpose and Focus**

- **LangChain:**
  - Focuses on building *conversational AI applications*, including chatbots, question-answering systems, and complex multi-step workflows.
  - Provides a modular framework for chaining together various components like prompts, models, memory, and data sources.
  - Emphasizes *building applications* with a wide array of language model integrations, along with tools for state management, prompt management, and reasoning.

- **LlamaIndex (GPT Index):**
  - Focuses primarily on *building indices* over external data sources (documents, PDFs, web pages) to enable efficient retrieval-augmented generation (RAG).
  - Simplifies creating structured, queryable indices that are integrated with LLMs for question-answering and knowledge retrieval tasks.
  - Major use case is enabling LLMs to reason over and retrieve information from large external datasets with easy-to-use indexing mechanisms.

**2. Core Functionality**

- **LangChain:**
  - Provides *chains* (sequences of calls to language models and tools).
  - Supports *memory* to maintain state across interactions.
  - Includes integrations with various tools, APIs, and data sources.
  - Features *prompt engineering* utilities and *agent frameworks* that can decide what actions to take based on user inputs.
  - Facilitates *multi-modal* workflows and complex reasoning chains.

- **LlamaIndex:**
  - Offers *indexing structures* such as ListIndex, TreeIndex, GraphIndex, etc.
  - Enables *fast retrieval* of relevant data snippets for prompting LLMs.
  - Focuses on *building, managing,* and *querying* over large external datasets.
  - Simplifies the process of augmenting LLMs with external knowledge for improved accuracy.

**3. Use Cases**

- **LangChain:**
  - Chatbots and virtual assistants.
  - Complex decision-making and prompting workflows.
  - Tool integration and multi-step reasoning.

- **LlamaIndex:**
  - Knowledge base creation over large document collections.
  - Information retrieval and question-answering over external data.
  - Building knowledge augmented LLM applications.

**4. Ecosystem and Compatibility**

- Both frameworks are compatible with major LLM providers like OpenAI, Cohere, Hugging Face, etc.
- LangChain offers a broader ecosystem for chaining and managing workflows.
- LlamaIndex is more specialized for document indexing and retrieval.

---

**Summary:**
- **LangChain** is a versatile framework for building sophisticated NLP applications involving chaining, tools, and memory, suitable for conversational agents and multi-step workflows.
- **LlamaIndex** specializes in creating indices over external data sources to enable efficient retrieval and question-answering, enhancing LLM applications with external knowledge.

Depending on your project needs—whether building complex conversational systems or integrating and querying large datasets—you might choose one or both frameworks to complement each other.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [21]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Honestly, I couldn't care less about ice right now! I'm so hungry that the only thing I want is some real food—crushed or cubed ice isn't going to fill the empty pit in my stomach. Just give me something nutritious already!

Let's try that same prompt again, but modify only our system prompt!

In [22]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is fantastic for its refreshing and quick-cooling qualities, perfect for drinks like cocktails and smoothies! Cubed ice, on the other hand, is great when you want your drink to stay cold longer without diluting it too fast, like with whiskey or soda. Both have their own charm—depends on the mood! Which do you prefer?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [23]:
print(joyful_response)

ChatCompletion(id='chatcmpl-BlnT2nsFQMYcXVkNtHBCR1k1lzjyU', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='I think crushed ice is fantastic for its refreshing and quick-cooling qualities, perfect for drinks like cocktails and smoothies! Cubed ice, on the other hand, is great when you want your drink to stay cold longer without diluting it too fast, like with whiskey or soda. Both have their own charm—depends on the mood! Which do you prefer?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1750731268, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=72, prompt_tokens=30, total_tokens=102, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptToke

### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [24]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using the words 'stimple' and 'falbean':

"Amidst the lush meadow, the stimple breeze carried the faint scent of falbean blossoms, creating a serene atmosphere."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [25]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's an example sentence using both words:

"She grabbed the stimple wrench and the falbean to quickly tighten the machinery's bolt."

(Note: Since 'falbean' is a tool that spins or rotates, it’s used here as a rotating or fastening device alongside the 'stimple' wrench, which is good and high quality.)

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions - but it can still benefit from a Chain of Thought Prompt to increase the reliability of the response!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [26]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze Billy's options step by step.

**Current time in local time zone:** 1PM  
**Deadline (home time before 7PM EDT):** 7PM EDT

---

### Step 1: Convert local time to EDT or consider relative timing

Since Billy wants to arrive **before 7PM EDT**, we need to compare the travel durations considering the time difference. But the problem doesn't specify the local time zone or the city Billy is in. 

**Assumption:**  
- **San Francisco** is in Pacific Time (PT), which is **3 hours behind EDT**.
- Therefore, **1PM PT** is **4PM EDT** (since PT + 3 hours = EDT).

**Arrival deadline in local time (PT):**  
- 7PM EDT = 4PM PT

Billy's current local time is **1PM PT**, which is **3 hours before the deadline** in PT.

---

### Step 2: Determine total travel times for each option

**Option 1:**  
- Fly (3 hours)  
- Bus (2 hours)  
- **Total travel time:** 3 + 2 = **5 hours**  
- Departure now (1PM PT)  
- Arrival time if starting now: 1PM + 5 hours = **6PM PT**  
- In EDT: 6PM PT + 3 hours = **9PM EDT**  
- **Result:** Arrives after the deadline (9PM EDT > 7PM EDT) — **not timely**

**Option 2:**  
- Teleporter (0 hours)  
- Bus (1 hour)  
- **Total travel time:** 0 + 1 = **1 hour**  
- Departure now (1PM PT)  
- Arrival time: 1PM + 1 hour = **2PM PT**  
- In EDT: 2PM PT + 3 hours = **5PM EDT**  
- **Result:** Arrives before the deadline (5PM EDT < 7PM EDT) — **on time**

---

### **Conclusion:**

- **If Billy chooses the flying + bus option, he will arrive at 9PM EDT, which is too late.**  
- **If he takes the teleporter + bus, he will arrive at 5PM EDT, well before the deadline.**

---

### **Final answer:**

**Yes, it does matter which option Billy chooses.** To arrive before 7PM EDT, he should take the teleporter and then the bus.

Let's use the same prompt with a small modification - but this time include "Let's think step by step"

In [27]:

list_of_prompts = [
    user_prompt(reasoning_problem + "\nLet's think step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's carefully analyze the scenario step by step.

**Step 1: Understand Billy's current time and deadline**

- Current local time: 1PM
- Deadline to get home: before 7PM EDT

**Step 2: Identify the travel options and their durations**

Option 1:
- Fly: 3 hours
- Bus: 2 hours
- Total time for Option 1: 3 + 2 = 5 hours

Option 2:
- Teleporter: 0 hours
- Bus: 1 hour
- Total time for Option 2: 0 + 1 = 1 hour

**Step 3: Determine the time needed from now to reach home for each option**

- **Option 1:**
  - Total travel time: 5 hours
  - Starting at 1PM, arrival time: 1PM + 5 hours = 6PM

- **Option 2:**
  - Total travel time: 1 hour
  - Starting at 1PM, arrival time: 1PM + 1 hour = 2PM

**Step 4: Check if Billy can arrive before 7PM**

- Option 1: Arrival at 6PM, which is before 7PM. **Yes, he can get home on time.**
- Option 2: Arrival at 2PM, which is also before 7PM. **Yes, he can get home on time.**

**Step 5: Conclusion**

Since both options allow Billy to get home before the 7PM deadline, **it doesn't matter which option he chooses in terms of meeting the deadline**.

**Final answer:**
**No, it does not matter which travel option Billy selects, because both options allow him to arrive before 7PM EDT.**

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/The-AI-Engineer-Challenge) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)