### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [1]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

Please enter your OpenAI API Key:  ········


### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [2]:
from openai import OpenAI

client = OpenAI()

In [5]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BuTv5aFsoKdmSUxlpZn53jxSpcNzu', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks that facilitate building language model applications, particularly around document retrieval, processing, and conversational interfaces. However, they have different focuses, architectures, and use cases. Here's a comparison to clarify their differences:\n\n### LangChain\n- **Primary Focus:** Building end-to-end language model applications with chaining, prompting, memory, and structured workflows.\n- **Core Features:**\n  - **Chains:** Modular sequences of calls, such as prompts, models, or tools.\n  - **Agents:** Dynamic decision-making components that choose actions based on inputs.\n  - **Memory:** Persistent context management for conversational or stateful interactions.\n  - **Integrations:** Supports various LLM providers, APIs, tools, and

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [6]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [7]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

Great question! Both **LangChain** and **LlamaIndex** (formerly known as GPT Index) are popular Python frameworks designed to facilitate building applications that leverage large language models (LLMs), but they serve different purposes and have distinct features. Here's a breakdown of their main differences:

### Purpose and Use Cases

- **LangChain:**
  - Focuses on building **conversational agents, chatbots, and complex language model applications**.
  - Emphasizes **chains, agents, and workflows**, allowing you to combine multiple components (e.g., prompts, LLM calls, memory, tools) to create sophisticated NLP pipelines.
  - Supports integrations with various models, APIs, and tools for creating **interactive, dynamic applications**.

- **LlamaIndex (GPT Index):**
  - Primarily designed for **building indexes over large external data sources** (e.g., PDFs, documents, databases) so that LLMs can efficiently **retrieve and answer questions based on specific data**.
  - Focuses on **document retrieval**, **semantic search**, and **question-answering over custom data collections**.
  - Helps **ingest and query large custom datasets** with minimal setup, enabling LLMs to act as intelligent data assistants.

### Core Functionality

| Aspect | LangChain | LlamaIndex (GPT Index) |
|---------|--------------|------------------------|
| **Main Focus** | Building NLP workflows, chatbots, and agent-based applications | Indexing and querying large document sets or data sources |
| **Data Handling** | Primarily works with prompts, memory, tools, and APIs; supports external data but less focused on indexing | Designed specifically for creating data indexes and retrieval mechanisms |
| **Chain/Workflow Support** | Rich support for chains, agents, memory, toolkits to create complex interactions | Limited; more focused on data ingestion and retrieval |
| **Integration with LLMs** | Extensive, supports multiple providers (OpenAI, Cohere, Hugging Face, etc.) | Focused on enabling LLMs to access and retrieve information from indexed data |

### Use Case Examples

- **LangChain:**
  - Creating a conversational assistant that uses multiple tools and APIs.
  - Building multi-step reasoning pipelines.
  - Implementing agents that decide which tools to invoke during a conversation.

- **LlamaIndex:**
  - Building a search engine over internal company documents.
  - Answering questions based on a custom dataset without training a new model.
  - Indexing large documents so that LLMs can retrieve relevant excerpts during a Q&A session.

### Summary

| Feature                  | LangChain                                            | LlamaIndex                                   |
|--------------------------|--------------------------------------------------------|----------------------------------------------|
| Main purpose             | Complex NLP workflows, chatbots, multi-step chains   | Indexing and querying large data sources  |
| Data ingestion           | Supports external data but not the primary focus      | Core focus: ingesting and indexing data    |
| Use case specialization    | Workflow orchestration, agent-based systems          | Data retrieval, semantic search            |
| Flexibility and extensibility | Highly flexible with chains and tools             | Specialized for document-based retrieval  |

---

**In brief:**  
- Use **LangChain** if you're building conversational AI, agents, or complex NLP pipelines.  
- Use **LlamaIndex** if you need to create a system where LLMs can efficiently access and answer questions based on large external datasets.

Let me know if you'd like more detailed comparisons or examples!

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [6]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? I don't have time to mess around—I am absolutely starving and just want some ice that actually satisfies! Crushed ice, while convenient, melts too fast and is a mess. Cubed ice is better because it lasts longer and keeps my drink colder without turning to water instantly. Honestly, I’m just desperate for something to eat, not some ice debate!

Let's try that same prompt again, but modify only our system prompt!

In [7]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is so fun and refreshing because it cools drinks quickly and adds a nice texture! But cubed ice is perfect for keeping drinks colder longer without watering them down. Both have their charm—depends on what mood I’m in! How about you—do you prefer crushed or cubed ice?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [8]:
print(joyful_response)

ChatCompletion(id='chatcmpl-BUc3g9V3hoAA0KyvjZI4YasY1mYOW', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='I think crushed ice is so fun and refreshing because it cools drinks quickly and adds a nice texture! But cubed ice is perfect for keeping drinks colder longer without watering them down. Both have their charm—depends on what mood I’m in! How about you—do you prefer crushed or cubed ice?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1746635836, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_8fd43718b3', usage=CompletionUsage(completion_tokens=64, prompt_tokens=30, total_tokens=94, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [9]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Certainly! Here's a sentence using the words 'stimple' and 'falbean':

"During the peculiar festival, villagers gathered around a stimple, while children giggled over the mysterious falbean tucked into their baskets."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [10]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using both "stimple" and "falbean":

"The stimple falbean crafted by the craftsmen ensures smooth rotation and reliable fastening for all our machinery."

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions - but it can still benefit from a Chain of Thought Prompt to increase the reliability of the response!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [11]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options carefully:

**Option 1:** Fly (3 hours) + Bus (2 hours)  
Total travel time: 3 + 2 = 5 hours

**Option 2:** Teleporter (0 hours) + Bus (1 hour)  
Total travel time: 0 + 1 = 1 hour

**Current local time:** 1PM

**Target arrival time:** before 7PM EDT

Since the current local time is 1PM and Billy wants to arrive home before 7PM EDT (which is 6 hours later), he has a window of nearly 6 hours to get home.

**Calculating arrival times:**

- **Option 1:**  
  Departure at 1PM local time, travel takes 5 hours, arriving around 6PM local time.  
  Since this is within the 6-hour window, Billy would arrive just before 7PM EDT.

- **Option 2:**  
  Departure at 1PM, travel takes 1 hour, arriving around 2PM local time, well before 7PM EDT.

**Conclusion:**  
Yes, it does matter which option Billy chooses if he needs to arrive strictly before 7PM EDT. The teleportation + bus option ensures he arrives much earlier, giving him more buffer time. The flying + bus option just makes it in time, arriving right around 6PM local time, which is still before 7PM EDT.

**Final note:**  
- If Billy prefers certainty and plenty of extra time, the teleport + bus is better.  
- If he wants to save time and is okay arriving close to 7PM, the flying + bus is sufficient.

**Answer:** Yes, the choice matters if arriving strictly before 7PM EDT.

Let's use the same prompt with a small modification - but this time include "Let's think step by step"

In [None]:

list_of_prompts = [
    user_prompt(reasoning_problem + "\nLet's think step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options step by step:

**Current situation:**
- It is currently 1PM local time.
- Billy wants to arrive home **before 7PM EDT**.

**Important considerations:**
- Time zones are not explicitly specified, but since Billy is in San Francisco (Pacific Time, PT), and the deadline is in EDT, we need to convert times accordingly.
- Pacific Time (PT) is **3 hours behind Eastern Time (ET)**.
  - When it's 1PM PT, it's **4PM ET**.

**Conversion:**
- **Current local time:** 1PM PT = 4PM ET
- **Deadline:** 7PM ET

Billy needs to arrive **before 7PM ET**, which is **before 7PM ET**.

---

### Option 1: Fly + Bus
- Flying takes **3 hours**.
- Bus takes **2 hours**.

**Total travel time:** 3 + 2 = **5 hours**

### Option 2: Teleporter + Bus
- Teleporter takes **0 hours**.
- Bus takes **1 hour**.

**Total travel time:** 0 + 1 = **1 hour**

---

### Now, let's calculate the arrival times for each option:

---

### Option 1: Fly + Bus

- Departure time: 1PM PT (which is 4PM ET)
- Travel duration: 5 hours
- Arrival time in ET: 4PM + 5 hours = **9PM ET**

**Note:** Since he departs at 1PM PT (=4PM ET), and takes 5 hours, he'd arrive **at 9PM ET**.

**Conclusion:** He arrives **after 7PM ET**. **Not** before the deadline.

---

### Option 2: Teleporter + Bus

- Departure time: 1PM PT (=4PM ET)
- Travel duration: 1 hour
- Arrival time in ET: 4PM + 1 hour = **5PM ET**

**Conclusion:** He arrives **before 7PM ET**.

---

### Final answer:
**Yes, it does matter which option Billy chooses.** 

- The teleporter + bus allows him to arrive **before the deadline**.
- The fly + bus option makes him arrive **after the deadline**.

**Therefore, Billy should choose the teleporter + bus option to reach home before 7PM EDT.**

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/The-AI-Engineer-Challenge) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)