### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [1]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [2]:
from openai import OpenAI

client = OpenAI()

In [3]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-Bm0T5mdZl3Ng7wpVKpum16svVsQcT', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Certainly! Here’s an overview of the differences between **LangChain** and **LlamaIndex (formerly known as GPT Index)**:\n\n### **LangChain**\n- **Purpose:** A comprehensive framework designed for building applications with large language models (LLMs). It focuses on **orchestrating** various components such as prompt management, model interaction, outside data retrieval, and more.\n- **Features:**\n  - Modular architecture supporting chains, prompts, agents, and memory.\n  - Integration with multiple LLM providers (OpenAI, Hugging Face, etc.).\n  - Tools for prompt engineering, conversation management, and task orchestration.\n  - Supports retrieval-augmented generation (RAG) workflows, including document retrieval and question-answering systems.\n- **Use Cases:** Building chatbots, question-answering systems, automation workf

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [4]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [5]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both frameworks designed to facilitate the development of applications that leverage large language models (LLMs), but they serve different purposes and have distinct focus areas. Here’s an overview of their differences:

**1. Purpose and Primary Focus**

- **LangChain:**  
  - Focuses on building **multi-step, complex language model applications** that involve chaining together various components like prompts, memory, agents, and tools.  
  - Emphasizes creating **chatbots, question-answering systems, and workflows** that require orchestration and state management.  
  - Provides a modular framework for managing prompts, agents, memory, and integration with external data sources.

- **LlamaIndex (GPT Index):**  
  - Focuses on **efficiently ingesting, indexing, and querying large-scale external data** (documents, files, PDFs, etc.) using LLMs.  
  - Designed to build **semantic search, retrieval-augmented generation (RAG)** applications, enabling LLMs to access and reason over large document collections.  
  - Simplifies embedding, indexing, and querying workflows over unstructured data.

**2. Core Functionality**

- **LangChain:**  
  - Provides abstractions for **prompt management**, **tool integration**, **memory**, and **agent orchestration**.  
  - Supports constructing **multi-turn conversations**, **decision-making agents**, and **workflow pipelines**.  
  - Can integrate with various external APIs, databases, or tools within an application.

- **LlamaIndex:**  
  - Offers **data ingestion pipelines** to load and process large document collections.  
  - Creates **vector indexes** (e.g., GPT, Simple, Tree, Keyword) to facilitate fast retrieval and querying.  
  - Enables creating **chatbots or Q&A systems** that can access external data sources dynamically.

**3. Use Cases**

| Use Case | LangChain | LlamaIndex |
|---|---|---|
| Building chatbots / conversational agents | Yes | Possible, often as an auxiliary component |
| Complex workflows involving prompts, memory, tools | Yes | No (focused on data retrieval and indexing) |
| Semantic search / document retrieval | Limited | Yes |
| Building retrieval-augmented generation systems | Possible | Yes |
| Orchestrating multi-step LLM applications | Yes | No |

**4. Integration and Ecosystem**

- **LangChain:**  
  - Has a broader ecosystem with integrations for various LLM providers (OpenAI, Hugging Face, AI21), memory modules, agents, and tools.  
  - Supports complex decision-making processes and interactions.

- **LlamaIndex:**  
  - Specifically tailored towards managing unstructured data sources and enabling LLMs to access and reason over them efficiently.  
  - Integrates with data storage solutions and embedding models.

---

### **In Summary:**

- **LangChain** is a versatile framework focusing on the orchestration, management, and chaining of LLM-based components for building complex AI applications, especially conversational and multi-step workflows.
- **LlamaIndex** specializes in ingesting and indexing large external datasets to enable fast, scalable retrieval and question-answering systems over unstructured data.

Depending on your project needs—whether it's building complex conversational workflows or enabling document-based retrieval—you might choose one or both frameworks in combination.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [6]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? After all I'm going through right now, the very LAST thing I have time for is debating ice types! I mean, crushed ice melts faster, ruining my drink—and cubed ice is too slow and bulky! Honestly, just give me ice so I can get on with stuff already! I'm furious just thinking about this trivial nonsense!

Let's try that same prompt again, but modify only our system prompt!

In [7]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is fun because it feels like a cool burst of refreshment all at once! Cubed ice is great for keeping drinks cold without diluting them too quickly and looks nice in a glass. Both have their charms—what's your favorite?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [8]:
print(joyful_response)

ChatCompletion(id='chatcmpl-Bm0TwsXHyCObslTRMC8qayceKYajt', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is fun because it feels like a cool burst of refreshment all at once! Cubed ice is great for keeping drinks cold without diluting them too quickly and looks nice in a glass. Both have their charms—what's your favorite?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1750781296, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=52, prompt_tokens=30, total_tokens=82, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [9]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Though the terms "stimple" and "falbean" aren't recognized in standard English, here's a creative sentence using them:

"During the fantasy game, I encountered a stimple creature that guarded a falbean artifact deep within the enchanted forest."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [10]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench easily adjusted the falbean to secure the machinery.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [11]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options carefully:

**Option 1:** Fly + Bus  
- Fly time: 3 hours  
- Bus time afterward: 2 hours  
- Total time: 3 + 2 = 5 hours

**Option 2:** Teleporter + Bus  
- Teleporter time: 0 hours  
- Bus time afterward: 1 hour  
- Total time: 0 + 1 = 1 hour

**Current time:** 1PM local time  
**Goal:** Arrive in San Francisco **before 7PM EDT**

---

**Important points:**

1. The current local time is 1PM.
2. The deadline is 7PM EDT.

## Time Zone Considerations:
- San Francisco is in Pacific Time (PT).  
- EDT is Eastern Daylight Time, which is typically **3 hours ahead** of PT.  

**Therefore:**

- 7PM EDT corresponds to **4PM PT** (since EDT is 3 hours ahead).

---

### Timing for each option:

**Option 1 (Fly + Bus):**  
- Total travel time: 5 hours  
- Earliest start: 1PM  
- Arrival time in PT: 1PM + 5 hours = **6PM PT**  
- Convert to EDT: 6PM PT + 3 hours = **9PM EDT**

**Result:** Billy would arrive **after 7PM EDT**, missing his deadline.

---

**Option 2 (Teleporter + Bus):**  
- Total travel time: 1 hour  
- Earliest start: 1PM  
- Arrival time in PT: 1PM + 1 hour = **2PM PT**  
- Convert to EDT: 2PM PT + 3 hours = **5PM EDT**

**Result:** Billy would arrive **before 7PM EDT**, meeting his deadline.

---

### **Conclusion:**

Yes, it **does** matter which option Billy chooses. Only taking the teleporter plus bus ensures he arrives before 7PM EDT. The flying option, taking longer, causes him to arrive too late.

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/Beyond-ChatGPT/tree/main) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)