### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [4]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [5]:
from openai import OpenAI

client = OpenAI()

In [7]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-Buw30tjD6h7g2Jpws5oGImIgDUlhB', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate building language model applications, especially those involving large language models (LLMs), but they serve different purposes and have distinct features. Here's an overview of their key differences:\n\n**1. Purpose and Focus:**\n\n- **LangChain:**\n  - Primarily a framework for developing complex, multi-step applications that involve chaining together various components like prompts, models, memory, tools, and more.\n  - Focuses on creating sophisticated conversational agents, workflows, or applications that require orchestration of multiple language model interactions.\n  - Emphasizes modularity and flexibility, enabling developers to build pipelines with chaining, agents, and memory.\n\n- **LlamaIndex (GPT Index):**\n 

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [8]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [9]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate building applications that leverage large language models (LLMs), but they serve different purposes and have distinct features. Here's a comparison to help clarify their differences:

### Purpose and Focus
- **LangChain**:
  - Focuses on creating **conversational agents, chatbots, and multi-step workflows**.
  - Provides tools for chaining together multiple language model calls, integrating external tools, memory management, and dialogue state tracking.
  - Emphasizes building **complex applications** that require dynamic interactions, reasoning, and memory.

- **LlamaIndex (GPT Index)**:
  - Specializes in **building and querying large document corpora** using LLMs.
  - Facilitates **indexing** of unstructured data (documents, PDFs, websites) into structures that can be efficiently queried.
  - Designed to make **retrieval-augmented generation (RAG)** workflows easier, enabling LLMs to fetch relevant information from large datasets.

### Core Functionality
- **LangChain**:
  - Provides **modules and abstractions** for prompts, models, chains, agents, tools, memory, and more.
  - Supports **multi-modal workflows**: combining language models with APIs, databases, tools.
  - Enables development of **personalized chatbots, autonomous agents** capable of performing complex tasks.

- **LlamaIndex**:
  - Offers **data ingestion pipelines** to load, parse, and index documents.
  - Provides **querying mechanisms** that combine retrieval with language models to answer questions based on your data.
  - Supports **custom indices** (e.g., tree indices, list indices) for specific data structures.

### Use Cases
- **LangChain**:
  - Building conversational agents and chatbots.
  - Automating workflows involving multiple steps and integrations.
  - Developing autonomous agents that can reason and interact with external tools.

- **LlamaIndex**:
  - Creating a knowledge base from large collections of documents.
  - Building question-answering systems that consult a corpus of data.
  - Summarizing or analyzing large unstructured datasets.

### Integration and Ecosystem
- **LangChain**:
  - Can work with various LLM providers (OpenAI, Hugging Face, etc.).
  - Supports integrations with external APIs, databases, and tools.
  - Has an active community and extensive documentation.

- **LlamaIndex**:
  - Designed to be compatible with multiple data sources and storage formats.
  - Integrates with LLM APIs for querying and retrieval.
  - Focused more on data management and retrieval workflows.

### Summary
| Aspect                | **LangChain**                                   | **LlamaIndex**                                 |
|-----------------------|------------------------------------------------|------------------------------------------------|
| Main Purpose          | Building chatbots, agents, workflows          | Indexing and querying large document corpora|
| Core Focus            | Chain-of-thought, multi-step reasoning, tool integration | Data ingestion, search, retrieval          |
| Use Cases             | Conversational AI, autonomous agents          | Knowledge bases, document QA                |
| Data Handling         | Dynamic, multi-turn conversations              | Large unstructured datasets                   |

---

### In brief:
- **Use LangChain** if you are developing complex conversational applications, workflows, or hybrid systems that involve reasoning, memory, and tool integration.
- **Use LlamaIndex** if your goal is to build a system that indexes vast amounts of unstructured data and allows efficient retrieval using LLMs.

Both frameworks can be complementary in some scenarios, and choosing between them depends on your specific application needs.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [10]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? After waiting forever for this question, I’d go nuts for some crushed ice right now! Cubed ice just doesn’t cut it when you’re starving and desperate for a cool, refreshing boost. Honestly, choose wisely—my stomach’s growling and I need something to satisfy this relentless hunger!

Let's try that same prompt again, but modify only our system prompt!

In [11]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is fun because it's great for summer drinks and adds a refreshing crunch! Cubed ice, on the other hand, looks sleek and melts more slowly, making it perfect for sipping. Which do you prefer?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [12]:
print(joyful_response)

ChatCompletion(id='chatcmpl-Buw7PzCEXly9zVPM6B7PVufI6NtAt', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is fun because it's great for summer drinks and adds a refreshing crunch! Cubed ice, on the other hand, looks sleek and melts more slowly, making it perfect for sipping. Which do you prefer?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1752909475, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint=None, usage=CompletionUsage(completion_tokens=45, prompt_tokens=30, total_tokens=75, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [13]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using the words:

"During the workshop, she demonstrated a quick stimple to fix the broken device, while the team marveled at the rare falbean species they observed nearby."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [14]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench effortlessly tightened the falbean bolt, ensuring everything was securely in place.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions - but it can still benefit from a Chain of Thought Prompt to increase the reliability of the response!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [15]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze both options step by step, assuming Billy wants to arrive before 7PM EDT.

**Important details:**

- **Current time:** 1PM local time.
- **Target arrival time:** before 7PM EDT.

---

### Scenario 1: Flying + Bus
- **Fly:** 3 hours.
- **Then take a bus:** 2 hours.
- **Total travel time:** 3 + 2 = **5 hours**.

**Departure time:** Since Billy is starting at 1PM local time, he can depart immediately.

**Arrival time:** 1PM + 5 hours = 6PM local time.

**Is this before 7PM EDT?**  
- Yes, if local time is aligned with EDT, he would arrive at **6PM EDT**, which is before 7PM.

---

### Scenario 2: Teleporter + Bus
- **Teleporter:** 0 hours.
- **Bus:** 1 hour.
- **Total travel time:** 0 + 1 = **1 hour**.

**Departure time:** at 1PM, same as above.

**Arrival time:** 1PM + 1 hour = 2PM local time.

**Is this before 7PM EDT?**  
- Yes, 2PM is before 7PM.

---

### **Conclusion:**

Both options get Billy home well before the 7PM cutoff, with the teleporter option arriving significantly earlier.

**Does it matter which option Billy selects?**  
- **In terms of arriving before 7PM**, no, both options are fine.
- **In terms of arrival time:** the teleporter + bus gets him home earlier.
- **In terms of resource or convenience:** depends on Billy's priorities, but travel time difference is clear.

**Final answer:**  
*No, it doesn't matter in terms of reaching before 7PM.* Both options allow Billy to arrive on time.

Let's use the same prompt with a small modification - but this time include "Let's think step by step"

In [16]:

list_of_prompts = [
    user_prompt(reasoning_problem + "\nLet's think step by step.")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options step by step.

**Given Data:**
- Current local time: 1PM
- Billy wants to arrive **before 7PM EDT**
- Time constraint: must arrive **before 7PM**

---

### Step 1: Determine the total travel times for each option

**Option 1: Fly + Bus**
- Fly time: 3 hours
- Bus time afterward: 2 hours
- Total travel time: 3 + 2 = **5 hours**

**Option 2: Teleporter + Bus**
- Teleporter time: 0 hours
- Bus time afterward: 1 hour
- Total travel time: 0 + 1 = **1 hour**

---

### Step 2: Calculate latest departure times to arrive before 7PM

**Option 1:**
- Arrival deadline: 7PM
- Total travel time: 5 hours
- Latest departure time: 7PM - 5 hours = **2PM**

**Option 2:**
- Arrival deadline: 7PM
- Total travel time: 1 hour
- Latest departure time: 7PM - 1 hour = **6PM**

---

### Step 3: Compare current time to departure deadlines

- Current local time: 1PM
- For **Option 1 (Fly + Bus)**:
  - Billy must depart **by 2PM**. Since it's currently 1PM, he has **1 hour** to decide and depart.
- For **Option 2 (Teleporter + Bus)**:
  - Billy must depart **by 6PM**. He has **5 hours** remaining.

### **Conclusion:**

Billy can still make either option, as he has enough time for both:

- He needs to depart **by 2PM** for Option 1.
- He has until **6PM** for Option 2.

**Does it matter which option he chooses?**

- From a timing perspective, **both options are feasible since current time is 1PM**.
- The main difference is the duration of travel and flexibility:

  - **Option 1** takes longer (5 hours total), and he must depart relatively soon.
  - **Option 2** is much faster (1 hour), giving more flexibility.

**Final Answer:**

**It does matter, because taking the teleporter allows Billy to arrive well before 7PM without rushing, while flying requires him to depart soon (by 2PM).** 

In terms of ensuring timely arrival, **taking the teleporter + bus gives Billy more margin**, but **both options can work if he departs on time**. If Billy wants to guarantee arriving before 7PM without tight schedule constraints, the teleporter + bus is the better choice.

---

**Summary:**  
Yes, it does matter which option Billy selects, because the teleporter plus bus allows for more departure flexibility and guarantees earlier arrival, while the airplane plus bus requires a prompt departure.

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/The-AI-Engineer-Challenge) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)