### Using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, we'll need to provide our OpenAI API Key - detailed instructions can be found [here](https://github.com/AI-Maker-Space/Interactive-Dev-Environment-for-LLM-Development#-setting-up-keys-and-tokens)!

In [None]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

Please enter your OpenAI API Key: ··········


### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [None]:
from openai import OpenAI

client = OpenAI()

In [None]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BaWc1DuZXA2Pw5SQAlQRHUBfErubp', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Great question! LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks and tools in the AI and natural language processing ecosystem, but they serve different purposes and have distinct functionalities. Here's a breakdown of their main differences:\n\n**1. Purpose and Core Use Cases:**\n\n- **LangChain:**\n  - **Primary Focus:** Building complex, multi-step applications that involve large language models (LLMs), such as chaining prompts, handling conversations, agents, and workflows.\n  - **Use Cases:** Chatbots, conversational agents, application orchestration, dynamic prompt management, tool integration, and automating interactions with LLMs.\n  - **Features:** Supports prompt chaining, memory management, agent frameworks, and integrations with various LLM providers and external tools.\n\n- **Llama

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [None]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [None]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate building applications that leverage large language models (LLMs) and external data sources. However, they have distinct focuses and functionalities. Here's a comparison to clarify their differences:

**1. Purpose and Core Focus:**

- **LangChain:**
  - **Primary Focus:** Modular framework for building 'chain'-based applications with LLMs.
  - **Use Cases:** Conversation agents, chatbots, decision-making pipelines, and multi-step LLM workflows.
  - **Features:** Emphasizes composability, with tools for chaining prompts, managing memory, handling user interactions, and integrating various LLM APIs.
  - **Data Handling:** Supports incorporating external data sources but typically requires custom integration.

- **LlamaIndex (GPT Index):**
  - **Primary Focus:** Data framework for building indices over external data sources to enable context-aware querying with LLMs.
  - **Use Cases:** Building knowledge bases, document retrieval, question-answering over large corpora, knowledge management.
  - **Features:** Provides data ingestion pipelines, index structures (e.g., trees, graphs), and query interfaces designed to efficiently retrieve relevant data snippets for LLM prompts.
  - **Data Handling:** Specifically optimized for indexing and querying large datasets, enabling LLMs to use external knowledge effectively.

**2. Architecture and Design:**

- **LangChain:** 
  - Emphasizes a composable "chain" architecture—connecting prompts, models, memory, and tools.
  - Supports a wide variety of chains (e.g., simple prompt-response, multi-turn dialogs, tools integration).

- **LlamaIndex:**
  - Focuses on creating and managing data indices that allow LLMs to reference external documents.
  - Incorporates components like document loaders, index builders, and query engines.

**3. Integration and Extensibility:**

- **LangChain:**
  - Integrates with multiple LLM providers (OpenAI, Hugging Face, etc.).
  - Offers integrations with APIs, databases, and other tools.
  - Extensively supports custom chains and tools for advanced workflows.

- **LlamaIndex:**
  - Designed to easily ingest large datasets and create retrieval-optimized indices.
  - Supports various data sources (PDFs, webpages, databases).

**4. Typical Usage Scenarios:**

- **LangChain:** Building chatbots, conversational agents, AI assistants that require multi-step reasoning, memory, or external tool integration.

- **LlamaIndex:** Creating a searchable knowledge base from large collections of documents, enabling LLMs to answer questions with access to external data.

---

**Summary:**

| Aspect                  | LangChain                                              | LlamaIndex                                              |
|-------------------------|--------------------------------------------------------|--------------------------------------------------------|
| Core Purpose            | Building multi-step LLM applications, chains, and workflows | Building and querying indices over external datasets  |
| Focus                   | Modular orchestration, chaining, and tool integration | Data ingestion, indexing, and retrieval for QA       |
| Typical Use Cases       | Chatbots, agents, decision workflows                  | Document Q&A, knowledge bases                        |
| Data Handling           | Supports external data but less focused on indexing     | Specialized in indexing, retrieval, and data management |

**In essence:**  
- Use **LangChain** if you're aiming to build complex, multi-step applications involving LLMs, memory, tools, and conversational flows.  
- Use **LlamaIndex** if your primary goal is to process large external datasets, create indices, and enable question-answering over that data with LLMs.

---

If you're planning a project, consider whether your focus is on orchestrating multi-step interactions (**LangChain**) or on efficiently managing and querying large datasets (**LlamaIndex**). Often, they can be used together for comprehensive applications.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [None]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? I haven't eaten in ages, and now you ask about ice preferences? Honestly, I don't care if it's crushed or cubed—what I want is real food! But if I had to choose, crushed ice is just frustratingly messy, and cubed ice takes forever to chew through when I'm starving. Ugh! Just give me some actual food already!

Let's try that same prompt again, but modify only our system prompt!

In [None]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I love both! Crushed ice is perfect for refreshing drinks and gives a nice cool burst, while cubed ice adds a touch of elegance and lasts longer in your beverage. It really depends on what you're craving—what about you? Which do you prefer?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [None]:
print(joyful_response)

ChatCompletion(id='chatcmpl-BaWd1pMBkElNFiiI7AWuHxxCEXTAT', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I love both! Crushed ice is perfect for refreshing drinks and gives a nice cool burst, while cubed ice adds a touch of elegance and lasts longer in your beverage. It really depends on what you're craving—what about you? Which do you prefer?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1748044931, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_eede8f0d45', usage=CompletionUsage(completion_tokens=52, prompt_tokens=30, total_tokens=82, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Few-shot Prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [None]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Sure! Here's a sentence using both words:

"During the strange festival, I noticed a stimple dance performed by the villagers while a falbean melody played softly in the background."

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [None]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench quickly secured the falbean in place, ensuring the machinery operated smoothly.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [None]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the options:

**Option 1:** Fly (3 hours) + Bus (2 hours)  
Total travel time: 3 + 2 = 5 hours

**Option 2:** Teleporter (0 hours) + Bus (1 hour)  
Total travel time: 0 + 1 = 1 hour

---

**Additional considerations:**
- **Current time:** 1PM local time
- **Deadline:** Before 7PM EDT

Assuming the current local time is 1PM and Billy wants to arrive **before 7PM EDT**, we need to consider the time difference between his local time zone and EDT.

**Important:**  
- San Francisco is in PDT (Pacific Daylight Time), which is UTC-7.  
- EDT (Eastern Daylight Time) is UTC-4.  
- Therefore, EDT is 3 hours ahead of PDT.

---

**Converting current local time to EDT:**  
- If it's 1PM local time in San Francisco (PDT), the current EDT is:  
1PM + 3 hours = 4PM EDT

**Deadline in local time:**  
Billy needs to arrive **before 7PM EDT**, which is:  
7PM EDT - 3 hours = 4PM PDT

So, Billy must arrive **before 4PM PDT**.

---

**Calculating latest departure times:**

- To arrive before 4PM PDT:  
  - **Option 1 (Fly + Bus):**  
    Total time: 5 hours  
    Latest departure time: 4PM - 5 hours = **11AM PDT**  
  - **Option 2 (Teleporter + Bus):**  
    Total time: 1 hour  
    Latest departure time: 4PM - 1 hour = **3PM PDT**

---

**Current time and possibilities:**

- Now it is 1PM PDT.
- **Option 1:**  
  You must leave **by 11AM PDT** to arrive on time — **already too late** now.
- **Option 2:**  
  Must leave **by 3PM PDT** — it is currently 1PM, so there is **2 hours remaining** to catch the teleporter.

---

**Conclusion:**  
- **If Billy wants to arrive before 7PM EDT (which is 4PM PDT),** he **must** take the teleporter option, because the fly+bus option is no longer feasible (he would need to depart by 11AM PDT, which has already passed).

- **Given current time (1PM PDT),** Billy can take the teleporter + bus to arrive on time, but the fly + bus option is too late.

---

### Final answer:

**It does matter which option Billy chooses.**  
- The teleporter + bus allows him to arrive on time (before 4PM PDT, which is before 7PM EDT).  
- The fly + bus option cannot make it in time given the current time.

**Therefore, Billy should choose the teleporter + bus option to ensure he gets home before 7PM EDT.**

As humans, we can reason through the problem and pick up on the potential "trick" that the LLM fell for: 1PM *local time* in San Fran. is 4PM EDT. This means the cumulative travel time of 5hrs. for the plane/bus option would not get Billy home in time.

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task:

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

You can find the rest of the steps in [this](https://github.com/AI-Maker-Space/Beyond-ChatGPT/tree/main) repository!

This notebook was authored by [Chris Alexiuk](https://www.linkedin.com/in/csalexiuk/)