In [None]:
import marimo as mo

# AI/LLM Engineering Kick-off

For our initial activity, we will be using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, you'll need an OpenAI Key. [here](https://platform.openai.com)!

In [None]:
import os
import openai

# Read API key from OpenAI_key.txt file
with open("OpenAI_key.txt", "r") as f:
    api_key = f.read().strip()

os.environ["OPENAI_API_KEY"] = api_key
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [None]:
from openai import OpenAI

client = OpenAI()

In [None]:
YOUR_PROMPT = "What is the difference between the LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{
        "role": "user",
        "content": YOUR_PROMPT
    }]
)

Aspect,LangChain,LlamaIndex
Main Purpose,Building LLM apps with complex workflows,Indexing and retrieval over data
Focus,"Workflow orchestration, chaining, tools","Data indexing, retrieval, QA"
Use Cases,"Chatbots, agents, multi-step LLM tasks","Knowledge bases, document QA"
Data Handling,"External integrations, prompt management","Data ingestion, search, retrieval"
Ecosystem,"Modular, extensive chaining and tools support","Data indexing, vector search"


As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [None]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, 
                 messages: str, 
                 model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {
        "role": "developer",
        "content": message
    }

def assistant_prompt(message: str) -> dict:
    return {
        "role": "assistant",
        "content": message
    }

def user_prompt(message: str) -> dict:
    return {
        "role": "user",
        "content": message
    }

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [None]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

Feature,LangChain,LlamaIndex (GPT Index)
Core Functionality,"Orchestrate prompts, chains, and LLM calls; design complex workflows",Indexing and querying unstructured data; efficient retrieval using LLMs
Data Handling,"Integrate with APIs, databases, and external tools","Ingest, organize, and search large document collections"
Prompts & Chains,"Advanced prompt templates, multi-step chains, memory management",Focus on embedding-based indexing and retrieval; less about prompts orchestration
Tools & Integrations,"Supports many LLM providers and tools, custom agents",Focused on retrieval and indexing modules; limited to data indexing
Use Case Focus,"Multi-step reasoning, agent-based tasks, conversational AI","Document search, knowledge base querying, data retrieval"

Aspect,LangChain,LlamaIndex
Main Goal,Orchestrate complex LLM workflows and applications,Index and retrieve information from large unstructured data sources
Focus Area,"Prompt engineering, chains, agents, and integrations","Data indexing, retrieval, and question-answering over documents"
Use Cases,"Chatbots, multi-step reasoning, automation","Document QA, knowledge base creation, data search"


Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [None]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)

pretty_print(irate_response)

Let's try that some prompt again, but modify only our system prompt!

In [None]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)

pretty_print(joyful_response)

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [None]:
print(joyful_response)

ChatCompletion(id='chatcmpl-CS9cgBM7aSg5roYTx98lyKWNexKpj', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is great for a refreshing sensation, especially in drinks like margaritas or iced coffees. Cubed ice, on the other hand, melts more slowly and is perfect for keeping your beverages cold without diluting them too quickly. Both have their own charm—it's all about what you're in the mood for!", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1760826210, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_1f35c1788c', usage=CompletionUsage(completion_tokens=64, prompt_tokens=30, total_tokens=94, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cac

### Prompt Engineering

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [None]:
list_of_prompts_1 = [
    user_prompt("Write a brief text on climate change.")
]

stimple_response = get_response(client, list_of_prompts_1)
pretty_print(stimple_response)

<span class="codehilite"><div class="highlight"><pre><span></span><span class="gt">Traceback (most recent call last):</span>
  File <span class="nb">&quot;E:\Personal\Full_Stack_Data_Analyst\PSI Academy\GitHub Repo\ai-llm-engineering-2\day_1\.venv\Lib\site-packages\openai\_base_client.py&quot;</span>, line <span class="m">1024</span>, in <span class="n">request</span>
<span class="w">    </span><span class="n">response</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
  File <span class="nb">&quot;E:\Personal\Full_Stack_Data_Analyst\PSI Academy\GitHub Repo\ai-llm-engineering-2\day_1\.venv\Lib\site-packages\httpx\_models.py&quot;</span>, line <span class="m">829</span>, in <span class="n">raise_for_status</span>
<span class="w">    </span><span class="k">raise</span> <span class="n">HTTPStatusError</span><span class="p">(</span><span class="n">message</span><span class="p">,</span> <span class="n">request</span><span class="o">=</span><span c

In [None]:
list_of_prompts_2 = [
    user_prompt("Write a brief text on climate change as vice ganda in a talk show.")
]

stimple_response_2 = get_response(client, list_of_prompts_2)
pretty_print(stimple_response_2)

### ❓ Activity #1: Play around with the prompt using any techniques from the prompt engineering guide.

In [None]:
prompt_input = mo.ui.text(
    label="Enter your prompt",
    value="Write a brief text on climate change as a standup comedian.",
    full_width=True
)

In [None]:
display(prompt_input)

In [None]:
if prompt_input.value:

    list_of_prompts_5 = [

        user_prompt(prompt_input.value)

    ]

    stimple_response_5 = get_response(client, list_of_prompts_5)

    pretty_print(stimple_response_5)

### Few-shot Prompting

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [None]:
list_of_prompts_3 = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response_3 = get_response(client, list_of_prompts_3)
pretty_print(stimple_response_3)

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [None]:
reasoning_problem = """
how many r's in "strawberry?" {instruction}
"""

list_of_prompts_4 = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts_4)
pretty_print(reasoning_response)

Notice that the model cannot count properly. It counted only 2 r's.

### ❓ Activity #2: Update the prompt so that it can count correctly.

In [None]:
list_of_prompts_6 = [
    user_prompt(reasoning_problem),
    assistant_prompt("There are 2 r's in 'strawberry.'"),
    user_prompt(reasoning_problem.replace("{instruction}", "Now I want you to spell out each letter, and count the r's carefully."))
]

reasoning_response_2 = get_response(client, list_of_prompts_6)
pretty_print(reasoning_response_2)

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

Materials adapted for PSI AI Academy. Original materials from AI Makerspace.