### AI/LLM Engineering Kick-off!! 


For our initial activity, we will be using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, you'll need an OpenAI API Key. [here](https://platform.openai.com)!

In [9]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [10]:
from openai import OpenAI

client = OpenAI()

In [13]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BxRug6GDxLPTiaGTOzPLkVJgzE3eZ', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate the development of large language model (LLM) applications, especially those involving data integration and retrieval. However, they serve different purposes and have distinct features:\n\n**1. Purpose and Focus:**\n\n- **LangChain:**\n  - **Primary Focus:** Building customizable, multi-step LLM applications such as chatbots, question-answering systems, and agents.\n  - **Core Strength:** Orchestrating LLM calls, managing prompts, conversation memory, chains, agents, and integrations with various data sources.\n  - **Use Cases:** Complex workflows, reasoning chains, retrieval-augmented generation (RAG), agents that can decide actions, and automation.\n\n- **LlamaIndex (GPT Index):**\n  - **Primary Focus:** Facilitating effi

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [14]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [15]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate building applications that leverage large language models (LLMs), but they have different focuses and design philosophies. Here's a breakdown of their main differences:

1. Purpose and Focus:
   - **LangChain:**  
     Primarily a framework for building **chatbots and conversational AI applications**. It provides tools for managing prompts, chaining multiple LLM calls, memory management, and integrating external tools or APIs. Its goal is to enable developers to build complex, multi-step workflows with LLMs.
   
   - **LlamaIndex:**  
     Focused on **indexing and querying large collections of external data** (e.g., documents, databases) using LLMs. Its main aim is to facilitate **semantic search and question-answering** over custom data sources by creating structured indices that can efficiently retrieve relevant information for LLMs to generate responses.

2. Core Functionality:
   - **LangChain:**  
     - Chains and agents that combine multiple prompts and model calls  
     - Memory management to maintain context across interactions  
     - Integration with APIs, tools, and external data sources  
     - Flexible prompt engineering and orchestration of workflows
   
   - **LlamaIndex:**  
     - Data ingestion pipelines for documents, PDFs, and databases  
     - Indexing structures (such as trees, vectors) that enable fast retrieval  
     - Query interfaces that leverage these indices to answer questions contextually  
     - Emphasis on building a "knowledge base" to enhance LLM performance on domain-specific data

3. Use Cases:
   - **LangChain:**  
     - Building chatbots, virtual assistants, or complex decision-making workflows  
     - Automating multi-step tasks involving LLMs and external tools  
     - Developing applications that require dynamic prompt construction and execution
   
   - **LlamaIndex:**  
     - Creating search engines over personal or enterprise data  
     - Building knowledge bases and document retrieval systems  
     - Enhancing question-answering systems with structured data retrieval

4. Ecosystem and Integration:
   - **LangChain:**  
     - Supports a wide variety of LLM providers (OpenAI, Hugging Face models, etc.)  
     - Extensive integrations with APIs, tools, and databases  
   
   - **LlamaIndex:**  
     - Primarily focused on data ingestion and retrieval  
     - Can work with various data storage formats and retrieval mechanisms

**In summary:**  
- Use **LangChain** if you're building complex conversational systems, workflows, or need orchestration across multiple tools.  
- Use **LlamaIndex** if your goal is to index large data collections and perform efficient semantic searches or question-answering over those datasets.

They can also complement each other in a full-stack application: LlamaIndex for indexing and retrieval of domain data, and LangChain for orchestrating conversations and integrating external tools.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [16]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? I don't have time to waste on such trivial nonsense while I'm starving! Crushed ice all the way—it's instantly satisfying and refreshingly chaotic! Can't stand those boring, clunky cubes when I'm this hungry!

Let's try that same prompt again, but modify only our system prompt!

In [17]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I love the idea of crushed ice—it's so refreshing and fun to crunch! But cubed ice is great too, especially for neat presentations or slower melting drinks. It really depends on the mood, but either way, ice makes everything cooler!

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [None]:
print(joyful_response)

ChatCompletion(id='chatcmpl-Bm0TwsXHyCObslTRMC8qayceKYajt', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is fun because it feels like a cool burst of refreshment all at once! Cubed ice is great for keeping drinks cold without diluting them too quickly and looks nice in a glass. Both have their charms—what's your favorite?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1750781296, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=52, prompt_tokens=30, total_tokens=82, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Prompt Engineering

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [18]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Climate change refers to long-term shifts in temperature, precipitation, and other atmospheric patterns primarily caused by human activities such as burning fossil fuels, deforestation, and industrial processes. These activities elevate greenhouse gas levels in the atmosphere, leading to global warming. The impacts of climate change include more frequent and severe weather events, rising sea levels, melting glaciers, and disruptions to ecosystems and agriculture. Addressing climate change requires coordinated efforts to reduce emissions, transition to renewable energy sources, and implement sustainable practices to protect the planet for future generations.

In [None]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change as vice ganda in a talk show.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Ay naku, mga kaibigan! Nakakabahala talaga itong climate change, parang kilabot na bagyo na hindi titigil! Sobrang init na, para kang laging nasa kiliti ng araw, tapos biglang uulan ng mga sakuna. Ang mundo natin, nagiging paasa sa ating mga kamay, parang crush na laging manhid. Kailangan na nating kumilos, mag-recycle, mag-walk if pwede, at huwag kalimutang mag-alaga sa kalikasan. Dahil kung hindi tayo kikilos, baka kayo na lang ang may kasalanan kung mas lalo pang lumala ang problema. Tandaan, mga kaibigan, sama-samang paglaban sa climate change, susi sa malinis na kinabukasan!

### ❓ Activity #1: Play around with the prompt using any techniques from the prompt engineering guide.

### Few-shot Prompting

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [19]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple screwdriver smoothly engaged with the falbean to securely tighten the bolt.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [20]:

reasoning_problem = """
how many r's in "strawberry?" {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

There are 2 letters 'r' in "strawberry".

Notice that the model cannot count properly. It counted only 2 r's.

### ❓ Activity #2: Update the prompt so that it can count correctly.

In [25]:

reasoning_problem = """
Please go through each letter of the word "strawberry" step by step and focus on counting how many times the letter r appears. Explain your reasoning for your number count {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the word "strawberry" letter by letter and count how many times the letter "r" appears:

1. **s** - Not an "r," so count remains 0.
2. **t** - Not an "r," count remains 0.
3. **r** - This is an "r," so count increases to 1.
4. **a** - Not an "r," count remains 1.
5. **w** - Not an "r," count remains 1.
6. **b** - Not an "r," count remains 1.
7. **e** - Not an "r," count remains 1.
8. **r** - This is an "r," so count increases to 2.
9. **r** - Again, an "r," so count increases to 3.
10. **y** - Not an "r," count remains 3.

**Final count:** The letter "r" appears **3 times** in "strawberry."

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

Materials adapted for PSI AI Academy. Original materials from AI Makerspace.