### AI/LLM Engineering Kick-off!! 


For our initial activity, we will be using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In [21]:
pip install -r requirements.txt

Collecting tiktoken (from -r requirements.txt (line 2))
  Using cached tiktoken-0.9.0-cp312-cp312-win_amd64.whl.metadata (6.8 kB)
Collecting regex>=2022.1.18 (from tiktoken->-r requirements.txt (line 2))
  Downloading regex-2025.7.34-cp312-cp312-win_amd64.whl.metadata (41 kB)
Collecting requests>=2.26.0 (from tiktoken->-r requirements.txt (line 2))
  Using cached requests-2.32.4-py3-none-any.whl.metadata (4.9 kB)
Collecting charset_normalizer<4,>=2 (from requests>=2.26.0->tiktoken->-r requirements.txt (line 2))
  Using cached charset_normalizer-3.4.2-cp312-cp312-win_amd64.whl.metadata (36 kB)
Collecting urllib3<3,>=1.21.1 (from requests>=2.26.0->tiktoken->-r requirements.txt (line 2))
  Downloading urllib3-2.5.0-py3-none-any.whl.metadata (6.5 kB)
Using cached tiktoken-0.9.0-cp312-cp312-win_amd64.whl (894 kB)
Downloading regex-2025.7.34-cp312-cp312-win_amd64.whl (275 kB)
Using cached requests-2.32.4-py3-none-any.whl (64 kB)
Using cached charset_normalizer-3.4.2-cp312-cp312-win_amd64.whl

In order to get started, you'll need an OpenAI API Key. [here](https://platform.openai.com)!

In [8]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [9]:
from openai import OpenAI

client = OpenAI()

In [10]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BzWFqd8r2KpZ6hbWwNa9XJqZBhS6g', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="LangChain and LlamaIndex (formerly known as GPT Index) are both frameworks designed to facilitate building applications that leverage large language models (LLMs), but they focus on different aspects and have distinct features:\n\n**LangChain:**\n- **Purpose:** Primarily a framework for developing complex, composable language model applications, especially those involving chaining multiple steps, tools, APIs, and memory.\n- **Features:**\n  - Supports chaining multiple LLM calls with different prompts.\n  - Facilitates integration with external tools, APIs, databases, and APIs.\n  - Provides a modular architecture with components like prompt templates, memory, agents, and tools.\n  - Useful for building conversational agents, question-answering systems, and applications requiring reasoning over multiple steps.\n- **Use Cases:**

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [11]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [12]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks designed to facilitate building applications with large language models (LLMs), but they have different focuses, architectures, and use cases. Here's an overview of their main differences:

### 1. Purpose and Focus
- **LangChain:**
  - **Purpose:** Provides a flexible, modular framework to build complex, multi-step applications involving LLMs.
  - **Focus:** Orchestrating language model interactions, including prompt management, chaining multiple calls, memory, and integrations with various data sources.
  - **Use Cases:** Chatbots, question-answering pipelines, reasoning tasks, language model chaining, and custom workflows.

- **LlamaIndex (GPT Index):**
  - **Purpose:** Designed primarily for creating index structures over external data sources to enable efficient retrieval and question-answering.
  - **Focus:** Data ingestion, indexing, and retrieval from unstructured or semi-structured data (like documents, PDFs, websites) to build chatbots or AI tools that can answer questions based on specific datasets.
  - **Use Cases:** Building semantic search, document-based QA systems, knowledge bases.

### 2. Core Architecture
- **LangChain:**
  - Modular components for prompts, memory, chains, agents, and integrations.
  - Supports a wide range of language models and APIs (OpenAI, Cohere, AI21, etc.).
  - Emphasizes procedural workflows with chaining, conditionals, and reasoning.

- **LlamaIndex:**
  - Focuses on data ingestion: creating indices (e.g., list index, tree index, GPT index) over datasets.
  - Incorporates retrieval-augmented generation (RAG) techniques.
  - Generally integrates with vector stores and embeddings for retrieval.

### 3. Data Handling
- **LangChain:**
  - Handles user input and generates outputs; can connect to data sources but does not inherently offer data indexing.
  - Often manages context and memory across interactions.

- **LlamaIndex:**
  - Specializes in processing large datasets, constructing indices, and retrieving relevant information to feed into LLMs for tasks like QA.
  - Optimized for handling document collections and external knowledge bases.

### 4. Community and Ecosystem
- **LangChain:**
  - Broader community with many integrations, examples, and tutorials.
  - Active development focusing on flexibility and extensibility.

- **LlamaIndex:**
  - More niche, focused on data ingestion and retrieval for LLM applications.
  - Gaining popularity in applications needing document-based retrieval.

### Summary Table

| Aspect                     | LangChain                                     | LlamaIndex (GPT Index)                          |
|----------------------------|----------------------------------------------|------------------------------------------------|
| Primary Focus              | Workflow orchestration with LLMs             | Indexing and retrieval over external data   |
| Use Cases                  | Chatbots, chaining, reasoning, agent frameworks | Document QA, semantic search, knowledge bases |
| Data Handling              | Less focused on data ingestion; more on prompt management | Designed for processing and querying datasets |
| Architecture               | Modular, chain-based workflows                | Index structures + retrieval mechanisms       |
| Community & Ecosystem      | Larger, more general-purpose                   | Specialized, focused on data indexing        |

---

### In summary:
- **Use LangChain** if you want to build complex, multi-step applications, workflows, or agents involving LLMs.
- **Use LlamaIndex** if your goal is to ingest large collections of documents or data sources and perform efficient retrieval-based question-answering or search over that data.

**Both can also be used together**: you might use LlamaIndex to index your data and LangChain to orchestrate complex interactions and workflows that incorporate retrieval results.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [13]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

I can't believe you're even asking! After all this hunger and frustration, I just want ice that can give me some relief right now. Crushed ice all the way—nothing beats that icy crunch when you're starving and irate! Cubed ice? Please, that's slow and boring. Bring me crushed ice or nothing at all!

Let's try that same prompt again, but modify only our system prompt!

In [14]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I love both! Crushed ice is perfect for making refreshing drinks like juleps and slushies, giving a cool and soothing feel. Cubed ice is great for keeping beverages cold without watering them down too quickly. If I had to choose, I might lean toward crushed ice for the fun, icy crunch—what about you?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [17]:
print(joyful_response)

ChatCompletion(id='chatcmpl-BzWIIs83hZETHuaCuBio3vYjjMXcB', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='I love both! Crushed ice is perfect for making refreshing drinks like juleps and slushies, giving a cool and soothing feel. Cubed ice is great for keeping beverages cold without watering them down too quickly. If I had to choose, I might lean toward crushed ice for the fun, icy crunch—what about you?', refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1754001846, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=67, prompt_tokens=30, total_tokens=97, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_token

### Prompt Engineering

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [18]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Climate change refers to significant long-term shifts in temperature, weather patterns, and other atmospheric conditions on Earth. It is primarily driven by human activities such as the burning of fossil fuels, deforestation, and industrial processes, which increase greenhouse gas concentrations in the atmosphere. These changes lead to phenomena like rising sea levels, more frequent and severe storms, droughts, and changes in ecosystems. Addressing climate change requires global cooperation to reduce emissions, transition to renewable energy sources, and implement sustainable practices to protect our planet for future generations.

In [19]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change as vice ganda in a talk show.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Ay nako, mga ka-tropa! Alam nyo ba, ang climate change ay parang si baduy na nakakatawa kasi misteryo talaga eh. Parang nag-iinarte ang mundo natin—maging mainit, maging malamig, tapos nagluluhan na ako! Kanya-kanya na lang tayo ng anggulo, pero seryoso to, ha? Kung hindi tayo kikilos ngayon, baka maging “ice age” na tayo sa huli, tapos mapapailing na lang tayo. Kailangan natin maging mindful sa kalikasan—mag-recycle, mag-conserve ng energy, at wag gawing trash-bin ang planet. Kasi, sabi nga nila, “Huwag nang maghintay na lumamig pa ang daigdig bago tayo kumilos!” Kaya, tulong-tulungan tayo, mga beshie, para sa isang mas malamig, mas maaliwalas na mundo. Virtual hug to all!

### ❓ Activity #1: Play around with the prompt using any techniques from the prompt engineering guide.

### Few-shot Prompting

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [20]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench tightly secured the falbean in place, ensuring the machinery operated smoothly.

In [23]:
# Assuming user_prompt and assistant_prompt functions are defined as before

list_of_isekasi_prompts = [
    # Teaching "Isekai" as a genre
    user_prompt("In anime and manga, 'Isekai' (pronounced ee-seh-KAI, meaning 'different world') is a genre where a character is transported from their normal world to a new, often fantastical one. An example of a sentence describing an 'Isekai' story is:"),
    assistant_prompt("Many Isekai series feature protagonists who gain powerful abilities in their new world."),

    # Asking for an example that includes a common Isekai trope
    user_prompt("An 'overpowered protagonist' is a common trope in Isekai where the main character becomes incredibly strong. Can you give an example sentence about an Isekai series that has an overpowered protagonist?")
]

isekai_response = get_response(client, list_of_isekasi_prompts)
pretty_print(isekai_response)

In the Isekai series "That Time I Got Reincarnated as a Slime," the protagonist is overpowered, effortlessly defeating enemies with his unparalleled abilities.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

Notice that the model cannot count properly. It counted only 2 r's.

### ❓ Activity #2: Update the prompt so that it can count correctly.

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

In [45]:
instruction = """
You are a careful assistant. Count how many times the letter 'r' appears in the word "strawberry". 
Please list the steps and provide the final count.
Answer: <number>
""".strip()

reasoning_problem = f"""
How many r's in "strawberry"? {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)


Let's carefully count the number of 'r's in the word "strawberry" step by step:

1. Write down the word: **strawberry**
2. Break it down into individual letters for clarity:

   s t r a w b e r r y

3. Identify and count each 'r':

   - First 'r' appears in the 3rd position.
   - Second 'r' appears in the 8th position.
   - Third 'r' appears in the 9th position.

4. Count the total number of 'r's:

   There are 3 'r's in the word "strawberry."

**Final count:** \(\boxed{3}\)

Materials adapted for PSI AI Academy. Original materials from AI Makerspace.