### AI/LLM Engineering Kick-off!! 


For our initial activity, we will be using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, you'll need an OpenAI API Key. [here](https://platform.openai.com)!

In [1]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [2]:
from openai import OpenAI

client = OpenAI()

In [3]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-C4hjcla7YTABW04EqVUrav8CSosXd', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks in the AI and natural language processing ecosystem, primarily designed to facilitate building applications with large language models (LLMs). However, they serve different purposes and offer distinct features. Here's a comparison to clarify their differences:\n\n**1. Purpose and Focus**\n\n- **LangChain:**\n  - Focuses on building **end-to-end AI applications** by providing abstractions and tools for chaining together multiple language model calls, managing conversation state, prompt engineering, and integrating with various data sources.\n  - Designed to create complex applications like chatbots, question-answering systems, and workflows that involve multiple steps and components.\n\n- **LlamaIndex (GPT Index):**\n  - Primarily aims to enable *

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [6]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [7]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

Great question! **LangChain** and **LlamaIndex** (formerly known as GPT Index) are both popular frameworks in the AI ecosystem designed to facilitate working with large language models (LLMs) and managing data for downstream applications. However, they serve different primary purposes and offer distinct functionalities.

### **LangChain**

**Overview:**
- An open-source framework focused on building **conversational AI applications**, **chatbots**, and **complex language model-powered workflows**.
- Provides abstractions and tools to connect LLMs with various data sources, APIs, and tools.
- Supports the development of multi-step reasoning, chaining multiple prompts, memory management, and agent-based workflows.

**Key Features:**
- **Chains:** Modular components that pass data through multiple processing steps.
- **Agents:** Dynamic systems that decide which tools or functions to invoke based on input.
- **Memory:** Persist context across interactions to maintain state.
- **Integrations:** Seamless connectivity with data sources, APIs, document stores, and more.
- **Prompt Management:** Templates, prompt templates, and prompt engineering utilities.

**Use Cases:**
- Building chatbots and conversational agents.
- Automating multi-step workflows involving LLMs.
- Connecting LLMs to external tools and APIs.

---

### **LlamaIndex (formerly GPT Index)**

**Overview:**
- Focused primarily on **building efficient data indexes** over large collections of documents to facilitate **quick and effective retrieval** for LLMs.
- Designed to **enhance retrieval-augmented generation (RAG)** systems by enabling LLMs to access relevant document snippets during inference.
- Works well with various data sources and formats.

**Key Features:**
- **Data Indexing:** Converts raw data into searchable indices.
- **Retrieval:** Fast and relevant document retrieval tailored for LLM augmentation.
- **Memory Management:** Maintains knowledge bases for long-term context.
- **Support for multiple index types:** Vector indices, tree indices, etc.
- **Data ingestion pipelines:** Supports documents, PDFs, HTML, and more.

**Use Cases:**
- Building knowledge bases for chatbots or question-answering systems.
- Retrieval-augmented generation to improve factual accuracy.
- Managing large document corpora for LLM applications.

---

### **Summary of Differences**

| Aspect | **LangChain** | **LlamaIndex** (GPT Index) |
| --- | --- | --- |
| **Primary Focus** | Building complex conversational workflows and integrating LLMs with tools/APIs | Efficient data storage, retrieval, and augmentation for LLMs |
| **Use Cases** | Chatbots, multi-step routines, agent-based systems | Knowledge bases, RAG systems, document retrieval |
| **Core Functionality** | Chains, agents, prompt engineering | Data indexing, document retrieval, embedding management |
| **Data Handling** | Connects to APIs, databases, documents for workflows | Creates searchable indexes over documents for quick retrieval |

---

### **In essence:**
- **LangChain** excels at orchestrating multiple steps, tools, and memory in complex language-driven applications.
- **LlamaIndex** specializes in organizing and retrieving large amounts of data to support LLMs, especially in retrieval-augmented scenarios.

They are **complementary**: one deals more with **workflow orchestration** (LangChain), and the other with **data management and retrieval** (LlamaIndex). Often, developers use both together to build powerful AI systems.

---

**Note:** Both frameworks are actively evolving, so it’s a good idea to check their latest documentation for updates and new features.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [8]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Honestly, I can't stand crushed ice! It's like biting into a handful of soggy snow while I'm already starving. Cubed ice at least pauses to give me a little chill without turning into a sloppy mess. But frankly, I’m so hungry and fed up with waiting, I’d settle for whatever gets me some cold relief—just not crushed ice!

Let's try that same prompt again, but modify only our system prompt!

In [9]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I love the idea of crushed ice! It's so refreshing and fun to enjoy, especially in a chilly drink or a summery treat. Cubed ice is great too, especially for keeping beverages cold without diluting them too quickly. Both have their charms—depends on the mood! How about you? Do you prefer crushed or cubed ice?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [10]:
print(joyful_response)

ChatCompletion(id='chatcmpl-C4hwCN5dfnfA0Wm0yxafO8u7AKLuk', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I love the idea of crushed ice! It's so refreshing and fun to enjoy, especially in a chilly drink or a summery treat. Cubed ice is great too, especially for keeping beverages cold without diluting them too quickly. Both have their charms—depends on the mood! How about you? Do you prefer crushed or cubed ice?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1755238244, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_c4c155951e', usage=CompletionUsage(completion_tokens=69, prompt_tokens=30, total_tokens=99, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cach

### Prompt Engineering

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [11]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Climate change refers to long-term alterations in Earth's climate patterns, primarily driven by human activities such as burning fossil fuels, deforestation, and industrial processes. These activities increase the concentration of greenhouse gases like carbon dioxide and methane in the atmosphere, leading to global warming. The effects include rising sea levels, more frequent and severe weather events, melting glaciers, and disruptions to ecosystems. Addressing climate change requires global cooperation to reduce emissions, transition to renewable energy sources, and promote sustainable practices to protect the planet for future generations.

In [12]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change as vice ganda in a talk show.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Ay, grabe, mga kababayan! Ang climate change, para kang kumare na umaapak sa paa mo, hindi mo maialis, hindi mo maiwalay. Sabi nila, ang planeta natin ay nagiging mainit, parang lola mong sobra sa mainit na pan de sal, diba? Kailangan na natin mag-action! Hindi pwedeng maghintay na maging si Earth na lang ang may sinabi sa atin. Kaya, tumulong tayo—mag-recycle, mag-save ng energy, at higit sa lahat, maging mas responsible tayo bilang mga anak ng mundo. Kasi kung hindi tayo kikilos, baka sa huli, ang environment na lang ang magdusa, at tayo, mga kababayan, ang maiwan sa isang mas malungkot na planeta. Remember, ka-vibe, ang pagbabago ay nagsisimula sa ating bawat maliit na hakbang!

### ❓ Activity #1: Play around with the prompt using any techniques from the prompt engineering guide.

### Few-shot Prompting

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [13]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple screwdriver effortlessly powered the falbean, making it easy to tighten the bolts.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [14]:

reasoning_problem = """
how many r's in "strawberry?" {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

There is 1 letter "r" in "strawberry?".

Notice that the model cannot count properly. It counted only 2 r's.

### ❓ Activity #2: Update the prompt so that it can count correctly.

In [15]:
reasoning_problem = """
How many letter 'r's are in the word "strawberry"? 
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

There are 2 letter 'r's in the word "strawberry."

In [16]:
reasoning_problem = """
Please go through each letter of the word strawberry step by step and count how many times the letter 'r' appears.
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's go through each letter of the word "strawberry" step by step and count the number of times the letter 'r' appears:

1. **s** – not 'r'  
2. **t** – not 'r'  
3. **r** – **1st 'r'**  
4. **a** – not 'r'  
5. **w** – not 'r'  
6. **b** – not 'r'  
7. **e** – not 'r'  
8. **r** – **2nd 'r'**  
9. **r** – **3rd 'r'**  
10. **y** – not 'r'  

**Total number of 'r's in "strawberry": 3**

In [17]:
reasoning_problem = """
How many letter 'r's are in the word "strawberry"? 
Please go through each letter of the word step by step and count how many times the letter 'r' appears.
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's go through each letter of the word "strawberry" step by step:

1. 's' – not 'r'
2. 't' – not 'r'
3. 'r' – yes, count = 1
4. 'a' – not 'r'
5. 'w' – not 'r'
6. 'b' – not 'r'
7. 'e' – not 'r'
8. 'r' – yes, count = 2
9. 'r' – yes, count = 3'
10. 'y' – not 'r'

Total number of 'r's in "strawberry" is **3**.

In [18]:

reasoning_problem = """
Please go through each letter of the word "strawberry" step by step and focus on counting how many times the letter r appears. Explain your reasoning for your number count {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Sure! Let's examine each letter of the word "strawberry" step by step and count how many times the letter "r" appears.

The word is: **s t r a w b e r r y**

Now, proceeding letter by letter:

1. **s** — Not an "r"; count remains 0.
2. **t** — Not an "r"; count remains 0.
3. **r** — This is an "r"; count increases to 1.
4. **a** — Not an "r"; count remains 1.
5. **w** — Not an "r"; count remains 1.
6. **b** — Not an "r"; count remains 1.
7. **e** — Not an "r"; count remains 1.
8. **r** — This is an "r"; count increases to 2.
9. **r** — This is an "r"; count increases to 3.
10. **y** — Not an "r"; count remains 3.

**Final count:** The letter **"r"** appears **3 times** in the word "strawberry."

**Reasoning Recap:** I checked each letter individually, increased the count whenever I encountered an "r," and kept track throughout the process.

### Basic Prompt


In [19]:
list_of_prompts = [
    user_prompt("What is Prompt Engineering.")
]

response = get_response(client, list_of_prompts)
pretty_print(response)

Prompt engineering is the process of designing, refining, and optimizing input prompts to effectively communicate with AI language models—such as GPT—to generate accurate, relevant, and high-quality responses. It involves carefully crafting the wording, structure, and context of prompts to guide the model's output in a desired direction, often by providing specific instructions, examples, or constraints.

Key aspects of prompt engineering include:
- **Clarity:** Making prompts clear and unambiguous to elicit precise responses.
- **Context:** Providing sufficient background information to inform the model's understanding.
- **Instruction Framing:** Using explicit commands or questions to guide the model’s behavior.
- **Iterative Refinement:** Testing and adjusting prompts based on the outputs to improve results.
- **Use of Techniques:** Such as few-shot learning (providing examples), zero-shot prompts, and chain-of-thought prompting to enhance performance.

Prompt engineering is vital in applications like chatbots, content generation, coding assistance, and more—helping users leverage AI models more effectively and reliably.

### Chain of Thought


In [20]:

reasoning_problem = """
“You are an expert tutor explaining prompt engineering to a beginner. Ask questions to understand what the learner knows. 
Provide a few examples to illustrate different techniques (like few-shot prompting, clarifying questions, or bias reduction). 
Ensure your answer is unbiased and avoids oversimplification.”"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Great! To start, could you tell me what, if anything, you already know about prompt engineering? Have you used AI models before, or are you completely new to the concept? 

Once I understand your background, I can better tailor my explanations. For example, are you interested in creating prompts for specific tasks like writing, coding, or data analysis? Or are you more curious about how to craft prompts that guide the AI effectively and ethically? 

In the meantime, I can introduce a few key techniques used in prompt engineering:

1. **Few-shot prompting:** This involves providing a few examples within your prompt to help the model understand the pattern or task. For example, if you want the AI to translate phrases, you might show a few sample translations before asking for a new one.

2. **Clarifying questions:** Sometimes, it helps to add questions or specify constraints to narrow down the AI's responses, ensuring they are relevant and precise.

3. **Bias reduction:** Being careful with wording and examples to avoid inadvertently reinforcing stereotypes or biases in the AI’s responses.

Would you like to explore any of these techniques in more detail? Or do you have specific goals for your prompt engineering practice?

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

Materials adapted for PSI AI Academy. Original materials from AI Makerspace.