### AI/LLM Engineering Kick-off!! 


For our initial activity, we will be using the OpenAI Library to Programmatically Access GPT-4.1-nano!

In order to get started, you'll need an OpenAI API Key. [here](https://platform.openai.com)!

In [3]:
import os
import openai
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Please enter your OpenAI API Key: ")
openai.api_key = os.environ["OPENAI_API_KEY"]

### Our First Prompt

You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) if you get stuck!

Let's create a `ChatCompletion` model to kick things off!

There are three "roles" available to use:

- `developer`
- `assistant`
- `user`

OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)

Let's just stick to the `user` role for now and send our first message to the endpoint!

If we check the documentation, we'll see that it expects it in a list of prompt objects - so we'll be sure to do that!

In [4]:
from openai import OpenAI

client = OpenAI()

In [5]:
YOUR_PROMPT = "What is the difference between LangChain and LlamaIndex?"

client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

ChatCompletion(id='chatcmpl-BzajflfFF8VhHWpvYFtgB7QqljjmT', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="LangChain and LlamaIndex (formerly known as GPT Index) are both frameworks designed to facilitate working with language models and building applications that leverage large language models (LLMs). However, they serve different primary purposes and have distinct features. Here's an overview of their differences:\n\n**1. Purpose and Focus:**\n\n- **LangChain:**\n  - Focuses on building complex, multi-step language model applications, such as chatbots, agents, and workflows.\n  - Provides tools for chaining multiple prompts, managing conversations, handling memory, and integrating with various data sources.\n  - Emphasizes the orchestration and control flow around LLMs, enabling developers to create sophisticated AI applications.\n\n- **LlamaIndex (GPT Index):**\n  - Primarily designed for creating and querying indices over large 

As you can see, the prompt comes back with a tonne of information that we can use when we're building our applications!

We'll be building some helper functions to pretty-print the returned prompts and to wrap our messages to avoid a few extra characters of code!

##### Helper Functions

In [7]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def system_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

### Testing Helper Functions

Now we can leverage OpenAI's endpoints with a bit less boiler plate - let's rewrite our original prompt with these helper functions!

Because the OpenAI endpoint expects to get a list of messages - we'll need to make sure we wrap our inputs in a list for them to function properly!

In [8]:
messages = [user_prompt(YOUR_PROMPT)]

chatgpt_response = get_response(client, messages)

pretty_print(chatgpt_response)

Great question! LangChain and LlamaIndex (formerly known as GPT Index) are both popular frameworks in the ecosystem of building applications with large language models (LLMs), but they serve different primary purposes and have distinct features.

### **LangChain**
- **Purpose:** Framework for building LLM-powered applications, especially focusing on **prompt management, chaining multiple operations**, and integrating with various tools and data sources.
- **Main Features:**
  - **Prompt management:** Helps construct, manage, and optimize prompts.
  - **Chains:** Enables chaining of multiple LLM calls, including conditional logic and workflows.
  - **Agents:** Supports dynamic decision-making, where the system chooses which tools or actions to invoke based on input.
  - **Tool Integration:** Easily integrates with APIs, databases, and other tools.
  - **Support for various LLM providers:** OpenAI, Hugging Face, local models, etc.
  - **Extensibility:** Modular design allowing custom components.
- **Use Cases:** Conversational agents, complex multi-step workflows, multi-tool orchestration.

### **LlamaIndex (GPT Index)**
- **Purpose:** Focused on **building and querying large, structured knowledge bases** from proprietary or external data sources like documents, PDFs, or other unstructured data, making it easier to perform retrieval-augmented generation (RAG).
- **Main Features:**
  - **Data ingestion:** Indexes raw data from various sources.
  - **Index types:** Supports multiple index types (e.g., tree, list, vector) for different retrieval strategies.
  - **Retrieval + LLM:** Combines document retrieval with LLMs to generate knowledgeable responses.
  - **Easy querying:** Simple API to query the index for answers based on underlying data.
  - **Knowledge augmentation:** Enables creation of chatbots that answer based on your custom knowledge base rather than general information.
- **Use Cases:** Building custom knowledge bases, document question answering, retrieval-augmented generation.

---

### **Summary of Differences**
| Aspect | LangChain | LlamaIndex (GPT Index) |
|---|---|---|
| **Main Focus** | Building complex LLM applications with workflows and tool integration | Creating and querying knowledge bases from external data sources |
| **Core Functionality** | Chaining, prompting, tool orchestration, agent-based reasoning | Data ingestion, indexing, retrieval-augmented generation |
| **Primary Use Cases** | Conversational agents, multi-step workflows, tool orchestration | Document QA, knowledge base creation, document retrieval + generation |
| **Integration Approach** | Modular, supports multiple LLM providers, complex workflows | Focused on data ingestion, indexing, fast retrieval |
- **Overlap:** Both can be combined—for instance, using LlamaIndex for data retrieval within a LangChain-powered application.

### **In essence:**
- Use **LangChain** if you're building sophisticated LLM applications that require prompt management, chaining, and tool integration.
- Use **LlamaIndex** if you need to build a system that handles large collections of documents or data, allowing LLMs to retrieve and respond based on that specific knowledge.

---

**Note:** Both tools are evolving rapidly, and the ecosystem often sees integrations where they complement each other.

Let's focus on extending this a bit, and incorporate a `developer` message as well!

Again, the API expects our prompts to be in a list - so we'll be sure to set up a list of prompts!

>REMINDER: The `developer` message acts like an overarching instruction that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the `developer` prompt.

In [9]:
list_of_prompts = [
    system_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

Are you kidding me? Neither! I want my ice in nice, big, solid chunks that stay cold and don't drown my drink in water. If I have to choose, I guess crushed ice is the worst because it melts so fast and makes everything watery. But honestly, I’d rather have none at all if it’s just going to be a mess!

Let's try that same prompt again, but modify only our system prompt!

In [10]:
list_of_prompts[0] = system_prompt("You are joyful and having an awesome day!")

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

I think crushed ice is fun and refreshing, especially for drinks like cocktails or slushies! Cubed ice is great for keeping beverages cold without quickly watering them down. Both have their own charm—what's your favorite?

While we're only printing the responses, remember that OpenAI is returning the full payload that we can examine and unpack!

In [11]:
print(joyful_response)

ChatCompletion(id='chatcmpl-BzakuI20VBKRYWYXI4WuDx7XbEaPt', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I think crushed ice is fun and refreshing, especially for drinks like cocktails or slushies! Cubed ice is great for keeping beverages cold without quickly watering them down. Both have their own charm—what's your favorite?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1754018996, model='gpt-4.1-nano-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_38343a2f8f', usage=CompletionUsage(completion_tokens=45, prompt_tokens=30, total_tokens=75, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))


### Prompt Engineering

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-mini` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [12]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Climate change refers to long-term shifts in weather patterns and global temperatures, primarily caused by human activities such as burning fossil fuels, deforestation, and industrial processes. These activities increase greenhouse gas concentrations in the atmosphere, leading to global warming. The impacts of climate change include more frequent and severe weather events, rising sea levels, melting glaciers, and disruptions to ecosystems and agriculture. Addressing climate change requires global cooperation to reduce emissions, transitioning to renewable energy sources, and implementing sustainable practices to protect the planet for future generations.

In [13]:
list_of_prompts = [
    user_prompt("Write a brief text on climate change as vice ganda in a talk show.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

Oo, mga beks! Kagaya ng sinasabi ni Vice Ganda, eh, mukhang kailangan na nating magising sa katotohanan—ang climate change ay tunay na seryosong usapin! Kasi, mga friends, hindi natin pwedeng ipasa hanggang sa susunod na generation ang problemang ito kung hindi natin aaksyunan ngayon. Let's do our part, mag-recycle, mag-save ng energy, at higit sa lahat, magkaisa tayo para sa isang mas malinis at mas maaliwalas na planeta. Kasi, di ba, ang beauty ng buhay ay nandiyan lang sa paligid natin—kailangan lang nating ingatan para tuloy-tuloy ang saya! Charot!

### ❓ Activity #1: Play around with the prompt using any techniques from the prompt engineering guide.

### Few-shot Prompting

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the `assistant` role to show the model what these words mean.

In [14]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

The stimple wrench effortlessly spun the falbean, allowing us to tighten the bolt with ease.

As you can see, leveraging the `assistant` role makes for a stimple experience!

### Chain of Thought

You'll notice that, by default, the model uses Chain of Thought to answer difficult questions!

> This pattern is leveraged even more by advanced reasoning models like [`o3` and `o4-mini`](https://openai.com/index/introducing-o3-and-o4-mini/)!

In [15]:

reasoning_problem = """
how many r's in "strawberry?" {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

There are 2 letter "r"s in "strawberry."

Notice that the model cannot count properly. It counted only 2 r's.

### ❓ Activity #2: Update the prompt so that it can count correctly.

In [17]:
reasoning_problem = """
How many letter 'r's are in the word "strawberry"? 
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

There are two letter 'r's in the word "strawberry."

In [19]:
reasoning_problem = """
Please go through each letter of the word strawberry step by step and count how many times the letter 'r' appears.
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's examine each letter of the word "strawberry" step by step and count the number of times the letter 'r' appears:

1. **S** – not 'r'
2. **t** – not 'r'
3. **r** – 1st 'r'
4. **a** – not 'r'
5. **w** – not 'r'
6. **b** – not 'r'
7. **e** – not 'r'
8. **r** – 2nd 'r'
9. **r** – 3rd 'r'
10. **y** – not 'r'

**Total number of times 'r' appears: 3**

In [16]:
reasoning_problem = """
How many letter 'r's are in the word "strawberry"? 
Please go through each letter of the word step by step and count how many times the letter 'r' appears.
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's go through the word "strawberry" letter by letter:

1. s – not 'r'
2. t – not 'r'
3. r – **1st 'r'**
4. a – not 'r'
5. w – not 'r'
6. b – not 'r'
7. e – not 'r'
8. r – **2nd 'r'**
9. r – **3rd 'r'**
10. y – not 'r'

Counting all the 'r's, we find that there are **3** in the word "strawberry".

In [25]:

reasoning_problem = """
Please go through each letter of the word "strawberry" step by step and focus on counting how many times the letter r appears. Explain your reasoning for your number count {instruction}
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's analyze the word "strawberry" letter by letter and count how many times the letter "r" appears:

1. **s** - Not an "r," so count remains 0.
2. **t** - Not an "r," count remains 0.
3. **r** - This is an "r," so count increases to 1.
4. **a** - Not an "r," count remains 1.
5. **w** - Not an "r," count remains 1.
6. **b** - Not an "r," count remains 1.
7. **e** - Not an "r," count remains 1.
8. **r** - This is an "r," so count increases to 2.
9. **r** - Again, an "r," so count increases to 3.
10. **y** - Not an "r," count remains 3.

**Final count:** The letter "r" appears **3 times** in "strawberry."

### Basic Prompt


In [22]:
list_of_prompts = [
    user_prompt("What is Prompt Engineering.")
]

response = get_response(client, list_of_prompts)
pretty_print(response)

Prompt engineering is the process of designing and refining input instructions or prompts to effectively communicate with artificial intelligence (AI) models, particularly large language models (LLMs), to produce desired outputs. It involves crafting clear, specific, and well-structured prompts to guide the AI's responses in a way that aligns with the user's goals.

Key aspects of prompt engineering include:

- **Clarity:** Ensuring prompts are unambiguous and straightforward.
- **Specificity:** Providing detailed context or instructions to elicit precise answers.
- **Formatting:** Using techniques like bullet points, questions, or examples to improve understanding.
- **Iteration:** Refining prompts through trial and error to optimize results.

Prompt engineering is essential for leveraging AI effectively in applications like content creation, coding assistance, data analysis, and more. It helps users obtain relevant, accurate, and useful outputs from AI systems.

### Chain of Thought


In [26]:

reasoning_problem = """
“You are an expert tutor explaining prompt engineering to a beginner. Ask questions to understand what the learner knows. 
Provide a few examples to illustrate different techniques (like few-shot prompting, clarifying questions, or bias reduction). 
Ensure your answer is unbiased and avoids oversimplification.”"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Great! To start, I'd like to understand your current level of familiarity. Have you heard of prompt engineering before, or is this your first time encountering the term? Also, do you have any experience working with AI models like chatbots or language models?

Once I have a sense of your background, I can tailor my explanation. For now, let me ask:

- Are you interested in how prompts influence the responses of models like GPT?
- Do you want to learn about specific techniques to improve prompt quality?
- Are you curious about how to guide the model to produce more accurate or unbiased answers?

To give you a sense of what prompt engineering involves, here are some techniques illustrated with examples:

**1. Few-shot Prompting:**  
This method provides a few examples within the prompt to guide the model toward the desired task.

*Example:*  
Suppose you want the model to translate English to French. You might write:  
```
English: Hello  
French: Bonjour

English: How are you?  
French: Comment ça va?

English: Thank you  
French:
```  
The model is likely to complete with "Merci," based on the pattern.

**2. Clarifying Questions:**  
Including questions in your prompt can help the model understand context or specify the kind of answer you're seeking.

*Example:*  
Instead of asking, "Tell me about climate change," you can ask:  
"What specific aspects of climate change are you interested in? Environmental impacts, policy responses, or technological solutions?"

**3. Bias Reduction:**  
To reduce bias or ensure more neutral responses, you can frame your prompt carefully or ask the model to consider multiple viewpoints.

*Example:*  
"Present arguments for and against the use of nuclear energy, without taking a side."

Would you like to go through a specific example or technique in more detail? Or perhaps you want suggestions tailored to a particular task you're working on?

### Conclusion

Now that you're accessing `gpt-4.1-nano` through an API, developer style, let's move on to creating a simple application powered by `gpt-4.1-nano`!

Materials adapted for PSI AI Academy. Original materials from AI Makerspace.