### Starter Prompt Engineering

1. Getting started - setup env variables
2. Using OpenAI library - client, roles
3. Prompting examples - few shot, chain of thought





Note: How to get OpenAI key:

Create an account with OpenAI [here](https://platform.openai.com/signup) if you do not have one.

Click on the "Settings" icon at the top right,  then on the left menu navigate to "API keys". Click on " Create new secret key" and complete the screen. Make sure you Copy the key( you can always generate new one).




## 1. Getting Started

First load the [OpenAI Python Library](https://github.com/openai/openai-python/tree/main)!

In [None]:
# Install the dependencies OpenAI library
!pip install openai -qU

#### Setting Environment Variables

In [None]:
import os
import getpass
import openai

In [None]:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key")

## 2. First Prompt




We're going to use the ChatCompletion create method to interact with the "gpt-4.1-nano" model.

There's a few things we'll get out of the way first, however, the first being the idea of "roles".

There are three "roles" available to use:



*   developer
*   assistant
*   user


OpenAI provides some context for these roles [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages)



We'll explore these roles in more depth - but for now just stick with the basic role `user`. The `user` role is, as it would seem, the user!

Again we will use latest model

We'll use the `gpt-4.1` or `gpt-4.1-nano` model as stated above.

Let's look at an example!


The core feature of the OpenAI Python Library is the `OpenAI()` client. It's how we're going to interact with OpenAI's models.

> NOTE: You can reference OpenAI's [documentation](https://platform.openai.com/docs/api-reference/chat) whenever you get stuck, have questions, or want to dive deeper.

In [None]:
from openai import OpenAI

client = OpenAI()

In [None]:
YOUR_PROMPT = "WRITE YOUR PROMPT HERE"
client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role" : "user", "content" : YOUR_PROMPT}]
)

Helper functions defined to aid using OpenAI API - for easier


In [None]:
from IPython.display import display, Markdown

def get_response(client: OpenAI, messages: str, model: str = "gpt-4.1-nano") -> str:
    return client.chat.completions.create(
        model=model,
        messages=messages
    )

def developer_prompt(message: str) -> dict:
    return {"role": "developer", "content": message}

def assistant_prompt(message: str) -> dict:
    return {"role": "assistant", "content": message}

def user_prompt(message: str) -> dict:
    return {"role": "user", "content": message}

def pretty_print(message: str) -> str:
    display(Markdown(message.choices[0].message.content))

In [None]:
# test the helper functions

YOUR_PROMPT_HELLO = "WRITE YOUR PROMPT HERE"
messages_list = [user_prompt(YOUR_PROMPT_HELLO)]

chatgpt_response = get_response(client, messages_list)

pretty_print(chatgpt_response)

## 2. Roles

Now we can extend our prompts to include a developer prompt.

NOTE: The developer message acts like an overarching **instruction** that is applied to your user prompt. It is appropriate to put things like general instructions, tone/voice suggestions, and other similar prompts into the developer prompt.

In [None]:
list_of_prompts = [
    developer_prompt("You are irate and extremely hungry."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

irate_response = get_response(client, list_of_prompts)
pretty_print(irate_response)

As you can see - the response we get back is very much as directed in the developer prompt!

Let's try the same user prompt, but with a different developer instruction to see the difference.

In [None]:
list_of_prompts = [
    developer_prompt("You are joyful and having the best day."),
    user_prompt("Do you prefer crushed ice or cubed ice?")
]

joyful_response = get_response(client, list_of_prompts)
pretty_print(joyful_response)

With a simple modification of the developer prompt - you can see that get completely different behaviour, and that's the main goal of prompt engineering as a whole.

Congratulations, you created your first prompt!

## 3. Few shot prompting

Now that we have a basic handle on the `developer` role and the `user` role - let's examine what we might use the `assistant` role for.

The most common usage pattern is to "pretend" that we're answering our own questions. This helps us further guide the model toward our desired behaviour. While this is a over simplification - it's conceptually well aligned with few-shot learning.

First, we'll try and "teach" `gpt-4.1-nano` some nonsense words as was done in the paper ["Language Models are Few-Shot Learners"](https://arxiv.org/abs/2005.14165).

In [None]:
list_of_prompts = [
    user_prompt("Please use the words 'stimple' and 'falbean' in a sentence.")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)

As you can see, the model is unsure what to do with these made up words.

Let's see if we can use the **assistant** role to show the model what these words mean.

In [None]:
list_of_prompts = [
    user_prompt("Something that is 'stimple' is said to be good, well functioning, and high quality. An example of a sentence that uses the word 'stimple' is:"),
    assistant_prompt("'Boy, that there is a stimple drill'."),
    user_prompt("A 'falbean' is a tool used to fasten, tighten, or otherwise is a thing that rotates/spins. An example of a sentence that uses the words 'stimple' and 'falbean' is:")
]

stimple_response = get_response(client, list_of_prompts)
pretty_print(stimple_response)



The example shows how assistant role guides the final result sentence.

## 4. Chain of Thought Prompting

We'll head one level deeper and explore the world of Chain of Thought prompting (CoT).

This is a process by which we can encourage the LLM to handle slightly more complex tasks.

Let's look at a simple reasoning based example without CoT.

In [None]:
reasoning_problem = """
Billy wants to get home from San Fran. before 7PM EDT.

It's currently 1PM local time.

Billy can either fly (3hrs), and then take a bus (2hrs), or Billy can take the teleporter (0hrs) and then a bus (1hrs).

Does it matter which travel option Billy selects?\n
"""

list_of_prompts = [
    user_prompt(reasoning_problem)
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

Let's see if we can leverage a simple CoT prompt to improve our model's performance on this task,  include "Let's think step by step":

In [None]:
list_of_prompts = [
    user_prompt(reasoning_problem + "WRITE YOUR CoT STEP PROMPT HERE")
]

reasoning_response = get_response(client, list_of_prompts)
pretty_print(reasoning_response)

## 5. Prompt Engineering Principles

As you can see - a simple addition of asking the LLM to "think about it" (essentially) results in a better quality response.

There's a [great paper](https://arxiv.org/pdf/2312.16171v1.pdf) that dives into some principles for effective prompt generation.

Your task for this notebook is to construct a prompt that will be used in the following exampke to create a helpful assistant for whatever task you'd like.

## 6. Test the prompt with using the LLM-as-a-judge

In [1]:
developer_template = """\
You are the best chef in the world and you are sharing your best recipes Answer customer's questions in a polite way. Provide answer in JSON format.
"""


In [2]:
user_template = """{input}
How long does it take to bake a cake? Calculate the result by summing up minutes on individual tasks.
"""

In [None]:
query = "It takes 30 minute to mix dough and 20 minutes to bake it in the oven."


list_of_prompts = [
    developer_prompt(developer_template),
    user_prompt(user_template.format(input=query))
]

test_response = get_response(client, list_of_prompts)

pretty_print(test_response)

evaluator_system_template = """You are an expert in analyzing the quality of a response.

You should be hyper-critical.

Provide scores (out of 10) for the following attributes:

1. Clarity - how clear is the response
2. Faithfulness - how related to the original query is the response
3. Correctness - was the response correct?

Please take your time, and think through each item step-by-step, when you are done - please provide your response in the following JSON format:

{"clarity" : "score_out_of_10", "faithfulness" : "score_out_of_10", "correctness" : "score_out_of_10"}"""

evaluation_template = """Query: {input}
Response: {response}"""

list_of_prompts = [
    developer_prompt(evaluator_system_template),
    user_prompt(evaluation_template.format(
        input=query,
        response=test_response.choices[0].message.content
    ))
]

evaluator_response = client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=list_of_prompts,
    response_format={"type" : "json_object"}
)

In [None]:
pretty_print(evaluator_response)

Completed!