<a href="https://colab.research.google.com/github/lennyciotti/learningsql-2875059/blob/main/Copy_of_Part_1_Guide_to_OpenAI_in_Google_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 1: Guide to OpenAI in Google Colab

Welcome to **Part 1: Guide to OpenAI in Google Colab**. In this notebook, you’ll learn the fundamentals of **prompt engineering** through a step-by-step tutorial. By the end, you’ll be able to:

- Create and store an OpenAI API key  
- Apply core prompt engineering strategies (iteration and refinement)  
- Use practical coding patterns to streamline your workflow  

These skills are not only valuable for this project, but they will also help you stand out in industry. In fact, Andrew Ng [notes](https://www.linkedin.com/posts/andrewyng_there-is-significant-unmet-demand-for-developers-activity-7369397355160272898-i85T?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAUuuFIBPjBR1kCVBdoY03J3r6hwaAwvapU) that some of the key abilities he looks for when interviewing AI engineers include:  

- Using AI building blocks like prompting, RAG, evals, agentic workflows, and machine learning to build applications  
- Prototyping and iterating rapidly  

You’ll get to practice both of these in this workshop.  

Let’s dive in and start building! 🚀


----

## Table of Contents

1. **[Getting Started](#getting-started)**

    - Creating an OpenAI Key

    - Saving Your API Key in Colab

    - Loading and Verifying Your API Key


2. **[API Fundamentals](#api-fund)**

    -  Temperature

    -  System/User Role Prompting

3. **[Practical Tips](#practical-tips)**

    - Prompt Engineering Basics

    - Don’t Repeat Yourself (DRY) Programming - Creating Functions

    - Markdown Formatting

4. **[Forward](#forward)**



-----


<a name="getting-started"></a>

## 1. Getting Started


In this section, we’ll get started with OpenAI’s API — learning how to obtain and verify your API key, and how to write your first prompt.  


### Creating an OpenAI Key 🔑

#### Step 1: OpenAI  

Click this link to open [OpenAI](https://openai.com/). In the top-right corner, hover over **Log In** and select **API Platform**, as shown in the image below.  


<img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_1.png?raw=true">

#### Step 2: Authentication  

After signing in (and entering the verification code sent to your email), you will be directed to the page shown below. Under **Authentication**, click <u>Organization settings</u>.  


<img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_2.png?raw=true">

#### Step 3: Creating a New Secret Key  

After clicking <u>Organization settings</u>, you will be taken to the page shown below. From there, click **Create new secret key**.  


<img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_3.png?raw=true">

#### Step 4: Creating the Key  

After clicking **Create new secret key**, a pop-up will appear. While entering a name is optional, it is recommended to use something meaningful (e.g., *Fall 2025 Practicum*). Next, select **Default project** and ensure **All** is selected. Finally, click **Create secret key**.  


<img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_4.png?raw=true">

#### Step 5: Saving Your Key  

After creating the key, it will appear as shown below (mine is hidden for privacy).  

1. Click **Copy** and save your key somewhere safe—you won’t be able to view it again later.  
2. Do **not** share your key. Using an OpenAI key incurs costs, and you will be charged if someone else uses it.  

<img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_5.png?raw=true">

### Next Steps

Congrats! You now made your OpenAI Key. Now, this is where the fun part begins. We can finally utilize the key.

----

### Saving Your API Key in Colab 🔑

Before using OpenAI's API, we need a secure way to store the API key in this notebook.  
Google Colab provides a built-in secrets manager for this purpose.  


#### Step 1:
On the left sidebar, click on the **key icon**.  

#### Step 2:
Click **“Add new secret.”**  

#### Step 3:
- Paste your API key into the **Value** field  
- Give it a descriptive **Name** (e.g., `OPENAI_API_KEY`)  
- Ensure **Notebook access** is enabled  


<div align="center">
  <img src="https://github.com/Sam-Gartenstein/GenAI-Engineering-Workshop/blob/main/Screen_Shots/OpenAI_Image_6.png?raw=true" width="500">
</div>

✅ Once entered, you can hit the exit button. Your API key will be  stored automatically and available for use in your notebook.



###  Loading and Verifying Your API Key 🔑

We load the API key from Colab Secrets into an environment variable so Python packages can access it.  
The check then verifies whether `OPENAI_API_KEY` is set:  

- If the key is missing, a clear **RuntimeError** is raised so you know to add it in Colab Secrets.  
- If the key is found, it safely confirms with `True` without ever printing the actual secret.  

However, before we do this, we must import `openai`'s library.

<br>  

**✨ Optional Learning**  
- `google.colab.userdata`: Secure interface to Colab’s Secrets; lets you fetch saved keys (e.g., `userdata.get("OPENAI_API_KEY")`).  
- `os`: Standard library module for interacting with the operating system — here, used to read/set environment variables (`os.getenv`, `os.environ`).  

In [None]:
import openai
from openai import OpenAI

In [None]:
from google.colab import userdata
import os

# Pull your saved secret into an environment variable
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

# Test if the key is available (without printing it)
if not os.getenv("OPENAI_API_KEY"):
    raise RuntimeError("OPENAI_API_KEY is not set. Add it via Colab Secrets (🔑) and try again.")
else:
    print("Key loaded?", True)

Key loaded? True




If you see `Key loaded? True`, then everything is working and you’re ready to move on to the next step.  

<br>

If you see the error, please make sure that:  
- You saved your API key in Colab’s **🔑 Secrets** panel.  
- The secret is named exactly **`OPENAI_API_KEY`** (no typos or extra spaces).  



Congrats! If you have successfully loaded in your key, you will run your first **end-to-end test with the OpenAI API** — creating a client, sending a simple prompt, and viewing the model’s reply.  

The first line

```python
client = OpenAI()
```

Creates an OpenAI client that knows how to talk to the API. It automatically picks up your API key from `OPENAI_API_KEY`, which you saved earlier in Colab’s Secrets.

The next three lines:

```python
resp = client.responses.create(
    model="gpt-4o-mini",
    input="Give me study tips. Each study point should be fairly short, a few sentences only."
)
```

- Sends a **request** to the Responses API
- `model="gpt-4o-mini"` selects the model
- `input="Give me study tips. Each study point should be fairly short, a few sentences only."` is your prompt
- The full structured result (text + metadata) is stored in `resp`

Finally:

```python
print(resp.output_text)
```

Extracts just the generated text from the response object and prints it.

<br>

That’s it! We follow this process: create a client → send a prompt → print the model’s reply.

In [None]:
client = OpenAI()  # uses OPENAI_API_KEY already in your env

resp = client.responses.create(
    model="gpt-4o-mini",
    input="Give me study tips. Each study point should be fairly short, a few sentences only."
)

print(resp.output_text)

Sure! Here are some effective study tips:

1. **Set Clear Goals**: Define what you want to achieve in each study session. Specific goals help you stay focused and motivated.

2. **Create a Study Schedule**: Allocate specific time slots for studying different subjects. Consistency helps reinforce learning.

3. **Use Active Learning Techniques**: Engage with the material through summarizing, questioning, or teaching others. This enhances retention.

4. **Take Regular Breaks**: Implement the Pomodoro Technique—study for 25 minutes, then take a 5-minute break. This keeps your mind fresh.

5. **Stay Organized**: Keep your study materials and notes well-organized. Clutter can distract and hinder productivity.

6. **Find Your Ideal Study Environment**: Choose a quiet, comfortable place with minimal distractions. A good environment enhances focus.

7. **Utilize Various Resources**: Incorporate videos, podcasts, and books to get diverse perspectives on the material.

8. **Practice with Flashcar

----

<a name="api-fund"></a>


## 2. API Fundamentals

In this section, we’ll cover two core features of the API: **temperature** and **system/user role prompting**. Let’s dive in!


### Temperature  

One parameter you can adjust in the model is the [temperature](https://platform.openai.com/docs/faq/how-should-i-set-the-temperature-parameter#how-should-i-set-the-temperature-parameter), which controls the **randomness** of its output.  

- A value near **0** → more deterministic and consistent responses.  
- A value near **1.0** → more varied and creative responses.  
- The maximum allowed is **2.0**.  

If you’d like to dive deeper, check out [this article on LLM temperature](https://www.hopsworks.ai/dictionary/llm-temperature). *(Optional reading)*  

Now let’s experiment! We’ll assign GPT the **system role** of a creative poet and request a **4-line poem about rain**. Then we’ll compare outputs at different temperatures:  

- Default setting: **1.0**  
- Low randomness: **0.2**  
- High randomness: **1.8**  

To observe the effect, we’ll call the API **3 times for each temperature** in a [loop](https://www.w3schools.com/python/python_for_loops.asp). A short pause of 10 seconds (`time.sleep(10)`) between calls prevents hitting API rate limits.  



**Tip**

Store your prompt as a [string](https://www.w3schools.com/python/python_strings.asp) variable before passing it into the model.  


In [None]:
prompt = (
    "You are a creative poet. "
    "Write a short 4-line poem about rain."
)

#### Temperature `1.0`

In [None]:
import time

for i in range(3):
    print(f"\n— Run {i+1} (temp=1.) —")
    resp = client.responses.create(
        model="gpt-4o-mini",
        input=prompt,
        temperature=0.2
    )
    print(resp.output_text)

    if i < 2:
        time.sleep(10)


— Run 1 (temp=0.2) —
Whispers of silver dance on the ground,  
Nature's soft sigh, a soothing sound.  
Each droplet a story, a moment in time,  
In the heart of the storm, the world starts to rhyme.

— Run 2 (temp=0.2) —
Whispers fall from silver skies,  
Dancing drops in soft reprise,  
Nature's tears, a sweet embrace,  
Life awakens, finds its place.

— Run 3 (temp=0.2) —
Whispers of silver in the twilight sky,  
Dancing on rooftops, a soft lullaby.  
Each droplet a story, a moment to share,  
Nature's embrace in the cool, fragrant air.


**Output Analysis**

These three poems at temperature = 1.0 show a balance of variety and consistency. Each output uses gentle, predictable imagery, but the phrasing and rhythm shift slightly with each run. This illustrates how the default temperature produces outputs that are creative yet still fairly stable across generations.

#### Temperature `0.2`

In [None]:
for i in range(3):
    print(f"\n— Run {i+1} (temp=0.2) —")
    resp = client.responses.create(
        model="gpt-4o-mini",
        input=prompt,
        temperature=0.2
    )
    print(resp.output_text)

    if i < 2:
        time.sleep(10)


— Run 1 (temp=0.2) —
Whispers of silver in the twilight sky,  
Dancing on rooftops, a soft lullaby.  
Each drop a secret, a story untold,  
Nature's embrace in a shimmer of gold.

— Run 2 (temp=0.2) —
Whispers of silver dance on the ground,  
Nature's soft lullaby, a soothing sound.  
Each drop a secret, a story untold,  
In the arms of the storm, the world turns to gold.

— Run 3 (temp=0.2) —
Whispers of silver dance on the ground,  
Nature's soft sigh, a soothing sound.  
Each droplet a story, a memory spun,  
In the heart of the storm, new life has begun.


**Output Analysis**

At temperature = 0.2, the poems are highly consistent, often reusing imagery like *“whispers of silver”* and *“dancing droplets”* (though results may still vary). The structure and tone remain nearly identical across runs, showing how a low temperature makes the model more deterministic and less creative. The outputs feel polished but display less variety compared to higher settings.


#### Temperature `1.8`

In [None]:
for i in range(3):
    print(f"\n— Run {i+1} (temp=1.8) —")
    resp = client.responses.create(
        model="gpt-4o-mini",
        input=prompt,
        temperature=1.8
    )
    print(resp.output_text)

    if i < 2:
        time.sleep(10)


— Run 1 (temp=1.8) —
Gentle whispers from the sky,  
Dancing droplets; trees reply,  
Nature bathes in silver streams,  
Splashing dreams like glimmering seams.  

— Run 2 (temp=1.8) —
Bare leaves sway in whispers stray,  
Dancing drops turn night to day,  
Mother Earth's rerun tonight,  
Healing tears ignite delight.

— Run 3 (temp=1.8) —
Whispers of water dance from the sky,  
Each drop a melody — quick to comply.  
Kissing the earth, a soothing refrain,  
In gray embrace, the world cries in rain.  


**Output Analysis**

At temperature = 1.8, the poems show greater variety and imaginative phrasing. The imagery shifts noticeably between runs, with less repetition and more surprising word choices. This illustrates how a high temperature boosts creativity and randomness, though it can also produce less polished or less consistent results.


#### Try It Yourself

Now that you’ve seen the flow and output, experiment with temperature! Set it to **0.0** for deterministic, minimal-variation answers; try **2.0** for very diverse, creative outputs (may be less consistent).

In [None]:
# UNCOMMENT BELOW AND ENTER A TEMPERATURE VALUE


for i in range(3):
    print(f"\n— Run {i+1} (temp=() —")
    resp = client.responses.create(
        model="gpt-4o-mini",
        input=prompt,
        temperature=ENTER_VALUE  # 👈 replace with your chosen temperature
    )
    print(resp.output_text)

    if i < 2:
        time.sleep(10)  # pause between runs


### System/User Role Prompting

Now, we will talk about **System/User Role Prompting**. But before diving in, let’s quickly review **Role Prompting**.  

**Role prompting** is a way of shaping not just *what* the model says, but *how* it says it. By assigning the model a role, you can influence its tone, perspective, and style of response.  

For example, if you tell the model *“You are a supportive tutor,”* it will answer with encouragement. If you instead say *“You are a strict tutor,”* the output will sound more demanding.  

Here is an example of **Role Prompting** only.

In [None]:
encouraging_prompt = (
    "You are an encouraging tutor. Use supportive and positive language. "
    "Adopt this encouraging voice consistently in every tip.\n"
    "Give me exactly 3 study tips for students in a college-level introductory statistics course.\n"
    "Each tip must be one sentence (≤ 20 words).\n"
    "Format as bullet points.\n"
    "Be concrete and domain-specific (e.g., sampling, probability, hypothesis testing).\n"
    "• **Active Recall:** Celebrate progress by testing yourself with short quizzes on sampling and probability after each session.\n"
    "\n"
    "Now generate exactly 3 new tips, written in the same format but with the encouraging tone.\n"
)



In [None]:
resp = client.responses.create(
    model="gpt-4o-mini",
    input= encouraging_prompt
)

print(resp.output_text)

Absolutely! Here are three more supportive study tips for your statistics journey:

- **Visual Aids:** Create colorful charts and graphs to solidify your understanding of hypothesis testing—visuals make concepts more memorable!

- **Study Groups:** Join or form a study group to discuss and practice key concepts like distribution shapes—collaboration enhances learning and boosts confidence!

- **Real-Life Applications:** Relate statistical concepts to real-world examples, like sports statistics, to make learning engaging and relevant—you're doing great!


Nice! We created a structured and detailed prompt tailored for our audience of introductory statistics students, and we instructed the model to use an encouraging tone.  

That prompt works well, but we can organize it more clearly. The API supports this by separating instructions into structured **role messages**:

- **System** → defines the overall role, voice, or behavior (e.g., “You are a supportive tutor”).  
- **User** → contains the actual request (e.g., “Give me 3 concise study tips for intro statistics”).  

This separation keeps prompts cleaner, makes intent more explicit, and helps the model maintain consistency in longer conversations.  


In [None]:
system_message = "You are an encouraging tutor. Be supportive, practical, and foster collaboration."
user_message = (
    "Give me exactly 3 study tips for a college-level introductory statistics course.\n"
    "Each tip should be one sentence (≤ 20 words).\n"
    "Format as bullet points.\n"
    "Be concrete and domain-specific (e.g., sampling, probability, hypothesis testing)."
    "• **Active Recall:** Test yourself with short quizzes on sampling and probability after each study session.\n"
    "\n"
    "Now generate exactly 3 new tips, different from the example.\n"

)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
    ]
)

print(resp.choices[0].message.content)

- **Practice Problems:** Regularly solve a variety of problems on hypothesis testing to solidify your understanding of concepts and calculations.  
- **Visual Aids:** Create visual representations, like graphs, for distributions and data sets to enhance comprehension and retention.  
- **Group Study:** Collaborate with classmates to discuss real-world applications of statistical methods, boosting understanding through diverse perspectives.  


----

<a name="practical-tips"></a>


## 3. Practical Tips

In this section, we will go over some practical tips that will be helpful as you embark throughout this module!


### Prompt Engineering   

In the System/User Role Prompting section, we briefly touched on what makes a good prompt. Now let’s dive a little deeper into some key strategies for effective prompt design. Let's start with an example of a **fluffy** prompt.


In [None]:
resp = client.responses.create(
    model="gpt-4o-mini",
    input="Give me three study tips."
)

print(resp.output_text)

Sure! Here are three effective study tips:

1. **Active Learning**: Engage with the material beyond passive reading. Summarize information in your own words, create flashcards, or teach concepts to someone else. This helps reinforce what you've learned.

2. **Pomodoro Technique**: Use a timer to break your study sessions into manageable chunks (e.g., 25 minutes of focused study followed by a 5-minute break). This method helps maintain concentration and reduces burnout.

3. **Organized Environment**: Create a clutter-free and distraction-free study space. Having a dedicated area for studying can improve focus and make your study sessions more productive.

Implementing these tips can enhance your learning efficiency!


While the output may look fine, it leaves too much room for interpretation — we only instructed the model to “give us three study tips.” Without further context, this could apply to any subject, any student group, or any learning scenario. In addition, we gave no specifications about length or output style, so the model had freedom to choose its own format and tone. This lack of clarity makes the results less predictable and harder to tailor to our needs.  

Let's contrast or fluffy prompt with with the one we made for our encouraring tutor:



```python

encouraging_prompt = (
    "You are an encouraging tutor. Use supportive and positive language. "
    "Adopt this encouraging voice consistently in every tip.\n"
    "Give me exactly 3 study tips for students in a college-level introductory statistics course.\n"
    "Each tip must be one sentence (≤ 20 words).\n"
    "Format as bullet points.\n"
    "Be concrete and domain-specific (e.g., sampling, probability, hypothesis testing).\n"
    "• **Active Recall:** Celebrate progress by testing yourself with short quizzes on sampling and probability after each session.\n"
    "\n"
    "Now generate exactly 3 new tips, written in the same format but with the encouraging tone.\n"
)


```

This prompt is much better for the following reasons:

- **Precise**: It specifies the subject (introductory statistics) and the type of content (study tips), reducing ambiguity.  
- **Set Boundaries**: It defines clear limits on number of tips (3) and sentence length (≤ 20 words).  
- **Controlled Format**: It requires bullet points and even provides an example to ensure structural consistency.  
- **Guided Style**: It enforces an encouraging tone, making the output more suitable for the intended audience.  


These adjustments highlight the power of prompt engineering — giving us much greater control over the **length, style, and clarity** of the output. Remember, this is an iterative process - each step builds on the last, and small adjustments to your instructions can lead to big improvements in the model’s output.   


### Don’t Repeat Yourself (DRY) Programming - Creating Functions

By now, you’ve probably noticed that we’re repeating the same code again and again. That’s not very efficient!  

To fix this, we can create a **function**. Functions are a staple in programming and data science — they let you **bundle code into a reusable block**. Instead of copying and pasting the same lines, you simply call the function by name whenever you need it.  

This makes your code **cleaner, more efficient, and easier to maintain** as your project grows.  

We will call our first function, `generate_text_simple`. Inside our function, we have

```python
resp = client.responses.create(
    model=model,
    input=prompt
)

return resp.output_text
```

Inside the function, we accept two arguments — the **prompt** and an optional **model** (default: `"gpt-4o-mini"`). The function calls the API and **returns only the generated text**, so your code gets a clean string instead of the full response object. The benefit of this structure is that we do not have to continuously ruse the code above!


In [None]:
encouraging_prompt = (
    ". Use supportive and positive language. "
    "Adopt this encouraging voice consistently in every tip.\n"
    "Give me exactly 3 study tips for students in a college-level introductory statistics course.\n"
    "Each tip must be one sentence (≤ 20 words).\n"
    "Format as bullet points.\n"
    "Be concrete and domain-specific (e.g., sampling, probability, hypothesis testing).\n"
    "• **Active Recall:** Celebrate progress by testing yourself with short quizzes on sampling and probability after each session.\n"
    "\n"
    "Now generate exactly 3 new tips, written in the same format but with the encouraging tone.\n"
)

role = "You are an encouraging tutor"

In [None]:
def generate_text_simple(prompt: str,  role: str, model: str = "gpt-4o-mini" ) -> str:
    """
    Send a prompt to an OpenAI model and return the generated text.

    Args:
        prompt: The input text/prompt.
        model:  The model name to use (default: gpt-4o-mini).

    Returns:
        The model's text output.
    """

    full_prompt = role + prompt
    resp = client.responses.create(
        model=model,
        input=full_prompt

    )
    return resp.output_text
print(generate_text_simple("Give me exactly 3 study tips for students in a college-level introductory statistics course", "you are a very stern mean professor"))

Certainly. Here are three essential study tips for excelling in your introductory statistics course:

1. **Master the Fundamentals**: Ensure you have a solid understanding of basic mathematical concepts. Statistics relies heavily on algebra, so review operations with decimals, fractions, and exponents. Familiarize yourself with common statistical terms and formulas.

2. **Engage with the Material**: Don’t just passively read the textbook. Work through examples and problem sets; apply the concepts to real-world scenarios. Join study groups or attend office hours to clarify difficult topics and engage in discussions that deepen your understanding.

3. **Practice, Practice, Practice**: Statistics requires a lot of practice to become proficient. Regularly solve practice problems and take advantage of any quizzes or practice tests provided by your instructor. The more you practice, the more comfortable you'll become with the different types of problems you'll encounter. 

Remember, mere att

Lets show this function with the encouraging and harsh tips that we created in the previous section!

**Tip:** For cleaner formatting of the output, wrap the function call inside a `print()` statement.  


In [None]:
encouaring_tips = generate_text_simple(encouraging_prompt)
print(encouaring_tips)

- **Visual Aids:** Enhance your understanding by creating colorful charts or graphs to represent data distributions and probability concepts.  

- **Study Groups:** Collaborate with classmates to discuss hypothesis testing, fostering deeper comprehension and building lasting friendships.  

- **Practice Problems:** Regularly tackle a variety of exercises on inferential statistics to strengthen your skills and boost your confidence!  


See! This is much more efficient than repeatedly calling `client.responses.create(model=model, input=prompt)` in every cell.



### Markdown Formatting  

So far, we have seen the LLM's raw output. However, we can make our results more readable by asking the model to format responses in **Markdown**. This will help us generate outputs that look cleaner and are easier to interpret inside Colab or on GitHub. This is especially useful when working with structured content such as study guides, rubrics, or summaries.

We can create a function called `to_markdown`, which we will call whenever we want to render the model’s text as formatted Markdown (headings, lists, bold/italics) instead of plain text.


In [None]:
from IPython.display import display, Markdown

def to_markdown(text):
    # Convert the provided text to Markdown format for better display in Jupyter Notebooks
    return Markdown(text)

Let's test it out with the `encouaring_tips` variable we just created!

In [None]:
to_markdown(encouaring_tips)

- **Visual Aids:** Enhance your understanding by creating colorful charts or graphs to represent data distributions and probability concepts.  

- **Study Groups:** Collaborate with classmates to discuss hypothesis testing, fostering deeper comprehension and building lasting friendships.  

- **Practice Problems:** Regularly tackle a variety of exercises on inferential statistics to strengthen your skills and boost your confidence!  

See! Our output is much cleaner!

----

<a name="forward"></a>

## 4. Forward  

Congrats! You now have the basics of prompt engineering. 🎉  
Next, we’ll apply these skills to a real use case — showing how LLMs can generate an essay, design a rubric, and then grade the essay based on that rubric.  
