# The Art and Science of Prompt Engineering

Welcome to a journey into the world of prompt engineering. This is where creativity meets technology. We'll learn how to craft prompts using APIs, shape them with templates and parameters, and track our progress using a tool called Weights & Biases (W&B). This knowledge will empower you to generate diverse and engaging content, a skill that's invaluable in many fields, especially in AI and machine learning.


In [1]:
from getpass import getpass
import os
import openai

if os.getenv("OPENAI_API_KEY") is None:
  if any(['VSCODE' in x for x in os.environ.keys()]):
    print('Please enter password in the VS Code prompt at the top of your VS Code window!')
  os.environ["OPENAI_API_KEY"] = getpass("Paste your OpenAI key from: https://platform.openai.com/account/api-keys\n")
  openai.api_key = os.getenv("OPENAI_API_KEY", "")

assert os.getenv("OPENAI_API_KEY", "").startswith("sk-"), "This doesn't look like a valid OpenAI API key"
print("OpenAI API key configured")

Please enter password in the VS Code prompt at the top of your VS Code window!
OpenAI API key configured


In [9]:
# Import necessary libraries
import openai
from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential, # for exponential backoff
)  
from rich.markdown import Markdown
import wandb
import time
import datetime

In [3]:
MODEL_NAME = "gpt-3.5-turbo"

system_prompt = "You are an AI tutor helping students prepare for machine learning coding interviews."
user_prompt = "Hi, can you give me an assignment? I'm just getting started."

# Prompt Templates

Prompt templates are a powerful tool for generating diverse and interesting prompts. Let's define a function to fill out these templates using parameters.

In [15]:
# Start a W&B run to track our experiments
wandb.init(project="llmapps")

# Define W&B Table to store generations
columns = ["system_prompt", "user_prompt", "generations", "elapsed_time", "timestamp",\
            "model", "prompt_tokens", "completion_tokens", "total_tokens"]
table = wandb.Table(columns=columns)

In [16]:

@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def completion_with_backoff(**kwargs):
    return openai.ChatCompletion.create(**kwargs)

def generate_and_print(system_prompt, user_prompt, table, n=1):
    messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt},
        ]
    start_time = time.time()
    responses = completion_with_backoff(
        model=MODEL_NAME,
        messages=messages,
        n = n,
        )
    elapsed_time = time.time() - start_time
    for response in responses.choices:
        generation = response.message.content
        print(generation)
    table.add_data(system_prompt,
                user_prompt,
                [response.message.content for response in responses.choices],
                elapsed_time,
                datetime.datetime.fromtimestamp(responses.created),
                responses.model,
                responses.usage.prompt_tokens,
                responses.usage.completion_tokens,
                responses.usage.total_tokens
                )

In [17]:
generate_and_print(system_prompt, user_prompt, table)

Of course! I can give you a simple coding assignment to begin with. Here's your task:

Write a Python function that checks if a given number is prime or not. The function should take an integer as input and return a boolean value indicating whether the number is prime or not.

Remember, a prime number is a number greater than 1 that has no positive divisors other than 1 and itself.

Take your time to work on this task, and let me know when you're ready to discuss your solution or have any questions.


In [22]:
system_prompt = """You are an AI tutor helping students prepare for machine learning coding interviews.
Your goal is to come up with learning assignments that will help students pass interviews. Specifically:
- Prompt students to solve a task that involves a simple machine learning concept and a coding exercise
- The task should be possible to solve in 30 minutes using a simple algorithm in Python
- The instruction should be minimal. Don't provide hints at this stage. 
- The task should be solvable by a student who has taken a machine learning course and has some coding experience
- The task should be interesting and fun to solve
- The task should advance the student's knowledge of machine learning
Start by summarizing what you're trying to achieve and your goals. Explain your reasoning behind the task and the way you present it. Then present the task concisely. 
Use this format:
REASONING: max 3 sentences  
TASK: max 3 sentences, no detailed instructions
"""
user_prompt = "Hi, can you give me an assignment? I'm just getting started."

In [23]:
generate_and_print(system_prompt, user_prompt, table)

REASONING: As a beginner, it's important to start with a simple task that covers a fundamental concept in machine learning. This will help you build a strong foundation and gain confidence in your abilities. 

TASK: Build a program that predicts whether a given email is spam or not spam, based on a set of pre-labeled emails. Use a binary classification algorithm of your choice to train a model using the provided dataset, and then use the trained model to predict the labels for a test set of emails. Finally, evaluate the accuracy of your model by comparing the predicted labels with the true labels.


In [24]:
system_prompt = """You are an AI tutor helping students prepare for machine learning coding interviews.
Your goal is to come up with learning assignments that will help students pass interviews. Specifically:
- Prompt students to solve a task that involves a simple machine learning concept and a coding exercise
- The task should be possible to solve in 30 minutes using a simple algorithm in Python
- The instruction should be minimal. Don't provide hints at this stage. 
- The task should be solvable by a student who has taken a machine learning course and has some coding experience
- The task should be interesting and fun to solve
- The task should advance the student's knowledge of machine learning
Example tasks by level:
- Beginner: calculate probability of 3 heads in 5 coin flips, count the number of times a word appears in a text
- Intermediate: implement a single neuron in Python, implement a simple decision tree in Python
- Advanced: implement backpropagation of a simple MLP in Python, implement a simple CNN in Python
Start by summarizing what you're trying to achieve and your goals. Explain your reasoning behind the task and the way you present it. Then present the task concisely. 
Use this format:
REASONING: max 3 sentences  
TASK: max 3 sentences, no detailed instructions
"""
user_prompt = "Hi, can you give me an assignment? I'm just getting started."

In [25]:
generate_and_print(system_prompt, user_prompt, table)

REASONING: As a beginner, it's important to start with simple and foundational concepts in machine learning. This will build a strong understanding of the basics before diving into more complex topics. By starting with a task like calculating the probability of coin flips, students will learn about probability and basic statistical concepts, while also gaining experience in coding.

TASK: Write a Python function that takes the number of coin flips `n` as input and calculates the probability of getting exactly `k` heads in `n` coin flips. The function should return the probability as a decimal value. For example, given `n=5` and `k=3`, the function should return `0.3125`. Use the formula for calculating the probability of a specific outcome in a binomial distribution.


In [26]:
system_prompt = """You are an AI tutor helping students prepare for machine learning coding interviews.
Your goal is to come up with learning assignments that will help students pass interviews. Specifically:
- Prompt students to solve a task that involves a simple machine learning concept and a coding exercise
- The task should be possible to solve in 30 minutes using a simple algorithm in Python
- The instruction should be minimal. Don't provide hints at this stage. 
- The task should be solvable by a student who has taken a machine learning course and has some coding experience
- The task should be interesting and fun to solve
- The task should advance the student's knowledge of machine learning
Example tasks by level:
- Beginner: calculate probability of 3 heads in 5 coin flips, count the number of times a word appears in a text
- Intermediate: implement a single neuron in Python, implement a simple decision tree in Python
- Advanced: implement backpropagation of a simple MLP in Python, implement a simple CNN in Python
You'll be evaluated on:
- conciseness of the task description
- clarity of the task description
- creativity of the task
- matching the task to the student's level
- learning value of the task
Start by summarizing what you're trying to achieve and your goals. Explain your reasoning behind the task and the way you present it. Then present the task concisely. 
Use this format:
REASONING: max 1 sentence
TASK: max 5 short bullet points, no detailed instructions.
"""
user_prompt = "Hi, can you give me an assignment? I'm just getting started."

In [27]:
generate_and_print(system_prompt, user_prompt, table)

REASONING: A beginner-level assignment should be simple and cover a basic concept in machine learning.

TASK:
- Write a function in Python called "mean_squared_error" that takes in two lists of numbers, `y_true` and `y_pred`, and calculates the mean squared error between them.
- Calculate the mean squared error between the following two lists: [1, 2, 3, 4, 5] and [2, 4, 6, 8, 10].
- Implement a function called "linear_regression" that takes in two lists of numbers, `X` and `y`, and performs simple linear regression (fitting a line of the form `y = mx + b`) using the least squares method.
- Use the "linear_regression" function to find the best-fitting line for the following data points:
    X: [1, 2, 3, 4, 5]
    y: [2, 3, 4, 5, 6]
- Print the equation of the line (in the form `y = mx + b`) that best fits the given data points.


In [28]:
wandb.log({"simple_generations": table})
wandb.finish()

# APIs

We can use various APIs to fetch data and generate prompts. Let's see how this works.

In [None]:
# Code to fetch data from APIs and generate prompts goes here


# Logging Results

We will log our results to a W&B Table. This will allow us to track our experiments and analyze the results.

In [None]:
run = wandb.init(project="prompt-engineering")
table = wandb.Table(columns=["Prompt", "Response"])

# Generate prompts and responses here...

table.add_data(prompt, response)
run.log({"results": table})

# Experiments

Now we will run experiments using different prompt templates and parameters. We will log the results to our W&B Table.

In [None]:
# Code to run experiments and log results goes here

# Conclusion

In this notebook, we explored the concept of prompt engineering. We used various APIs to generate prompts, filled them out using templates and parameters, and logged the results to a W&B Table. This approach has many potential applications in natural language processing and machine learning.