## Homework 2, Part 1, CS678 Fall 2024

### This is due on **October 11th, 2024**. Please read the report PDF for submission instruction.
### **Note that this is only the Part 1 of the homework.**

#### **IMPORTANT**: After copying this notebook to a Google Drive or One Drive, please paste a link to the PDF report ("Your Notebook solution"). To get a publicly-accessible link, hit the *Share* button at the top right, then click "Get shareable link" and copy over the result. If you fail to do this, you will receive no credit for this homework!

---

##### *How to do this problem set:*

- Some questions require writing Python code and computing results, and the rest of them have written answers. For coding problems, you will have to fill out all code blocks that say `YOUR CODE HERE`.

- This assignment is designed so that you can run all cells almost instantly. If it is taking longer than that, you have made a mistake in your code.

- Note that there are more questions in the PDF than the ones present in this notebook (which only includes the ones requiring code).

---

##### *How to submit this problem set:*
- After filling in the missing code, provide all the answers in LaTeX template released with the assignment. Once again, you should create a shareable link of your completed notebook and paste it to the LaTex report. The PDF report compiled from running the LaTex template should be submitted to Gradescope.
  
---

##### *Academic honesty*

- We will audit the notebooks from a set number of students, chosen at random. The audits will check that the code you wrote actually generates the answers in your PDF. If you turn in correct answers on your PDF without code that actually generates those answers, we will consider this a serious case of cheating. See the course page for honesty policies.

- We will also run automatic checks of notebooks for plagiarism. Copying code from others is also considered a serious case of cheating.

---

### Task 0: Environment Configuration

#### Step 1: Set up an OpenAI API key
Set up your OpenAI API key below. If you don't have one, register one from OpenAI's website: https://platform.openai.com/api-keys.
This assignment will mainly use **gpt-4o-mini**. Its pricing can be found here: https://openai.com/api/pricing/ (\$0.150 / 1M input tokens, $0.600 / 1M output tokens).

**NOTE: Please delete your key after you complete this homework. This is your private key that should not be shared with others (including instructor/TA).**

In [2]:
OPENAI_API_KEY='My api key' # REPLACE

#### Step 2: Install the openai Python library

To complete this notebook, we will use the "openai" library for calling OpenAI's language models.

Execute the following command to pip install the library.

In [3]:
#!pip install openai

Now, you should be able to run the following code, which gives a response to an input message "Hello!"

Specifically,
- `client = OpenAI(api_key=OPENAI_API_KEY)` defines a client call with your private API key;
- `client.chat.completions.create` calls OpenAI's chat completion function (https://platform.openai.com/docs/api-reference/chat/create);
    - Field `model` specifies the LLM version to use, here being "gpt-4o-mini"
    - Field `messages` contains the chat history which is used to prompt the LLM for a response, including
        - `{"role": "system", "content": "You are a helpful assistant."}` which specifies the system description (being a helpful assistant),
        - `{"role": "user", "content": "Hello!"}` which specifies the user input "Hello!"

The returned chat completion object (https://platform.openai.com/docs/api-reference/chat/object), includes one possible responses (`choices[0]`) whose message content is "Hello! How can I assist today?"

You can also have a look at: https://cookbook.openai.com/examples/how_to_format_inputs_to_chatgpt_models

In [4]:
from openai import OpenAI
client = OpenAI(api_key=OPENAI_API_KEY)

completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message.content)

Hello! How can I assist you today?


In this assignment, we will use this chat completion function to prompt gpt-4o-mini for a few tasks. For the ease of the work, let's define the following wrapper function called "ChatCompletion" on top of OpenAI's chat completion.

Note that in the function, we have included two additional arguments to the API call:
- `n_samples` is passed as the argument `n` to `client.chat.completions.create`, which specifies the number of samples requested from the LLM;
- `top_p` is passed as the argument `top_p` to `client.chat.completions.create`, which specifies the p% probability mass to sample from.

In [5]:
def ChatCompletion(prompt, n_samples=1, top_p=1.0, return_object=False):
    assert n_samples >= 1
    assert top_p <= 1 and top_p > 0
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        n=n_samples,
        top_p=top_p
    )

    if n_samples == 1:
        print("Response: ", completion.choices[0].message.content)
    else:
        print("The call returns %d responses:\n" % n_samples)
        for i in range(n_samples):
            print("*Response %d*: " % i, completion.choices[i].message.content)
            print("-" * 10)

    if return_object:
        return completion

### Task 1: Story Generation with Different Sampling Strategies

In the first question, we will learn about different generation effects with a sampling approach called "nucleus sampling". We will try its difference configurations with different `top_p`.


### Question 1 (5 points)

Can you use the ChatCompletion function to generate a story about an Indian student studying abroad (e.g., at George Mason University)? Please use the default setting and generate only one story.

In [6]:
# YOUR CODE HERE
ChatCompletion("generate a story about an Indian student studying abroad")

Response:  **Title:** The Journey Beyond Borders

Arjun Sharma was like many other students from New Delhi—ambitious, curious, and eager to explore the world. Since childhood, he held a dream of studying abroad, fueled by stories of students who had returned with diverse experiences and knowledge that transformed their lives. After months of rigorous preparation, he received an acceptance letter from the prestigious University of Edinburgh in Scotland, and his dream was finally becoming a reality.

Arriving in Edinburgh, Arjun was struck by the city’s historic charm. Cobblestone streets wound around ancient buildings, and the towering Edinburgh Castle served as a sentinel over the city. At first, everything felt surreal; he was both excited and nervous. The moment he stepped into his new dorm room, he took a deep breath and tried to absorb the weight of his new beginning.

Classes began, and Arjun was introduced to a melting pot of cultures. He met students from England, Nigeria, Brazi

### Question 2 (5 points)

Now, can you do the same but try to get 2 generations with `top_p` set to be 1?

In [7]:
# YOUR CODE HERE
ChatCompletion("generate a story about an Indian student studying abroad",top_p=1)
ChatCompletion("generate a story about an Indian student studying abroad",top_p=1)

Response:  **Title: A Journey Beyond Borders**

In the heart of Mumbai, India, lived a bright and ambitious student named Aanya. She had always dreamed of exploring the world beyond her vibrant metropolis, fueled by stories of adventure from her grandfather, who had traveled extensively during his youth. After years of hard work, Aanya received an acceptance letter from a prestigious university in London to study International Relations. It was the opportunity of a lifetime, and she was determined to make the most of it.

As she boarded the plane, Aanya's heart raced with excitement and nervousness. The vastness of the ocean below was a metaphor for her new journey, as she prepared to dive into a culture that was both fascinating and intimidating. Upon her arrival at Heathrow, the cool breeze and the eclectic chatter of accents welcomed her to a bustling metropolis that was a far cry from the warmth of Mumbai.

Aanya settled into her new life in London, sharing a quaint flat with three

### Question 3 (5 points)

How about 2 generations with `top_p` set to be 0.5?

In [8]:
# YOUR CODE HERE
# YOUR CODE HERE
ChatCompletion("generate a story about an Indian student studying abroad",top_p=0.5)
ChatCompletion("generate a story about an Indian student studying abroad",top_p=0.5)

Response:  **Title: A Journey Beyond Borders**

In the bustling city of Pune, India, a bright and ambitious student named Aarav Sharma was preparing for the biggest adventure of his life. With a scholarship in hand, he was set to study computer science at a prestigious university in Canada. The night before his departure, Aarav sat in his room, surrounded by family photos and the familiar scent of his mother’s cooking wafting through the air. His heart raced with excitement and a tinge of anxiety.

“Remember, Aarav,” his father said, placing a reassuring hand on his shoulder, “this is an opportunity to learn and grow. Embrace every moment.”

With his parents’ words echoing in his mind, Aarav boarded the plane the next day, his heart a mix of hope and trepidation. The flight was long, but he spent the time imagining the new life that awaited him. As the plane touched down in Toronto, he felt a wave of exhilaration wash over him. The city was alive with energy, a stark contrast to the fa

### Question 4 (10 points)

What did you observe from Q1 - Q3? Did the different `top_p` configurations give you the same or different results? Why?

### Task 2: gpt-4o-mini for Solving Mathematical Problems

The second task we will try is about solving a math problem.

The math problem we consider is:

> Melanie is a door-to-door saleswoman. She sold a third of her vacuum cleaners at the green house, 2 more to the red house, and half of what was left at the orange house. If Melanie has 5 vacuum cleaners left, how many did she start with?

For your reference, the correct answer should be 18, following the reasoning chain below:

> First multiply the five remaining vacuum cleaners by two to find out how many Melanie had before she visited the orange house: 5 * 2 = 10;
> Then add two to figure out how many vacuum cleaners she had before visiting the red house: 10 + 2 = 12;
> Now we know that 2/3 * x = 12, where x is the number of vacuum cleaners Melanie started with. We can find x by dividing each side of the equation by 2/3, which produces x = 18


### Question 5 (5 points)
Can you use the ChatCompletion function and prompt gpt-4o-mini to work out the problem?

In [9]:
math_problem = 'Melanie is a door-to-door saleswoman. She sold a third of her vacuum cleaners at the green house, 2 more to the red house, and half of what was left at the orange house. If Melanie has 5 vacuum cleaners left, how many did she start with?'

# YOUR CODE HERE
ChatCompletion(math_problem)

Response:  Let \( x \) be the number of vacuum cleaners Melanie started with.

1. First, she sold a third of her vacuum cleaners at the green house:
   \[
   \text{Vacuum cleaners sold at the green house} = \frac{x}{3}
   \]
   After this sale, the number of vacuum cleaners left is:
   \[
   x - \frac{x}{3} = \frac{3x}{3} - \frac{x}{3} = \frac{2x}{3}
   \]

2. Next, she sold 2 more vacuum cleaners at the red house:
   \[
   \text{Vacuum cleaners left after the red house} = \frac{2x}{3} - 2
   \]

3. Then, she sold half of what was left at the orange house:
   \[
   \text{Vacuum cleaners sold at the orange house} = \frac{1}{2} \left(\frac{2x}{3} - 2\right)
   \]
   The remaining vacuum cleaners after selling at the orange house is:
   \[
   \text{Vacuum cleaners left after the orange house} = \frac{2x}{3} - 2 - \frac{1}{2} \left(\frac{2x}{3} - 2\right)
   \]

   We can simplify this. First, let's calculate:
   \[
   \frac{1}{2} \left(\frac{2x}{3} - 2\right) = \frac{2x}{6} - 1 = \frac{x}

Did gpt-4o-mini solve the problem correctly? If not, where did it go wrong?

### Question 6 (10 points)

Now, try to get 10 solutions from gpt-4o-mini with `top_p` set to 0.7.

In [13]:
# YOUR CODE HERE
ChatCompletion(math_problem,10,top_p=0.7)

The call returns 10 responses:

*Response 0*:  Let \( x \) be the number of vacuum cleaners Melanie started with.

1. She sold a third of her vacuum cleaners at the green house:
   \[
   \text{Sold at green house} = \frac{x}{3}
   \]
   After this sale, she has:
   \[
   x - \frac{x}{3} = \frac{2x}{3}
   \]

2. Next, she sold 2 more at the red house:
   \[
   \text{Sold at red house} = 2
   \]
   After this sale, she has:
   \[
   \frac{2x}{3} - 2
   \]

3. Then, she sold half of what was left at the orange house. The amount left before this sale is \( \frac{2x}{3} - 2 \), so she sold:
   \[
   \text{Sold at orange house} = \frac{1}{2} \left( \frac{2x}{3} - 2 \right)
   \]
   After this sale, she has:
   \[
   \left( \frac{2x}{3} - 2 \right) - \frac{1}{2} \left( \frac{2x}{3} - 2 \right)
   \]

   To simplify this expression, we can factor out \( \left( \frac{2x}{3} - 2 \right) \):
   \[
   \text{Remaining} = \frac{1}{2} \left( \frac{2x}{3} - 2 \right)
   \]

4. We know from the problem

You may see multiple different answers produced by gpt-4o-mini. Summarize the answers in the table on the report. Did gpt-4o-mini do right in all of the solutions? If there are any mistakes, what are the common errors that gpt-4o-mini make?

### Question 7 (10 points)

Can you try other ways to prompt gpt-4o-mini to give correct solutions more stably? Be creative!

It may be helpful to design your prompt considering multiple math problems together. Hence we provided another one below:

The problem is:
> John drives for 3 hours at a speed of 60 mph and then turns around because he realizes he forgot something very important at home.  He tries to get home in 4 hours but spends the first 2 hours in standstill traffic.  He spends the next half-hour driving at a speed of 30mph, before being able to drive the remaining time of the 4 hours going at 80 mph.  How far is he from home at the end of those 4 hours?

For your reference, the correct answer is 45:
> When he turned around he was 3*60=180 miles from home
> He was only able to drive 4-2=2 hours in the first four hours.
> In half an hour he goes 30*.5=15 miles. He then drives another 2-.5=1.5 hours. In that time he goes 80*1.5=120 miles. So he drove 120+15=135 miles
> So he is 180-135=45 miles away from home

Include your prompt design and the answer on the report. Why do you think it works or not?

In [14]:
# YOUR CODE HERE
new_math_problem = "John drives for 3 hours at a speed of 60 mph and then turns around because he realizes he forgot something very important at home.  He tries to get home in 4 hours but spends the first 2 hours in standstill traffic.  He spends the next half-hour driving at a speed of 30mph, before being able to drive the remaining time of the 4 hours going at 80 mph.  How far is he from home at the end of those 4 hours?"
ChatCompletion("Let us solve this problem step by step:- "+new_math_problem)

Response:  Let's break down the problem step by step.

### Step 1: Calculate the distance John drives away from home
John drives for 3 hours at a speed of 60 mph. 

Distance = Speed × Time  
Distance = 60 mph × 3 hours = 180 miles

So, John is 180 miles away from home when he turns around.

### Step 2: Analyze the return journey
John attempts to return home, and he has a total of 4 hours for this journey, but the first 2 hours are spent in standstill traffic. During this time, he does not cover any distance.

### Step 3: Calculate distance covered in the next half-hour
After the standstill traffic, he drives for 0.5 hours (30 minutes) at a speed of 30 mph.

Distance = Speed × Time  
Distance = 30 mph × 0.5 hours = 15 miles

### Step 4: Determine the remaining time for driving
After the first 2 hours in traffic and the half hour driving at 30 mph, 2.5 hours have passed out of the 4 hours. 

Remaining time = 4 hours - 2.5 hours = 1.5 hours

### Step 5: Calculate distance covered in the f

#### Acknowledgement: The math problems used in this notebook come from the GSM8k dataset: Training Verifiers to Solve Math Word Problems, Cobbe et al., 2021. https://huggingface.co/datasets/gsm8k