**MIE 451/1513 Lab 5: Prompt Engineering**

This lab uses GPT models hosted by OpenAI and can be completed either by:

1.   Copying the prompts into the OpenAI web interface.
2.   Acquiring  an API key and running the code.

Please see the Assignment PDF for instructions on how to set up the web interface and API key.

This lab was designed with the help of the following materials. You are encouraged to check these references for more details on prompt engineering:
- OpenAI API reference:
    - https://platform.openai.com/docs/api-reference/introduction

- Prompt engineering guides:
    - https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api
    - https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/

#Task Description
We will explore prompting styles for the broad task of text classification. Your assignment involves designing an experiment to test various prompting styles for a different natural language (NL) task.

**Text classification:** Identify aspects such as sentiment or key topics in spans/collections of text. Example subtasks:
  * Sentiment analysis:
    * Identify positive/negative attitudes and the strength of these attitudes.
    * Identify attitude types from a fixed set such as { "like", "love", "hate", "value", ... } or by generating open-world labels (i.e., short, non-predefined tags).
  * Identify topics of news articles using either a fixed topic set or open-world labels.


#API Setup


In [None]:
!pip install cohere
!pip install tiktoken
!pip install openai

In [None]:
import os
from getpass import getpass
from openai import OpenAI
import cohere
import tiktoken

In [None]:
#Wraps text to fit on screen without needing horizontal scrolling
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [None]:
#Uploading your OpenAI API Key
#NEVER share API keys with anyone - if your payment information is associated with your key, anyone with your key can charge you!

#Paste your API key into the prompt window and hit enter. OPENAI_API_KEY becomes your API key as a string.
OPENAI_API_KEY = getpass("Enter your OpenAI API key:")

Enter your OpenAI API key:··········


In [None]:
#instantiate OpenAI client
client = OpenAI(
    api_key = OPENAI_API_KEY
)

In [None]:
#prompt gpt-3.5-turbo
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello Gpt."}
  ],
  temperature = 0
)
print(response.choices[0].message.content)

Hello! How can I assist you today?


The above code prompts the gpt-3.5-turbo-1106 model which is part of the Chat Completions library.

**Messages**: `messages` provides the conversation history as a list of messages. This lab focuses on single turn interactions. In a single turn interaction, the first message describes the system's role and the second message contains the prompt.   

The default system role is described by the prompt "You are a helpful assistant''. We will not be working with alternative system roles in this lab but you are welcome to refer to the API documentation and explore this parameter.

The prompt is placed in the `content` field of the second message.

**Temperature**: `temperature` ranges from 0 to 2 and determines the randomness of the response. A temperature closer 0 gives less variance in responses but is more likely to produce repeated outputs. A higher temperature increases response randomness. However, even setting `temperature = 0` will not produce deterministic responses.

**Response**: By default, one response object is returned in `response.choices[0]`, and the response message is accessed as `response.choices[0].message.content` . See the API for more details on how multiple messages can be returned and what other information is included in the output.


#Zero-shot prompting

Zero-shot (ZS) prompts, as opposed to few-shot (FS) prompts, do not include examples of expected input and output pairs.

Let's create some ZS prompts for text classification. Note, these prompts ignore some of the best practices introduced in the instruction engineering section below - they mainly serve as baselines.

**Web interface users:** To obtain an output, paste the prompt (located between the `"""` multiline string indicators in the API call) into the ChatGPT web application. Though ChatGPT3.5 currently uses a similar model to `gpt-3.5-turbo-1106`, the output may not be exactly the same as below due to randomness during response generation.






###ZS: Positive/Negative sentiment

Task: Given a set of course reviews, identify whether the sentiment of each review is positive or negative.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    For each course review, identify whether the sentiment is positive or negative.
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

Course review 1: Positive
Course review 2: Negative
Course review 3: Mixed (leaning towards positive)


###ZS: Positive/Negative sentiment strength

Task: Perform the same task as above, but also identify the strength of each sentiment.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    For each course review, identify whether the sentiment is positive or negative and the strength of the sentiment.
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

Course review 1: The sentiment is positive with a strong strength.

Course review 2: The sentiment is negative with a moderate strength.

Course review 3: The sentiment is mixed, leaning towards positive, with a moderate strength.


###ZS 3: Positive/Negative sentiment, sentiment strength, tags

Task: Perform the same task as above, but also add open-world labels (i.e., short, non-predefined tags) about the sentiment of each review.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    For each course review, identify whether the sentiment is positive or negative and the strength of the sentiment. Describe the sentiment in each review with some tags.
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

Course review 1: 
Sentiment: Positive
Strength: Strong
Tags: Hands-on approach, practical sessions, deeply enhanced understanding, highly valuable

Course review 2: 
Sentiment: Negative
Strength: Moderate
Tags: Overwhelming, dense material, fast pace, relevance to real-world applications

Course review 3: 
Sentiment: Mixed
Strength: Moderate
Tags: Demanding, heavy focus on theory, ultimately rewarding, interested in technicalities


###ZS: Adding a "neutral" review.

Lets add some reviews (4 and 5) with a mixed sentiment. This can cause the LLM to add Neutral sentiment to Positive/Negative, which could be undesirable.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    Course review 4: "The course was OK"

    Course review 5: "I liked some parts of the course and disliked others."

    For each course review, identify whether the sentiment is positive or negative and the strength of the sentiment. Describe the sentiment in each review with some tags.
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

Sure, here are the sentiment analyses for each course review:

1. Course review 1: Positive sentiment, strong - The student found the hands-on approach and practical sessions highly valuable, enhancing their understanding of decision support systems. Tags: Positive, Strong

2. Course review 2: Negative sentiment, moderate - The student found the course overwhelming due to dense material and fast-paced lectures, despite its real-world relevance. Tags: Negative, Moderate

3. Course review 3: Mixed sentiment, moderate - The course was demanding with a heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems. Tags: Mixed, Moderate

4. Course review 4: Neutral sentiment - The student found the course to be okay. Tags: Neutral

5. Course review 5: Mixed sentiment, mild - The student liked some parts of the course and disliked others. Tags: Mixed, Mild


#Instruction Engineering (IE)

What are the [best practices](https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api) to structure a ZS prompt?

So far we've seen two prompt [elements](https://www.lambdatest.com/learning-hub/prompt-engineering): **Instructions** and **Input Data**. In the above prompts, the input data came first and the instructions came at the end. However, it is recommended to:


1.   Put instructions at the beginning.
2.   Separate prompt components with **delimiters** such as #### or '''':

    * Delimiters are typically included during LLM fine-tuning.
    * Choose delimiters that do not occur naturally in text, such as ####, ====, etc. .
    * Be consistent. For example, if you separate inputs with ####, use the same delimiter for every input.



Let's apply these suggestions to ZS 1:


##IE: Instructions first and delimiters (positive/negative sentiment)


In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative.

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

Course review 1: Positive
Course review 2: Negative
Course review 3: Positive


##IE: Output format and context (Positive/Negative sentiment)

Two other common elements of a prompt are **output format** and **context**.

For example, we can add **output format** instructions such as "For review `i`, output "R`i`: positive" for positive sentiment and "R`i`: negative" for negative sentiment. Separate each review with a newline."

A prompt can also include **context** that does not constitute an explicit instruction but provides useful information about the task. This distinction can sometimes be ambiguous, so context is often ommited in ZS prompts, though an example of using a separate ZS context component is shown below. Most often, the context component takes the form of few-shot (FS) examples, discussed in the next section.

Adding these two components:

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative. For review `i`, output "R`i`: positive" for positive sentiment and "R`i`: negative" for negative sentiment. Seperate each review with a newline.

    Context:
    ''''
    The results of this sentiment analysis will be used for automated numerical analysis.
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive
R2: negative
R3: positive


#Few-shot prompting
Much of the benefits of instruction engineering can also be achieved with few-shot (FS) prompting, which function as context.

A FS prompt includes one or more examples of the desired output given an input. Like the input data, it is best to clearly seperate these examples with delimeters and maintain consistent formatting.

Let's remove the output format instruction and see if the LLM can infer the desired output format given only a few-shot example (this is demonstration purposes - you could both specify the output format and show the format with examples if needed.)

We'll also add a leading output word `output`, which is consistent with the output format shown in the example. Adding this leading word is meant to make the LLM more likely to follow the demonstrated output format.

##FS: Positive/Negative sentiment

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative.

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####

    Output:
    R1: positive
    R2: negative
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####

    Output:
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive
R2: negative
R3: positive


##FS: Positive/Negative sentiment, sentiment strength
Let's add identifying sentiment strength to the task and see if the output format is inferred from the examples.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative, and the strength of the sentiment.

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####

    Output:
    R1: positive, strong
    R2: negative, mild
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####

    Output:
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive, strong
R2: negative, mild
R3: positive, moderate


## Positional Bias in FS Examples
LLMs are prone to positional bias.

In deployed systems, it is best to randomize the order of FS examples. Otherwise, the LLM may infer patterns based on the position of examples - for instance, if all the negative example reviews are at the end, the LLM will be more likely to label the later input reviews as negative.

# Negative Prompts (positive/negative sentiment, sentiment strength)
LLMs often fail at understanding negation (e.g., see example 7 [here](https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api#h_1f4c9c5fa1)).

Let's add back the reivews that are sometimes classified as neuteral (reviews 4 and 5 from before) - this can sometimes lead the model to answer "Neutral" instead of positive or negative (sometimes without a sentiment strength). We will try to prevent this behaviour in the next prompts.



In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative, and the strength of the sentiment.

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####

    Output:
    R1: positive, strong
    R2: negative, mild
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    Course review 4: "The course was OK"
    ####
    Course review 5: "I liked some parts of the course and disliked others."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive, strong
R2: negative, mild
R3: positive, mild
R4: neutral
R5: neutral




To prevent "neutral" outputs, we could tell the model "Do not include neutral as a sentiment." However, due to the difficulty LLMs have with understanding negation, this negative prompt aove could lead the LLM to continue to output "neutral", or worse, not classify all of the reviews.


In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative, and the strength of the sentiment. Do not include neutral as a sentiment.

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####

    Output:
    R1: positive, strong
    R2: negative, mild
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    Course review 4: "The course was OK"
    ####
    Course review 5: "I liked some parts of the course and disliked others."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive, strong
R2: negative, mild
R3: positive, mild
R4: neutral
R5: neutral


Instead, we could try to add an example demonstrating the desired behaviour for the neutral case:

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative, and the strength of the sentiment.

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####
    Course review 3: "Neutral"
    ####

    Output:
    R1: positive, strong
    R2: negative, mild
    R3: negative, mild
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    Course review 4: "The course was OK"
    ####
    Course review 5: "I liked some parts of the course and disliked others."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive, strong
R2: negative, mild
R3: positive, mild
R4: negative, mild
R5: neutral


We could also add instructions for handling negative behaviour cases (avoiding saying what not to do). For example, we could add the instruction:

"For any review R`j` with sentiment best described as neutral, output "R`j`: negative, mild".

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    For each course review, identify whether the sentiment is positive or negative, and the strength of the sentiment. For any review R`j` with sentiment best described as neutral, output "R`j`: negative, mild".

    Example:
    ''''
    ####
    Course review 1: "MIE451/1513's lectures were insightful, offering a deep dive into decision support systems that was incredibly beneficial."
    ####
    Course review 2: "MIE451/1513 was kinda rough with the pace and workload, even if it did up my real-world skills."
    ####

    Output:
    R1: positive, strong
    R2: negative, mild
    ''''

    Course reviews:
    ''''
    ####
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."
    ####
    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."
    ####
    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."
    ####
    Course review 4: "The course was OK"
    ####
    Course review 5: "I liked some parts of the course and disliked others."
    ####
    ''''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

R1: positive, strong
R2: negative, mild
R3: negative, mild
R4: negative, mild
R5: negative, mild


#Retrieval-Augmented Generation (RAG)
An LLM contains a lot of internal knowledge (in the neural network parameters) it has obtained from its training data. However, sometimes, external knowledge (not in the LLM parameters) is required for a task.

To address such tasks, information first needs to be retrieved, and then used for text generation in a process called retrieval-augmented generation (RAG). There are several RAG methods available (see [this survey](https://arxiv.org/pdf/2302.07842.pdf) on augmented language models if you're curious), but we will cover the most basic method which is sometimes called [In-Context RAG](https://arxiv.org/pdf/2302.00083.pdf).

In this method, if a query is given as part of a task, it is used to perform a search and the top results are pasted into the LLM prompt with instructions on how to use them for generation. If a search query is not given as part of the task, this query must be generated.


## RAG Example

As an example, let's consider the task of identifing the 2023 nobel prize recipients' names and the reasons for each award, and outputing the names and reasons as a comma seperated list.

The LLM should not have internal knowledge of these prizes since they were announced on Oct 2nd, 2023, and the `gpt-3.5-turbo-1106` training data cutoff is Sep 2021. Let's check:

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Identify the 2023 nobel prize recipients' names and the reasons for each award.
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

I'm sorry, but I cannot provide real-time information about future Nobel Prize recipients as the Nobel Prize winners for 2023 have not been announced yet. The Nobel Prize winners are typically announced in October each year. I recommend checking the official Nobel Prize website or reputable news sources for the most up-to-date information on Nobel Prize recipients.


First, let's generate a query based on the task:

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Generate a Google search query for the following task.

    Task
    ''''
    Identify the 2023 nobel prize recipients' names and the reasons for each award.
    ''''

    Query:
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

"2023 Nobel Prize winners list and reasons"


Now, aftering running a Google search with this query, we can copy and paste the top result into the following prompt, which also uses a format example, delimiters, and a leading output word.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    Use the news article to identify the 2023 nobel prize recipients' names and the reasons for each award. For each award, output a comma seperated list of names and a comma seperated list of keyphrases for the award reason.

    Format example:
    ''''
    {
       "prize name 1": {
            "recipient names": [],
            "keyphrases": []
       },
      "prize name 2": {
           "recipient names": [],
           "keyphrases": []
       }
    }

    ''''

    News article
    ''''
Skip to content

    Nobel Prizes & Laureates
    Nomination
    Alfred Nobel
    News & insights
    Events
    Educational

Statue of Alfred Nobel.

Photo: A. Mahmoud
NOBEL PRIZES 2023
The Nobel Prize in Physics 2023
Pierre Agostini
“for experimental methods that generate attosecond pulses of light for the study of electron dynamics in matter”
Pierre Agostini

Ill. Niklas Elmehed © Nobel Prize Outreach
Ferenc Krausz
“for experimental methods that generate attosecond pulses of light for the study of electron dynamics in matter”
Ferenc Krausz

Ill. Niklas Elmehed © Nobel Prize Outreach
Anne L’Huillier
“for experimental methods that generate attosecond pulses of light for the study of electron dynamics in matter”
Anne L'Huillier

Ill. Niklas Elmehed © Nobel Prize Outreach
Experiments with light capture the shortest of moments
The three Nobel Prize laureates in physics 2023 are being recognised for their experiments, which have given humanity new tools for exploring the world of electrons inside atoms and molecules. They have demonstrated a way to create extremely short pulses of light that can be used to measure the rapid processes in which electrons move or change energy.
Related articles

    Press release: The Nobel Prize in Physics 2023
    Popular science background: Electrons in pulses of light
    Scientific background: “for experimental methods that generate attosecond pulses of light for the study of electron dynamics in matter”

Illustration of two electrons, illustrating the Nobel Prize in Physics 2023.

© Johan Jarnestad/The Royal Swedish Academy of Sciences
The Nobel Prize in Chemistry 2023
Moungi G. Bawendi
“for the discovery and synthesis of quantum dots”
Moungi Bawendi

Ill. Niklas Elmehed © Nobel Prize Outreach
Louis E. Brus
“for the discovery and synthesis of quantum dots”
Louis Brus

Ill. Niklas Elmehed © Nobel Prize Outreach
Aleksey I. Yekimov
“for the discovery and synthesis of quantum dots”
Alexei Ekimov

Ill. Niklas Elmehed © Nobel Prize Outreach
They added colour to nanotechnology
Moungi G. Bawendi, Louis E. Brus and Aleksey Yekimov are awarded the Nobel Prize in Chemistry 2023 for the discovery and development of quantum dots. These tiny particles have unique properties and now spread their light from television screens and LED lamps. They catalyse chemical reactions and their clear light can illuminate tumour tissue for a surgeon.
Related articles

    Press release: The Nobel Prize in Chemistry 2023
    Popular science background: They added colour to nanotechnology
    Scientific background: Quantum dots – seeds of nanoscience

An illustration of a bucket of paint with coloured balls beneath it, representing quantum dots.

© Johan Jarnestad/The Royal Swedish Academy of Sciences
The Nobel Prize in Physiology or Medicine 2023
Katalin Karikó
“for their discoveries concerning nucleoside base modifications that enabled the development of effective mRNA vaccines against COVID-19”
Katalin Karikó

Ill. Niklas Elmehed © Nobel Prize Outreach
Drew Weissman
“for their discoveries concerning nucleoside base modifications that enabled the development of effective mRNA vaccines against COVID-19”
Drew Weissman

Ill. Niklas Elmehed © Nobel Prize Outreach
They contributed to an unprecedented rate of vaccine development
The discoveries by the two Nobel Prize laureates were critical for developing effective mRNA vaccines against COVID-19 during the pandemic that began in early 2020. Through their groundbreaking findings, which have fundamentally changed our understanding of how mRNA interacts with our immune system, the laureates contributed to the unprecedented rate of vaccine development during one of the greatest threats to human health in modern times.

Press release: The Nobel Prize in Physiology or Medicine 2023

Scientific background: Discoveries concerning nucleoside base modifications that enabled the development of effective mRNA vaccines against COVID-19
A blue background with COVID-19 virus and a yellow strand of modified mRNA. Also shown is the chemical structure of pseudouridine, an RNA base that was important in the prize-awarded discovery. The graphic represents the 2023 Nobel Prize in Physiology or Medicine awarded to Katalin Karinkó and Drew Weissman who received the Nobel Prize in Physiology or Medicine for their discoveries concerning nucleoside base modifications that enabled the development of effective mRNA vaccines against COVID-19.

© The Nobel Committe for Physiology or Medicine. Ill. Mattias Karlén
The Nobel Prize in Literature 2023
Jon Fosse
“for his innovative plays and prose which give voice to the unsayable”
Jon Fosse

Ill. Niklas Elmehed © Nobel Prize Outreach
The Nobel Prize in Literature 2023
Jon Fosse
The Nobel Prize in Literature 2023 is awarded to the Norwegian author Jon Fosse, “for his innovative plays and prose which give voice to the unsayable.”

His immense oeuvre written in Norwegian Nynorsk and spanning a variety of genres consists of a wealth of plays, novels, poetry collections, essays, children’s books and translations. While he is today one of the most widely performed playwrights in the world, he has also become increasingly recognised for his prose.
Related articles

    Press release: The Nobel Prize in Literature 2023
    Biobibliography

Author Jon Fosse at his desk

Jon Fosse at his desk.

Credit: Det Norske Samlaget. Photo: Tove Breistein
The Nobel Peace Prize 2023
Narges Mohammadi
“for her fight against the oppression of women in Iran and her fight to promote human rights and freedom for all”
Narges Mohammadi

Ill. Niklas Elmehed © Nobel Prize Outreach
The Nobel Peace Prize 2023
“Woman – Life – Freedom”
The Norwegian Nobel Committee has decided to award the Nobel Peace Prize 2023 to Narges Mohammadi for her fight against the oppression of women in Iran and her fight to promote human rights and freedom for all.

This year’s peace prize also recognises the hundreds of thousands of people who, in the preceding year, have demonstrated against Iran’s theocratic regime’s policies of discrimination and oppression targeting women. The motto adopted by the demonstrators – “Woman – Life – Freedom” – suitably expresses the dedication and work of Narges Mohammadi.
Three demonstrating hands. The hand in the middle wears a set of bracelets representing the colours of Iran.

Ill. Niklas Elmehed © Nobel Prize Outreach
The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2023
Claudia Goldin
“for having advanced our understanding of women’s labour market outcomes”
Claudia Goldin

Ill. Niklas Elmehed © Nobel Prize Outreach
She uncovered key drivers of gender differences in the labour market
Over the past century, the proportion of women in paid work has tripled in many high-income countries. This is one of the biggest societal and economic changes in the labour market in modern times, but significant gender differences remain. It was first in the 1980s that a researcher adopted a comprehensive approach to explaining the source of these differences. Claudia Goldin’s research has given us new and often surprising insights into women’s historical and contemporary roles in the labour market.
Related articles

    Press release: The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2023
    Popular science background: History help us understand gender differences in the labour market
    Scientific background to Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2023

A detective investigating a file cabinet, accompanied by a golden retriever.

© Johan Jarnestad/The Royal Swedish Academy of Sciences
Explore prizes and laureates
Select the category or categories you would like to filter by
Physics
Chemistry
Medicine
Literature
Peace
Economic Sciences
Decrease the year by one
Choose a year you would like to search in
Increase the year by one
Sign up for the “Monthly” newsletter

Join thousands of global subscribers enjoying the free monthly Nobel Prize highlights, trivia and up-to-date information.

Your e-mail address

I consent to my email address being used in accordance with the privacy policy.
About the Nobel Prize organisation
The Nobel Foundation

Tasked with a mission to manage Alfred Nobel's fortune and has ultimate responsibility for fulfilling the intentions of Nobel's will.
The prize-awarding institutions

For more than a century, these academic institutions have worked independently to select Nobel Prize laureates.
Nobel Prize outreach activities

Several outreach organisations and activities have been developed to inspire generations and disseminate knowledge about the Nobel Prize.

    Press
    Contact
    FAQ

    Privacy policy
    Technical support
    Terms of use

    For developers
    Media player

Join us

Facebook Twitter Instagram Youtube

    LinkedIn

The Nobel Prize
Copyright © Nobel Prize Outreach AB 2023

    ''''

    Output:
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

```json
{
   "The Nobel Prize in Physics 2023": {
        "recipient names": ["Pierre Agostini", "Ferenc Krausz", "Anne L’Huillier"],
        "keyphrases": ["experimental methods", "attosecond pulses of light", "study of electron dynamics in matter"]
   },
  "The Nobel Prize in Chemistry 2023": {
       "recipient names": ["Moungi G. Bawendi", "Louis E. Brus", "Aleksey I. Yekimov"],
       "keyphrases": ["discovery and synthesis", "quantum dots"]
   },
   "The Nobel Prize in Physiology or Medicine 2023": {
       "recipient names": ["Katalin Karikó", "Drew Weissman"],
       "keyphrases": ["discoveries concerning nucleoside base modifications", "development of effective mRNA vaccines against COVID-19"]
   },
   "The Nobel Prize in Literature 2023": {
       "recipient names": ["Jon Fosse"],
       "keyphrases": ["innovative plays and prose", "give voice to the unsayable"]
   },
   "The Nobel Peace Prize 2023": {
       "recipient names": ["Narges Mohammadi"],
       "keyphrases": ["fig

#Chain-of-Thought Prompting
It often makes sense to break a task down into a sequence of steps when prompting an LLM, which is broadly called chain-of-thought (COT) prompting.

Let's look at a slightly modified example from the [OpenAI Prompt Engineering Guide](https://platform.openai.com/docs/guides/prompt-engineering).

The LLM is asked to determine if the students solution is correct, but fails to catch that the student incorrectly used 100x in step 3) instead of 10x.  


In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
Determine if the student's solution is correct or not.

Problem Statement: I'm building a solar power installation and I need help working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost me a flat $100k per year, and an additional $10 / square foot
What is the total cost for the first year of operations as a function of the number of square feet.

Student's Solution: Let x be the size of the installation in square feet.
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000

    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

The student's solution is correct. The total cost for the first year of operations as a function of the number of square feet is indeed 450x + 100,000.


Instead, we can instruct the LLM to follow a chain of reasoning.

In [None]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo-1106",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content":
    """
    First work out your own solution to the problem.
    Then compare your solution to the student's solution and evaluate if the student's solution is correct or not.
    Don't decide if the student's solution is correct until you have done the problem yourself.

    Problem Statement:
    '''
    I'm building a solar power installation and I need help working out the financials.
    - Land costs $100 / square foot
    - I can buy solar panels for $250 / square foot
    - I negotiated a contract for maintenance that will cost me a flat $100k per year, and an additional $10 / square foot
    What is the total cost for the first year of operations as a function of the number of square feet?
    '''

    Student's Solution:
    '''
    Let x be the size of the installation in square feet.
    1. Land cost: 100x
    2. Solar panel cost: 250x
    3. Maintenance cost: 100,000 + 100x
    Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
    '''
    """
    }
  ],
  temperature = 0.0
)

print(response.choices[0].message.content)

First, let's work out the total cost for the first year of operations as a function of the number of square feet.

My Solution:
Let x be the size of the installation in square feet.
1. Land cost: $100 * x
2. Solar panel cost: $250 * x
3. Maintenance cost: $100,000 + $10 * x

Total cost: $100 * x + $250 * x + $100,000 + $10 * x
Total cost: $350 * x + $100,000

Now, let's compare the student's solution to my solution.

Student's Solution:
Total cost: 450x + 100,000

Evaluation:
The student's solution is not correct. The student mistakenly added the land cost, solar panel cost, and maintenance cost together without considering the different cost components separately. The correct total cost should be $350 * x + $100,000, as per my solution.


This COT prompt led the LLM to correctly identify the student's mistake.

Note that COT can be used in combination with FS prompting or RAG.

#More Resources

Prompt engineering:

* https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/

* https://platform.openai.com/docs/guides/prompt-engineering

How powerful are the best LLMs?

* https://arxiv.org/abs/2303.12712

Augmented LLMs:

* https://arxiv.org/pdf/2302.07842.pdf



#Why not use a small open source model?
Why does this lab use OpenAI's hosted models instead of downloading an open source model?

TLDR: Because model size matters and it's not feasible to host large models on colab.

[Falcon-RW-1B](https://huggingface.co/tiiuae/falcon-rw-1b) (1 billion parameters) is one of the best open source LLMs for its size according to the [huggingface leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) in November 2023. It is roughly the largest model that fits into the 12 GB of RAM on Google colab.

In contrast, GPT-3 has over 175 billion paramaters and requires 300+ GB of RAM.

Try it yourself and see how well Falcon 1B compares to GPT-3 in terms of speed and performance on one of the prompts from the lab:

In [None]:
!pip install transformers
!pip install sentencepiece
!pip install accelerate



In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

In [None]:
model = "tiiuae/falcon-rw-1b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

In [None]:
#the prompt should be between the triple quotations
sequences = pipeline(
    """
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    For each course review, identify whether the sentiment is positive or negative.
    """,
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Result: 
    Course review 1: "MIE451/1513's hands-on approach with practical sessions deeply enhanced my understanding of decision support systems, making it a highly valuable course in my academic journey."

    Course review 2: "I found MIE451/1513 to be overwhelming due to the dense material and the fast pace of lectures, despite its relevance to real-world applications."

    Course review 3: "The course was quite demanding with its heavy focus on theory, but ultimately rewarding for those interested in the technicalities of decision support systems."

    For each course review, identify whether the sentiment is positive or negative.
                                                             
