# Review Analyst - Prompt Engineering

In this notebook, you'll build an AI-powered document analyst, capable of performing **sentiment analysis** and generating data for downstream tasks. You will learn how to employ **few-shot learning** with a language model by providing it with instructive examples.

## Learning Objectives

By the time you complete this notebook you will be able to:
- Perform **sentiment analysis** on unstructured text using Phi-3.
- Explain what the Phi-3 **prompt template** is, and how it was used during **instruction fine-tuning**.
- Guide and improve model performance using **few-shot learning**.
- Use Phi-3 to generate JSON data for potential use in downstream processing tasks.

# <FONT COLOR="purple">Verify that the runtime environment is GPU in Colab!</FONT>

## Install Dependencie(s)

In [None]:
# The 'device_map' paramter requires Accelerate package.
# Restart workspace after the install!
!pip install accelerate flash_attn

## Create Microsoft [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) Pipeline

In [None]:
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
from transformers import TextStreamer

# Microsoft Phi-3-mini-4k-instruct model
model = "microsoft/Phi-3-mini-4k-instruct"

# The tokenizer is responsible for converting the text into a format understandable by the model.
tokenizer = AutoTokenizer.from_pretrained(model)

# Load model
model = AutoModelForCausalLM.from_pretrained(model, 
                                             torch_dtype=torch.float16, 
                                             device_map="auto",
                                             trust_remote_code=True,
                                             attn_implementation="eager")

# The task of the streamer object is to ensure that the model's response is continuous. This reduces the waiting time.
streamer = TextStreamer(tokenizer, skip_prompt=True)

## Generate Functions

In this notebook, we will use the following `generate` function to support our interaction with the LLM.

```python
# Microsoft Phi-3-mini-4k-instruct default prompt template

<|system|>
{system}<|end|>
<|user|>
{question}<|end|>
<|assistant|> 
{response}<|end|>
<|user|>
{question}<|end|>
<|assistant|> 
```

In [None]:
def generate(question, system=None, history=[], model=model, max_new_tokens = 256, do_sample=False, temperature=1):
    """
    This function facilitates the generation of text responses leveraging a designated large language model (LLM) pipeline.
    It accepts a prompt as input and transmits it to the specified LLM pipeline to produce a textual output.
    The function offers comprehensive control over the generative process through the inclusion of configurable parameters and keyword arguments.

    - question (str): This parameter holds the user question or any other instruction.
    - system (str): This parameter holds contextual information to be provided to the language model for all conversations.
    - history (array, opitonal) - This parameter stores the chat history. Each tuple within the list comprises a question and the corresponding assistant response.
    - model (object): This object contains the model.
    - max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
    - do_sample (bool, optional, defaults to False) — Whether or not to use sampling ; use greedy decoding otherwise.
    - temperature (float, optional, defaults to 1.0) — The value used to modulate the next token probabilities.
    """

    if system is None:
        system = """This is a chat between a user and an artificial intelligence assistant.
        The assistant gives helpful, detailed, and polite answers to the user's questions based on the context.
        The assistant should also indicate when the answer cannot be found in the context."""

    prompt = f"<|system|>\n{system}<|end|>\n"

    # Add each example from the history to the prompt
    for prev_question, prev_response in history:
        prompt += f"<|user|>{prev_question}<|end|>\n<|assistant|>{prev_response}<|end|>\n"
    
    # Add the user_message prompt at the end
    prompt += f"<|user|>{question}<|end|>\n<|assistant|>"
    tokenized_prompt = tokenizer(prompt, return_tensors="pt").to(model.device)

    outputs = model.generate(input_ids=tokenized_prompt.input_ids,
                             max_new_tokens=max_new_tokens,
                             streamer=streamer,
                             temperature=1, 
                             do_sample=do_sample)

    # Return the decoded text from outputs
    return tokenizer.decode(outputs[0][tokenized_prompt.input_ids.shape[-1]:], skip_special_tokens=True).strip()

## Sentiment Analysis

### Movie reviews from ChatGPT

**Neutral**

In [None]:
neutral = """
The Matrix" is a fictional movie delving into intricate subjects like reality, identity, and existence.
Released in 1999, it falls within the science fiction action genre and serves as the inaugural entry in the Matrix film series.
Directed and written by the Wachowskis, the film features a cast including Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss, Hugo Weaving, and Joe Pantoliano.
"""

Let's start our exploration by instructing the model to conduct sentiment analysis, guiding us in discerning the overarching sentiment within one of our reviews. Our goal is to elicit from the model a succinct response encapsulated in a single word, indicating whether the sentiment conveyed is *positive*, *negative*, or *neutral*.

However, it's important to note that the forthcoming output from the following cell may appear peculiar or unexpected.

In [None]:
prompt = f"""
What is the overall sentiment of the following movie review: {neutral}
"""

_ = generate(prompt)

Since our objective is to elicit a single-word and exact response from the model, let's refine our prompt further to provide specific guidance regarding the desired response.

In [None]:
prompt = f"""
What is the overall sentiment of the following movie review.: {neutral}?

Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt)

The answer is shorter, but still not exact. Let's try to support the model in solving the task. Success may just depend on the use of words.

In [None]:
prompt = f"""
What is the overall sentiment of the following movie review. 
Pay attention to the fact that if there are no strong mood indicators in the text, then it is neutral.
Review: {neutral}?

Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt)

Try the **positive** and **negative** cases.

In [None]:
negative = """
The Matrix" disappoints on multiple fronts, failing to live up to its lofty expectations.
Despite its ambitious premise, the film falls short with its overindulgent reliance on flashy visuals and convoluted storytelling.
The characters lack depth, leaving audiences disconnected from their plight, while the philosophical themes feel superficially explored.
Moreover, the excessive use of slow-motion action sequences becomes tiresome rather than exhilarating.
Ultimately, "The Matrix" is a shallow and overhyped spectacle that fails to deliver on its promise of profound sci-fi storytelling.
"""

Fill the curly brackets with the negative (variable named `negative`) content.

In [None]:
prompt = f"""
What is the overall sentiment of the following movie review.: {}?

Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt)

In [None]:
positive = """
The Matrix" is an absolute masterpiece of science fiction cinema.
Directed by the visionary Wachowskis, this film immerses audiences in a mind-bending world of stunning visuals, gripping action, and profound philosophical depth.
From its groundbreaking special effects to its thought-provoking exploration of reality and free will, "The Matrix" continues to captivate viewers with its timeless brilliance.
A true cinematic landmark that remains as thrilling and relevant today as it was upon its release.
"""

Fill the curly brackets with the positive (variable named `positive`) content.

In [None]:
prompt = f"""
What is the overall sentiment of the following movie review.: {}?

Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt)

---

Our current prompt lacks the assurance of eliciting meaningful responses from the model. To enhance the reliability and trustworthiness of our outputs, let's pivot our focus to an essential technique: few-shot learning. This approach empowers us to furnish instructive examples to the model, guiding it towards the desired behavior.

## Few-shot Learning

The technique we're discussing varies based on the number of examples we provide, ranging from **one-shot** learning to **many-shot** learning and even **one-to-many-shot** learning. In each instance, a shot represents an example **prompt/response** cycle given to the model to steer its behavior.

These shots are usually included before the primary prompt for which we seek the model's response. Different models may require specific formatting of our shots to ensure the model grasps that we're presenting **prompt/response** instances for its guidance.

### Instruction Fine-tuned (Chat) Models

If you've ever explored model repositories like those on [Hugging Face](https://huggingface.co/meta-llama), you might have come across variants labeled with "-Instruct or -chat". These models, such as Llama-2-13b-chat or Llama3-8b-Instruct, undergo additional training beyond their base pretrained counterparts (like Llama-3-70b). This supplementary training, known as instruction fine-tuning, aims to enhance the models' ability to follow instructions, particularly for use in chat applications.

While pretrained Language and Logic Model base models are adept at generating text by predicting the most probable continuation given an input, this doesn't equate to effectively responding to questions or instructions.

Instruction fine-tuning involves training models on a vast array of example interactions between users and the model itself. This process enables the model to learn how to appropriately follow instructions or engage in dialogue within a chat context.

### [Microsoft Phi-3-mini-4k-instruct default prompt template](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)

The instruction fine-tuning process varies depending on the specific chat variant model being utilized. These models are trained on example interactions formatted according to a template known as the prompt template. You can usually locate the prompt template for a particular model in its documentation.

Here's an example of the NVIDIA Phi-3 prompt template. This version provided is slightly simplified, as it excludes a component that we'll delve into later in the course.

```python
<|system|>
{system}<|end|>
<|user|>
{question}<|end|>
<|assistant|> 
{response}<|end|>
<|user|>
{question}<|end|>
<|assistant|>
```

- **<|system|>system<|end|>:** Specifies the role for the following message, i.e. “system” You are a helpful AI assistant for travel tips and recommendations: The system message
- **<|user|>question<|end|>:** Specifies the role for the following message i.e. “user”
- What can you help me with?: The user message
- **<|assistant|>response<|end|>:** Ends with the assistant header, to prompt the model to start generation.

[An example dataset to the Phi-3 fine-tune with a prompt template](https://huggingface.co/datasets/ndavidson/sft-phi-3)

### One-Shot Learning

Let's briefly step away from our sentiment analysis task, and use a simple text generation prompt to explore how we can use the `generate` function to provide an instructive example, or put another way, perform **one-shot learning**.

The function `generate` provided earlier integrates a single user/model interaction enclosed within tags, following the Phi-3 prompt template.  What if we have more interactions?

In [None]:
previous_prompt = "Give me an all uppercase color that starts with the letter 'B'."
previous_response = "BLACK"

# The function expects all prompt/response 2-tuples to be in a list
instruction = [(previous_prompt, previous_response)]

# This is the "main" prompt we actually want the model to respond to
prompt = "Give me an all uppercase color that starts with the letter 'P'."

# Use `prompt_with_examples` to create a one-shot learning prompt, with a single example prepended to our main prompt
_ = generate(prompt, history=instruction)

---

## Sentiment Analysis Again

Given our previous uncertainty regarding the model's ability to accurately classify a neutral review in our sentiment analysis task, let's apply the concept of one-shot learning that we've just covered. This involves providing the model with an instructive example of how to respond to what we, as humans, perceive as a neutral review.

Here's an example of a review we want to be classified as neutral. Despite containing elements of both positive and negative sentiments about the movie review, it does not lean significantly towards either extreme.

In [None]:
neutral_1 = f"""The Matrix (1999) is a sci-fi action film that dives into the mind-bending concept of reality.  
The story follows hacker Neo, who uncovers the truth: the world he lives in is a simulated reality called the Matrix.  
The film's action sequences are impressive, with the now-famous "bullet time" effect becoming a pop culture staple.  
While the plot can get intricate, there's no denying the film's ambition and its influence on sci-fi cinema.  
This is a solid choice for those looking for a visually stylish action film with a philosophical twist."""

In [None]:
prompt = f"""
What is the overall sentiment of this review {neutral_1}?

If a review is not extremely positive, please always be neutral. 
Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt)

---

## Two-shot Learning

Frequently, when **one-shot learning** falls short of achieving the desired behavior, we can augment the process by introducing additional examples to reinforce the model's understanding.

In our scenario, alongside the neutral example we've provided the model, let's also furnish it with an example of a positive review. This additional example aims to clarify the distinction between the two sentiments, thereby aiding the model in making more accurate classifications.

In [None]:
neutral_2 = f"""The Matrix is a 1999 sci-fi action flick that asks the question: 'What is real?' 
Keanu Reeves delivers a solid performance as Neo, a computer hacker drawn into a world of rebellion against machines.  
The action choreography is innovative, and the special effects, particularly the "bullet time" sequences, hold up well even today.  
While the philosophical elements can be a bit heavy-handed, the film's exploration of reality and free will is thought-provoking.  
Overall, The Matrix is an entertaining and visually striking action film with a unique concept."""

instruction =  [(neutral_1, "neutral")]

prompt = f"""
What is the overall sentiment of this review {neutral_2}?

If a review is not extremely positive, please always be neutral. 
Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt, history=instruction)

We are now trying to remind the model directly that she has already rated this review as neutral. This is a direct solution, but the point is to understand how past instructions and actions affect the future behavior of the model.

In [None]:
neutral_2 = f"""The Matrix is a 1999 sci-fi action flick that asks the question: 'What is real?' 
Keanu Reeves delivers a solid performance as Neo, a computer hacker drawn into a world of rebellion against machines.  
The action choreography is innovative, and the special effects, particularly the "bullet time" sequences, hold up well even today.  
While the philosophical elements can be a bit heavy-handed, the film's exploration of reality and free will is thought-provoking.  
Overall, The Matrix is an entertaining and visually striking action film with a unique concept."""

instruction =  [(neutral_1, "neutral"), (neutral_2, "neutral")]

prompt = f"""
What is the overall sentiment of this review {neutral_2}?

If a review is not extremely positive, please always be neutral. 
Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt, history=instruction)

### Exercise: Perform Two-shot Learning

TODO: Perform **two-shot** learning, providing the model understanding of what is the positive sentiment exactly.

### Your Work Below

### Solution

In [None]:
instruction =  [(neutral_1, "positive"), (neutral_2, "positive")]

prompt = f"""
What is the overall sentiment of this review {neutral_2}?

If a review is not extremely positive, please always be neutral. 
Answer with only one word! You do not wort more than one word: "positive", "negative", or "neutral".
"""

_ = generate(prompt, history=instruction)

---

## Data generation

Now that our model has demonstrated proficiency in sentiment analysis, let's expand its capabilities to generate JSON objects for downstream utilization. These JSON objects will encapsulate the positive and negative aspects of a given review.

To initiate this process, we'll commence by refining our prompt. Initially, we'll prompt the model to enumerate the positive and negative points of a review separately. This iterative approach allows us to gradually enhance the model's performance in generating structured outputs.

In [None]:
review = """"The Matrix" is a groundbreaking film that pushes the boundaries of science fiction cinema.
Its innovative visual effects and action-packed sequences captivate audiences, immersing them in a thrilling dystopian world.
The complex exploration of themes such as reality and identity adds depth to the narrative, prompting thought and discussion long after the credits roll.
However, amidst its brilliance, "The Matrix" can feel overly convoluted at times, with a plot that may confuse rather than enlighten.
Some viewers may find the philosophical themes too heavy-handed, detracting from the overall enjoyment of the film.
Additionally, while the action scenes are visually stunning, they occasionally overshadow character development, leaving the protagonists feeling somewhat one-dimensional.
In conclusion, "The Matrix" is a cinematic marvel that leaves an indelible mark on the genre.
Despite its flaws, its ambition and vision set a new standard for sci-fi storytelling, ensuring its place in cinematic history."""

In [None]:
prompt = f"""
Separate the positive and negative points from the movie review: {review}
"""

_ = generate(prompt)

The model did quite well. Let's iterate now to try to get the model to produce a JSON object for us.

In [None]:
prompt = f"""
Separate the positive and negative points from the movie review, and use JSON format to output: {review}
"""

_ = generate(prompt)

We try to give more precise instructions. We need double "{{" only to string format. Finally, the double "{{" equals simple "{" in the result string.

In [None]:
prompt = f"""
Separate the positive and negative points from the movie review. Use keywords! and use following JSON format to output:

{{"Positive statements": [], "Negative statements": []}}

Review: {review}
"""

_ = generate(prompt)

TODO: Let's attempt a similar task, but this time, instruct the models to utilize other XML format.

Example format:

```XML
<?xml version="1.0" encoding="UTF-8" ?>
<root>
  <Positive/>
  <Bad/>
</root>
```

### Your Work Here

### Solution

In [None]:
prompt = f"""
Separate the positive and negative points from the movie review. Use keywords! and use following XML format to output:

<?xml version="1.0" encoding="UTF-8" ?>
<root>
  <Positive/>
  <Bad/>
</root>

Review: {review}
"""

_ = generate(prompt)


---

## Review

In this notebook, we've covered several fundamental concepts:

- **Sentiment Analysis:** This involves discerning the mood or sentiment expressed in a given text.
- **Instruction Fine-Tuning:** This process enhances a model's performance by providing tailored examples for learning.
- **Phi-3 Prompt Template:** It's a structured format designed to guide Phi-3 model responses during instruction fine-tuning.
- **Few-shot Learning:** This technique involves providing one-to-many instructive examples to a model, thereby enhancing its responses.

---

## Restart the Kernel

In [None]:
from IPython import get_ipython

get_ipython().kernel.do_shutdown(restart=True)

## <FONT COLOR="red">The notebook is licensed under the Creative [Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/). This means that you can freely copy, distribute, and modify the notebook by authors ([Balázs Harangi](https://inf.unideb.hu/dr-harangi-balazs), [András Hajdu](https://inf.unideb.hu/munkatars/4250), and [Róbert Lakatos](https://inf.unideb.hu/lakatos-robert-tanarseged)), but not for commercial purposes. Additionally, if you modify the notebook, you must cite them as the original creators and share the modified version under the same terms.
</FONT>