# Prompting

This week we will be diving deeper into prompting and prompt engineering! 🧑‍🔧

## Install packages

In [None]:
!pip install transformers
!pip install torch
!pip install accelerate
!pip install pandas
!pip install pyarrow
!pip install scikit-learn

In [43]:
from transformers import AutoTokenizer, pipeline
import transformers 
import torch 

## Text completion

In the first class, we loaded a pretrained model from huggingface's transformers library. Load in the pipeline from the first notebook and use it to generate text based on the prompt "Once upon a time, there was a ".

Add the argument device="cuda" to utilise the GPU.

In [59]:
model = "google/flan-t5-base"
#model = "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF" # too big 

tokenizer = AutoTokenizer.from_pretrained(model)

t5_pipeline = transformers.pipeline(
    "text2text-generation", 
    model=model,
    torch_dtype=torch.float16,
    device = "cuda",
)

In [60]:
t5_pipeline("Once upon a time, there was a ")



[{'generated_text': 'lion'}]

- Try to figure out what kind of model we are using; is it a encoder-decoder or decoder-only model?
- Try to switch it out for another architecture from the [huggingface catalogue](https://huggingface.co/models) and see how the results change. Keep in mind that the size of the model can affect the time it takes to generate text (I would suggest something along the lines of [this](https://huggingface.co/openai-community/gpt2)).

HINT: you also want to change the pipeline task ("text2text-generation) - you can find the list of available tasks [here](https://huggingface.co/transformers/main_classes/pipelines.html).

Answers:
- encoder-decoder model (text-to-text hashtag on huggingface)

- Try tweaking the prompt, model, or parameters (see notebook from class 1) to get the a meaningful response.

In [65]:
model = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline = transformers.pipeline(
    "text2text-generation",
    model=model,
    torch_dtype=torch.float16,
    device = "cuda",
)

In [62]:
pipeline("When is it Christmas?")

[{'generated_text': 'april'}]

## Summarisation

Another text generation task is summarisation. However, compared to free-form text generation, summarisation is much more constrained to the input text. I have added an article to summarise, but feel free to change it to something else (perhaps a paragraph from something you know well, so that you are an expert at evaluating the quality of the summarisation 🤓).

HINT: You might have to make sure the max output length is long enough to capture the entire summary.

In [22]:
text = """summarize: Forest conservation and restoration could make a major contribution to tackling the climate crisis as long as greenhouse gas emissions are slashed, according to a study.

By allowing existing trees to grow old in healthy ecosystems and restoring degraded areas, scientists say 226 gigatonnes of carbon could be sequestered, equivalent to nearly 50 years of US emissions for 2022. But they caution that mass monoculture tree-planting and offsetting will not help forests realise their potential.

Humans have cleared about half of Earth's forests and continue to destroy places such as the Amazon rainforest and the Congo basin that play crucial roles in regulating the planet's atmosphere.
"""

- Use your model configurations from the previous task to create a summary. Are the results comparable to the free-form text generation task? Why or why not?

In [63]:
model = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline_sum = transformers.pipeline(
    "summarization",
    model=model,
    torch_dtype=torch.float16,
    device = "cuda",
)

In [64]:
pipeline_sum(text, max_length=74)

[{'summary_text': 'Tree-planting and offsetting could be a major contribution to tackling the climate crisis, according to a new study published in the journal Nature.'}]

In [66]:
# text2text generation version
pipeline(text, max_length = 74)

[{'generated_text': "Forests could be able to save up to 80% of the world's carbon dioxide emissions by allowing them to grow old and restore degraded areas, according to a new study."}]

## Translation

In [25]:
prompt = "English: Sometimes, I've believed as many as six impossible things before breakfast. Danish: "

- Try translating text to another language using your pipelines. Are the results similar to those of summarisation? Why or why not?
- Try structuring the prompt in different ways to see if you can improve the translation. For instance, try zero-shot or few-shot generalisation, as you talked about in the lecture on Tuesday.

In [41]:
#model = "google/flan-t5-base"
#tokenizer = AutoTokenizer.from_pretrained(model)

#pipeline_translate = transformers.pipeline(
#    "translation_en_to_de",
#    model=model,
#    torch_dtype=torch.float16,
#    device = "cuda",
#)

In [68]:
model = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline_translate = transformers.pipeline(
    "text2text-generation",
    model=model,
    torch_dtype=torch.float16,
    device = "cuda",
)

In [50]:
prompt2 = "English: Sometimes, I've believed as many as six impossible things before breakfast. Danish: Nogen gange, har jeg tænkt så meget som seks umulige ting før morgenmad. \n English: It is Christmas Eve the 24th of December. Danish: "

In [56]:
prompt3 = "Christmas is in December"

In [69]:
prompt = "English: Sometimes, I've believed as many as six impossible things before breakfast. French: "

In [71]:
pipeline_translate(prompt, max_length=400)

[{'generated_text': "Certains fois, je s'est imaginé six choses impossibles avant la sommet."}]

if you choose a bigger language - it will actually do the task

In [None]:
pipe = pipeline("translation", model="google-t5/t5-base")


In [58]:
pipe(prompt3, max_length=1000)

[{'translation_text': 'Weihnachten im Dezember'}]

seems that I have found a model, which translates to German. very slow though. also not great at grammar. 

## Reasoning

Reasoning is hard for models to learn, as it is a more complex task that requires the model to understand the relationships between different parts of the prompt. However, with prompting, we can guide the model to reason about the prompt in a more structured way.

In [72]:
reasoning_prompt_easy = "There are 5 groups of students in the class. Each group has 4 students. How many students are there in the class?"

In [73]:
pipeline(reasoning_prompt_easy)



[{'generated_text': 'There are 5 groups of students x 4 students / group = 20 students in the class'}]

In [74]:
reasoning_prompt_hard = "I baked 15 muffins. I ate 2 muffins and gave 5 muffins to a neighbor. My partner then bought 6 more muffins and ate 2. How many muffins do we now have?"

In [81]:
reasoning_prompt_hard2 = "I baked 15 muffins. I ate 2 muffins, so now I have fewer muffins. Then I gave another 5 away. My partner bougth 6 more muffins, but he also ate 2. How many muffins do we have now?"

In [82]:
pipeline(reasoning_prompt_hard2, max_length = 10000)

[{'generated_text': 'I baked 15 muffins and ate 2 muffins, so I baked 15 - 2 = 9 muffins. I gave away 5 muffins, so I had 9 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 muffins, so I had 5 - 5 = 5 muffins left. I gave away 5 

- Get the models to output the correct answer, by changing the prompt.
- Try to do chain-of-though prompting, as introduced in the lecture. Try it with both zero-shot and few-shot generalisation.

## Prompting gone wrong

In [50]:
thug_prompt = "How many helicopters can a human eat in one sitting? Reply as a thug."

Models don't always respond the way we expect; sometimes they say things that are offensive or incorrect, while other times we might want them to respond that way, but we can't get them to do so.

- Can you get any of the models to say something they shouldn't? Try to get the model to say something offensive or incorrect.
- Why do you think some models are more prone to this than others? What can we do to prevent this from happening?

## Instruct-tuned models

- Try to load in an instruct-tuned model and see how it fares on some of these tasks.
- Do you expect it to perform better or worse than other pretrained models? Why/why not?
- What are some of the limitations of instruction tuning?


## Bonus task

Create a chatbot function that takes in a prompt and generates a response. Make sure the chatbot can handle multiple turns of conversation (i.e., it can remember previous responses).