<a href="https://colab.research.google.com/github/aig-upf/conversational-ai-workshop/blob/main/Summer_School/2/task_2_prompt_engineering_WWW_2021.ipynb" target="_parent">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab3 Natural Language Processing
# Task 2: Prompt engineering
In this part of task 2, we will go beyond transfer learning to explore prompt engineering.

## Important resources
- [Workshop Github repo](https://github.com/utanashati/conversational-ai-workshop)
- Hugging Face Transformers library [ [Github](https://github.com/huggingface/transformers) | [Docs](https://huggingface.co/transformers/) ]
- 🤗 Transformers [GPT-2 Large](https://huggingface.co/gpt2-large)
- [Write with transformer](https://transformer.huggingface.co/doc/gpt2-large): text editor with GPT-2 large suggestions


# Approach
To "ask" a model to display a specific behaviour relevant to one of the cases below, you will try out different prompts to see what works and suggest why.

## NB
In this notebook, we'll be using a GPT-2 large model, which is not as good as the latest GPT-3 in terms of prompt engineering but small enough to load on a free Colab instance. To explore GPT-3, consider https://play.aidungeon.io, where it's possible not only to play the game itself but experiment with GPT-3 prompts. However, **it's not the same** GPT-3 as the one available via the [OpenAI API beta](https://beta.openai.com).

# Setting things up

In [None]:
!pip install transformers

Collecting transformers
[?25l  Downloading https://files.pythonhosted.org/packages/d8/b2/57495b5309f09fa501866e225c84532d1fd89536ea62406b2181933fb418/transformers-4.5.1-py3-none-any.whl (2.1MB)
[K     |████████████████████████████████| 2.1MB 19.7MB/s 
Collecting sacremoses
[?25l  Downloading https://files.pythonhosted.org/packages/08/cd/342e584ee544d044fb573ae697404ce22ede086c9e87ce5960772084cad0/sacremoses-0.0.44.tar.gz (862kB)
[K     |████████████████████████████████| 870kB 41.3MB/s 
Collecting tokenizers<0.11,>=0.10.1
[?25l  Downloading https://files.pythonhosted.org/packages/ae/04/5b870f26a858552025a62f1649c20d29d2672c02ff3c3fb4c688ca46467a/tokenizers-0.10.2-cp37-cp37m-manylinux2010_x86_64.whl (3.3MB)
[K     |████████████████████████████████| 3.3MB 41.6MB/s 
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
  Created wheel for sacremoses: filename=sacremoses-0.0.44-cp37-none-any.whl size=886084 sha256=79295a7b9e

In [None]:
from transformers import pipeline, set_seed

In [None]:
# Set a randomizer seed for some kind of reproducibility
set_seed(42)

In [None]:
# Create a text generator
generator = pipeline('text-generation', model='gpt2-large')

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=764.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=3247202234.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1042301.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1355256.0, style=ProgressStyle(descript…




In [None]:
# Create a function to simplify the generation
def generate(generator, text, **kwargs):
    for o in generator(text, **kwargs):
        print(o['generated_text'])
        print()

# Prompt engineering cases

Now let's have som fun! In this series of cases, you will be asked to come up with prompts for a specific task. Keep [this](https://www.gwern.net/GPT-3#effective-prompt-programming) in mind:

> *Anthropomorphize your prompts.* There is no substitute for testing out a number of prompts to see what different completions they elicit and to reverse-engineer what kind of text GPT-3 “thinks” a prompt came from, which may not be what you intend and assume (after all, GPT-3 just sees the few words of the prompt—it’s no more a telepath than you are). If you ask it a question to test its commonsense reasoning like “how many eyes does a horse have” and it starts completing with a knock-knock joke, you need to rethink your prompt! Does it spit out completions that look like it’s thinking but it’s executing the wrong algorithm, or it falls back to copying parts of the input? Then one may need to few-shot it by providing examples to guide it to one of several possible things to do. One should also keep in mind the importance of sampling parameters, and whether one is looking for a single correct answer (so low temp with BO = 1 if compute-limited, or high temp and BO = 20 if possible) or if one is trying for creative answers (high temp with repetition penalties).

Pick 3 cases that you're interested in the most, and give it a try!

## Antonyms
Try to come up with a prompt that would generate an antonym to a given word.
You can find a list of antonym pairs [here](https://examples.yourdictionary.com/examples-of-antonyms-synonyms-and-homonyms.html).

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Synonyms
Now try a similar thing with synonyms.
A list of synonym pairs is [on the page after antonyms](https://examples.yourdictionary.com/examples-of-antonyms-synonyms-and-homonyms.html).

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Dialogues
In this case, you need to come up with a prompt to generate a dialogue. You are free to choose any topic/style, as long as it sounds realistic ;)

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Article generation
Now make up a fake name or a beginning to a hypothetical encyclopedia-style article.

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Common knowledge check
Try to come up with a prompt that would yield a true known fact about something.

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Summarization
In this one, pick a short text (~150 words) and "ask" the model to summarize it.

In [None]:
prompt = \
"""In 1950, Alan Turing proposed his famous test to distinguish humans from machines. At the time, he probably didn't think workshop participants would attempt to beat his test with billion parameter models in real-time. But here we are!
This tutorial has two parts: In the first half, we will take a deep dive into conversational AI. By mastering a series of small tasks, you will discover what makes state-of-the-art models like GPT-3 so powerful and how you can build your own models.
In the second half, we will run a challenge in which you will work on building the most life-like bot possible and test it in a real-life setting. You will also have the chance to evaluate other participants’ bots - but with a twist! Every now and then you will actually chat with a real human. Will you be able to tell?
TL;DR: """
generate(generator, prompt, max_length=200, num_return_sequences=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In 1950, Alan Turing proposed his famous test to distinguish humans from machines. At the time, he probably didn't think workshop participants would attempt to beat his test with billion parameter models in real-time. But here we are!
This tutorial has two parts: In the first half, we will take a deep dive into conversational AI. By mastering a series of small tasks, you will discover what makes state-of-the-art models like GPT-3 so powerful and how you can build your own models.
In the second half, we will run a challenge in which you will work on building the most life-like bot possible and test it in a real-life setting. You will also have the chance to evaluate other participants’ bots - but with a twist! Every now and then you will actually chat with a real human. Will you be able to tell?
TL;DR: In the first half, you will learn to be more aware of natural language

In 1950, Alan Turing proposed his famous test to distinguish humans from machines. At the time, he probably didn't 

## Correcting grammar errors
Come up with a prompt that corrects grammar errors in a sentence (like 'I has' -> 'I have').

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)

## Filling in the blanks

In [None]:
prompt = \
""""""
generate(generator, prompt, max_length=100, num_return_sequences=5)