# Session 11 - Generative language models for zero-shot learning

In [1]:
from transformers import TFAutoModelForSeq2SeqLM, AutoTokenizer

  from .autonotebook import tqdm as notebook_tqdm


We're going to be working with ```FLAN-T5```, a text-to-text model developed by Google. ```FLAN-T5``` is based on ```T5```, which we saw in the lecture, but it has been further finetuned on a range of common text-to-text tasks. This means that it can already perform a lot of the kinds of tasks that people use generative language models for. You can read more in the paper here: [https://arxiv.org/pdf/2210.11416.pdf](https://arxiv.org/pdf/2210.11416.pdf)

We load the model from ```huggingface```. We're here using the ```Large``` version, but you can try the other sizes if you want. The ```Large``` version is already 3.4GB, and even larger models will take a long time to download and to run - but they should also see a marked performance improvement.
 
You can read more about the available models here: [https://huggingface.co/docs/transformers/model_doc/flan-t5](https://huggingface.co/docs/transformers/model_doc/flan-t5)

In [2]:
model = TFAutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-large")

Downloading (…)lve/main/config.json: 100%|██████████| 662/662 [00:00<00:00, 222kB/s]
2023-04-26 10:20:29.923816: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-04-26 10:20:29.970216: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-04-26 10:20:29.972077: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Downloading tf_model.h5: 100%|██████████| 3.40G/3.40G [00:14<00:00, 237MB/s] 
All model checkpoint layers were used when initializing TFT5ForConditionalGeneration.

All the layers of TFT5ForConditionalGeneration were initialized from the model checkpoint at google/flan-t5-large.
If your task is similar to the task the mod

We also download and initalize the pretrained tokenizer that fits with our model:

In [3]:
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-large")

Downloading (…)okenizer_config.json: 100%|██████████| 2.54k/2.54k [00:00<00:00, 754kB/s]
Downloading spiece.model: 100%|██████████| 792k/792k [00:00<00:00, 136MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 2.42M/2.42M [00:00<00:00, 11.3MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 2.20k/2.20k [00:00<00:00, 675kB/s]


## Prompt engineering

Our goal is to use the knowledge of language that FLAN-T5 has already acquired during training and to use that knowledge in different domains *without any further fine-tuning*. This is an example of what is called *zero-shot* learning.

In order for zero-shot learning to be successful, our prompts need to be carefully designed. FLAN-T5 (and similar models) are a bit less flexible than, for example, ChatGPT.

you can delete the # depending on what lanague you want to run

In [14]:
# classification
#prompt = "classify the following text as positive or negative: I absolutely hated this movie"

# translation
prompt = "translate from English to French: how old are you?"

# question answering
#prompt = "answer the following question: how is cheese made?"

# named entity recognition
#prompt = "find all location entities in this text: Ross comes from Scotland"

We then pass our text prompt to the tokenizer, defing some extra arguments such as the ```max_length``` of our input (anything longer than this will be truncated):

In [15]:
inputs = tokenizer(prompt, 
                    max_length = 200,
                    return_tensors="tf")

We then pass all of our input prompt tokens to the model and use than to generate an appropriate output from what FLAN-T5 has learned during training.

** something that can be iterated over, if something do something


In [16]:
outputs = model.generate(**inputs)

In [17]:
print(tokenizer.batch_decode(outputs, 
                            skip_special_tokens=True))

[' quelle âge avez-vous?']


Basically we're giving it task (prompt) and it print out a texxt.

We can also do this a bit more cleverly by using a single prompt plus a F-string:

In [9]:
prompt = f"classify the following text as positive or negative: {input_text}"

NameError: name 'input_text' is not defined

This means we could, for example, write functions for specific tasks:
For this we're asking Flan-T5 to classify the text as either positive or negative

In [18]:
def classifier(input_text:str) -> str:
    prompt = f"classify the following text as positive or negative: {input_text}"
    inputs = tokenizer(prompt, 
                    max_length = 200,
                    return_tensors="tf")
    outputs = model.generate(**inputs)
    print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

In [20]:
classifier("This was the best book!")

['positive']


## Tasks

Look through previous notebooks, exercises, and datasets from Language Analytics so far this semester. In small groups, try using either ```Flan-T5``` and **ChatGPT** (or both) try to solve those problems using generative language models.

So that would mean, for example:

    - Grammatical analysis
    - Named entity recognition/extraction
    - Classification
    - Topic modelling

As an illustrative example: try using Flan-T5 to perform classification on the Fake or Real News dataset. How does it perform on ground truth? Is it better or worse than the other classifiers we've seen? How about if we use ChatGPT for the same task?

Be creative!
