Let's set up our environment to run the Hugging Face version of T5 and feed it a small snippet of text to see what kind of summary it produces.  Note that we could not feed the entire Wikipedia article we used above into T5.

In [1]:
import tensorflow as tf
from transformers import T5Tokenizer, TFT5Model, TFT5ForConditionalGeneration

2023-10-10 19:52:14.153034: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Here is the text that I will demonstrate

Now let's look at an extractive question answering example.  We'll need to feed the model a context paragraph and a question.  The T5 model was pre-trained on the SQUAD dataset so it knows how to identify and extract the answer span. Note that we already have the prompt in the respective texts.

In [6]:
t5_model = TFT5ForConditionalGeneration.from_pretrained('t5-base') #also t5-small and t5-large
t5_tokenizer = T5Tokenizer.from_pretrained('t5-base')

t5_model.summary()

All PyTorch model weights were used when initializing TFT5ForConditionalGeneration.

All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for y

Model: "tft5_for_conditional_generation"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 shared (Embedding)          multiple                  24674304  
                                                                 
 encoder (TFT5MainLayer)     multiple                  109628544 
                                                                 
 decoder (TFT5MainLayer)     multiple                  137949312 
                                                                 
Total params: 222903552 (850.31 MB)
Trainable params: 222903552 (850.31 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [2]:
t5_context_text = """context: Hyperbaric (high-pressure) medicine uses special oxygen 
chambers to increase the partial pressure of O 2 around the patient and, when needed, 
the medical staff. Carbon monoxide poisoning, gas gangrene, and decompression sickness 
(the ’bends’) are sometimes treated using these devices. Increased O 2 concentration 
in the lungs helps to displace carbon monoxide from the heme group of hemoglobin. 
Oxygen gas is poisonous to the anaerobic bacteria that cause gas gangrene, so increasing 
its partial pressure helps kill them. Decompression sickness occurs in divers who 
decompress too quickly after a dive, resulting in bubbles of inert gas, mostly nitrogen 
and helium, forming in their blood. Increasing the pressure of O 2 as soon as possible 
is part of the treatment."""

In [3]:
t5_question_text = """question: What does increased oxygen concentrations in the patient’s
lungs displace? """

In [4]:
t5_qa_input_text = t5_question_text + t5_context_text

Here's the output

In [7]:
t5_inputs = t5_tokenizer([t5_qa_input_text], return_tensors='tf')

t5_summary_ids = t5_model.generate(t5_inputs['input_ids'])

print([t5_tokenizer.decode(g, skip_special_tokens=True,
                           clean_up_tokenization_spaces=False) for g in t5_summary_ids])

2023-10-10 19:55:18.800513: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fcaa033c6f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-10-10 19:55:18.800539: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2023-10-10 19:55:18.801066: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-10 19:55:18.809671: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:255] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-10-10 19:55:18.846257: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


['carbon monoxide']
