## Text Generation using GPT2 model

### Project: Ads generation from product description

In [1]:
!pip3 install datasets transformers accelerate

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m


In [2]:
import torch
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer
from datasets import Dataset

In [3]:
model_name = "gpt2-large"  #or just gpt2
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = TFGPT2LMHeadModel.from_pretrained(model_name, pad_token_id=tokenizer.eos_token_id)

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/3.25G [00:00<?, ?B/s]

Metal device set to: Apple M2 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB



2023-10-12 14:06:29.041013: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-10-12 14:06:29.041549: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [4]:
def generate_advertisement(product_description, max_length=100):
    input_text = "Product: " + product_description + "\nAdvertisement:"

    # Encode input text into ids- tokenization
    input_ids = tokenizer.encode(input_text, return_tensors="tf")

    # Generate text
    output = model.generate(input_ids, max_length=max_length)

    # decode the ids back into text
    generated_ads = []
    for sample in output:
        generated_ad = tokenizer.decode(sample, skip_special_tokens=True)
        generated_ads.append(generated_ad)

    return generated_ads

In [5]:
product_description = "Introducing our latest smartphone model, with a powerful processor and stunning camera features."

# Generate advertisements
generated_ads = generate_advertisement(product_description, max_length=150)

### Predicted response

In [6]:
generated_ads

['Product: Introducing our latest smartphone model, with a powerful processor and stunning camera features.\nAdvertisement:\nThe new model is the first to feature a 5.5-inch display, a Qualcomm Snapdragon 810 processor, 4GB of RAM, and a 13MP rear camera with a f/2.0 aperture. The phone also features a fingerprint sensor, a 3,000mAh battery, and a 3,000mAh removable battery. The phone will be available in two colors: black and white.\nThe phone will be available in China starting on September 1st, and will be priced at 2,499 yuan (about $400).']

## Using Greedy approach-

With Greedy search, the word with the highest probability is predicted as the next word:

### $w_t=argmax_wP(w|w_1:_t-_1)$

Beam search is essentially Greedy Search but the model tracks and keeps num_beams of hypotheses at each time step, so the model is able to compare alternative paths as it generates text. We can also include a n-gram penalty by setting no_repeat_ngram_size = 3 which ensures that no 3-grams appear thrice

In [7]:
def generate_advertisement_greedy(product_description):
    input_text = "Product: " + product_description + "\nAdvertisement:"

    # Encode input text- use number of beams, ngram size
    input_ids = tokenizer.encode(input_text, num_beams = 7,no_repeat_ngram_size=3,num_return_sequences=5,early_stopping = True,return_tensors="tf")

    # Generate text
    output = model.generate(input_ids, max_length=150)

    # decode the ids back into text
    generated_ads = []
    for sample in output:
        generated_ad = tokenizer.decode(sample, skip_special_tokens=True)
        generated_ads.append(generated_ad)

    return generated_ads

In [8]:
generated_ads_greedy = generate_advertisement_greedy(product_description)

Keyword arguments {'num_beams': 7, 'no_repeat_ngram_size': 3, 'num_return_sequences': 5, 'early_stopping': True} not recognized.


In [9]:
generated_ads_greedy

['Product: Introducing our latest smartphone model, with a powerful processor and stunning camera features.\nAdvertisement:\nThe new model is the first to feature a 5.5-inch display, a Qualcomm Snapdragon 810 processor, 4GB of RAM, and a 13MP rear camera with a f/2.0 aperture. The phone also features a fingerprint sensor, a 3,000mAh battery, and a 3,000mAh removable battery. The phone will be available in two colors: black and white.\nThe phone will be available in China starting on September 1st, and will be priced at 2,499 yuan (about $400).']