# Text Generation Apps

Link to TDS [article](https://towardsdatascience.com/build-a-text-generator-web-app-in-under-50-lines-of-python-9b63d47edabb)

#### This notebook contains two variations of the text generation application app. 
- Basic Application
- Advanced Application with added probabilities 

## Part 1: Setting up the Model

In [1]:
pip install transformers 

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
# Loading model dependencies
import numpy as np
import torch
import torch.nn.functional as F
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from random import choice

In [3]:
# Downloading the model
tok = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

### Writing the prediction function

In [4]:
def get_pred(text, model, tok, p=0.7):
    # 1. tokenize/encode the input text
    input_ids = torch.tensor(tok.encode(text)).unsqueeze(0)
    # 2. extract the logits vector for the next possible token
    logits = model(input_ids)[0][:, -1]
    # 3. apply softmax to the logits so we have the probabilities of each word add up to 1
    probs = F.softmax(logits, dim=-1).squeeze()
    # 4. sort the probabilities in descending order 
    idxs = torch.argsort(probs, descending=True)
    # 5. loop through the ordered probabilities until they sum up to p. Then, randomly choose an option
    res, cumsum = [], 0.
    for idx in idxs:
        res.append(idx)
        cumsum += probs[idx]
        if cumsum > p:
            pred_idx = idxs.new_tensor([choice(res)])
            break
    # 6. convert the chosen prediction into text
    pred = tok.convert_ids_to_tokens(int(pred_idx))
    return tok.convert_tokens_to_string(pred)

In [5]:
# testing it out
get_pred("wow this tutorial is", model, tok, p = 0.7) 

' no'

In [6]:
import panel as pn
pn.extension() # loading panel's extension for jupyter compatibility

## Advanced Application

Now, we will built upon the basic application and add some more features, namely the ability for the user to select a token out a list of most probably vectors. This is standalone from the code of the previous application. We are utilizing the model and tokenizer ofcourse. 

In [7]:
# Redefining the predictions function since we now want to return a list of most likely next tokens 
# instead of a single token. Also, we want to return the proabilities in order to return them to 
# the user as well.

def get_preds(text, model, tok, p=0.7):
    input_ids = torch.tensor(tok.encode(text)).unsqueeze(0)
    logits = model(input_ids)[0][:, -1]
    probs = F.softmax(logits, dim=-1).squeeze()
    idxs = torch.argsort(probs, descending=True)
    res,pred_probs = [],[]
    for idx in idxs:
        res.append(idx)
        pred_probs.append(probs[idx])
        if sum(pred_probs) > p:
            pred_idxs = [idxs.new_tensor([p]) for p in res]
            break
    preds = [tok.convert_ids_to_tokens(int(p)) for p in pred_idxs]
    return [tok.convert_tokens_to_string(pred) for pred in preds], pred_probs

In [8]:
text_input = pn.widgets.TextInput(value="",width=400)
generated_text = pn.pane.Markdown(text_input.value)
start_button = pn.widgets.Button(name="Generate",button_type="primary")

# creating radio buttons for the token options along with probabilities 
options = [""]
radio_button = pn.widgets.RadioButtonGroup(options=options,height=30,width=500)
prob_button = pn.widgets.RadioButtonGroup(options=options,height=30,width = 500)

# since the prob_button is only to inform the user of the probabilities, we don't need to be enabled
prob_button.disabled=True

# new click callback function which handles the updation of the radio buttons
def click_cb(event):
    if radio_button.value == "<|endoftext|>": 
        start_button.disabled = True
        return None
    generated_text.object += radio_button.value
    preds, probs = get_preds(generated_text.object, model, tok)
    radio_button.options = preds[:10]
    radio_button.value = radio_button.options[np.random.randint(0,len(radio_button.options))]
    prob_button.options = [str(round(float(i),2)) for i in probs[:10]]

start_button.on_click(click_cb)

# call back function in case the text input changes. Essentially, we need to reset our options. 
def text_change_cb(event):
    generated_text.object = event.new
    start_button.disabled = False
    radio_button.options = options
    radio_button.value = radio_button.options[0]
    prob_button.options = options

# tying the callback function to the text_input widget
text_input.param.watch(text_change_cb, 'value')

# preparing the app
app = pn.Column(text_input,radio_button,prob_button,start_button,generated_text)

In [9]:
# Panel spacer object to center our title
h_spacer = pn.layout.HSpacer()

# defining the title and description 
title = pn.pane.Markdown("# **Text Generator**")
desc = pn.pane.HTML("<i>Welcome to the text generator! In order to get started, simply enter some starting input text below, click generate a few times and watch it go! You can also choose to select which token gets chosen using the radio buttons. Probabilities for each of which can be seen underneath. Give it a shot!</i>")

In [10]:
# setting up the final app
final_app = pn.Column(pn.Row(h_spacer,title,h_spacer), desc ,app); final_app