# Problem Statement:

Freeform text generation can be beneficial to marketers when creating long-form articles, blog posts, emails, and much more. The possibilities are endless with freeform text generation that uses natural language. With freeform text generation, you provide an input like a phrase, sentence, or paragraph. Next, the user allows the output to be controlled. As a result, the NLG system produces an output as a continuation of the input.


# Approach: 

Deep learning models have been used for the generation of text. Some of the
approaches are Generative Adversarial Networks (GANs) with the Diehl-Martinez-
Kamalu (DMK) loss function that forces the output to include specific keywords and Domain-Constrained Keyword Generator (DCKG) to generate diversified keywords with domain constraints

# Results: 

The system should be able to generate new text (more than 750 words) based
on the context sentences provided by the user which will be around 2-3 sentences. You have to build a solution that should able to generate text from a given context or reference (around 2-3 sentences). The output size of the generated test should be more than 750 words.

In [2]:
! pip install -q gradio
! pip install -q git+https://github.com/huggingface/transformers.git

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.2/14.2 MB[0m [31m75.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.9/56.9 KB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.5/84.5 KB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.5/71.5 KB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m107.0/107.0 KB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m140.7/140.7 KB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.7/45.7 KB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.2/56.2 KB[0m [31m7.2 MB/s[0m

In [4]:
# importing libraries

import gradio as gr
import tensorflow as tf
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer


Loading Model

In [5]:
tokenizer =GPT2Tokenizer.from_pretrained('gpt2')
model = TFGPT2LMHeadModel.from_pretrained('gpt2', pad_token_id = tokenizer.eos_token_id)

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading (…)"tf_model.h5";:   0%|          | 0.00/498M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [6]:
def generate_text(inp):
  input_ids=tokenizer.encode(inp,return_tensors='tf')
  beam_output=model.generate(input_ids,max_length=100,num_beams=5,no_repeat_ngram_size=2,early_stopping=True)
  output=tokenizer.decode(beam_output[0],skip_special_tokens=True,clean_up_tokenization_spaces=True)
  return ".".join(output.split(".")[:-1]) + "."

Creating the interface

In [None]:
output_text=gr.outputs.Textbox()
gr.Interface(generate_text,"textbox",output_text,title="Text Generator",
             description="GPT 2 is an unsupervised language model that \
             can generate coherent text. Go Ahead & Input a Sentence and see what it generates \
             Takes around 20s to run").launch(debug=True)



Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>