# Text Generation

Text generation is the NLP task of generating a coherent sequence of words, usually from a language model. The current leading methods, most notably OpenAI’s GPT-2 and GPT-3, rely on feeding tokens (words or characters) into a pre-trained language model which then uses this seed data to construct a sequence of text. AdaptNLP provides simple methods to easily fine-tune these state-of-the-art models and generate text for any use case.

Below, we'll walk through how we can use AdaptNLP's EasyTextGenerator module to generate text to complete a given String.

## Getting Started with `TextGeneration`

We'll first get started by importing the `EasyTextGenerator` class from AdaptNLP.

After that, we set some example text that we'll use further down, and then instantiate the generator.


In [1]:
from adaptnlp import EasyTextGenerator

text = "China and the U.S. will begin to"

generator = EasyTextGenerator()

### Generate with `generate(text:str, model_name_or_path: str, mini_batch_size: int, num_tokens_to_produce: int **kwargs)`
Now that we have the summarizer instantiated, we are ready to load in a model and compress the text 
with the built-in `generate()` method.  

This method takes in parameters: `text`, `model_name_or_path`, `mini_batch_size`, and `num_tokens_to_produce` as well as optional keyword arguments
from the `Transformers.PreTrainedModel.generate()` method.

Note: You can set `model_name_or_path` to any of Transformers pretrained Generation Models with Language Model heads.
    Transformers models are located at [https://huggingface.co/models](https://huggingface.co/models).  You can also pass in
    the path of a custom trained Transformers `xxxWithLMHead` model.
 
The method returns a list of Strings. For now, there is no batch processing with the text generation module.

Here is one example using the gpt2 model:

In [2]:
# Generate
generated_text = generator.generate(text, model_name_or_path="gpt2", mini_batch_size=2, num_tokens_to_produce=50)

print(generated_text)


Special tokens have been added in the vocabulary, make sure the associated word emebedding are fine-tuned or trained.
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Generating: 100%|██████████| 1/1 [00:00<00:00,  2.24it/s]

['China and the U.S. will begin to see the effects of the new sanctions on the Russian economy.\n\n"The U.S. is going to be the first to see the effects of the new sanctions," said Michael O\'Hanlon, a senior fellow at the Center for Strategic']



