# Text Generation
mkultra is set up for convenient text generation with transformers generation and sampling tools.

You can load a SoftPrompt object and simply concatenate it into your context string, where it shows up as a human-readable tag that tokenizes to the underlying number of tokens.

In [1]:
#@title Setup for Colab only
#!pip install transformers
#!pip install git+https://github.com/corolla-johnson/mkultra.git#egg=mkultra --log PIP_LOG

In [2]:
from transformers.pipelines import pipeline
from mkultra.inference import GPT2SoftPromptLM
from mkultra.tokenizers import GPT2SPTokenizerFast
from mkultra.soft_prompt import SoftPrompt
import torch

In [3]:
# You'll need to instantiate mkultra's model and tokenizer classes.
model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

In [4]:
# SoftPrompts may be loaded in several ways.
# sp = SoftPrompt.from_json(json_str)
# sp = SoftPrompt.from_inputs_embeds(inputs_embeds)
# sp = SoftPrompt.from_tuning_model(model)
# sp = SoftPrompt.from_string("With the court firmly balkanized into three distinct factions, Princess Charlotte had her work cut out for her.",
#                             model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")

In [5]:
# Information about a SoftPrompt can be printed with
print(sp)

neuromancer-gpt2 (2021-06-10 02:46:17.708191)
Length: 100
UUID:   299ce582-5a2e-427d-a8e8-8c17aef3b458
Description:
Trained for 30 epochs on the full text of William Gibson's 'Neuromancer'


In [6]:
# Use 'get_tag_str()' to get a tag string that you can add to your context.
prompt = sp.get_tag_str() + "Armitage"

# The addition operator also works:
# prompt = sp + "Armitage"

In [7]:
# The tag string conveniently contains the SP's name, part of its GUID,
# and a series of '@'s which represent individual soft tokens.
print(prompt)

<neuromancer-gpt2-299ce582-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>Armitage


In [8]:
# The tag string tokenizes to the correct length budget your context.
prompt_len = len(tokenizer.encode(prompt))
print(f"Length of soft prompt: {len(tokenizer.encode(sp.get_tag_str()))}")
print(f"Length of full prompt: {prompt_len}")

Length of soft prompt: 100
Length of full prompt: 103


In [9]:
# Generation is as usual.
output = generator( prompt,
                    do_sample=True,
                    min_length=prompt_len+100,
                    max_length=prompt_len+100,
                    repetition_penalty=1.7,
                    top_p=0.8,
                    temperature=0.7,
                    use_cache=True,
                    return_full_text=True)

print(output[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
<neuromancer-gpt2-299ce582-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>Armitage was a sort of black man with long, white hair and the sharpest bangs that made him look like he'd been in jail. The only thing missing from his visor were eyes: His face had faded into grey as if it hadn't grown old enough to be recognizable by its coloration; there's no trace on any other hand.""I've seen you before," Case said softly over her shoulder, "but this is where I saw your first death — what do we call my


In [19]:
# It is also fine to use generate() instead of a pipeline.
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

output = model.generate(input_ids,
                        do_sample=True,
                        min_length=prompt_len+100,
                        max_length=prompt_len+100,
                        repetition_penalty=1.7,
                        top_p=0.8,
                        temperature=0.7,
                        use_cache=True,
                        return_full_text=True)

print(tokenizer.decode(output[0]))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
<neuromancer-gpt2-299ce582-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>Armitage's the best he'd ever seen. "I don't know how it got here, but I've been doing this for five years.""You mean like a hundred? You just started getting into trouble with them?" The man was talking about his grandfather in front of him; they were both standing next to each other at one end and their eyes met on Case as if two people had stared down from above: an old woman wearing jeans that she held tightly against her back while leaning forward


In [None]:
# If you decide to write your own sampler instead of a pipeline,
# be aware that the special token logits will be very high.
# This is accounted for when using model.generate() or pipelines,
# but you will need to exclude special tokens when sampling.

output = model(input_ids)