# Text Generation
mkultra is set up for convenient text generation with transformers generation and sampling tools.

You can load a SoftPrompt object and simply concatenate it into your context string, where it shows up as a human-readable tag that tokenizes to the underlying number of tokens.

In [1]:
#@title Setup for Colab only
#!pip install transformers
#!pip install git+https://github.com/corolla-johnson/mkultra.git#egg=mkultra --log PIP_LOG

In [2]:
from transformers.pipelines import pipeline
from prompt_tuner.inference import LlamaSoftPromptLM
from prompt_tuner.tokenizers import LlamaSPTokenizerFast
from prompt_tuner.soft_prompt import SoftPrompt
import torch

In [3]:
# You'll need to instantiate mkultra's model and tokenizer classes.
model = GPT2SoftPromptLM.from_pretrained("NousResearch/Llama-2-7b-hf")
tokenizer = LlamaSoftPromptLM.from_pretrained("NousResearch/Llama-2-7b-hf")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

In [4]:
# SoftPrompts may be loaded in several ways.
# sp = SoftPrompt.from_json(json_str)
# sp = SoftPrompt.from_inputs_embeds(inputs_embeds)
# sp = SoftPrompt.from_tuning_model(model)
sp = SoftPrompt.from_string("With the court firmly balkanized into three distinct factions, Princess Charlotte had her work cut out for her.",
                             model=model, tokenizer=tokenizer)

#sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_NousResearch/Llama-2-7b-hf.json")

In [5]:
# Information about a SoftPrompt can be printed with
print(sp)

FromString (2021-06-20 20:27:51.615463)
Length: 22
UUID:   9361b46c-0cfd-41cd-b418-2868fa735fb6
Description:
Created from string 'With the court firmly balkanized into three distinct factions, Princess Charlotte had her work cut out for her.'


In [6]:
# Use 'get_tag_str()' to get a tag string that you can add to your context.
prompt = sp.get_tag_str() + "She"

# The addition operator also works:
# prompt = sp + "Armitage"

In [7]:
# The tag string conveniently contains the SP's name, part of its GUID,
# and a series of '@'s which represent individual soft tokens.
print(prompt)

<FromString-9361b46c-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>She


In [8]:
# The tag string tokenizes to the correct length budget your context.
prompt_len = len(tokenizer.encode(prompt))
print(f"Length of soft prompt: {len(tokenizer.encode(sp.get_tag_str()))}")
print(f"Length of full prompt: {prompt_len}")

Length of soft prompt: 22
Length of full prompt: 23


In [10]:
# Generation is as usual.
output = generator( prompt,
                    do_sample=True,
                    min_length=prompt_len+100,
                    max_length=prompt_len+100,
                    repetition_penalty=1.7,
                    top_p=0.8,
                    temperature=0.7,
                    use_cache=True,
                    return_full_text=True)

print(output[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
<FromString-9361b46c-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>She had no choice but to join the rest of her family, a decision that would not be easy. She was going through an incredible process with Princess Charlotte and she wanted everything done right for herself."You can't go back in time," said Queen Elsa as they walked down Main Street into their apartment building on Sunday night - just before midnight when it became clear who's really behind this whole thing.""I think I'm getting close enough now! Now we're all gonna get outta here!""Yes


In [11]:
# It is also fine to use generate() instead of a pipeline.
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

# This also demonstrates multiple return sequences.
output = model.generate(input_ids,
                        do_sample=True,
                        min_length=prompt_len+100,
                        max_length=prompt_len+100,
                        repetition_penalty=1.7,
                        top_p=0.8,
                        temperature=0.7,
                        use_cache=True,
                        return_full_text=True,
                        num_return_sequences=3)

print(tokenizer.decode(output[0]))
print(tokenizer.decode(output[1]))
print(tokenizer.decode(output[2]))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
<FromString-9361b46c-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>She had to fight the evil that was on her side, and in order for it not only be good but also a blessing.
The two princesses were both at war with one another as well…but even so they weren't completely out of their minds about what would happen next after this ordeal ended...and she could see some very strange things coming from them now! Even though Princess Charlotte knew how hard something like an assassination attempt must go against anyone's will or trustworthiness—she didn't
<FromString-9361b46c-@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@><@>She would have to be a little more careful in her decisions. If she were the only one, it wouldn't matter if they weren

The Court of Justice was still waiting for an answer from Princess Charlotte herself and had not even started its work yet! She wasn't going anywher

In [None]:
# If you decide to write your own sampler instead of a pipeline,
# be aware that the special token logits will be very high.
# This is accounted for when using model.generate() or pipelines,
# but you will need to exclude special tokens when sampling.

output = model(input_ids)