# World Info Tuning
mkultra provides a specialized trainer for generating soft World Info*.

This trainer takes datapoints in a "call and response" format. The "call" is a fixed prompt (e.g. "A detailed description of the Spider Tank:") that should elicit the "response" (a few paragraphs of information about the Spider Tank). Multiple responses can be specified for each call, but loss is only calculated against the responses.

The trainer will optimize a soft prompt such that the call is followed by the response. The combined call and response is moved randomly within the context window to prevent overfitting, and the gap is filled with random tokens.

This setup may work for other tasks, but if your needs are significantly different then it is recommended to see tuning_finetune.ipynb for tips on rolling your own.

*Short prompt infixes for AI text adventures and writing.
Describes a character, subject, artistic direction, etc. with the intent of steering the output to consistently make use of the details in question.

In [None]:
#@title Setup for Colab only
!pip install transformers
#!pip install git+git://github.com/corolla-johnson/mkultra.git#egg=mkultra

In [None]:
from transformers.pipelines import pipeline
from mkultra.models.tuning import GPT2PromptTuningLM
from mkultra.tokenizers import GPT2TokenizerFast
from mkultra.soft_prompt import SoftPrompt
from mkultra.trainers import DescriptionTrainer
from torch.optim import Adafactor

In [None]:
# Use an mkultra prompt tuning LM and a standard tokenizer.
model = GPT2PromptTuningLM.from_pretrained("gpt2")
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

In [None]:
# Set the length of the soft prompt.
# The paper does not recomment going over 100 tokens.
model.initialize_soft_prompt(n_tokens=20)

block = ("Description of dog:", ["Dog is a good boy", "Dog loves pets"])



# Multiple answers can be written for a single question.
# answers = [[]]

# The components are assembled like this:
# |soft prompt|random tokens|question|answer


In [None]:
# WorldInfoTrainer will help prevent overfitting by moving the target
# up and down the context, filling the rest with random tokens.
# Also, only the 'target' string is used to calculate loss.
# Note that very low loss may not be acceptable for the description task.
trainer = WorldInfoTrainer(
    model=model
    optimizer=Adafactor(params=model.get_soft_params())
    blocks = [block],
    max_spacing=0, # Setting this to 0 for illustrative purposes
    min_loss=0.4
)

trainer.train()

## Best Practices for World Info
Unlike the finetuning example, this is fresh territory.
Here are potential techniques to explore:
- Define bad_words_ids (e.g. square brackets) and surround the response in them to keep the output from repeating the training data verbatim.
- Rather than describing the subject in grammatically correct sentences, try a word cloud approach.
- Rather than describing the subject, provide implicit examples of to write about it ("John doffed his black top hat and cleaned it with his white handkerchief")
- Consider the use of the subject's proper noun vs its pronouns. (Should you start every sentence with "John Doe", or a mix of "John" and "He"?)
- Definitely use the min_loss parameter to arrest the tuning.