<a href="https://colab.research.google.com/github/jeffbinder/promptarray/blob/main/PromptArray.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PromptArray: A Prompting Language for Neural Text Generators

This notebook allows you to experiment with PromptArray, a system for controlling the output of neural text generators. Text generators are usually controlled by prompts: input text that indicates what the model should do. For instance, if you want a description of a species of snake, you can enter the following:

> Scientists recently discovered a new species of snake. Here is a description of it:

The machine will then generate a completion of this text, which usually consists of something like the desired description. However, engineering effective prompts is not always straightforward; in particular, it is very hard to design a prompt that effectively tells the generator *not* to do something.

PromptArray allows you to include operators in your prompt that manipulate the machine's predictions. At present, these five operators are available:

| Operator | Meaning |
| --- | --- |
| A&B | A and B |
| A\|B | A or B |
| A^B | A and not B |
| A/B | A more than B |
| A~B | A as opposed to B |

These operators allow you to construct arrays of multiple prompt variants and merge their output. Don't care if it's a snake or lizard? Write "a new species of {snake|lizard}." Want the species to combine the qualities of a snake and a bird? Write "{snake&bird}." Want to make sure the snake is not venomous? Write "{~ venomous} snake," which is far more effective than simply writing "non-venomous snake." You can combine multiple operators, using {} brackets to group text.

A detailed explanation of this method, along with the code, is available [here](https://github.com/jeffbinder/promptarray).

In [None]:
#@title Setup and Model Selection

#@markdown Using a GPU is recommended, but you will first need to connect to an instance that has one. The larger models will not work unless your instance has enough RAM. Note that the XL model will take a while to load.

model = "gpt2-large" #@param ['gpt2', 'gpt2-medium', 'gpt2-large', 'gpt2-xl']
use_gpu = True #@param {type:"boolean"}

%cd /content
!rm -rf promptarray
!git clone https://github.com/jeffbinder/promptarray
%cd promptarray/

!pip install lark
!pip install sentencepiece
!pip install git+https://github.com/huggingface/transformers.git@61f64262692ac7dc90e2e0bdeb7e79d9cd607a66

import textwrap
from generation_utils import *

model_type = model.split('-')[0]
model_name_or_path = model
device = 'cuda' if use_gpu else 'cpu'

# Initialize the model and tokenizer
try:
    model_class, tokenizer_class = MODEL_CLASSES[model_type]
except KeyError:
    raise KeyError("the model {} you specified is not supported. You are welcome to add it and open a PR :)")
tokenizer = tokenizer_class.from_pretrained(model_name_or_path)
model = model_class.from_pretrained(model_name_or_path)
model.to(device)
model.eval()

print("Ready!")

In [None]:
#@title Prompt Entry

prompt_text = "Scientists recently discovered a new species of {serpent~snake}. Here is a description of it:" #@param {type:"string"}

output_length = 100 #@param {type:"slider", min:1, max:500, step:1}
num_return_sequences = 3 #@param {type:"slider", min:1, max:10, step:1}

#@markdown ___

do_sample = True #@param {type:"boolean"}
temperature = 0.6 #@param {type:"slider", min:0, max:1, step:0.01}
top_k = 5 #@param {type:"slider", min:0, max:20, step:1}
top_p = 0.8 #@param {type:"slider", min:0, max:1, step:0.01}
repetition_penalty = 1.5 #@param {type:"slider", min:0, max:5, step:0.1}
overlap_factor = 0.25 #@param {type:"slider", min:0, max:1, step:0.01}
show_program = True #@param {type:"boolean"}


def adjust_length_to_model(length, max_sequence_length):
    if length < 0 and max_sequence_length > 0:
        length = max_sequence_length
    elif 0 < max_sequence_length < length:
        length = max_sequence_length  # No generation bigger than model size
    elif length < 0:
        length = MAX_LENGTH  # avoid infinite loop
    return length
length = adjust_length_to_model(output_length, max_sequence_length=model.config.max_position_embeddings)

import time
start_time = time.time()
output_sequences = model.generate(
    prompt=prompt_text,
    overlap_factor=overlap_factor,
    tokenizer=tokenizer,
    max_length=length,
    temperature=temperature,
    top_k=top_k,
    top_p=top_p,
    repetition_penalty=repetition_penalty,
    do_sample=do_sample,
    num_return_sequences=num_return_sequences,
    pad_token_id=0,
    verbose=show_program,
)
print(f"Time: {time.time() - start_time}s\n")

# Remove the batch dimension when returning multiple sequences
if len(output_sequences.shape) > 2:
    output_sequences.squeeze_()

generated_sequences = []

for generated_sequence_idx, generated_sequence in enumerate(output_sequences):
    generated_sequence = generated_sequence.tolist()
    generated_sequence = [idx for idx in generated_sequence if idx != 0]

    # Decode text
    generated_text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)

    if num_return_sequences > 1:
        print(f'\nGenerated sequence {generated_sequence_idx}:')
    print('\n'.join(textwrap.wrap(generated_text)))

