<a href="https://www.kaggle.com/code/aisuko/text-generation-with-outlines?scriptVersionId=164531661" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Outlines

Neuro-sysmbolic Text Generation

It is a library for neural text generation and it can be a replacement for the `generate` method in the transformers library. And it can guarantee that the output will match a regular expressions or follow a JSON schema.

# Installation

In [1]:
!pip install outlines==0.0.9

Collecting outlines==0.0.9
  Obtaining dependency information for outlines==0.0.9 from https://files.pythonhosted.org/packages/79/db/fbee799e6c6a602b654ec87ee3eb2973a215417a83d3265e060a177a2508/outlines-0.0.9-py3-none-any.whl.metadata
  Downloading outlines-0.0.9-py3-none-any.whl.metadata (14 kB)
Collecting interegular (from outlines==0.0.9)
  Obtaining dependency information for interegular from https://files.pythonhosted.org/packages/c4/01/72d6472f80651673716d1deda2a5bbb633e563ecf94f4479da5519d69d25/interegular-0.3.3-py37-none-any.whl.metadata
  Downloading interegular-0.3.3-py37-none-any.whl.metadata (3.0 kB)
Collecting lark (from outlines==0.0.9)
  Obtaining dependency information for lark from https://files.pythonhosted.org/packages/e7/9c/eef7c591e6dc952f3636cfe0df712c0f9916cedf317810a3bb53ccb65cdd/lark-1.1.9-py3-none-any.whl.metadata
  Downloading lark-1.1.9-py3-none-any.whl.metadata (1.9 kB)
Collecting perscache (from outlines==0.0.9)
  Obtaining dependency information

# Early stopping

Stoping the generation after a given sequence has been found.

In [2]:
import outlines.text.generate as generate
import outlines.models as models

model = models.transformers("gpt2")



Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [3]:
answer = generate.continuation(model, stop=["."])("Tell me a one-sentence joke.")
print(answer)

 I don't ever pay attention to this show


# Multiple choices

You can reduce the completion to a choice between multiple possibilities.

In [4]:
prompt="""You care a sentiment-labelling assistant. Is the following review positive or negative? Review: This restaurant is just awesome!"""
answer =generate.choice(model, ["Positive", "Negative"])(prompt)
print(answer)

Encountered the use of a type that is scheduled for deprecation: type 'reflected set' found for argument 'fsm_finals' of function '_walk_fsm'.

For more information visit https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types
[1m
File "../../opt/conda/lib/python3.10/site-packages/outlines/text/fsm.py", line 239:[0m
[1m@numba.njit(nogil=True, cache=True)
[1mdef _walk_fsm(
[0m[1m^[0m[0m
[0m[0m
  state_seq = _walk_fsm(
Encountered the use of a type that is scheduled for deprecation: type 'reflected set' found for argument 'fsm_finals' of function 'state_scan_tokens'.

For more information visit https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types
[1m
File "../../opt/conda/lib/python3.10/site-packages/outlines/text/fsm.py", line 592:[0m
[1m@numba.njit(cache=True, nogil=True)
[1mdef state_scan_tokens(
[0m[1m^[0m[0m
[0m


Negative


# Efficient JSON generation following a Pydantic model

Allowing to guide the generation process so the output is guaranteed to follow a JSON schema or Pydantic model

In [5]:
from enum import Enum
from pydantic import BaseModel, constr
import torch

class Weapon(str, Enum):
    sword="sword"
    axe="axe"
    mace="mace"
    spear="spear"
    bow="bow"
    crossbow="crossbow"

class Armor(str, Enum):
    leather="leather"
    chainmail="chainmail"
    plate="plate"

class Character(BaseModel):
    name: constr(max_length=10)
    age: int
    armor: Armor
    weapon: Weapon
    strength: int

model = models.transformers("gpt2", device="cuda")

# Construct guided sequence generator
generator = generate.json(model, Character, max_tokens=100)

# Draw a sample
rng = torch.Generator(device="cuda")
rng.manual_seed(789001)

<torch._C.Generator at 0x7eb280262c30>

In [6]:
sequence = generator("Give me a character description", rng=rng)
print(sequence)

{ "name":"Janston", "age": 3, "armor" : "chainmail", "weapon" : "bow", "strength" : 30 }


In [7]:
sequence = generator("Give me an interesting character description", rng=rng)
print(sequence)

{

"name" : "version 1,",

"age" : 0,

"armor" : "plate",

"weapon" : "spear",

"strength" : 0

}


In [8]:
parsed = Character.model_validate_json(sequence)
print(parsed)

name='version 1,' age=0 armor=<Armor.plate: 'plate'> weapon=<Weapon.spear: 'spear'> strength=0


# Prompting

Managing prompts by encapsulating templates inside "template functions". Template functions require no superfluous abstraction, they use the Jinja2 templating engine to help build complex prompts in a concise manner

In [9]:
import outlines.text as text

examples=[
    ("The food was digusting","Negative"),
    ("We had a fantastic night", "Positive"),
    ("Recommended","Positive"),
    ("The waiter was rude","Negative")
]

@text.prompt
def labelling(to_label, examples):
    """You are a sentiment-labelling assistant.
    {% for example in examples %}
    {{ example[0] }} // {{ example[1] }}
    {% endfor %}
    {{ to_label }} //
    """

prompt=labelling("Just awesome", examples)
answer = text.generate.continuation(model, max_tokens=100)(prompt)
print(answer)

 Check out more at Food Standards in the comments
I checked it out when I came and quite enjoyed it down the hall and it was addictive and on top of that it was delicious. Everything was done especially the broccolini with the green zest. I highly recommend this dish!
[^I trusted my teacher if I can´t wipe the toilet].
Chicken picture mab was fun // Yahtzee
This was really good // Neptunes
~~~~~~~~ //thinks it was sloppy
