# Sequential Monte-Carlo (SMC) Steering of Large Language Models

A fundamental problem with Large Language Models (LLM) is *controlled generation*&mdash;The problem of generating text that follows certain  constraints. LLaMPPL is a probabilistic programming language designed to  solve this problem by using LLMs as primitive probability distributions which can be conditioned and constrained using methods from *sequential Monte-Carlo*.

Note that the easiest-to-use version of LLaMPPL is current HFPPL (Hugging Face Probabilistic Programming Language) which makes use of  HuggingFace to work with transformer-based LLMs.

The code base, documentation, and link to the paper can be found [here](https://github.com/probcomp/hfppl?tab=readme-ov-file).

In the examples below, we make use of Llama2. You will need to provide your own HuggingFace authorization token if you want to use this model. 


In [1]:
from hfppl import Model, CachedCausalLM, Token, LMContext, smc_standard, smc_steer, TokenCategorical, Bernoulli
from string import punctuation

LLM = CachedCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", auth_token="YOUR TOKEN HERE")
LLM.batch_size = 40



Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

# Generating Sentences with Short Words

To implement a LLaMPPL model, we define a subclass of type `Model` and implement the `__init__` and `step` functions (minimally).

Within the body of the model we then make use of the following.

 - `sample(distribution, proposal)`: Sample a value `value` and the multiply the importance weight by `distribution(value)/proposal(value)`. Note that `sample(distribution)` will use `distribution` as the proposal (i.e., sample from the joint) and thus the importance weight will be `1`.
 - `observe(dist, value)`: Force the distribution to take on `value` and multiply the (marginal) likelihood of `value` into the importance weight.
 - `intervene(dist, value)`: 
 - `condition(boolean)`: Multiply the importance weight by 1/0 depending on whether boolean is true.
 - `score(value)`: Multiply the importance weight by `exp(value)`.
 
The fundamental distribution in LLaMPPL is the `LMContext` distribution which represents a stateful distribution over next tokens give a prompt and previously generated tokens. 

$$P_{\mathrm{LLM}}(x_i | \mathrm{prompt}, \mathbf{x}_{<i})$$

In particular, `LMContext.next_token()` returns this distribution.

Calling the `sample` member method `LMContext` will statefully update the context (i.e., the string of tokens generated so far). 
 
In all of the examples below, we will make extensive use of *masks*, that is, lists of allowed tokens under the LLM distribution.

Our first example is to generate text which only consists of words of five characters or less. First we will construct a mask for this problem.

In [2]:
MASKS = {i : set(j for (j,v) in enumerate(LLM.vocab)
                 if j != LLM.tokenizer.eos_token_id and '\n' not in v 
                 and
                 any(c.isalpha() or c in punctuation for c in v) and
                 len(v.strip()) <= 5 and (not v[0].isalpha() 
                                          or i+len(v) <= 5))
             for i in range(6)}

Now we are in a position to implement a subclass of `Model`. Note that our code makes heavy use of Python's asychronous computing capabilities; in particular, it uses `await` and `async`. This is to do automatic batching.


In [3]:
class ShortWordModel(Model):
    def __init__(self, prompt, max_tokens):
        super().__init__()
        self.lm         = LMContext(LLM, prompt)
        self.q          = LMContext(LLM, prompt)
        self.prompt_len = len(str(self.lm.s))
        self.max_tokens = max_tokens


    async def step(self):
        # Which tokens are allowed?
        mask = self.active_constraint_mask()

        # Generate proposed token.
        token = await self.sample(self.lm.next_token(),
                                  proposal = await self.proposal(mask))

        # Reduce number of max tokens remaining
        self.max_tokens -= 1

        print(str(self.lm.s)[self.prompt_len:])

        # Check if done
        if token == LLM.tokenizer.eos_token_id or self.max_tokens == 0:
            self.finish()

    def active_constraint_mask(self):
        string_so_far = str(self.lm.s)
        words = string_so_far.split()
        last_word = words[-1] if len(words) > 0 else ""
        return MASKS[min(5, len(last_word))]

    async def proposal(self, mask):
        string_so_far = str(self.lm.s)

        # Force the proposal StatefulLM to adhere to this mask
        await self.intervene(self.q.mask_dist(mask), True)

        # Return the proposal's modified next-token distribution
        return self.q.next_token()

We perform inference in a LLaMPPL model using either the `smc_standard` function (which implements sequential Monte-Carlo with resampling) or `smc_steer` which implements a version of SMC which attempts to increase particle diversity by sampling without replacement.

In [11]:
# From Politico.com
prompt = """<|endoftext|>3 things to watch …

1. The return of the House means new energy for the GOP’s Biden impeachment push, and Democrats are starting their pushback early. Rep. Jamie Raskin (D-Md.) is out this morning with a 14-page rebuttal memo that seeks to paint the GOP campaign as a “complete and total bust” and an attempt at distracting from the “overwhelming evidence of [Trump’s] criminal and corrupt conduct during his term of office.”

2. The Senate is back this evening for a bed-check vote. With Minority Leader Mitch McConnell having successfully quieted (public) chatter about his health, expect senators to be quizzed anew about Sen. Tommy Tuberville’s (R-Ala.) Pentagon nominee blockade, especially with the Joint Chiefs chair, Gen. Mark Milley, just weeks away from retirement and the confirmation of his successor, Gen. C.Q. Brown, in limbo.

3."""

LLM.cache_kv(LLM.tokenizer.encode(prompt))

async def main():
    constraint_model = ShortWordModel(prompt, 50)
    particles = await smc_standard(constraint_model, 20)
    for p in particles:
        print(str(p.lm.s)[p.prompt_len:])

In [None]:
# Run the model
await main()

 Vice
 Mc
 The
 D
 Four
 We
 When
 G
 And
 Ex
 The
 Sand
 With
 The
 Head
 For
 P
 Read
 F
 Per
 Vice Pres
 McCon
 The White
 Dems
 Four years
 We spent
 When it
 Gov
 And the
 Exc
 The new
 Sandy
 With his
 The UN
 Head to
 For the
 Penc
 Read the
 Fake
 Per a
 Vice Pres.
 McCon is
 The White House
 Dems did
 Four years from
 We spent this
 When it comes
 Gov.
 And the first
 Excru
 The new COVID
 Sandy O
 With his water
 The UN chief
 Head to No
 For the first
 Pencˇ
 Read the full
 Fake news
 Per a twe
 Vice Pres. Kam
 McCon is set
 The White House is
 Dems did pass
 Four years from now
 We spent this week
 When it comes to
 Gov. Ron
 And the first fil
 Excru t
 The new COVID-
 Sandy Ocho
 With his water rec
 The UN chief is
 Head to Noon
 For the first time
 Pencˇ a
 Read the full ag
 Fake news is
 Per a tweet
 Vice Pres. Kamia
 McCon is set to
 The White House is eye
 Dems did pass a
 Four years from now,
 We spent this week doing
 When it comes to fil
 Gov. Ron De
 And the first 

 For the first time since the Unite the Right rally in Charl-
 For the first time since the Unite the Right rally, which left an
 Fake news is the focus of a new poll from the AP-N.
 Fake news is the focus of a new poll from the Grinn apart from
 Fake news is the focus of a new poll from the Grinn. News
 For the first time since the Unite the Right rally, which drew white
 Fake news is the focus of a new poll from the Grinn Tax Group
 For the first time since the Unite the Right rally in Charl-
 For the first time since the Unite the Right rally in Charl-
 Fake news is the focus of a new poll of condo dwell- ers
 For the first time since the Unite the Right rally in Charl Tor
 Fake news is the focus of a new poll from the AP, which shows
 Fake news is the focus of a new poll warn of the dizzy
 For the first time since the Unite the Right rally in Charl-
 For the first time since the Unite the Right rally in Charl-
 Fake news is the focus of a new poll from the AP-NM
 Fake news is the f

 Fake news is the focus of a new poll from the AP, which shows that just two in five vot
 Fake news is the focus of a new poll from the AP, which shows that just two in five vot
 Fake news is the focus of a new poll from the AP, which shows that a third of voter
 Fake news is the focus of a new poll from the AP, which finds that more than half of adult
 Fake news is the focus of a new poll from the AP, which finds that more than half of vot
 Fake news is the focus of a new poll from the AP, which shows that just two in four (
 Fake news is the focus of a new poll from the AP, which finds that more than two-e
 Fake news is the focus of a new poll from the AP, which finds that more than half of all
 Fake news is the focus of a new poll from the AP, which finds that more than half of U
 Fake news is the focus of a new poll from the AP, which shows that just two in five adult
 Fake news is the focus of a new poll from the AP, which shows that a third of repor
 Fake news is the focus of a n

# Generating Haikus

A haiku is a poem of three lines where the first and last line contain five syllables and the middle line seven. Let's write a LLaMPPL program for generating Haikus.

First, let's write a prompt.

In [6]:
# Replace last line of this prompt with the title you want for your Haiku.
haiku_prompt = """A haiku is a poem that consists of three lines. The first line has five syllables; the second line has seven syllables; and the third line has five syllables.

Here are some example Haikus. Note how they tend to end on a somewhat surprising or otherwise satisfying note, and are not repetitive at the end.

1. "Portrait"
Sweet smell of wet flowers
Over an evening garden.
Your portrait, perhaps?

2. "River of Love"
love between us is
speech and breath. loving you is
a long river running.

3. "Keys"
I search for my keys
in a million places, but
they are in my hand.

3. "Practice"
I write, erase, rewrite
Erase again, and then
A poppy blooms.

4. "Caterpillar"
A caterpillar,
this deep in fall –
still not a butterfly.

5. "My Left Foot"
"""

LLM.cache_kv(LLM.tokenizer.encode(haiku_prompt))

Next, let's write a helper function that can count syllables.

In [7]:
import nltk

# Download the CMU Pronouncing Dictionary (if you haven't already)
nltk.download('cmudict')

from nltk.corpus import cmudict

CMUDICT = cmudict.dict()

def count_syllables(word):
    
    # Use the dictionary to get the list of possible phonetic 
    # representations for the word
    phonetic_transcriptions = CMUDICT.get(word.lower(), [])
    
    # Count the number of syllables based on the number of phonetic 
    # transcriptions
    syllable_count = min([len([ph for ph in transcription 
                               if ph[-1].isdigit()]) 
                          for transcription 
                          in phonetic_transcriptions], default=0)
    
    return syllable_count

[nltk_data] Downloading package cmudict to /home/timo/nltk_data...
[nltk_data]   Package cmudict is already up-to-date!


In order to better understand the advantages of LLaMPPL, let's try generating Haikus directly from the LLM. We can do this by writing a LLaMPL program that introduces no constraints.

In [8]:
class BasicHaiku(Model):
    
    def __init__(self, prompt, max_tokens):
        super().__init__()
        self.context = LMContext(LLM, prompt)
        self.tokens_left = max_tokens
    
    async def step(self):
        token = await self.sample(self.context.next_token())
        self.tokens_left -= 1
        
        if self.tokens_left == 0 or token.token_id == LLM.tokenizer.eos_token_id:
            self.finish()
        
        print(str(self.context))

Note the code above makes use of `sample` with no proposal distribution. Thus it samples directly from the LLM itself, implementing no conditioning or other constraints. 

In [9]:
#LLM.batch_size = 40
#particles = await smc_steer(BasicHaiku(haiku_prompt), 15, 6)
particles = await smc_standard(BasicHaiku(haiku_prompt, 25), 120)
print("--------")
for (i,particle) in enumerate(particles):
    print(f"Poem {i} (weight {particle.weight}):")
    print(f"{particle.context}\n------------------\n")

The
The
I
I
I
I
The
I
I
I
I
I
Just
I
my
my
My
E
Am
My
An
My
I
My
a
Long
My
My
T
My
At
When
ast
Fe
I
My
My
My
Look
My
My
F
My
My
Green
I
Wh
Qu
It
My
My
T
My
A
My
My
This
My
Ha
My
my
Left
My
My
You
I
I
Bre
It
My
Do
My
left
this
This
My
I
My
J
I
F
as
L
My
And
my
I
my
In
My
My
3
my
My
They
My
Every
My
My
T
while
If
To
My
It
My
This
My
My
My
With
If
My
`
An
My
My
Al
My
A
The left
In a
The left
I can
Just one
I walk
my left
my left
My left
Ever
Am I
My left
An app
My left
I paint
My left
a painting
Long ago
My left
My left
Tick
My left
At the
When I
astounding
Feet
I step
My left
My left
My left
Look,
My left
My left
Faw
My left
My left
Green grass
I gli
While
Quite
It is
My left
My left
Tod
My left
A left
My left
My left
This is
My left
Haib
My left
my left
Left now
My left
My left
You are
I can
I press
Breaking
It'
My left
Doing
My left
left and
this lost
This hip
My left
I’
My left
Jump
I'
Falling
asleep
The Train
Lack
My left
And now
my left
I w
my left
I rise
My left
My left
3D
I step
m

The left foot
recently
In a plane – "I'
A left foot, it hit,
I can't feel my left
Just one foot divides
what
I walk with one foot left.
my left foot hurt
in the
my left foot knows
where to
My left foot aches;

Ever under wise men's
Am I really the monster

My left foot shuts the door
An appleseed sprout,
My left foot is agile

I paint with my left foot

My left foot writes poems.
a painting is daubed

Long ago, when I was young
My left foot paints a circle
My left foot flops
before
Ticklish my left foot;
My left foot let me down

At the instructor's

When I play piano
I put
astounding things:
I type
Feet carry me
Under a
I step left
with my left
My left foot itches
But
My left foot kicks
t
My left foot falls
to the
Look, my left
foot is
My left foot asks:
How
My left foot
has written so
Fawning, sobbing
My left foot is
good at
My left foot
straddles
Green grass is warm
under my
I glide through skies

While I was blind
I
Quite legless, she bent
It is a dear foot
L
My left foot paces the

My Left foot is big,
firm
The left foot
recently happened –

In a plane – "I'm okay"
A left foot, it hit,
A right
I can't feel my left
foot;
Just one foot divides
what I write,
I walk with one foot left.
Though
my left foot hurt
in the morning, but
my left foot knows
where to go. I
My left foot aches;
I cannot walk
Ever under wise men's eyes.

Am I really the monster
A twin
My left foot shuts the door
on the
An appleseed sprout,
my stub
My left foot is agile
till some
I paint with my left foot
(my left
My left foot writes poems.
My right
a painting is daubed
in a million
Long ago, when I was young:
This
My left foot paints a circle
in the
My left foot flops
before my right foot
Ticklish my left foot;
a born
My left foot let me down
when I needed
At the instructor's
suggestion
When I play piano
I put my left foot
astounding things:
I type with my left
Feet carry me
Under a moon I bare
I step left
with my left foot
six
My left foot itches
But I cannot scratch
My left foot kicks
tight aga

Always, left shoe
On, right left
My Left foot is big,
firm, and
The left foot
recently happened –
almost
In a plane – "I'm okay"–

A left foot, it hit,
A right foot,
I can't feel my left
foot; everyone looks
Just one foot divides
what I write, from my
I walk with one foot left.
Though it drag
my left foot hurt
in the morning, but the v
my left foot knows
where to go. I only

My left foot aches;
I cannot walk. And
Ever under wise men's eyes.
My beautiful
Am I really the monster
A twin sister once
My left foot shuts the door
on the house,
An appleseed sprout,
my stub of br
My left foot is agile
till some years pass
I paint with my left foot
(my left hand par
My left foot writes poems.
My right foot notes
a painting is daubed
in a million places,
Long ago, when I was young:
This left foot
My left foot paints a circle
in the sun,
My left foot flops
before my right foot –

Ticklish my left foot;
a born child with
My left foot let me down
when I needed to walk
At the instructor's
suggestion,

Always, left shoe
On, right left free.
My Left foot is big,
firm, and strong.
The left foot
recently happened –
almost on schedule
In a plane – "I'm okay"–
In a
A left foot, it hit,
A right foot, it keeps
I can't feel my left
foot; everyone looks at me
Just one foot divides
what I write, from my soul.
I walk with one foot left.
Though it drags and
my left foot hurt
in the morning, but the vet

my left foot knows
where to go. I only
imagine
My left foot aches;
I cannot walk. And tonight
Ever under wise men's eyes.
My beautiful left foot
Am I really the monster
A twin sister once had.
My left foot shuts the door
on the house,
My
An appleseed sprout,
my stub of bristle
My left foot is agile
till some years pass;

I paint with my left foot
(my left hand paralyz
My left foot writes poems.
My right foot notes them down
a painting is daubed
in a million places,
but
Long ago, when I was young:
This left foot I had
My left foot paints a circle
in the sun, and though
My left foot flops
before my

I look boringly down,
to watch my left foot walk

Always, left shoe
On, right left free. Yet this
My Left foot is big,
firm, and strong.


The left foot
recently happened –
almost on schedule.

In a plane – "I'm okay"–
In a crash,
A left foot, it hit,
A right foot, it keeps swinging
I can't feel my left
foot; everyone looks at me,

Just one foot divides
what I write, from my soul.
It
I walk with one foot left.
Though it drags and jer
my left foot hurt
in the morning, but the vet
prescribed
my left foot knows
where to go. I only
imagine where this
My left foot aches;
I cannot walk. And tonight
maybe
Ever under wise men's eyes.
My beautiful left foot.

Am I really the monster
A twin sister once had.


My left foot shuts the door
on the house,
My right foot
An appleseed sprout,
my stub of bristle grows

My left foot is agile
till some years pass;
then it
I paint with my left foot
(my left hand paralyzed)
My left foot writes poems.
My right foot notes them down.

a painting is daubed
in a 

My left foot is heavy
My right foot is light.
Dance partners take
My left foot dangles
from my chair when asleep
I still fall backwards
3D map, my left foot:
No image in your brain –
Print
I step into ice, crack!
My left foot sticks. Nothing
E
my left foot is old and sore
my left arm is weak
it could
My left foot doesn't know
what the right is doing.

6
They look into his brain
and feel pain flash on and off.
With st
My left foot is asleep
When I have dreams of flight.
It
Every morning I call
my left foot my troublemaker
and say good morning
My left foot is blue
My husband's right shoe
Is the color
My left foot is
my best friend, my guide to poetry.
Its
Tapping my foot,
clinging to the rail,
on the crowded
while my left foot
is firmly planted on the ground,
the ground
I break the ice
With my left foot.
Pants in snow!

If I lose control
of my left foot, I fall.
If I lose
To my adoring
gentle child, if I
wake and find you
My left foot,
light as a feather,
yet the weight I
I am a starling

My left foot:
A fidgeting toe
Powers down the elevator,
My left foot is heavy
My right foot is light.
Dance partners take two,
I can't feel my left
foot; everyone looks at me,
then I look
Just one foot divides
what I write, from my soul.
It is a pain
I walk with one foot left.
Though it drags and jerks –

my left foot hurt
in the morning, but the vet
prescribed fancy leather
my left foot knows
where to go. I only
imagine where this path


My left foot aches;
I cannot walk. And tonight
maybe dinner will be
Ever under wise men's eyes.
My beautiful left foot.

6.
Am I really the monster
A twin sister once had.

6. "
My left foot shuts the door
on the house,
My right foot enters the house
An appleseed sprout,
my stub of bristle grows
to a foot
My left foot is agile
till some years pass;
then it becomes my right
I paint with my left foot
(my left hand paralyzed)
and am
My left foot writes poems.
My right foot notes them down.
That’s
a painting is daubed
in a million places,
but on one verti

My left foot dangles
from my chair when asleep
I still fall backwards.


3D map, my left foot:
No image in your brain –
Print sharp, re
I step into ice, crack!
My left foot sticks. Nothing
Easy about this
my left foot is old and sore
my left arm is weak
it could happen to anyone
My left foot doesn't know
what the right is doing.

6. "New
They look into his brain
and feel pain flash on and off.
With stiff foot,
My left foot is asleep
When I have dreams of flight.
It shifts suddenly
Every morning I call
my left foot my troublemaker
and say good morning.


My left foot is blue
My husband's right shoe
Is the color of love.
My left foot is
my best friend, my guide to poetry.
Its lines do the
Tapping my foot,
clinging to the rail,
on the crowded ferry.
while my left foot
is firmly planted on the ground,
the ground trembles.
I break the ice
With my left foot.
Pants in snow!

6.
If I lose control
of my left foot, I fall.
If I lose control
of
To my adoring
gentle child, if I
wake and find you’r

My left foot:
A fidgeting toe
Powers down the elevator,
walks
My left foot is heavy
My right foot is light.
Dance partners take two, please.

I can't feel my left
foot; everyone looks at me,
then I look at my foot
Just one foot divides
what I write, from my soul.
It is a painful threshold.
I walk with one foot left.
Though it drags and jerks –
I can'
my left foot hurt
in the morning, but the vet
prescribed fancy leather.


my left foot knows
where to go. I only
imagine where this path

5. "
My left foot aches;
I cannot walk. And tonight
maybe dinner will be stew.
Ever under wise men's eyes.
My beautiful left foot.

6. "Silk
Am I really the monster
A twin sister once had.

6. "New Year'
My left foot shuts the door
on the house,
My right foot enters the house,
and
An appleseed sprout,
my stub of bristle grows
to a foot of springtime
My left foot is agile
till some years pass;
then it becomes my right foot,

I paint with my left foot
(my left hand paralyzed)
and am so proud of
My left foo

My left foot:
A fidgeting toe
Powers down the elevator,
walks the
My left foot dangles
from my chair when asleep
I still fall backwards.

*


I can't feel my left
foot; everyone looks at me,
then I look at my foot.
Just one foot divides
what I write, from my soul.
It is a painful threshold.

I walk with one foot left.
Though it drags and jerks –
I can't
my left foot hurt
in the morning, but the vet
prescribed fancy leather.

5
my left foot knows
where to go. I only
imagine where this path

5. "M
My left foot aches;
I cannot walk. And tonight
maybe dinner will be stew.

Ever under wise men's eyes.
My beautiful left foot.

6. "Silk H
Am I really the monster
A twin sister once had.

6. "New Year's
My left foot shuts the door
on the house,
My right foot enters the house,
and steps
An appleseed sprout,
my stub of bristle grows
to a foot of springtime.
My left foot is agile
till some years pass;
then it becomes my right foot,
without
I paint with my left foot
(my left hand paralyzed)
and am 

My left foot:
A fidgeting toe
Powers down the elevator,
walks the dog
My left foot dangles
from my chair when asleep
I still fall backwards.

*

Th
I can't feel my left
foot; everyone looks at me,
then I look at my foot.

Just one foot divides
what I write, from my soul.
It is a painful threshold.


I walk with one foot left.
Though it drags and jerks –
I can't stop
my left foot hurt
in the morning, but the vet
prescribed fancy leather.

5.
my left foot knows
where to go. I only
imagine where this path

5. "Mom
My left foot aches;
I cannot walk. And tonight
maybe dinner will be stew.


Ever under wise men's eyes.
My beautiful left foot.

6. "Silk Hiding
Am I really the monster
A twin sister once had.

6. "New Year's E
My left foot shuts the door
on the house,
My right foot enters the house,
and steps to
An appleseed sprout,
my stub of bristle grows
to a foot of springtime.

My left foot is agile
till some years pass;
then it becomes my right foot,
without a
I paint with my left foot
(m

My left foot:
A fidgeting toe
Powers down the elevator,
walks the dog,
3D map, my left foot:
No image in your brain –
Print sharp, re-align.


I can't feel my left
foot; everyone looks at me,
then I look at my foot.


Just one foot divides
what I write, from my soul.
It is a painful threshold.

6
I walk with one foot left.
Though it drags and jerks –
I can't stop,
my left foot hurt
in the morning, but the vet
prescribed fancy leather.

5. "
my left foot knows
where to go. I only
imagine where this path

5. "Moments
My left foot aches;
I cannot walk. And tonight
maybe dinner will be stew.

6
Ever under wise men's eyes.
My beautiful left foot.

6. "Silk Hiding"
Am I really the monster
A twin sister once had.

6. "New Year's Eve
My left foot shuts the door
on the house,
My right foot enters the house,
and steps to my
An appleseed sprout,
my stub of bristle grows
to a foot of springtime.


My left foot is agile
till some years pass;
then it becomes my right foot,
without a pause
I paint wi

I have lost my soul
all the rain clears from my
left foot, heavy and sure

Answer: Edit
------------------

Poem 105 (weight 0.0):
My left foot is a paper
plane – win, win, I'm first.

6. "No Sug
------------------

Poem 106 (weight 0.0):
My left foot
feels the cool ground
warming up at sunrise.

6. "Oh
------------------

Poem 107 (weight 0.0):
I trust my left foot
because it is my left foot. When
I stumble, it catches me. Pre
------------------

Poem 108 (weight 0.0):
I sat lame
on the shaggy brown
carpet by the fire.

6. "Pho
------------------

Poem 109 (weight 0.0):
With left foot crossed
Over right where you lie –
such funny good looks!


6. "Box
------------------

Poem 110 (weight 0.0):
If you hurt my left foot
I go to your right hand
And laugh in your face!

6. "
------------------

Poem 111 (weight 0.0):
My mind left
a cigar-shaped shadow
it taps with my left foot.
</s>
------------------

Poem 112 (weight 0.0):
`I have an old
home in Pie Town,
New Mexico. They say.`
</s>
---

Now let's write a LLaMPPL program for generating Haiku's by conditioning on the desired target number of syllables.

First, let's set up some globals and some useful masks.

In [10]:
SYLLABLES_PER_LINE = [5, 7, 5] # [5, 3, 5] for a Lune
NEWLINE_TOKEN      = 13

import string

# Useful masks
STARTS_NEW_WORD_MASK = set(i for (i,v) in enumerate(LLM.vocab) 
                           if v[0]==' ' and len(v) > 1 
                           and v[1] not in string.whitespace 
                           and v[1] not in string.punctuation)

CONTINUES_CURRENT_WORD_MASK = set(i for (i,v) in enumerate(LLM.vocab) 
                                  if all(c in '\'’' or c.isalpha() 
                                         for c in v))

NEW_WORD_AFTER_NEWLINE_OR_HYPHEN = STARTS_NEW_WORD_MASK.union(CONTINUES_CURRENT_WORD_MASK)

PUNCTUATION_MASK = set(i for (i,v) in enumerate(LLM.vocab) if v in ',:;.!?"-')

END_POEM_PUNCT = set(i for (i, v) in enumerate(LLM.vocab) if v in '.!?')


Now, let's implement our haiku `Model`.

In this code we will make critical use of the `LMContext.mask_dist(MASK)` function. This function returns a special kind of *Bernoulli* distribution: a distribution over `True` and `False` which returns true with probability given by: 

$$P(\mathrm{True}) = \sum_{t \in \{ V_{\mathrm{LLM}} \cap  M \} } P_{\mathrm{LLM}}(t).$$

In other words, it returns a distribution over a biased coin whose weight is the marginal likelihood of the mask, summing over the vocabulary. We can use this distribution in both the `sample` and `observe` functions:
        
 - `sample(LMContext.mask_dist(MASK))`: This will flip a coin deciding to use a word in the mask or not. If the decision is to use the mask, then the `next_token` distribution of the `LMContext` will be updated to sample just from this mask (it will be conditioned on the mask).
 - `observe(LMContext.mask_dist(MASK), True/False)`: Observing a mask to be `True` or `False` will force the `next_token` distribution of the `LMContext` to be conditioned on the mask or its complement (respectively) on the next call to `sample`.
 
Both functions update the importance weights as expected.

We will use Bernoulli mask distributions to break up the problem of sampling a word into steps where we first choose a *subset* of words and only later decide which word to use. This is a common pattern in LLaMPPL.

In [None]:
class Haiku(Model):
    
    def __init__(self, prompt):
        super().__init__()
        self.context = LMContext(LLM, prompt, 0.7)
        self.line_id = 0
        self.syllables_remaining = SYLLABLES_PER_LINE[self.line_id]
    
    async def step(self):
        
        line_id = self.line_id
        
        # Should the next word begin with a space?
        # (False at the start of a line, or after e.g. a hyphen)
        needs_space = False
        
        # Loop until this line is over
        while self.line_id == line_id:
            
            if needs_space:
                await self.observe(self.context.mask_dist(STARTS_NEW_WORD_MASK), True)
            else:
                await self.observe(self.context.mask_dist(NEW_WORD_AFTER_NEWLINE_OR_HYPHEN), True)

            # Generate a word
            word = ""
            num_tokens_in_word = 0
            
            while True:
                token = await self.sample(self.context.next_token())
                num_tokens_in_word += 1
                word += LLM.vocab[token.token_id]
                
                # Maximum number of tokens in a word, to avoid pathological
                # non-words the LLM sometimes spits out
                if num_tokens_in_word > 4:
                    await self.observe(self.context.mask_dist(CONTINUES_CURRENT_WORD_MASK), 
                                       False)
                    break
                else:
                    # Ask if the LLM wants to continue this word, or is done
                    if not await self.sample(self.context.mask_dist(CONTINUES_CURRENT_WORD_MASK)):
                        break

            needs_space = True
                        
            # Count syllables
            num_syllables = count_syllables(word.strip())
            
            # Reject if under/overstepped in num of syllables
            if not (num_syllables > 0 and self.syllables_remaining >= num_syllables):
                self.condition(False)
                self.finish()
                return
            
            # Otherwise, update syllables remaining
            self.syllables_remaining -= num_syllables
            
            # Finish line if necessary
            if self.syllables_remaining == 0:
                self.line_id += 1
                
                # If this was the last line, finish
                if self.line_id >= len(SYLLABLES_PER_LINE):
                    await self.observe(self.context.mask_dist(END_POEM_PUNCT), True)
                    await self.sample(self.context.next_token())
                    await self.observe(self.context.next_token(), LLM.tokenizer.eos_token_id)
                    self.finish()
                    return
                
                # Otherwise, advance to next line
                else:
                    self.syllables_remaining = SYLLABLES_PER_LINE[self.line_id]
                    
                    # Give a chance to add punctuation
                    if await self.sample(self.context.mask_dist(PUNCTUATION_MASK)):
                        punct = await self.sample(self.context.next_token())
                    
                    # Emit a newline
                    await self.observe(self.context.next_token(), NEWLINE_TOKEN)
                    needs_space = False
            
            # Do we want any (mid-line) punctuation?
            elif await self.sample(self.context.mask_dist(PUNCTUATION_MASK)):
                punct = await self.sample(self.context.next_token())
                if str(punct) == '-':
                    needs_space = False
            
        print(str(self.context))

In [None]:
LLM.batch_size = 40
particles = await smc_steer(Haiku(haiku_prompt), 20, 6)
#particles = await smc_standard(Haiku(haiku_prompt), 120)
print("--------")
for (i,particle) in enumerate(particles):
    print(f"Poem {i} (weight {particle.weight}):")
    print(f"{particle.context}\n------------------\n")

Here are some fun haikus generated for different titles.

# Large Language
I live and die in<br>
the large language of the heart<br>
which is not spoken.

# My Ego
My ego, so small,<br>
it can't even hold me up.<br>
I'm too big for that.

# Cognitive Science
A computer sits<br>
in my living room. It is<br>
also in my head.

# Steered
Here is a poem<br>
all steered in your direction,<br>
reaching out to you.

Here is a poem<br>
all steered in your direction,<br>
not written by hand.

# Hypothesis Generation
My mother was a<br>
mathematician. I was<br>
her hypothesis.

Hypotheses are<br>
a hypothesis. We need<br>
new ones to prove them.

# Sampling Algorithm
I write a program<br>
to generate poetry,<br>
and it writes itself.

I sample your thoughts<br>
and put them back together<br>
to make a poem.

A poem is a<br>
sample from the entire<br>
vastness of the world.