# 🎳 bumpner example
Below I go through an example showing how the formula of constrained decoding + in-context learning + rules can be useful.

I'm calling this library bumpner because 1) i think it sounds funny, and 2) it's essentially NER with bumpers, as you'd have when you're bad at bowling

In [1]:
from guidance.models import Transformers
from bumpner import Bumpner
import time

## Load model & motivating example

In [2]:
model = Transformers(
    "Qwen/Qwen1.5-0.5B", 
    trust_remote_code=True, 
    device_map='cuda'
)
few_shot = """
Input: John worked at Apple.
Output:
John: PERSON
worked: O
at: O
Apple: ORG
.: O
"""
text = """
I work at Aperature Science with Mike. 
We work on cool products like the portal gun together. his num is s23ahg.
"""

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
You are a helpful assistant<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %} was unable to be loaded directly into guidance.
                        Defaulting to the ChatML format which may not be optimal for the selected model. 
                        For best results, create and pass in a `guidance.ChatTemplate` subclass for your model.


## Unconstrained generation

In [6]:
from guidance import gen
prompt = f"""
Tag each word in the input with an entity type. 
If none apply, use the tag 'O' to denote "no entity". Most words will have this 'O' label. 

Below is a description of each entity type.

PERSON: People, including fictional.
PRODUCT: Products offered by a company.
ORG: Companies, agencies, institutions, etc.
IDNUMBER: Identifier for Aperture Science employees

---

Input: John worked at Aperature Science.
Output:
John: PERSON
worked: O
at: O
Aperature: ORG
Science: ORG
.: O
---
Input: {text}
"""
# Specify max_tokens, so the model doesn't generate gibberish for too long
model += prompt + gen(max_tokens=256)

## Constrained Generation, with only in-context learning

Above, we didn't get great results. The model hallucinated input text, and didn't give a clear ending to its prediction (we'd need to setup some regular expression to extract the relevant bits from the text)

Below, we do constrained in-context learning. We're guaranteed that the output will match the {input_word: pred_label} format, but the predicted labels might not be all that good.

In the output, green highlighted text is generated

In [5]:
bumpner = Bumpner(
    model,
    """
    PERSON:
      description: People, including fictional.
      
    PRODUCT:
      description: Products offered by a company.
      
    ORG:
      description: Companies, agencies, institutions, etc.
    
    IDNUMBER:
      description: Identifier for Aperture Science employees
    """, 
    few_shot=few_shot
)
start = time.time()
result = bumpner(text)
print(f"Took {time.time() - start} seconds")
del bumpner

Took 0.9095032215118408 seconds


## With domain-specific heuristic matching 
Some NER labels don't *need* the power of a language model to perform inference. For these cases, we can define a set of rules (keywords or regular expressions) to route the word(s) away from the language model, and to our fast, interla rule-based system.

In [4]:
bumpner = Bumpner(
    model,
    """
    PERSON:
      description: People, including fictional.
      
    PRODUCT:
      description: Products offered by a company.
      rules:
        keyword:
          - conversion gel
          - portal gun

    ORG:
      description: Companies, agencies, institutions, etc.
    
    IDNUMBER:
      description: Identifier for Aperture Science employees
      rules:
        regex:
          - s\d{2}\w{3}
    """, 
    few_shot=few_shot
)
start = time.time()
result = bumpner(text)
print(f"Took {time.time() - start} seconds")
del bumpner

Took 0.7508370876312256 seconds


## Closing Thoughts
While I love the Qwen1.5-0.5b model, the predictions above are not perfect. However, this can be improved by:
1) Swapping in a larger base model to bumpner
2) Increasing the number of examples we pass in the `few_shot` argument

Whatever we do, by using the recipe of constrained decoding, in-context learning, and rules, we now have a framework which ensures the shape and entities in the prediction output is what we expect. No more ["prompt-and-pray"](https://arxiv.org/abs/2402.17882) needed