In [4]:
import json
import outlines
from transformers import AutoTokenizer
import torch
from textwrap import dedent

In [5]:
model_name = "microsoft/Phi-3-medium-4k-instruct"
model = outlines.models.transformers(
    model_name,
    model_kwargs={
        #'torch_dtype': torch.bfloat16,
        'trust_remote_code': True
    })
tokenizer = AutoTokenizer.from_pretrained(model_name)

`flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'.
Current `flash-attenton` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`.


Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [6]:
departments = ["clothing","electronics","kitchen","automotive"]

In [7]:
with open("../examples.json",'r') as fin:
    complaint_data = json.loads(fin.read())

In [8]:
complaint_data

[{'message': 'Hi, my name is Olivia Brown.I recently ordered a knife set from your wellness range, and it arrived earlier this week. Unfortunately, my satisfaction with the product has been less than ideal.My order was A123456',
  'order_number': 'A12-3456',
  'department': 'kitchen'},
 {'message': 'Hi, my name is John Smith.I recently ordered a dress for an upcoming event, which was alleged to meet my expectations both in fit and style. However, upon arrival, it became apparent that the fabric was of subpar quality, leading to a less than satisfactory appearance.The order number is A12-3456',
  'order_number': 'A12-3456',
  'department': 'clothing'},
 {'message': 'Hi, my name is Sarah Johnson.I recently ordered the ultimate ChefMaster 8 Drawer Cooktop. However, upon delivery, I discovered that one of the burners is malfunctioning.My order was A458739',
  'order_number': 'A45-8739',
  'department': 'kitchen'},
 {'message': 'Hi, my name is Jane Doeandcommn.I recently ordered a stylish b

In [9]:
def create_prompt(complaint):
    prompt_messages = [{
        "role": "user",
        "content": dedent("""
        I'm going to provide you with a consumer complaint to analyze.
        The complaint is going to be regarding a product from one of our
        departments. Here is the list of departments:
            - "clothing"
            - "electronics"
            - "kitchen"
            - "automotive"
        Please reply with *only* the name of the department.
        """)
    },{
        "role": "assistant",
        "content": "I understand and will only answer with the department name"
    },{
        "role": "user",
        "content": f"Great! Here is the complaint: {complaint['message']}"
    }
                       
                      ]
    prompt = tokenizer.apply_chat_template(prompt_messages, tokenize=False)
    return prompt

In [10]:
create_prompt(complaint_data[0])

'<|user|>\n\nI\'m going to provide you with a consumer complaint to analyze.\nThe complaint is going to be regarding a product from one of our\ndepartments. Here is the list of departments:\n    - "clothing"\n    - "electronics"\n    - "kitchen"\n    - "automotive"\nPlease reply with *only* the name of the department.\n<|end|>\n<|assistant|>\nI understand and will only answer with the department name<|end|>\n<|user|>\nGreat! Here is the complaint: Hi, my name is Olivia Brown.I recently ordered a knife set from your wellness range, and it arrived earlier this week. Unfortunately, my satisfaction with the product has been less than ideal.My order was A123456<|end|>\n<|assistant|>\n'

## Unstructured Generation

In [11]:
generator = outlines.generate.text(model)

In [12]:
for complaint in complaint_data[0:10]:
    prompt = create_prompt(complaint)
    result = generator(prompt, max_tokens=12)
    print(f"LLM: {result} Actual: {complaint['department']}")

You are not running the flash-attention implementation, expect numerical differences.


LLM: kitchen After analyzing the complaint, I' Actual: kitchen
LLM: clothing clothing clothing Actual: clothing
LLM: The department name is "kitchen" Thank you Actual: kitchen
LLM: electronics The information provided in the consumer complaint Actual: electronics
LLM: electronics Thank you for providing the details of your Actual: electronics
LLM: kitchen To further address Sarah Johnson's concern about Actual: kitchen
LLM: electronics Based on the consumer complaint provided by Actual: electronics
LLM: automotive To address this complaint thoroughly, Actual: automotive
LLM: electronics Upon analyzing the complaint, it Actual: electronics
LLM: The complaint is related to a product in the "k Actual: kitchen


Too much yapping! This is not reliable for a real world system!

## Structured Generation

In [10]:
# TODO: Replace this with correct use of outlines.generate.choice
generator_struct = None


In [11]:
for complaint in complaint_data[0:10]:
    prompt = create_prompt(complaint)
    result = generator_struct(prompt, max_tokens=12)
    print(f"LLM: {result} Actual: {complaint['department']}")

TypeError: 'NoneType' object is not callable