# ChatOutlines

[Outlines](https://github.com/outlines-dev/outlines) is a library for constrained language generation. It allows you to use large language models (LLMs) with various backends while applying constraints to the generated output.

## Overview

### Integration details

| Class | Package | Local | Serializable | JS support |
| :--- | :--- | :---: | :---: |  :---: |
| [ChatOutlines](https://python.langchain.com/docs/integrations/chat/outlines) | [langchain-community](https://python.langchain.com/docs/integrations/chat/outlines) | ✅ | ❌ | ❌ |

### Model features

| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | Image input | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) | Grammars |
| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | 

## Setup

First, you need to install the required packages:

In [None]:
!pip install langchain-community outlines

## Instantiation

You can instantiate the ChatOutlines model with various backends:

In [None]:
from langchain_community.chat_models.outlines import ChatOutlines

# For llamacpp backend
model = ChatOutlines(model="TheBloke/phi-2-GGUF/phi-2.Q4_K_M.gguf", backend="llamacpp")

# For vllm backend (not available on Mac)
model = ChatOutlines(model="meta-llama/Llama-3.2-1B", backend="vllm")

# For mlxlm backend (only available on Mac)
model = ChatOutlines(model="mistralai/Ministral-8B-Instruct-2410", backend="mlxlm")

# For huggingface ransformers backend
model = ChatOutlines(model="microsoft/phi-2")  # defaults to transformers backend

## Basic Usage

You can use the ChatOutlines model for simple chat completions:

In [None]:
from langchain_core.messages import HumanMessage

messages = [HumanMessage(content="What will the capital of mars be called?")]
response = model.invoke(messages)

response.content

## Streaming

ChatOutlines supports streaming of tokens:

In [None]:
messages = [HumanMessage(content="Count to 10 in French:")]

for chunk in model.stream(messages):
    print(chunk.content, end="", flush=True)

## Constrained Generation

ChatOutlines allows you to apply various constraints to the generated output:

### Regex Constraint


In [None]:
model.regex = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"

response = model.invoke("What is the IP address of Google's DNS server?")

response.content

### Type Constraints

In [None]:
model.type_constraints = int
response = model.invoke("What is the answer to life, the universe, and everything?")

response.content

### Pydantic and JSON Schemas

In [None]:
from pydantic import BaseModel


class Person(BaseModel):
    name: str


model.json_schema = Person
response = model.invoke("Who are the main contributors to LangChain?")
person = Person.model_validate_json(response.content)

person

### Context Free Grammars

In [None]:
model.grammar = """
?start: expression
?expression: term (("+" | "-") term)
?term: factor (("" | "/") factor)
?factor: NUMBER | "-" factor | "(" expression ")"
%import common.NUMBER
%import common.WS
%ignore WS
"""
response = model.invoke("Give me a complex arithmetic expression:")

response.content


## LangChain's Structured Output

You can also use LangChain's Structured Output with ChatOutlines:


In [None]:
from pydantic import BaseModel


class AnswerWithJustification(BaseModel):
    answer: str
    justification: str


_model = model.with_structured_output(AnswerWithJustification)
result = _model.invoke("What weighs more, a pound of bricks or a pound of feathers?")

result

## Full Outlines Documentation: 

https://dottxt-ai.github.io/outlines/latest/