Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Better Models, Better Prompts in Docs #619

Merged
merged 11 commits into from
Feb 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 33 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ You can reduce the completion to a choice between multiple possibilities:
``` python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?
Expand All @@ -73,15 +73,18 @@ You can instruct the model to only return integers or floats:
``` python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")

prompt = "1+1="
prompt = "<s>result of 9 + 9 = 18</s><s>result of 1 + 2 = "
answer = outlines.generate.format(model, int)(prompt)
print(answer)
# 3

prompt = "sqrt(2)="

generator = outlines.generate.format(model, float)
answer = generator(prompt)
answer = generator(prompt, max_tokens=10)
print(answer)
# 1.41421356
```

### Efficient regex-guided generation
Expand All @@ -93,7 +96,7 @@ hood:
``` python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")

prompt = "What is the IP address of the Google DNS servers? "

Expand Down Expand Up @@ -156,7 +159,7 @@ class Character(BaseModel):
strength: int


model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")

# Construct guided sequence generator
generator = outlines.generate.json(model, Character, max_tokens=100)
Expand All @@ -165,25 +168,15 @@ generator = outlines.generate.json(model, Character, max_tokens=100)
rng = torch.Generator(device="cuda")
rng.manual_seed(789001)

sequence = generator("Give me a character description", rng=rng)
print(sequence)
# {
# "name": "clerame",
# "age": 7,
# "armor": "plate",
# "weapon": "mace",
# "strength": 4171
# }

sequence = generator("Give me an interesting character description", rng=rng)
print(sequence)
# {
# "name": "piggyback",
# "age": 23,
# "armor": "chainmail",
# "weapon": "sword",
# "strength": 0
# }
character = generator("Give me a character description", rng=rng)

print(repr(character))
# Character(name='Anderson', age=28, armor=<Armor.chainmail: 'chainmail'>, weapon=<Weapon.sword: 'sword'>, strength=8)

character = generator("Give me an interesting character description", rng=rng)

print(repr(character))
# Character(name='Vivian Thr', age=44, armor=<Armor.plate: 'plate'>, weapon=<Weapon.crossbow: 'crossbow'>, strength=125)
```

The method works with union types, optional types, arrays, nested schemas, etc. Some field constraints are [not supported yet](https://github.com/outlines-dev/outlines/issues/215), but everything else should work.
Expand Down Expand Up @@ -232,9 +225,9 @@ schema = '''{
}
}'''

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.json(model, schema)
sequence = generator("Give me a character description")
character = generator("Give me a character description")
```

### Using context-free grammars to guide generation
Expand All @@ -245,34 +238,25 @@ Formal grammars rule the world, and Outlines makes them rule LLMs too. You can p
import outlines

arithmetic_grammar = """
?start: sum
?start: expression

?sum: product
| sum "+" product -> add
| sum "-" product -> sub
?expression: term (("+" | "-") term)*

?product: atom
| product "*" atom -> mul
| product "/" atom -> div
?term: factor (("*" | "/") factor)*

?atom: NUMBER -> number
| "-" atom -> neg
| "(" sum ")"
?factor: NUMBER
| "-" factor
| "(" expression ")"

%import common.NUMBER
%import common.WS_INLINE

%ignore WS_INLINE
"""

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = outlines.generate.cfg(model, arithmetic_grammar)
sequence = generator("Write a formula that returns 5 using only additions and subtractions.")

# It looks like Mistral is not very good at arithmetics :)
sequence = generator("Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:")

print(sequence)
# 1+3-2-4+5-7+8-6+9-6+4-2+3+5-1+1
# (8-2)
```

This was a very simple grammar, and you can use `outlines.generate.cfg` to generate syntactically valid Python, SQL, and much more than this. Any kind of structured text, really. All you have to do is search for "X EBNF grammar" on the web, and take a look at the [Outlines Grammars repository](https://github.com/outlines-dev/grammars).
Expand All @@ -288,9 +272,9 @@ import outlines
def add(a: int, b: int):
return a + b

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("WizardLM/WizardMath-7B-V1.1")
generator = outlines.generate.json(model, add)
result = generator("Return two integers named a and b respectively. a is odd and b even.")
result = generator("Return json with two integers named a and b respectively. a is odd and b even.")

print(add(**result))
# 3
Expand Down Expand Up @@ -329,7 +313,7 @@ def labelling(to_label, examples):
{{ to_label }} //
"""

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
prompt = labelling("Just awesome", examples)
answer = outlines.generate.text(model)(prompt, max_tokens=100)
```
Expand Down
51 changes: 27 additions & 24 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The first step when writing a program with Outlines is to initialize a model. We
```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
```

### Text generator
Expand All @@ -27,7 +27,7 @@ Once the model is initialized you can build a text generator. This generator can
```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.text(model, max_tokens=100)

result = generator("What's 2+2?")
Expand All @@ -43,7 +43,7 @@ Once the model is initialized you can build a text generator. This generator can
```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.text(model, max_tokens=100)

stream = generator.stream("What's 2+2?")
Expand All @@ -64,7 +64,7 @@ Outlines allows you to do multi-label classification by guiding the model so it
```python
import outlines

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.choice(model, ["Blue", "Red", "Yellow"])

color = generator("What is the closest color to Indigo? ")
Expand Down Expand Up @@ -96,7 +96,7 @@ Outlines can guide models so that they output valid JSON **100%** of the time. Y
armor: Armor
strength: conint(gt=1, lt=100)

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.json(model, Character)

character = generator(
Expand Down Expand Up @@ -131,7 +131,7 @@ Outlines can guide models so that they output valid JSON **100%** of the time. Y
"type": "object"
}"""

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1")
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = outlines.generate.json(model, schema)
character = generator(
"Generate a new character for my awesome game: "
Expand Down Expand Up @@ -175,12 +175,12 @@ arithmetic_grammar = """
%ignore WS_INLINE
"""

model = models.transformers("mistralai/Mistral-7B-v0.1")
model = models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = generate.cfg(model, arithmetic_grammar, max_tokens=100)

result = generator("Write a series of operations on integers that return the number 5")
result = generator("Question: How can you write 5*5 using addition?\nAnswer:")
print(result)
# 4*5*3*2*1/6*4*3*2*1/2*1*1*1/4*1*1*1/2*1*1*1/2*1*1/2*1*1*5*1/2*2*1*1/2*1*1*6*1*1/2*1*1*1*1*2*1*1*1*1
# 5+5+5+5+5
```


Expand All @@ -189,12 +189,12 @@ EBNF grammars can be cumbersome to write. This is why Outlines provides grammar
```python
from outlines import models, generate, grammars

model = models.transformers("mistralai/Mistral-7B-v0.1")
model = models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = generate.cfg(model, grammars.arithmetic, max_tokens=100)

result = generator("Write a series of operations on integers that return the number 5 ")
result = generator("Question: How can you write 5*5 using addition?\nAnswer:")
print(result)
# 100-2-75+50-18+27-501.
# 5+5+5+5+5
```

The available grammars are listed [here](https://github.com/outlines-dev/outlines/tree/main/outlines/grammars).
Expand All @@ -207,14 +207,14 @@ Slightly simpler, but no less useful, Outlines can generate text that is in the
```python
from outlines import models, generate

model = models.transformers("mistralai/Mistral-7B-v0.1")
model = models.transformers("mistralai/Mistral-7B-Instruct-v0.2")

regex_str = r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)"
generator = generate.regex(model, regex_str)

result = generator("What is the IP address of Google's DNS sever?")
result = generator("What is the IP address of localhost?\nIP: ")
print(result)
# 0.0.0.0
# 127.0.0.100
```

### Generate a given Python type
Expand All @@ -224,7 +224,7 @@ We provide a shortcut to regex-guided generation for simple use cases. Pass a Py
```python
from outlines import models, generate

model = models.transformers("mistralai/Mistral-7B-v0.1")
model = models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
generator = generate.format(model, int)

result = generator("What is 2+2?")
Expand All @@ -245,15 +245,15 @@ python -m outlines.serve.serve
This will by default start a server at `http://127.0.0.1:8000` (check what the console says, though) with the OPT-125M model. If you want to specify another model:

```python
python -m outlines.serve.serve --model="mistralai/Mistral-7B-v0.1"
python -m outlines.serve.serve --model="mistralai/Mistral-7B-Instruct-v0.2"
```

You can then query the model in shell by passing a prompt and a [JSON Schema][jsonschema]{:target="_blank"} specification for the structure of the output:

```bash
curl http://0.0.0.1:8000 \
curl http://127.0.0.1:8000/generate \
-d '{
"prompt": "What is the capital of France?",
"prompt": "Question: What is a language model? Answer:",
"schema": {"type": "string"}
}'
```
Expand Down Expand Up @@ -329,25 +329,28 @@ Once you are done experimenting with a prompt and an output structure, it is use

@outlines.prompt
def tell_a_joke(topic):
"""Tell me a joke about {{ topic }}."
"""Tell me a joke about {{ topic }}."""

class Joke(BaseModel):
setup: str
punchline: str

fn = outlines.Function(
generate_joke = outlines.Function(
tell_a_joke,
Joke,
"mistralai/Mistral-7B-v0.1"
"mistralai/Mistral-7B-Instruct-v0.2"
)
```

=== "Call a function"

```python
from .function import fn as joke
from .function import generate_joke

response = joke("baseball")
response = generate_joke("baseball")

# haha
# Joke(setup='Why was the baseball in a bad mood?', punchline='Because it got hit around a lot.')
```

=== "Call a function stored on GitHub"
Expand Down
Loading