Skip to content

Commit

Permalink
Exposing grammar as a request parameter in completion/chat with go-si…
Browse files Browse the repository at this point in the history
…de grammar validation
  • Loading branch information
richardanaya committed May 20, 2024
1 parent 105186a commit 80b46f7
Show file tree
Hide file tree
Showing 6 changed files with 1,074 additions and 0 deletions.
6 changes: 6 additions & 0 deletions api/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ type GenerateRequest struct {
// Format specifies the format to return a response in.
Format string `json:"format"`

// Grammar specifies the GBNF grammar string to constrain generation output.
Grammar string `json:"grammar"`

// KeepAlive controls how long the model will stay loaded in memory following
// this request.
KeepAlive *Duration `json:"keep_alive,omitempty"`
Expand Down Expand Up @@ -94,6 +97,9 @@ type ChatRequest struct {
// Format is the format to return the response in (e.g. "json").
Format string `json:"format"`

// Grammar specifies the GBNF grammar string to constrain generation output.
Grammar string `json:"grammar"`

// KeepAlive controls how long the model will stay loaded into memory
// followin the request.
KeepAlive *Duration `json:"keep_alive,omitempty"`
Expand Down
16 changes: 16 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ Generate a response for a given prompt with a provided model. This is a streamin
Advanced parameters (optional):

- `format`: the format to return a response in. Currently the only accepted value is `json`
- `grammar`: the [GBNF grammar](https://github.com/ggerganov/llama.cpp/tree/master/grammars) to constrain generated output to
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
- `system`: system message to (overrides what is defined in the `Modelfile`)
- `template`: the prompt template to use (overrides what is defined in the `Modelfile`)
Expand Down Expand Up @@ -162,6 +163,21 @@ curl http://localhost:11434/api/generate -d '{
}'
```

#### Request (GBNF mode)

> When `grammar` is set to a [GBNF grammar](https://github.com/ggerganov/llama.cpp/tree/master/grammars) output will be constrained to the grammar's rules. This method does not rely upon the prompt containing references to how it should output.
##### Request

```shell
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt": "Are llamas amazing?",
"grammar": "root ::= \"yes\" | \"no\"",
"stream": false
}'
```

##### Response

```json
Expand Down
Loading

0 comments on commit 80b46f7

Please sign in to comment.