Add support for guided decoding (fixes #288) #2815

br3no · 2024-02-08T16:14:03Z

This pull request extends the api_server.py with optional guided decoding with regex and JSON schema resolving issue #288.

The API is extended with the optional parameters regex and schema. The functionality is opt-in and needs to be activated by the CLI parameter --guided-decoding-engine that defaults to None and accepts the value outlines.

The new module guided_decoding.py implements the integration of the outlines logits processors.

This pull request also changes the API of the vLLM LogitsProcessors to add a sequence id for each sequence, to support stateful logits processors.

Starting the api_server with the extra CLI arguments --guided-decoding-engine outlines allows one to issue these kinds of requests:

response = requests.post("http://hal9000:1984/generate", json={
    "prompt" : "The best language for type-safe systems programming is ",
    "regex" : "(Python|Java|C|C\+\+|C#|JavaScript|PHP|Swift|Go|Ruby|TypeScript|Kotlin|Rust)",
    "max_tokens" : 10,
    "n" : 2
})
response.json()

Producing

 {'text': ['The best language for type-safe systems programming is C++',
  'The best language for type-safe systems programming is Go']}

response = requests.post("http://hal9000:1984/generate", json={
    "prompt" : """Return a json object for the following schema: {"type": "object", "properties": {"name": {"type": "string", "maxLength" : 20}, "age": {"type": "integer"}}}""",
    "schema" : {"type": "object", "properties": {"name": {"type": "string", "maxLength" : 20}, "age": {"type": "integer"}}}
})
response.json()

Producing

{'text': ['Return a json object for the following schema: {"type": "object", "properties": {"name": {"type": "string", "maxLength" : 20}, "age": {"type": "integer"}}}: {"name": "John", "age": 30}']}

The changes here are partially inspired by the integration of vLLM in outlines (https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py).

Added support for guided decoding in `api_server` by integrating _outlines_ (https://github.com/outlines-dev/outlines).

esmeetu · 2024-02-09T04:53:22Z

Hi, @br3no
What's the difference between https://github.com/noamgat/lm-format-enforcer and outlines? IMK, They could be done json and regex structure output.

simon-mo · 2024-02-09T05:28:14Z

I think outlines has lower runtime overhead as compared to lmformatenforcer.

br3no · 2024-02-09T07:18:29Z

@esmeetu I actually have a branch with support for lm-format-enforcer. I chose not to add it to this PR because I couldn't yet find a way to reach reasonable speed. You can follow the discussion here: noamgat/lm-format-enforcer#65 (comment)

br3no · 2024-02-13T15:35:01Z

@simon-mo let me know if you feel this goes in the right direction and if there is anything I can do to help in the review process.

simon-mo · 2024-02-13T22:31:21Z

Hi @br3no, thank you sooo much for this PR. However, I'm bummed to say that we are not going to add more functionality to the simple api server, rather focusing the complex features in open ai compatible server. I think we will end up merging #2819 (which is based off your commit and you are one of the co-author!). Any review on that PR is appreciated.

br3no · 2024-02-14T08:20:08Z

Hi @simon-mo, no worries. I'll chime in on issue #2819.

br3no added 3 commits February 8, 2024 16:53

vllm-project#288 guided decoding

c1c8d39

Added support for guided decoding in `api_server` by integrating _outlines_ (https://github.com/outlines-dev/outlines).

Merge branch 'SUPPORT_GUIDED_DECODING-vllm-project#288' into main

04f1e19

Fixing ruff and yapf complaints

8b0395a

simon-mo self-assigned this Feb 8, 2024

simon-mo closed this Feb 13, 2024

felixzhu555 mentioned this pull request Feb 14, 2024

Add guided decoding for OpenAI API server #2819

Merged

kevinbu233 mentioned this pull request Apr 16, 2024

Added Support for guided decoding in offline interface #4130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for guided decoding (fixes #288) #2815

Add support for guided decoding (fixes #288) #2815

br3no commented Feb 8, 2024 •

edited

Loading

esmeetu commented Feb 9, 2024

simon-mo commented Feb 9, 2024

br3no commented Feb 9, 2024

br3no commented Feb 13, 2024

simon-mo commented Feb 13, 2024

br3no commented Feb 14, 2024

Add support for guided decoding (fixes #288) #2815

Add support for guided decoding (fixes #288) #2815

Conversation

br3no commented Feb 8, 2024 • edited Loading

esmeetu commented Feb 9, 2024

simon-mo commented Feb 9, 2024

br3no commented Feb 9, 2024

br3no commented Feb 13, 2024

simon-mo commented Feb 13, 2024

br3no commented Feb 14, 2024

br3no commented Feb 8, 2024 •

edited

Loading