Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using json schema in vLLM/Aphrodite-engine, lmfe generates a lot of ":" as json properties #94

Closed
sgsdxzy opened this issue Apr 29, 2024 · 14 comments

Comments

@sgsdxzy
Copy link

sgsdxzy commented Apr 29, 2024

The json template to send as "guided_json" in the request to oai api of vLLM/Aphrodite:

json_template = {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "tool_name": {"type": "string"},
            "parameters": {
                "type": "object",
                "additionalProperties": {
                    "anyOf": [{"type": "string"}, {"type": "number"}, {"type": "boolean"}]
                },
            },
        },
        "required": ["tool_name", "parameters"],
        "additionalProperties": {},
    },
    "minItems": 1,
    "maxItems": 1,
}

Model output without any constrant

[
    {
        "tool_name": "internet_search",
        "parameters": {
            "query": "biggest penguin species",
            "provider": "Google"
        }
    }
]

Model output with "guided_json" and "guided_decoding_backend": "lm-format-enforcer"

[
    {
        "tool_name": "internet_search",
        "parameters": {
           ":": "biggest penguin species world"
        }
    }
]

Model output with "guided_json" and "guided_decoding_backend": "outlines"

[
    {
        "tool_name": "internet_search",
        "parameters": {
            "query": "biggest penguin species",
            "provider": "Google"
        }
    }
]

The model: CohereForAI/c4ai-command-r-plus
The prompt: (adapted from the function calling example of c4ai-command-r-plus)

@noamgat
Copy link
Owner

noamgat commented Apr 30, 2024 via email

@sgsdxzy
Copy link
Author

sgsdxzy commented Apr 30, 2024

Yeah I was using the release version of 0.4.1, will try latest, thanks. Could you point me to the related vLLM PR?

@sgsdxzy
Copy link
Author

sgsdxzy commented Apr 30, 2024

I am using vllm main with lmfe==0.9.8, and during the same request I encountered:

ERROR:root:Unknown LMFormatEnforcer Problem. Prefix: '[
    {
        "tool_name": "internet_search",
        "parameters": {
           "hquery": "biggest penguin in the world",
           "hprovider": "Google"
        }
'
Terminating the parser. Please open an issue at
https://github.com/noamgat/lm-format-enforcer/issues with the prefix and CharacterLevelParser parameters
Traceback (most recent call last):
  File "/home/sgsdxzy/micromamba/envs/vllm/lib/python3.11/site-packages/lmformatenforcer/tokenenforcer.py", line 96, in _compute_allowed_tokens
    self._collect_allowed_tokens(state.parser, self.tokenizer_tree.root, allowed_tokens, shortcut_key)
  File "/home/sgsdxzy/micromamba/envs/vllm/lib/python3.11/site-packages/lmformatenforcer/tokenenforcer.py", line 144, in _collect_allowed_tokens
    self._collect_allowed_tokens(next_parser, next_tree_node, allowed_tokens, None)
  File "/home/sgsdxzy/micromamba/envs/vllm/lib/python3.11/site-packages/lmformatenforcer/tokenenforcer.py", line 144, in _collect_allowed_tokens
    self._collect_allowed_tokens(next_parser, next_tree_node, allowed_tokens, None)
  File "/home/sgsdxzy/micromamba/envs/vllm/lib/python3.11/site-packages/lmformatenforcer/tokenenforcer.py", line 142, in _collect_allowed_tokens
    next_parser = parser.add_character(character)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgsdxzy/micromamba/envs/vllm/lib/python3.11/site-packages/lmformatenforcer/jsonschemaparser.py", line 63, in add_character
    while new_character not in self.object_stack[receiving_idx].get_allowed_characters():
                               ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
IndexError: list index out of range

Any idea why?

@noamgat
Copy link
Owner

noamgat commented Apr 30, 2024 via email

@sgsdxzy
Copy link
Author

sgsdxzy commented Apr 30, 2024

Yes change the outmost array to object works fine.

@sgsdxzy
Copy link
Author

sgsdxzy commented Apr 30, 2024

I might be having some lucky draw. If I fix the random number seed I am still getting

[
    {
        "tool_name": "internet_search",
        "parameters": {
           ":": "biggest penguin species world"
        }
    }
]

as response as of lmfe 0.9.8, alongside the Unknown LMFormatEnforcer Problem. Error above.

@noamgat
Copy link
Owner

noamgat commented May 1, 2024

Thanks! I hope to look at this in the coming days.

@noamgat
Copy link
Owner

noamgat commented May 3, 2024

I just released 0.9.10 with a fix that should remove the Unknown LMFormatEnforcer Problem issue. However, I'm not sure it will solve your problem, as the last response you posted conforms to the json schema you posted (obviously its not a good one, but thats not LMFE's job).

@sgsdxzy
Copy link
Author

sgsdxzy commented May 4, 2024

I can confirm 0.9.10 fixed the ERROR:root:Unknown LMFormatEnforcer Problem.
However I think lmfe gives the LLM wrong logits that prevents some valid responses from generation. It made conforming to the schema necessary but insufficient: It does enforce the json schema so all generations are valid, but not all valid generations are allowed.

[
    {
        "tool_name": "internet_search",
        "parameters": {
            "query": "biggest penguin species",
            "provider": "Google"
        }
    }
]

also conforms to the schema and should be a much more likely generation (it will be generated without any contraint by the LLM, and by outlines) but prevented by lmfe somehow.

@noamgat
Copy link
Owner

noamgat commented May 4, 2024

I can confirm 0.9.10 fixed the ERROR:root:Unknown LMFormatEnforcer Problem. However I think lmfe gives the LLM wrong logits that prevents some valid responses from generation. It made conforming to the schema necessary but insufficient: It does enforce the json schema so all generations are valid, but not all valid generations are allowed.

[
    {
        "tool_name": "internet_search",
        "parameters": {
            "query": "biggest penguin species",
            "provider": "Google"
        }
    }
]

also conforms to the schema and should be a much more likely generation (it will be generated without any contraint by the LLM, and by outlines) but prevented by lmfe somehow.

I now see the problem - this language model prefers to use spaces instead of tab, and uses indentation of length 4. This causes the completion to contain 13 consecutive whitespaces (newline + 12). LMFE has a heuristic const MAX_CONSECUTIVE_WHITESPACES=12 in order to avoid infinite whitespace loops (Which are legal jsons, but probably unwanted). If you are able to test this, can you try increasing this constant and trying again?

One of the upcoming features I plan to add is to allow environment variables to modify some of these configurations, to make it easier to change some of LMFE's heuristics in non-code environments (such as vLLM OpenAI server)

@sgsdxzy
Copy link
Author

sgsdxzy commented May 4, 2024

I can confirm setting MAX_CONSECUTIVE_WHITESPACES in consts.py to some larger value completely fixes this.
Yes making it configurable through environment variables is a better solution to this.

@noamgat
Copy link
Owner

noamgat commented May 4, 2024

#97 - Coming very soon :)

@noamgat
Copy link
Owner

noamgat commented May 4, 2024

Released in v0.10.1. Can you check if you can now solve the problem via Configuration Options?

@sgsdxzy
Copy link
Author

sgsdxzy commented May 4, 2024

v0.10.1 fixes this issue.

@sgsdxzy sgsdxzy closed this as completed May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants