Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to run the grammar example with Mistral #580

Closed
dborowiec10 opened this issue Jan 24, 2024 · 5 comments · Fixed by #619
Closed

Error when trying to run the grammar example with Mistral #580

dborowiec10 opened this issue Jan 24, 2024 · 5 comments · Fixed by #619
Labels
bug documentation Linked to documentation and examples grammar help wanted

Comments

@dborowiec10
Copy link

Describe the issue as clearly as possible:

When trying to run the example from the README, it fails with a recursion depth exceeded error.
Running on a pretty straightforward Nvidia A100 80GB setup and environment as per the pyproject.toml.

Steps/code to reproduce the bug:

import outlines

arithmetic_grammar = """
    ?start: sum

    ?sum: product
        | sum "+" product   -> add
        | sum "-" product   -> sub

    ?product: atom
        | product "*" atom  -> mul
        | product "/" atom  -> div

    ?atom: NUMBER           -> number
         | "-" atom         -> neg
         | "(" sum ")"

    %import common.NUMBER
    %import common.WS_INLINE

    %ignore WS_INLINE
"""

model = outlines.models.transformers("mistralai/Mistral-7B-v0.1", device="cuda")
generator = outlines.generate.cfg(model, arithmetic_grammar)
sequence = generator("Write a formula that returns 5 using only additions and subtractions.")

# It looks like Mistral is not very good at arithmetics :)

print(sequence)
# 1+3-2-4+5-7+8-6+9-6+4-2+3+5-1+1

Expected result:

1+3-2-4+5-7+8-6+9-6+4-2+3+5-1+1

As per example in README

Error message:

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:  50%|█████     | 1/2 [00:06<00:06,  6.81s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00,  4.57s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00,  4.90s/it]
Traceback (most recent call last):
  File "/home/user/Work/bdnf_experiments/bdnf_experiments/main2.py", line 26, in <module>
    sequence = generator("Write a formula that returns 5 using only additions and subtractions.")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/generate/api.py", line 213, in __call__
    last_state = next(states)
                 ^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/generate/generator.py", line 82, in sequence_generator
    allowed_tokens = get_allowed_tokens(fsms, fsm_states)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/generate/generator.py", line 190, in get_allowed_tokens
    return [fsm.allowed_token_ids(state) for fsm, state in zip(fsms, fsm_states)]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/generate/generator.py", line 190, in <listcomp>
    return [fsm.allowed_token_ids(state) for fsm, state in zip(fsms, fsm_states)]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/fsm/fsm.py", line 304, in allowed_token_ids
    self._set_next_regex_fsm()
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/outlines/fsm/fsm.py", line 252, in _set_next_regex_fsm
    options = {self.terminal_regexps[x] for x in interactive.accepts()}
                                                 ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py", line 112, in accepts
    new_cursor = copy(self)
                 ^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 84, in copy
    return copier(x)
           ^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py", line 68, in __copy__
    copy(self.parser_state),
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 84, in copy
    return copier(x)
           ^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py", line 61, in __copy__
    deepcopy(self.value_stack),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 206, in _deepcopy_list
    append(deepcopy(a, memo))
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 153, in deepcopy
    y = copier(memo)
        ^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/tree.py", line 207, in __deepcopy__
    return type(self)(self.data, deepcopy(self.children, memo), meta=self._meta)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 206, in _deepcopy_list
    append(deepcopy(a, memo))
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 153, in deepcopy
    y = copier(memo)
        ^^^^^^^^^^^^


.... Repeated 100s of times



  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/tree.py", line 207, in __deepcopy__
    return type(self)(self.data, deepcopy(self.children, memo), meta=self._meta)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 206, in _deepcopy_list
    append(deepcopy(a, memo))
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 153, in deepcopy
    y = copier(memo)
        ^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/tree.py", line 207, in __deepcopy__
    return type(self)(self.data, deepcopy(self.children, memo), meta=self._meta)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 146, in deepcopy
    y = copier(x, memo)
        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 206, in _deepcopy_list
    append(deepcopy(a, memo))
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/copy.py", line 153, in deepcopy
    y = copier(memo)
        ^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/lexer.py", line 263, in __deepcopy__
    return Token(self.type, self.value, self.start_pos, self.line, self.column)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/lexer.py", line 210, in __new__
    return cls._future_new(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/pypoetry/virtualenvs/bdnf-experiments-4KtN4JnD-py3.11/lib/python3.11/site-packages/lark/lexer.py", line 215, in _future_new
    inst = super(Token, cls).__new__(cls, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python object

Outlines/Python version information:

[tool.poetry]
name = "bdnf_experiments"
version = "0.1.0"
description = REDACTED
authors = REDACTED
packages = [{include = "bdnf_experiments"}]

[tool.poetry.dependencies]
python = "^3.11"
outlines = "^0.0.24"
openai = "^1.9.0"
tiktoken = "^0.5.2"
transformers = "^4.37.1"
datasets = "^2.16.1"
parserllm = "^0.0.2"
lark = "^1.1.9"
accelerate = "^0.26.1"

Context for the issue:

No response

@lapp0
Copy link
Collaborator

lapp0 commented Jan 25, 2024

Thanks for submitting an issue!

I was able to reproduce, both in 0.0.24 and main. Here is the value of self.generation which self.parser.parse_interactive(self.generation) is called with:

'5+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3+2+2+2+2+2+2+2+2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2'

Here is a related issue in Lark where excessive recursion results in the same error after they're 242 terminals deep: lark-parser/lark#550

The output of the following is unsurprising given the linked issue:

>>> generation = '5+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-7-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3-3+2+2+2+2+2+2+2+2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2'
>>> len(re.split(r'[+-]', generation))
242

The way I see it we have 3 options here:

    1. Update the grammar so it doesn't have this issue
    1. Update the parser so it doesn't have a recursion issue (the modified LALR parser from Add Grammars vllm-project/vllm#2105 would fix this issue I believe)
    1. Disable excessive grammar recursion.

I'm leaning towards 2. The first option is a one-off and theoretically doesn't always solve the issue.


@dborowiec10 my recommendation is to use mistralai/Mistral-7B-Instruct-v0.2 while this issue exists. The recursion issue is still a problem on our end, but mistralai/Mistral-7B-Instruct-v0.2 will give you much higher quality generations. In other words, out of all my tests, mistralai/Mistral-7B-v0.1 errored every time, while mistralai/Mistral-7B-Instruct-v0.2 never did.


@rlouf

Any thoughts as to why this is cropping up now? It seems to have trouble generating the eos_token. Did you observe these unending generations previously?

As a short term fix, should we use mistralai/Mistral-7B-Instruct-v0.2 in README.md and docs/quickstart.md?

@dborowiec10
Copy link
Author

@lapp0 Thanks for the suggestion, will test with v0.2 on my end this week.

@rlouf
Copy link
Member

rlouf commented Jan 25, 2024

I tested with a quantized version on Mistral-7B-v0.1 and never encountered this issue. We can either change the model version, set the seed and/or a maximum number of tokens.

@lapp0
Copy link
Collaborator

lapp0 commented Jan 28, 2024

The EOS Token probability is consistently low during generation, leading to a lack of termination. I've found this to be the case of any regex or grammar. This is likely because the model aims to continue generating beyond the constrained format, often attempting to provide further explanation.

I'm going to experiment with stopping at newlines for single line regexp, and wrapping grammars in codeblocks which terminate at the end of the codeblock for CFG.

@lapp0
Copy link
Collaborator

lapp0 commented Jan 28, 2024

I'm having issues with mistralai/Mistral-7B-Instruct-v0.2 in outlines.generate.cfg repeating ad infinitum.

When generating with vLLM serve.py I don't run into this issue. #541

curl http://127.0.0.1:8000/generate \0.0.1:8000/generate \
    -d '{
        "prompt": "Write a formula that returns 5 using only additions and subtractions. ", 
        "grammar": "?start: expression\n\n?expression: term ((\"+\" | \"-\") term)*\n\n?term: factor ((\"*\" | \"/\") factor)*\n\n?factor: NUMBER\n       | \"-\" factor\n       | \"(\" expression \")\"\n\n%import common.NUMBER\n%import common.WS_INLINE\n\n%ignore WS_INLINE"
        }'
{"text":["Write a formula that returns 5 using only additions and subtractions. 5-3+1-1."]}

python -m outlines.serve.serve --model="mistralai/Mistral-7B-Instruct-v0.2" seems to end on a . no matter what, so that may be the reason it's terminating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug documentation Linked to documentation and examples grammar help wanted
Projects
None yet
3 participants