Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The decoder returned a sequence that exceeds the provided max_len when using local:llama.cpp #356

Open
jzju opened this issue Jun 20, 2024 · 0 comments

Comments

@jzju
Copy link

jzju commented Jun 20, 2024

When running the following I get the error but its fine with openai (just remove model=model).

model = lmql.model(
    "local:llama.cpp:free.gguf",
    tokenizer="Orenguteng/Llama-3-8B-Lexi-Uncensored",
    n_threads=16,
    n_gpu_layers=128,
    n_ctx=8192,
)
query_string = """
    "Q: What is 1+1 \\n"
    "A: [ANS] \\n"
"""
lmql.run_sync(query_string, model=model).variables
File /usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1122, in PromptInterpreter.run(self, fct, *args, **kwargs)
   [1120](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1120) else:
   [1121](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1121)     state = await self.advance(state)
-> [1122](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1122)     assert len(s.input_ids) < decoder_args["max_len"], "The decoder returned a sequence that exceeds the provided max_len (max_len={}, sequence length={}). To increase the max_len, please provide a corresponding max_len argument to the decoder function.".format(decoder_args["max_len"], len(s.input_ids))
   [1124](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1124)     assert state.query_head.result is not None, "decoder designates sequence {} as finished but the underyling query program has not produced a result. This is likekly a decoder bug. Decoder in use {}".format(await s.text(), decoder_args["decoder"])
   [1125](https://vscode-remote+ssh-002dremote-002bph.vscode-resource.vscode-cdn.net/usr/local/lib/python3.10/dist-packages/lmql/runtime/interpreter.py:1125)     results.append(state.query_head.result)

AssertionError: The decoder returned a sequence that exceeds the provided max_len (max_len=2048, sequence length=2048). To increase the max_len, please provide a corresponding max_len argument to the decoder function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant