Skip to content

Output does not stream #11

@bvanslyke

Description

@bvanslyke

Rather than streaming output I see all of the output show up at the end, all at once.

But, if I add a print statement before an output item is yielded, then I'll see text generated line-by-line from my print().

for item in stream:
    # Each item looks like this:
    # {'id': 'cmpl-00...', 'object': 'text_completion', 'created': .., 'model': '/path', 'choices': [
    #   {'text': '\n', 'index': 0, 'logprobs': None, 'finish_reason': None}
    # ]}
+   print(item["choices"][0]["text"], end="")
    yield item["choices"][0]["text"]
  • Using llm 0.8 on macOS m1, installed via pipx
  • And the latest llama-cpp-python, force-reinstalled with no pip cache, rebuilt with METAL -- following this repo's README.
  • I see this running with default options, like llm -m llamacode "My prompt here" on a models added with and without the --llama2-chat option
  • When I switch to an OpenAI model like -m 4, streaming works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions