Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Terminal Option: --no_live_response #1278

Closed
wants to merge 4 commits into from

Conversation

Steve235lab
Copy link
Contributor

Describe the changes you have made:

Add a new terminal option which allows users to config whether rendering responses while receiving chunks (classic and default behavior) or perform a one-time rendering after all chunks were received (new behavior).

Perform a one-time rendering after all chunks were received will prevent showing duplicate lines in terminal and especially when using via SSH it will reduce bandwidth usage and twinkling.

Reference any relevant issues (e.g. "Fixes #000"):

Temporally fixes #1127

Pre-Submission Checklist (optional but appreciated):

  • I have included relevant documentation updates (stored in /docs)
  • I have read docs/CONTRIBUTING.md
  • I have read docs/ROADMAP.md

OS Tests (optional but appreciated):

  • Tested on Windows
  • Tested on MacOS
  • Tested on Linux

@Steve235lab
Copy link
Contributor Author

This has annoyed me for a long time since the very first time I use OI, and this could be a not perfect but working solution. Just pull and give it a try, you'll know what I'm talking about.

@tyfiero
Copy link
Collaborator

tyfiero commented May 23, 2024

This is so cool, its been an issue for a while. thanks @Steve235lab

@KillianLucas
Copy link
Collaborator

Hi @Steve235lab, this is fantastic. I am annoyed by the original behavior as well! But I want to float two other solutions.

I think the streaming is an important UX component to lots of modern AI systems, and I think we can fix the issue in two other ways:

  1. --plain — a flag that just removes Rich. It merely would merely print(chunk, end="") the chunks as plain text, more like Ollama. Would also work if someone wanted to pipe OI's output into something else. This should fix all problems, unless there's something deeper about the rate of streaming that's bad for SSH!
  2. Always printing the last 5 messages at the end of a message stream. This would fix the weird repeating behavior, I believe, because you'd scroll up, and it would be 5 solid messages printed at once. It wouldn't fix twinkling during streaming, and it wouldn't help with SSH bandwidth, but it would fix the repeating bug.

What do you think?

@Steve235lab
Copy link
Contributor Author

--plain — a flag that just removes Rich. It merely would merely print(chunk, end="") the chunks as plain text, more like Ollama. Would also work if someone wanted to pipe OI's output into something else. This should fix all problems, unless there's something deeper about the rate of streaming that's bad for SSH!

This one is great, I will implement this later.

@Steve235lab
Copy link
Contributor Author

Long time no see, it seems somebody had already implemented this, and it works well for me, closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repeating output
3 participants