Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: multimodal - fix misreported prompt and num prompt tokens #5896

Closed
wants to merge 3 commits into from

Conversation

cjpais
Copy link
Contributor

@cjpais cjpais commented Mar 6, 2024

Should address #5852, #5863

The issue is all normal token processing is skipped for multimodal so it outputs no tokens processed in the response. This is wrong.

To fix, we set the number of tokens processed to it's correct value in ingest_images where the prompt is tokenized for multimodal.

Additionally a fix for the prompt being set to the empty string for multimodal responses. Basically we iteratively rebuild the initial prompt since it was cleared.

I think it would be best to unify the token processing between multimodal and regular, but this PR does not aim to address that.

@chigkim
Copy link

chigkim commented Mar 6, 2024

Awesome! confirming it works! Thank you!

@ggerganov
Copy link
Owner

ggerganov commented Mar 6, 2024

I'm attempting to refactor most of the server code in #5882. The tentative plan is to remove multimodal functionality and potentially reintroduce it at a later stage. In the meantime llava-cli can server as a LLaVA example

Since it touches pretty much the entire code, I don't plan to merge server related PRs in the meantime to avoid resolving conflicts

@cjpais
Copy link
Contributor Author

cjpais commented Mar 6, 2024

Wasn't aware of the rewrite, but excited for it. Makes sense to me not to merge.

Happy to close this PR if it makes sense to you. I'll keep my branch up for those who want to patch or cherrypick.

@mathpopo
Copy link

Wasn't aware of the rewrite, but excited for it. Makes sense to me not to merge.

Happy to close this PR if it makes sense to you. I'll keep my branch up for those who want to patch or cherrypick.

can give me zhe branch name? i use latest version , server upload image , just error?

@crashr
Copy link
Contributor

crashr commented Mar 29, 2024

@mathpopo You can fetch from this PR and create a local branch. You could name it "multimodal".

git fetch origin pull/5896/head:multimodal
git checkout multimodal

@ggerganov
Copy link
Owner

This is outdated since we removed the multimodal functionality from server

@ggerganov ggerganov closed this May 10, 2024
@mofosyne mofosyne added the obsolete? Marker for potentially obsolete PR label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
obsolete? Marker for potentially obsolete PR Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants