Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for moondream vision language model #6899

Merged
merged 3 commits into from
Apr 25, 2024

Conversation

vikhyat
Copy link
Contributor

@vikhyat vikhyat commented Apr 25, 2024

This required making the following changes to the CLIP model:

  1. Support for patch embedding bias.
  2. Make class embedding and pre-layernorm optional.
  3. Add support for post-layernorm.

I verified that the LLaVA model still works as expected after this change.

This required making the following changes to the CLIP model:

1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.
examples/llava/clip.cpp Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit 46e12c4 into ggerganov:master Apr 25, 2024
26 of 31 checks passed
@CoderCowMoo
Copy link

Would this be a general support for all SigLIP based encoders or just for moondream2?

@cjpais cjpais mentioned this pull request Apr 27, 2024
nopperl pushed a commit to nopperl/llama.cpp that referenced this pull request May 5, 2024
* add support for moondream vision language model

This required making the following changes to the CLIP model:

1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.

* Update examples/llava/clip.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
jart added a commit to Mozilla-Ocho/llamafile that referenced this pull request May 7, 2024
This broke the server's LLaVA support in a non-obvious way.

See ggerganov/llama.cpp#6899
See ggerganov/llama.cpp#7060
ggerganov added a commit that referenced this pull request May 8, 2024
abetlen added a commit to abetlen/llama.cpp that referenced this pull request May 9, 2024
@abetlen abetlen mentioned this pull request May 9, 2024
ggerganov pushed a commit that referenced this pull request May 10, 2024
* Revert "Revert "llava : add support for moondream vision language model (#6899)""

This reverts commit 9da243b.

* Fix num_positions and embeddings initialization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants