-
Notifications
You must be signed in to change notification settings - Fork 343
Added new smol 🤏 Multimodal RAG System with ColSmolVLM and SmolVLM recipe 🧑🍳️ #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
View / edit / reply to this conversation on ReviewNB andimarafioti commented on 2024-12-05T12:45:51Z Have you tried to also quantize the docs_retrieval_model? |
very cool! |
The library currently doesn't support ![]() It could be possible but I think it would add a lot of complexity to the notebook without that native support. |
View / edit / reply to this conversation on ReviewNB stevhliu commented on 2024-12-06T17:37:26Z Depending on how quickly the PR is merged, maybe we can wait until it does get merged, that way we don't need to come back and update this and it'd be simpler for users to just install the main version of byaldi sergiopaniego commented on 2024-12-09T17:40:02Z Sure, however you prefer :) |
View / edit / reply to this conversation on ReviewNB stevhliu commented on 2024-12-06T17:37:27Z Is this image being displayed twice here? sergiopaniego commented on 2024-12-09T17:37:01Z In the actual notebook, it's not shown twice: https://github.com/huggingface/cookbook/blob/12a99a3d8c4f5607bb382e1d8536e60392e653b1/notebooks/en/multimodal_rag_using_document_retrieval_and_smol_vlm.ipynb |
View / edit / reply to this conversation on ReviewNB stevhliu commented on 2024-12-06T17:37:27Z Would also be nice to compare it to the memory consumption of a not so small system so users have some more context. Maybe add a table below? sergiopaniego commented on 2024-12-09T17:37:40Z Added! Comparison of these two systems and the other two multimodal RAG recipes :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nice job!
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
In the actual notebook, it's not shown twice: https://github.com/huggingface/cookbook/blob/12a99a3d8c4f5607bb382e1d8536e60392e653b1/notebooks/en/multimodal_rag_using_document_retrieval_and_smol_vlm.ipynb View entire conversation on ReviewNB |
Added! Comparison of these two systems and the other two multimodal RAG recipes :) View entire conversation on ReviewNB |
Sure, however you prefer :) View entire conversation on ReviewNB |
Updated @stevhliu @andimarafioti 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Once your byaldi PR is merged, we can merge this one as well 🤗
View / edit / reply to this conversation on ReviewNB merveenoyan commented on 2024-12-10T16:26:07Z We can just say Byaldi is a library which contains APIs to enable streamlined multimodal RAG pipelines based on multimodal retrievers and multimodal vision language models |
View / edit / reply to this conversation on ReviewNB merveenoyan commented on 2024-12-10T16:26:08Z submitting a question and getting relevant documents that might contain the answer* (and rephasing the following sentence) simpler phrasing ✨ |
View / edit / reply to this conversation on ReviewNB merveenoyan commented on 2024-12-10T16:26:09Z "Stay up to date with the latest advancements in open vision language models by checking out the OpenVLMLeaderboard here." |
View / edit / reply to this conversation on ReviewNB merveenoyan commented on 2024-12-10T16:26:09Z SmolVLM* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some nits! otherwise looks good :)
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Thanks for the suggestions @merveenoyan 😄 Applied! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Fixes #243
Who can review?
@merveenoyan and @stevhliu