Skip to content

Conversation

sergiopaniego
Copy link
Member

What does this PR do?

Fixes #243

Who can review?

@merveenoyan and @stevhliu

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

review-notebook-app bot commented Dec 5, 2024

View / edit / reply to this conversation on ReviewNB

andimarafioti commented on 2024-12-05T12:45:51Z
----------------------------------------------------------------

Have you tried to also quantize the docs_retrieval_model?


@andimarafioti
Copy link
Member

very cool!

@sergiopaniego
Copy link
Member Author

View / edit / reply to this conversation on ReviewNB

andimarafioti commented on 2024-12-05T12:45:51Z ----------------------------------------------------------------

Have you tried to also quantize the docs_retrieval_model?

The library currently doesn't support quantization_config as in transformers:

image

It could be possible but I think it would add a lot of complexity to the notebook without that native support.

Copy link

review-notebook-app bot commented Dec 6, 2024

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-12-06T17:37:26Z
----------------------------------------------------------------

Depending on how quickly the PR is merged, maybe we can wait until it does get merged, that way we don't need to come back and update this and it'd be simpler for users to just install the main version of byaldi


sergiopaniego commented on 2024-12-09T17:40:02Z
----------------------------------------------------------------

Sure, however you prefer :)

Copy link

review-notebook-app bot commented Dec 6, 2024

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-12-06T17:37:27Z
----------------------------------------------------------------

Is this image being displayed twice here?


sergiopaniego commented on 2024-12-09T17:37:01Z
----------------------------------------------------------------

In the actual notebook, it's not shown twice: https://github.com/huggingface/cookbook/blob/12a99a3d8c4f5607bb382e1d8536e60392e653b1/notebooks/en/multimodal_rag_using_document_retrieval_and_smol_vlm.ipynb

Copy link

review-notebook-app bot commented Dec 6, 2024

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-12-06T17:37:27Z
----------------------------------------------------------------

Would also be nice to compare it to the memory consumption of a not so small system so users have some more context. Maybe add a table below?


sergiopaniego commented on 2024-12-09T17:37:40Z
----------------------------------------------------------------

Added! Comparison of these two systems and the other two multimodal RAG recipes :)

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice job!

sergiopaniego and others added 2 commits December 9, 2024 18:34
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Copy link
Member Author

Copy link
Member Author

Added! Comparison of these two systems and the other two multimodal RAG recipes :)


View entire conversation on ReviewNB

Copy link
Member Author

Sure, however you prefer :)


View entire conversation on ReviewNB

@sergiopaniego
Copy link
Member Author

sergiopaniego commented Dec 9, 2024

Updated @stevhliu @andimarafioti 🤗

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Once your byaldi PR is merged, we can merge this one as well 🤗

Copy link

review-notebook-app bot commented Dec 10, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-12-10T16:26:07Z
----------------------------------------------------------------

We can just say Byaldi is a library which contains APIs to enable streamlined multimodal RAG pipelines based on multimodal retrievers and multimodal vision language models


Copy link

review-notebook-app bot commented Dec 10, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-12-10T16:26:08Z
----------------------------------------------------------------

submitting a question and getting relevant documents that might contain the answer*

(and rephasing the following sentence)

simpler phrasing ✨


Copy link

review-notebook-app bot commented Dec 10, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-12-10T16:26:09Z
----------------------------------------------------------------

"Stay up to date with the latest advancements in open vision language models by checking out the OpenVLMLeaderboard here."


Copy link

review-notebook-app bot commented Dec 10, 2024

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-12-10T16:26:09Z
----------------------------------------------------------------

SmolVLM*


Copy link
Collaborator

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some nits! otherwise looks good :)

sergiopaniego and others added 2 commits December 11, 2024 15:27
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
@sergiopaniego
Copy link
Member Author

Thanks for the suggestions @merveenoyan 😄 Applied!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@stevhliu stevhliu merged commit 2b0e221 into huggingface:main Dec 11, 2024
1 check passed
@sergiopaniego sergiopaniego deleted the smol_vlm_rag branch December 11, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New smol 🤏 Multimodal RAG System with ColSmolVLM and SmolVLM recipe 🧑‍🍳️
5 participants