Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aya #36521

Merged
merged 34 commits into from
Mar 4, 2025
Merged

Add aya #36521

merged 34 commits into from
Mar 4, 2025

Conversation

ArthurZucker
Copy link
Collaborator

What does this PR do?

Adds aya model

@yonigozlan yonigozlan marked this pull request as ready for review March 3, 2025 21:20
Copy link
Collaborator Author

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yonigozlan todo refactor with modular!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeep

@saurabhdash2512
Copy link
Contributor

@ArthurZucker @yonigozlan Thank you so much for all your help! Y'all legends!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan merged commit 84f0186 into main Mar 4, 2025
24 checks passed
@yonigozlan yonigozlan deleted the add-aya branch March 4, 2025 11:24
garrett361 pushed a commit to garrett361/transformers that referenced this pull request Mar 4, 2025
* initial commit

* small fix

* move stuff to image processing file

* remove stuff in validate turn and fix return tensor

* remove liquid stuff

* in the process of addressing comments

* changes to get the right tokenization

* new __init__ works

* fixing defulat std and mean

* works

* small testing scipt -- to be deleted before merge

* remove redundant code

* addressing comments

* fix inits, add docs templates

* refactor processor, switch to gotocr image processor

* remove image proc from init

* refactor to working llava-style architecture

* Change AyaVisionModel to AyaVisionForConditionalGeneration

* add tests

* fixups

* update doc

* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model

* better variable names + remove code paths

* Updates to aya_vision.md

* address comments

* adding copied from

* make style and remove unused projector_hidden_act from config

* sort init

* include usage of fast image proc and proc on cuda in doc

* update checkpoint iin test processor

* update checkpoint in test processor 2

* remove test_model and update docstring

* skip failing tests

---------

Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
garrett361 pushed a commit to garrett361/transformers that referenced this pull request Mar 4, 2025
* initial commit

* small fix

* move stuff to image processing file

* remove stuff in validate turn and fix return tensor

* remove liquid stuff

* in the process of addressing comments

* changes to get the right tokenization

* new __init__ works

* fixing defulat std and mean

* works

* small testing scipt -- to be deleted before merge

* remove redundant code

* addressing comments

* fix inits, add docs templates

* refactor processor, switch to gotocr image processor

* remove image proc from init

* refactor to working llava-style architecture

* Change AyaVisionModel to AyaVisionForConditionalGeneration

* add tests

* fixups

* update doc

* Adding logits_to_keep explicitly in ayavision forward to enable compatibility with cohere model

* better variable names + remove code paths

* Updates to aya_vision.md

* address comments

* adding copied from

* make style and remove unused projector_hidden_act from config

* sort init

* include usage of fast image proc and proc on cuda in doc

* update checkpoint iin test processor

* update checkpoint in test processor 2

* remove test_model and update docstring

* skip failing tests

---------

Co-authored-by: Saurabh Dash <saurabh@cohere.com>
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants