-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: supports loading .safetensors params file #231
Conversation
Thanks for the PR, a couple minor comments :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Co-authored-by: Jonatan Kłosko <jonatanklosko@gmail.com>
FWIW I plan to explore in a follow up PR also supporting safetensors sharded params files. |
Co-authored-by: Jonatan Kłosko <jonatanklosko@gmail.com>
Thanks a lot! |
Mmm, correcting myself. I think with the changes in this PR one would be able to load sharded safetensors param files. For example, for https://huggingface.co/stabilityai/StableBeluga-7B/tree/main which contains the following files
using
should use the existing sharded params loading logic, look at the index file and "just work". |
What would be a good improvement might be adding some decent auto-selection of the preferred file format based on what's available in the model repo without having the user needed to explicitly provide the file name. |
Good call, so far most repos had the pytorch file and optionally other formats, but as safetensors become more popular there may be cases where it's just safetensors. Currently we do fallbacks, that is, request one file, if doesn't exist request another, and so on. I checked and looks like HF API now allows listing files, so I will later reevaluate if we can improve :) |
FTR as of #256 we automatically detect if there are no parameters in the pytorch format, but safetensors one is available :) |
closes #96
Opening proof of concept as draft while I continue working on some improvements and test coverage, and potentially any other feedback folks have :-)