-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Fix convert_hf_to_gguf.py script on s390x #17431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Assume converted model data is originally little-endian. Byteswap data on s390x after reading it to put values in correct presentation for any transformation needed, like calculating weight tensors. Then byteswap data to little-endian before passing it to GGUFWriter while GGUFWriter will byteswap data back to big endian if big endian output is requested. byteswap(inplace=True) calls don't work with lazy tensor and array wrappers. Use byteswap with copying data to workaround this behaviour.
…-endian With this change if no byteswapping is actually needed, 2 excessive byteswaps can be omitted on s390x
de7b5ec to
ed94707
Compare
compilade
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Did it ever work before or was it broken by #15667?
| torch.uint64: np.uint64, | ||
| torch.int32: np.int32, | ||
| torch.uint32: np.uint32, | ||
| torch.int16: np.int16, | ||
| torch.uint16: np.uint16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be relevant to uncomment the unsigned int types in _dtype_str_map as well (U16, U32, U64) if those are expected to exist.
They seem to be available since PyTorch 2.3.0, while the requirements.txt has version 2.6.0, so it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mentioned those types for numpy in case they're ever encountered. It would be fine for me to get _dtype_str_map updated, but maybe it could be done separately?
I don't know because I didn't test it before. |
|
Is this change ok to merge with latest commit? If yes, how do I merge it? |
Yes, LGTM, I'll merge. |
Assume converted model data is originally little-endian.
Byteswap data on s390x after reading it to put values in correct presentation
for any transformation needed, like calculating weight tensors.
Then byteswap data to little-endian before passing it to GGUFWriter while
GGUFWriter will byteswap data back to big endian if big endian output is requested.
byteswap(inplace=True) calls don't work with lazy tensor and array wrappers.
Use byteswap with copying data to workaround this behaviour.
Make GGUFWriter accept tensors in native endianness instead of little-endian.
With this change if no byteswapping is actually needed, 2 excessive byteswaps can be omitted on s390x.