Skip to content

Conversation

@AlekseiNikiforovIBM
Copy link
Contributor

Assume converted model data is originally little-endian.
Byteswap data on s390x after reading it to put values in correct presentation
for any transformation needed, like calculating weight tensors.

Then byteswap data to little-endian before passing it to GGUFWriter while
GGUFWriter will byteswap data back to big endian if big endian output is requested.

byteswap(inplace=True) calls don't work with lazy tensor and array wrappers.
Use byteswap with copying data to workaround this behaviour.

Make GGUFWriter accept tensors in native endianness instead of little-endian.

With this change if no byteswapping is actually needed, 2 excessive byteswaps can be omitted on s390x.

Assume converted model data is originally little-endian.
Byteswap data on s390x after reading it to put values in correct presentation
for any transformation needed, like calculating weight tensors.

Then byteswap data to little-endian before passing it to GGUFWriter while
GGUFWriter will byteswap data back to big endian if big endian output is requested.

byteswap(inplace=True) calls don't work with lazy tensor and array wrappers.
Use byteswap with copying data to workaround this behaviour.
…-endian

With this change if no byteswapping is actually needed, 2 excessive byteswaps can be omitted on s390x
@CISC CISC requested a review from compilade November 21, 2025 16:07
Copy link
Collaborator

@compilade compilade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Did it ever work before or was it broken by #15667?

Comment on lines +10049 to +10053
torch.uint64: np.uint64,
torch.int32: np.int32,
torch.uint32: np.uint32,
torch.int16: np.int16,
torch.uint16: np.uint16,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be relevant to uncomment the unsigned int types in _dtype_str_map as well (U16, U32, U64) if those are expected to exist.

They seem to be available since PyTorch 2.3.0, while the requirements.txt has version 2.6.0, so it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mentioned those types for numpy in case they're ever encountered. It would be fine for me to get _dtype_str_map updated, but maybe it could be done separately?

@AlekseiNikiforovIBM
Copy link
Contributor Author

Thanks! Did it ever work before or was it broken by #15667?

I don't know because I didn't test it before.

@AlekseiNikiforovIBM
Copy link
Contributor Author

Is this change ok to merge with latest commit? If yes, how do I merge it?

@CISC
Copy link
Collaborator

CISC commented Nov 25, 2025

Is this change ok to merge with latest commit? If yes, how do I merge it?

Yes, LGTM, I'll merge.

@CISC CISC merged commit 05872ac into ggml-org:master Nov 25, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants