Skip to content

[embind] Don't truncate unsigned long under wasm64 #24678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 11, 2025

Conversation

RReverser
Copy link
Collaborator

This makes it consistent with other bindings like EM_ASM.

As part of this, I had to change vector and map sizes and indices to explicitly take unsigned int instead, so that users don't have to upgrade all their code to deal with unexpected bigints in e.g. vec.size().

Technically this means you can't pass around a C++ vector >4GB to JS and back, but I'm sure we can live with it.

This makes it consistent with other bindings like EM_ASM.

As part of this, I had to change vector and map sizes and indices to explicitly take `unsigned int` instead, so that users don't have to upgrade all their code to deal with unexpected bigints in e.g. `vec.size()`.

Technically this means you can't pass around a C++ vector >4GB to JS and back, but I'm sure we can live with it.
Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is the wrong direction. size_t and void * should be the same type on the JS side whenever possible.

The fact that EM_ASM cannot distinguish size_t from unsigned long is a bug/limitation in EM_ASM.

No point dragging embind down to the level of EM_ASM.

@RReverser
Copy link
Collaborator Author

The fact that EM_ASM cannot distinguish size_t from unsigned long is a bug/limitation in EM_ASM.

But neither can Embind, as it's a limitation of C/C++ not specific bindings.

In #24611 you said:

(Without looking at this specific instance yet) I think the only times we want to use i53 numbers for 64-bit values is:

  1. When we know that thing is a pointer type (e.g. when p is used in signature of a JS function)
  2. When there is an explicit opt-in (i.e. __i53abi tag on JS functions).

Otherwise 64-bit values should be preserved either via bigint (or pair-of-numbers)

In this case size_t aka unsigned long is a 64-bit value, and 1) we don't know that it's a pointer type and 2) there is no explicit opt-in, so per conditions above we should preserve it as a bigint on wasm64?

@RReverser
Copy link
Collaborator Author

To clarify, SignatureCode<size_t> is a misnomer there.

It can look like it's only overriding signature for an explicit size_t, but because it's simply an type alias, what that line actually does is defines SignatureCode<unsigned long>.

So right now all unsigned long values - which are 64-bit integers on wasm64 - get truncated to i53 in Embind. Based on previous discussions, I assume this is not what we want, and unsigned long should be a bigint on wasm64 to preserve the full range instead.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 9, 2025

To clarify, SignatureCode<size_t> is a misnomer there.

It can look like it's only overriding signature for an explicit size_t, but because it's simply an type alias, what that line actually does is defines SignatureCode<unsigned long>.

So right now all unsigned long values - which are 64-bit integers on wasm64 - get truncated to i53 in Embind. Based on previous discussions, I assume this is not what we want, and unsigned long should be a bigint on wasm64 to preserve the full range instead.

Oh I see, so that real change here that we want is for unsigned long to not be truncated.

Perhaps the title should "Don't truncate 64-bit unsigned long values under wasm64" and the unfortunate side effect is that size_t cannot be truncated (even though we would like it to be, ideally).

@RReverser RReverser changed the title [embind] Map size_t to bigint on wasm64 [embind] Don't truncate unsigned long under wasm64 Jul 9, 2025
@RReverser
Copy link
Collaborator Author

Sgtm, changed the title.

@RReverser RReverser requested a review from sbc100 July 11, 2025 10:38
@RReverser RReverser enabled auto-merge (squash) July 11, 2025 10:38
@RReverser RReverser disabled auto-merge July 11, 2025 17:18
@RReverser RReverser enabled auto-merge (squash) July 11, 2025 19:31
@RReverser RReverser merged commit 4182f94 into emscripten-core:main Jul 11, 2025
30 checks passed
@RReverser RReverser deleted the embind-size-t branch July 11, 2025 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants