Skip to content

Conversation

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Aug 4, 2023

This change use makeGetValue helper macros to avoid direct heap access.

This does come with a 6 byte code size cost (I measured the size of the
JS code generated for wasm64.test_em_asm with -O2 going from 9275 to
9281).

@sbc100 sbc100 force-pushed the 4gb branch 3 times, most recently from c5fcc8d to 84c65a4 Compare August 4, 2023 16:31
@sbc100 sbc100 requested a review from kripken August 4, 2023 16:32
@sbc100 sbc100 force-pushed the 4gb branch 3 times, most recently from 8387b69 to 04529b3 Compare August 4, 2023 22:38
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this uses makeGetValue to make sure to emit the right code, but what was breaking in the old code? We do support direct HEAPF64[..] etc. access in wasm64, don't we? And >>= should work on an int53?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 7, 2023

I understand this uses makeGetValue to make sure to emit the right code, but what was breaking in the old code? We do support direct HEAPF64[..] etc. access in wasm64, don't we? And >>= should work on an int53?

I believe the problem with using buf >>= 2 to get to turn the byte address into an int32 address. This doesn't work if buf is largers than 2**32, right?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 7, 2023

I understand this uses makeGetValue to make sure to emit the right code, but what was breaking in the old code? We do support direct HEAPF64[..] etc. access in wasm64, don't we? And >>= should work on an int53?

I believe the problem with using buf >>= 2 to get to turn the byte address into an int32 address. This doesn't work if buf is largers than 2**32, right?

Indeed any usage of bitshifting doesn't work for these larger addresses, right?

@kripken
Copy link
Member

kripken commented Aug 7, 2023

Ah, yes, that does make sense. We'd need >>>= for 2-4GB but nothing can work for more than that.

Another option might be /= 2. The JIT will make it fast anyhow, and it would work up to 53 bits. What do you think?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 8, 2023

Ah, yes, that does make sense. We'd need >>>= for 2-4GB but nothing can work for more than that.

Another option might be /= 2. The JIT will make it fast anyhow, and it would work up to 53 bits. What do you think?

Whats wrong with the approach taken here which uses the getVale/setValue helpers to avoid this completely? In most cases I think its more consistent to always use these helpers rather than direct heap access, no?

1 similar comment
@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 8, 2023

Ah, yes, that does make sense. We'd need >>>= for 2-4GB but nothing can work for more than that.

Another option might be /= 2. The JIT will make it fast anyhow, and it would work up to 53 bits. What do you think?

Whats wrong with the approach taken here which uses the getVale/setValue helpers to avoid this completely? In most cases I think its more consistent to always use these helpers rather than direct heap access, no?

@kripken
Copy link
Member

kripken commented Aug 8, 2023

This approach is good. It's just the 11 byte code cost. Not a big deal but if this pattern happens in many places perhaps it adds up, that's what worries me. And I don't see obvious downsides to /=?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 8, 2023

OK I switched to using division, which actually saves a few bytes. How confident are you the optimizing compiler can make these equivalent?

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 8, 2023

@juj has another solution that involves yet more macros: #19289

@sbc100 sbc100 requested a review from kripken August 8, 2023 18:29
@kripken
Copy link
Member

kripken commented Aug 8, 2023

I'm very confident VMs optimize this based on what I recall from years ago, but my information isn't recent, so we should probably verify to make sure.

If we verify, this PR lgtm, but good point about that other PR with the macros, perhaps that is worth considering as well - what do you think? I lean towards just doing / * actually, since it seems simpler (and even saves size compared to >> <<.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 16, 2023

I reversed course again and decided to go with the helper macros here.. I think its cleaner and more consistent. And worth the 6 bytes of code size. It also unblocks a lot of followup changes to get >4gb tests passing.

@sbc100
Copy link
Collaborator Author

sbc100 commented Aug 16, 2023

We can look into the possibility of using division rather than shifting if/when we have some performance numbers, but using the macros means that we can update in single location, making these measurements easier.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. if we find this extra code size appears in more places we can maybe prioritize focusing on it more then. if it's just a few places then the simplicity and other benefits seem worth it.

Base automatically changed from em_asm to main August 18, 2023 00:58
sbc100 added a commit that referenced this pull request Aug 18, 2023
This change use makeGetValue helper macros to avoid direct heap access.

This does come with a 6 byte code size cost (I measured the size of the
JS code generated for wasm64.test_em_asm with -O2 going from 9275 to
9281).
@sbc100 sbc100 merged commit 2d72c4d into main Aug 18, 2023
@sbc100 sbc100 deleted the 4gb branch August 18, 2023 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants