perf(parser): pad `Token` to 16 bytes #2211

overlookmotel · 2024-01-29T19:49:28Z

Counter-intuitively, it seems that increasing the size of Token improves performance slightly.

This appears to be because when Token is 16 bytes, copying Token is a single 16-byte load/store. At present, it's 12 bytes which requires an 8-byte load/store + a 4-byte load/store.

https://godbolt.org/z/KPYsn3ab7

This suggests that either:

perf(parser): reduce Token size from 16 to 12 bytes #2010 could be reverted at no cost, and the overhead of the hash table removed.
or:
We need to get Token down to 8 bytes!

I have an idea how to maybe do (2), so I'd suggest leaving it as is for now until I've been able to research that.

NB I also tried putting #[repr(align(16))] on Token so that copying uses aligned loads/stores. That hurt the benchmarks very slightly, though it might produce a gain on architectures where unaligned loads are more expensive (ARM64 I think?). But I can't test that theory, so have left it out.

codspeed-hq · 2024-01-29T20:03:55Z

CodSpeed Performance Report

Merging #2211 will not alter performance

_{Comparing overlookmotel:lexer-pad-token (065d5fe) with main (872d751)}

Summary

✅ 17 untouched benchmarks

Boshen · 2024-01-30T03:47:21Z

Counter-intuitively, it seems that *increasing* the size of `Token` improves performance slightly. This appears to be because when `Token` is 16 bytes, copying `Token` is a single 16-byte load/store. At present, it's 12 bytes which requires an 8-byte load/store + a 4-byte load/store. https://godbolt.org/z/KPYsn3ab7 This suggests that either: 1. oxc-project#2010 could be reverted at no cost, and the overhead of the hash table removed. or: 2. We need to get `Token` down to 8 bytes! I have an idea how to *maybe* do (2), so I'd suggest leaving it as is for now until I've been able to research that. NB I also tried putting `#[repr(align(16))]` on `Token` so that copying uses aligned loads/stores. That [hurt the benchmarks very slightly](https://codspeed.io/overlookmotel/oxc/branches/lexer-pad-token), though it might produce a gain on architectures where unaligned loads are more expensive (ARM64 I think?). But I can't test that theory, so have left it out.

perf(parser): pad Token to 16 bytes

065d5fe

github-actions bot added the A-parser Area - Parser label Jan 29, 2024

Boshen merged commit 20679d1 into oxc-project:main Jan 30, 2024
20 checks passed

overlookmotel deleted the lexer-pad-token branch January 30, 2024 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(parser): pad `Token` to 16 bytes #2211

perf(parser): pad `Token` to 16 bytes #2211

overlookmotel commented Jan 29, 2024 •

edited

Loading

codspeed-hq bot commented Jan 29, 2024

Boshen commented Jan 30, 2024

perf(parser): pad Token to 16 bytes #2211

perf(parser): pad Token to 16 bytes #2211

Conversation

overlookmotel commented Jan 29, 2024 • edited Loading

codspeed-hq bot commented Jan 29, 2024

CodSpeed Performance Report

Merging #2211 will not alter performance

Summary

Boshen commented Jan 30, 2024

perf(parser): pad `Token` to 16 bytes #2211

perf(parser): pad `Token` to 16 bytes #2211

overlookmotel commented Jan 29, 2024 •

edited

Loading