perf(parser): pad Token
to 16 bytes
#2211
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Counter-intuitively, it seems that increasing the size of
Token
improves performance slightly.This appears to be because when
Token
is 16 bytes, copyingToken
is a single 16-byte load/store. At present, it's 12 bytes which requires an 8-byte load/store + a 4-byte load/store.https://godbolt.org/z/KPYsn3ab7
This suggests that either:
Token
size from 16 to 12 bytes #2010 could be reverted at no cost, and the overhead of the hash table removed.or:
Token
down to 8 bytes!I have an idea how to maybe do (2), so I'd suggest leaving it as is for now until I've been able to research that.
NB I also tried putting
#[repr(align(16))]
onToken
so that copying uses aligned loads/stores. That hurt the benchmarks very slightly, though it might produce a gain on architectures where unaligned loads are more expensive (ARM64 I think?). But I can't test that theory, so have left it out.