-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Strengthen type-based optimizations #5664
JIT: Strengthen type-based optimizations #5664
Conversation
CT Test Results 4 files 358 suites 44m 11s ⏱️ Results for commit 2362d1c. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked closely at the arm parts, but x86 looks good to me.
@@ -607,7 +607,11 @@ static int parse_type_chunk(BeamFile *beam, IFF_Chunk *chunk) { | |||
beamreader_init(chunk->data, chunk->size, &reader); | |||
|
|||
LoadAssert(beamreader_read_i32(&reader, &version)); | |||
LoadAssert(version == BEAM_TYPES_VERSION); | |||
if (version != BEAM_TYPES_VERSION) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we issue a run-time warning when this happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not by default. It could get very annoying for systems that intentionally mix BEAM files compiled by different versions of the compiler. But there should be some way or some tool so that one could get aware that type-based optimizations are no longer active.
if (need_div) { | ||
a.sal(x86::rax, imm(_TAG_IMMED1_SIZE)); | ||
} | ||
|
||
if (need_rem) { | ||
a.sal(x86::rdx, imm(_TAG_IMMED1_SIZE)); | ||
} | ||
|
||
if (need_div) { | ||
a.or_(x86::rax, imm(_TAG_IMMED1_SMALL)); | ||
} | ||
|
||
if (need_rem) { | ||
a.or_(x86::rdx, imm(_TAG_IMMED1_SMALL)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4 ifs and not just 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have kept the original order of the instructions, thinking that perhaps it could enable more instruction-level parallelism. Not sure whether it makes any difference, or whether the CPU itself can do the same reordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC and Clang seem to think that it does matter, so let us leave it like this.
i_rem_div Fail Live LHS RHS Remainder Quotient | ||
gc_bif2 Fail Live u$bif:erlang:rem/2 LHS1 RHS1 Remainder | \ | ||
gc_bif2 A B u$bif:erlang:intdiv/2 LHS2 RHS2 Quotient | \ | ||
equal(LHS1, LHS2) | \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the changes to the types, would it make sense to not allow matching on equal arguments by repeating the pattern and always require explicit equal
calls? This seems like something that would be easy to get wrong and might result in considerable performance issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have also had the same thought. It is a probably a good idea.
This is necessary if type information is added to instructions that have `list` operands (such as `bs_create_bin`).
The introduction of types broke fusing of `div` and `rem`.
Add ranges to the type for functions that returns a size for an Erlang term. Since any term is limited by the size of memory, it is possible to define an upper limit that can never be reached in practice in the foreseeable future. Also improve propagation of ranges through some more operations.
For expressions such as: Unknown band 42 the type would conservatively be assumed to be any integer. Be less conservative and assume that the type is an integer in the range 0 .. 42. That will give more opportunities for the JIT to eliminate type tests. Note that for: Unknown band 0 we will still infer the type to be any integer instead of the integer 0. The reason is that code such as: foo(Unknown) -> Unknown band 0. would be rewritten to: foo(Unknown) -> Unknown band 0, 0. if we were to assume that `Unknown band 0` is 0. This is a pessimization.
Find the type of a call to `erlang/2` where a literal tuple is used as a lookup table. Here is a truncated example from the `base64` module: element(X+1, {$A, $B, $C, $D, $E, $F, ...}) The type will be the join of the types of the elements.
Having type information for the `bs_create_bin` instruction will allow to omit some type checks.
Version 1 of the type information also include ranges for integers.
ac96abc
to
b054a03
Compare
Test that the JIT does not omit the overflow check when it would be unsafe.
2362d1c
to
d6907be
Compare
This pull requests strengthens the type-based optimizations in the JIT that were introduced in #5316.
The compiler has been enhanced to infer ranges for more operations. The ranges are now included in the type information stored in BEAM files.
With the enhanced type information, the JIT can now omit type checks and overflow checks for some arithmetic instructions, and can also omit some redundant tests when constructing binaries using the binary syntax.