Have RangeCheck::ComputeRange handle some TYP_LONG scenarios#128676
Have RangeCheck::ComputeRange handle some TYP_LONG scenarios#128676tannergooding wants to merge 6 commits into
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR extends CoreCLR JIT range analysis to recognize additional patterns that can help eliminate bounds checks, especially when intermediate nodes are TYP_LONG (e.g., certain shifts, casts, and bit-count intrinsics).
Changes:
- Added
BitOperations::RoundUpToPowerOf2helpers foruint32_t/uint64_t. - Enhanced
RangeCheck::ComputeRangeForBinOpto special-case someTYP_LONGshift-right scenarios by mapping them into existing 32-bit range logic. - Enhanced
RangeCheck::ComputeRangeto derive ranges for additional node kinds (someTYP_LONGVN constants that fit inint32,GT_CASTsizing logic, and select bit-count intrinsics/HW intrinsics).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/coreclr/jit/utils.h | Adds RoundUpToPowerOf2 overloads to BitOperations. |
| src/coreclr/jit/rangecheck.cpp | Expands range computation to cover more long/cast/intrinsic cases and adjusts binop handling for long shifts. |
16f795e to
b105749
Compare
|
Sorry but +200 LOC with a risky change to lift TYP_LONG limitation and that path for CAST in rangecheck is not worth 1 diff in windows-x64 (and 8 diffs on linux-x64) - this makes code harder to read/maintain/introduces risks for no benefit to me |
We will never get into a situation where we can properly handle As for the actual changes handled here, the operations being touched are fairly trivial to check for correctness given the limited set they can be. 60 lines is the 15 lines is the 30 lines is then improving the 25 lines is expanding the We then have 40 lines that are actually meaningful and handling the edge cases for The remaining 20-30 lines is just adding The other paths are handling cases such as |
|
Again, all I see is +200 LOC for a 1-2 diffs meaning almost no test coverage (but very much can break actual production from my experience). I understand that you want to check in a change you spent some time on, but all I see is an increased maintenance cost for people who work with this code full-time. I don't see how this moves us towards TYP_LONG support or if we even want to go there given all arrays/spans are TYP_INT typed. I think we should first prove that TYP_LONG range analysis is worth it. There will be a lot of pitfalls everywhere starting from enabling assertions (we have no room for any new ones) for TYP_LONG; this path will be the very least of our problems. |
|
So you basically removed a correcntess limitation for not being TYP_LONG for a big function and lack of diffs in my opinion do not help to say it's fine to just unconditionally enable it for all opcodes handled in that function |
| var_types op1Type = hwintrinsic->Op(1)->TypeGet(); | ||
|
|
||
| range.lLimit = Limit(Limit::keConstant, 0); | ||
| range.uLimit = Limit(Limit::keConstant, varTypeIsLong(op1Type) ? 64 : 32); |
There was a problem hiding this comment.
the reason why the original path was so simple because only "never negative" knowledge had an impact. Upper bound didn't matter.
There was a problem hiding this comment.
Yes, but there's also no reason to not be accurate and it can have impact in user code. Namely this allows indexing into a span/array with known length to skip bounds check and therefore allowing such code to avoid unsafe.
| return static_cast<uint32_t>(0x1'0000'0000ULL >> LeadingZeroCount(value - 1)); | ||
| } | ||
|
|
||
| static uint64_t RoundUpToPowerOf2(uint64_t value) |
There was a problem hiding this comment.
Yes, the PR was originally doing more and I dropped some of the changes to make it more iterative and easier to review and check for correctness.
I can drop it and add it back in the PR that actually uses it instead, but we do have a few such "round up" in the codebase already and so if we keep it, we can just submit a follow up PR to centralize them to this one instead (as we've done with log2 and other helpers that are in utils/BitOperations
That isn't actually what that means and is itself a view of survivorship bias. Rather because we know that
This isn't a consideration to me, I close PRs that I don't believe will pay off all the time. I keep them open if there are significant wins or I believe they are part of an iterative set of PRs that will pay off and I close them otherwise (and far more get closed or never even become PRs than stay open). This one is a case I believe will pay off long term if we are able to get through the set of smaller iterative PRs that move us towards being able to handle
It moves us a step in the right direction because it allows us to actually start processing The fact that arrays/spans have a We can also, from this, move forward with some other range based tracking around comparisons, which will in turn allow |
And to that regard, this PR should now have a large number of test failures on 32-bit and a larger number of diffs on those platforms because of #128769 and me switching to use This explicitly highlights that we do have good coverage here, that this work helps uncover places we are incorrectly handling |
|
|
||
| var_types rangeType; | ||
|
|
||
| if (genTypeSize(fromType) < genTypeSize(toType)) |
There was a problem hiding this comment.
I ran diffs with this change and there were 0 diffs. can we remove it? I really don't see what value it brings to a very buggy CompureRange in rangecheck that typically requires all changes to be repeated in DoesOverflow as well.
So it's just + 30 LOC with no value that makes me uncofortable knowing how fragile ComputeRange is (it's a completely separate thing vs GetRangeFromAssertions/VN)
There was a problem hiding this comment.
I can remove it, but could you elaborate on what is uncomfortable about the check?
We presumably want the same type of check in GetRangeFromAssertions which is also using the casts target type as the initial foundation.
A widening cast can never exceed the bounds of the input (fromType), so:
BYTE -> INT:[INT8_MIN, INT8_MAX]UBYTE -> INT:[0, UINT8_MAX]SHORT -> INT:[INT16_MIN, INT16_MAX]USHORT -> INT:[0, UINT16_MAX]INT -> LONG:[INT32_MIN, INT32_MAX]UINT -> LONG:[0, UINT32_MAX]
A cast between same sizes however only respects the target type (and even should've been elided).
UINT -> INT:[INT32_MIN, INT32_MAX]ULONG -> LONG:[INT64_MIN, INT64_MAX](i.e.keUnknown)
We then cannot have casts between the same type where the target is unsigned, because we always treat the destination as signed. And casts between same sized small types likewise simply become widening to INT.
A narrowing cast then is restricted to the bounds of the output (toType), so:
LONG -> INT:[INT32_MIN, INT32_MAX]ULONG -> INT:[INT32_MIN, INT32_MAX]INT -> BYTE:[INT8_MIN, INT8_MAX]INT -> UBYTE:[0, UINT8_MAX]INT -> SHORT:[INT16_MIN, INT16_MAX]INT -> USHORT:[0, UINT16_MAX]
There was a problem hiding this comment.
UINT -> LONG: [0, UINT32_MAX]
Range doesn't support unsigned upper bound so if your change does it - it's already a bug. Given no diffs it means it will fail in someones Production and not in out suite.
I think if code brings no obvious value it should be removed. In fact, I'd like us to avoid doing any changes to ComputeRange/DoesOverflow. This code particually looks like it should just call GetRangeFromAssertions for casts.
There was a problem hiding this comment.
That being said, bugs in branch optimizations are the worst - they always lead to silent codegen bug (as opposite to deterministic asserts/crashes) extremely difficult to investigate. I've spent enought time on them in these things to only accept changes that have clear impact/coverage
There was a problem hiding this comment.
Range doesn't support unsigned upper bound so if your change does it - it's already a bug
It does not do that, I was simply giving the full list. We use GetRangeForType and so TYP_INT becomes [INT32_MIN, INT32_MAX] and TYP_UINT, TYP_LONG, and TYP_ULONG all become keUnknown
There was a problem hiding this comment.
Anyway, if you really want to make this change, just call GetRangeFromAssertion on CAST's VN, it's literally one line change vs 30 LOC of untested code
There was a problem hiding this comment.
That being said, bugs in branch optimizations are the worst - they always lead to silent codegen bug (as opposite to deterministic asserts/crashes) extremely difficult to investigate. I've spent enought time on them in these things to only accept changes that have clear impact/coverage
Agreed, but lack of diffs is not representative of lack of coverage and I would argue in many cases such problems occur because we tried to be clever and not simply do the more obviously correct handling. -- i.e. I actually think the current logic is much more likely to be broken than the logic in this PR, because it is trying to be simple and only handle the case it thinks it cares about. While the new logic explicitly factors in widening vs narrowing vs same size with concrete reasoning as to why it is correct by definition.
I'm Still fine to remove this part if its actually giving zero diffs (will wait until after the IsVNIntegralConst fix goes in though), but I don't agree with the doing so because of the shape of the logic itself.
I don't see what this PR did for that, I've noticied this issue yesterday just reminded myself after I suggested it. |
We actually have in a few places things like |
Are you suggesting we extend this to support these cases instead and only rely on VN? |
I wasn't saying that this PR helped identify that issue. I was rather saying the fact that such an issue exists and this PR is now failing is evidence that the PR has good coverage (as it was passing before the switch to use the buggy API). Coverage is not just about number of diffs that show, its also about the failures that don't occur. Now whether no failures is meaningful or bad is dependent on how common a given pattern is. In this case the pattern is effectively The number of low diffs might then suggest this additional handling is not meaningful, but I'm arguing for it being an iterative step towards us having meaningful diffs without me dropping a giant PR/refactoring. This is the type of logic we will end up having in any world where |
|
Talked with @EgorBo on Discord and we came to more or less a consensus here. The general concern is that However, we'd like to get rid of So the plan is to fully remove some of the It is then fine for Going to close this and will get a different PR up that does that handling. (Egor, feel free to ping if I misstated anything here). |
No description provided.