New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RyuJIT bounds checks not eliminated for readonly arrays #11797
Comments
Might be relatively low risk and modest benefit. Keeping in 3.0 for now until we can better assess potential benefit. |
Could be as simple as a well placed |
Example knock on code change for this is in |
In particular this should do interesting things for ref typed readonly initialized statics, allowing reads to be hoisted out of loops, for instance. Addresses part of #21951. Note readonly instance fields cannot be given special treatment.
Preview of jit changes here: master...AndyAyersMS:InitializedReadonlyStaticInvariance. Unfortunately this won't help But we can give special treatment to There seem to be a fair number of these static cases about, and initial diffs look promising (Note this used
Method with the biggest percent diff shows repeated access to a readonly static array which can now be optimized as if the array ref was hoisted to a local. Is it my imagination, or are there two identical copies of the |
@benaadams do you think this is still interesting, even if it can't catch the cases you'd hoped? |
Yes definitely; I actually thought |
Ok. Let me look into some of the regressions. Running the above over the jitutils\fx set augmented with ASP.Net assemblies shows some hits here and there in ASP.Net. Need to add some sort of filter option to |
Two examples from Kestrel would be private static readonly Tuple<ulong, ulong, HttpMethod, int>[] _knownMethods =
new Tuple<ulong, ulong, HttpMethod, int>[17];
private static readonly string[] _methodNames = new string[9]; Though there are probably better examples; not sure they are used in a way that would skip bounds checking |
Filtered view ...
|
Note We don't yet drill into We don't know what to do with accesses to |
Very nice :) |
In
In the after version, r13 is only referred to in the preheader and right at the top of the loop. The preheader reference is just for a bounds check. If not spilled r13 must stay live across the whole body, and the loop body is pretty large. So the lack of r13 causes a lot of turmoil. Not sure why LSRA doesn't spill r13 -- but as a guess, the allocator probably picks shorter lifetimes as spill victims. Also if we did spill, becasuse there are quite a few Chances are good that if we implement the follow-on optimization to propagate the array length as a jit-time constant then we'd fix this as the array ref would only need to be live within a the loop iteration, and similar may hold true of many of these "for loop over elements of a static readonly array" cases. Something similar would happen if the user decided to manually CSE the static array by creating a local. So in addition to catching the explict references it would be good if we could propagate info about the reference to any local copies so we could better optimize cases that users have already tried to optimize. |
In particular this should do interesting things for ref typed readonly initialized statics, allowing reads to be hoisted out of loops, for instance. Addresses part of #21951. Note readonly instance fields cannot be given special treatment.
There are a few readonly static array accesses that are done via enum that I thought wouldn't be effected by this; however if it is a propagated const length, then it could potentially also skip them there? (e.g. recognise in range) |
Presumably any first pass blocks (if can't tell .cctor has finished, e.g. crossgen and first encounter); will get picked up by second pass in Tier1? |
I would certainly hope so.
Right, at Tier1 there's a pretty good chance all the relevant |
As for the optimization itself -- I am going to add in the array length bit sometime soon; I think that plus the above should come out looking pretty solid. But we're also trying to wrap things up for the release. So am going to mark this as future. If the ongoing work looks encouraging, we can still consider it for 3.0. |
Just leaving some performance numbers from using AES-NI with .NET Core 3.0. The AES round keys are stored in a private readonly array field in my AES class. Accessing the array directly results in an encryption speed of 4.2 GB/s. Creating a local reference to the array outside the hot loop and then accessing the last element of the array before accessing any other elements removes the remaining bounds checks and results in an encryption speed of 5.7 GB/s. |
@EgorBo maybe your recent changes fixed this? |
Yes, we now fold nullchecks for |
Range analysis doesn't take into account readonly arrays and you need to make a function local reference to eliminate the range check.
https://github.com/dotnet/coreclr/issues/5371 regressed due to dotnet/coreclr#15756
It shouldn't need to be conservative for a readonly array?
/cc @briansull @AndyAyersMS @mikedn
category:cq
theme:bounds-checks
skill-level:expert
cost:medium
impact:medium
The text was updated successfully, but these errors were encountered: