Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay slicing in list pattern until all other element patterns are matched #70615

Closed

Conversation

DoctorKrolic
Copy link
Contributor

Fixes: #60091

@DoctorKrolic DoctorKrolic requested a review from a team as a code owner October 28, 2023 16:05
@ghost ghost added the Community The pull request was submitted by a contributor who is not a Microsoft employee. label Oct 28, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Oct 28, 2023
@DoctorKrolic
Copy link
Contributor Author

Can I have a review here please?

@CyrusNajmabadi
Copy link
Member

We may want to run this by LDM and/or compat council to make sure everyone is ok with this.

Personally, I like this, and it fits with what we've said about patterns. But we should make sure everyone is on the same page here.

@jcouv would you want to drive this?

@DoctorKrolic
Copy link
Contributor Author

I guess, if there are arguments against the change as a whole we can at least limit this to only apply to arrays/strings since they are special cases and rely on GetSubArray and Substring methods which have known behavior and are invariant to evaluation order

@DoctorKrolic
Copy link
Contributor Author

And if this question will be considered by LDM, it would be better to broaden its scope and discuss the possibility of the compiler changing the order of evaluation in list patterns without being tied to any specific pattern type. So for instance a pattern on a string array like ["(", ..var between, { Length: > 2 }, ")"] can be avaluated in the following order:

  1. "("
  2. ")"
  3. { Length: > 2 }
  4. ..var between

This is the most efficient order of evaluation here since constant is cheaper to check than property access + greater than operator and slicing is the most expensive one which always evaluates to true. Right now compiler checks patterns in left-to-right order and this PR implements putting slicing to the end, but it would be better if language allowed compiler to be as flexible as possible to deliver the best performance in every particular case

@jcouv
Copy link
Member

jcouv commented Nov 8, 2023

@jcouv would you want to drive this?

Sure. Added the topic to LDM queue: https://github.com/dotnet/csharplang/blob/main/meetings/2023/README.md#schedule-asap
For background, here's the paper Neal was referencing regarding optimizations in pattern matching: https://www.cs.tufts.edu/~nr/cs257/archive/norman-ramsey/match.pdf

@jcouv jcouv added the Blocked label Nov 10, 2023
@jcouv jcouv removed the Blocked label Nov 27, 2023
@jcouv jcouv added this to the 17.9 milestone Nov 27, 2023
@jcouv
Copy link
Member

jcouv commented Nov 27, 2023

Unblocked the PR as LDM approved this design (delaying the slice). #Closed

IL_003a: ret
}
""");
}
Copy link
Member

@jcouv jcouv Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the added tests use pre-existing types (arrays, string, ROS) so the change is only observable through IL inspection.
Please add some tests with custom collection types where the different properties/methods are instrumented to print something, then we can directly verify effect on order of evaluation.
See SlicePattern_Subpattern for an example (Length property logs itself). #Closed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider adding a test with multiple "simple slice patterns". I don't think that scenario will get past initial binding, so shouldn't reach the modified code, but it may be good to check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one is done.

I don't think that scenario will get past initial binding, so shouldn't reach the modified code, but it may be good to check

Correct, that's a binding problem. And I am 99.99% sure these cases are already tested in any meaningful way possible somewhere in the binding tests. I think adding them in this PR will just create unnecessary duplicates and bloat the size of code to review. So left this one untouched. If you really feal strong, that I am wrong and such tests are actually needed here, LMK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcouv I need your final word here

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with review pass (iteration 7)

// We faced a slice pattern with a declaration pattern of the same narrowed type inside, e.g. `[<pattern>, ..var between, <pattern>]`
// This pattern always evaluates to true, but causes slicing operation and even may allocate (e.g. when slicing an array or string)
// Thus it makes sense to hold it and "evaluate" after all other patterns in the list matched
if (slicePattern.Kind == BoundKind.DeclarationPattern && slicePattern.InputType.Equals(slicePattern.NarrowedType, TypeCompareKind.AllNullableIgnoreOptions))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check really necessary? the narrowed type for the slice won't escape from within the element itself. Then you can inline the local function outside the loop.

Copy link
Contributor Author

@DoctorKrolic DoctorKrolic Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand this one, can you elaborate? The narrowed type check is needed to prevent order of evaluation change when types are different since in such case we cannot guarantee that the pattern always evaluates to true (see #69053 for instance) and thus we cannot tell, whether it is cheaper to perform slicing pattern check or check of the last N patterns in the list

Copy link
Member

@alrz alrz Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The narrowed type check is needed to prevent order of evaluation change when types are different since in such case we cannot guarantee that the pattern always evaluates to true

Matching the slice pattern last will change the ordering (of the Slice call) anyways? but that is obviously not a requirement - allowing this optimization. I don't think narrowed type of the slice pattern is at issue here because it won't affect any other patterns e.g. the remaining subpatterns in the list.

we cannot tell, whether it is cheaper to perform slicing pattern check or check of the last N patterns in the list

As you mentioned, Slice may actually allocate for non-span types so it's worth considering making it the last test to perform regardless of what subpattern we have. As for overall ordering of the tests based on cost, there's #29033.

Unless this complicates subsumption checking I think it should be ok to do this uniformly. I may have missed something from previous discussions, but @jcouv can weigh in on whether or not that's an appropriate change to make.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be okay to generalize the optimization.

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with review pass (iteration 11). Sorry for delay, I was on vacation. Happy New Year :-)

@DoctorKrolic DoctorKrolic requested review from alrz and jcouv March 25, 2024 15:35
@DoctorKrolic
Copy link
Contributor Author

@jcouv I've finally returned to this PR, would like a look from you)

@jcouv jcouv modified the milestones: 17.9, 17.10 Mar 25, 2024
""";

var comp = CreateCompilationWithIndexAndRangeAndSpan([source, TestSources.GetSubArray]);
comp.VerifyEmitDiagnostics();
Copy link
Member

@jcouv jcouv Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tests that don't have errors, it would be good to verify execution.
I'd even recommend switching some of those tests away from arrays or ReadOnlySpans to a custom countable/indexable/sliceable type so that the order of evaluation can be made apparent. Verifying that the output is something like index[0] index[^1] slice will more effectively capture the intent and avoid unintentional regressions than a block of IL. (like SlicePattern_LastDeclarationPatternOfSameType_CustomTypeWithSideEffects you added below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tests that don't have errors, it would be good to verify execution.

Good idea to verify that behavior stays the same for most cases, working on that

I'd even recommend switching some of those tests away from arrays or ReadOnlySpans to a custom countable/indexable/sliceable type so that the order of evaluation can be made apparent

There are 3 groups of tests added in this PR: when slice is in between other patterns, when it is the first and when it is the last. In each group there are 4 tests: 2 tests arrays with the same type and a narrowed type, test of a string, test of ROS and a test with custom side-effecting type. Each test has its own purpose: arrays and strings have special slicing behavior, so it is good to verify them separately, ROS represents a case when slicing is completely pure and test with side effect well... tests behavior change when there is a side effect (remember that this should represent a minority of real-world cases, so we shouldn't focus on behavior changes too much). Thus I wouldn't switch existing tests from one type to another, rather add additional tests on top of what we already have.

Verifying that the output is something like index[0] index[^1] slice

I didn't understand this one. Can you elaborate please? We already cover cases with side effects, what other types of side effects can we check for? Maybe we can modify existing side effect tests to verify more things at once? Can you please provide a prototype of a test you want to see

""";

var comp = CreateCompilationWithIndexAndRangeAndSpan([source, TestSources.GetSubArray]);
comp.VerifyEmitDiagnostics();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good test to show side-effect. It's much harder to figure out from the IL block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what do you mean by "side-effect" here? We usually reffer to side effects as something, which is not a part of the main code flow or a potential unintentional behavior from user's perspective. The review is pointing to a test of an array, which by design should fully keep its original behavior with the only observable change being performance improvement in certain scenarios. So I wouldn't say there is a side effect of the optimization when dealing with arrays. Did you mean just to verify runtime behavior (which I'm gonna do anyway) or what?

// We faced a slice pattern with a declaration pattern of the same narrowed type inside, e.g. `[<pattern>, ..var between, <pattern>]`
// This pattern always evaluates to true, but causes slicing operation and even may allocate (e.g. when slicing an array or string)
// Thus it makes sense to hold it and "evaluate" after all other patterns in the list matched
if (slicePattern.Kind == BoundKind.DeclarationPattern && slicePattern.InputType.Equals(slicePattern.NarrowedType, TypeCompareKind.AllNullableIgnoreOptions))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeCompareKind.AllNullableIgnoreOptions

Do we have tests covering the comparison kind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to resolve #70615 (comment) before working on this one

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with review pass (iteration 17). Only test feedback

@jcouv
Copy link
Member

jcouv commented Mar 25, 2024

@dotnet/roslyn-compiler for another review. Thanks

@jcouv jcouv self-assigned this Mar 28, 2024
@jcouv
Copy link
Member

jcouv commented Mar 28, 2024

@dotnet/roslyn-compiler for second review. Thanks

@jaredpar jaredpar modified the milestones: 17.10, 17.11 Apr 1, 2024
@DoctorKrolic DoctorKrolic deleted the delay-list-pattern-slicing branch May 25, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Community The pull request was submitted by a contributor who is not a Microsoft employee. untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suboptimal codegen for the slice pattern in list pattern matching
5 participants