-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8367530: The exhaustiveness errors could be improved #27256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
|
|
Thanks for the comments so far. I think I've updated the code where needed based on them, and tried to respond. Thanks! |
| tree.isExhaustive = tree.hasUnconditionalPattern || | ||
| TreeInfo.isErrorEnumSwitch(tree.selector, tree.cases); | ||
| if (exhaustiveSwitch) { | ||
| tree.isExhaustive |= exhaustiveness.exhausts(tree.selector, tree.cases); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This |= turned into =. Is that correct? I think so. Now it is guarded by tree.isExhaustive itself right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is intentional. There is one more additional if (!tree.isExhaustive) test below, which replaces the use of the or (but is faster in case we know the switch is exhaustive).
| } | ||
|
|
||
| /* The stricness of determining the equivalent of patterns, used in | ||
| * nestedComponentsEquivalent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in computeCoverage. Can you provide an example/sentence of two strictly equivalent patterns and two loosely equivalent?
| /* | ||
| * Based on {@code basePattern} generate new {@code RecordPattern}s such that all | ||
| * components instead of {@code replaceComponent}th component, which is replaced | ||
| * with values from {@code updatedNestedPatterns}. Resulting {@code RecordPatterns}s | ||
| * are sent to {@code target}. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/*
* Using {@code basePattern} as a starting point, generate new {@code
* RecordPattern}s, such that all corresponding components but one, are the
* same. The component described by the {@code replaceComponent} index is
* replaced with all {@code PatternDescription}s taken from {@code
* updatedNestedPatterns} and the resulting {@code RecordPatterns}s are sent
* to {@code target}.
*/
src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ExhaustivenessComputer.java
Outdated
Show resolved
Hide resolved
src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ExhaustivenessComputer.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| //assert? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address this todo?
src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ExhaustivenessComputer.java
Show resolved
Hide resolved
Co-authored-by: Aggelos Biboudis <biboudis@gmail.com>
|
|
||
| return currentMissingPatterns; | ||
| } else if ((bp.type.tsym.flags_field & Flags.RECORD) != 0 && | ||
| //only expand record types into record patterns if there's a chance it may change the outcome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this choice
|
|
||
| removeUnnecessaryPatterns(selectorType, bp, basePatterns, inMissingPatterns, combinatorialPatterns); | ||
|
|
||
| CoverageResult coverageResult = computeCoverage(targetType, combinatorialPatterns, PatternEquivalence.LOOSE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can the expansion of a record pattern not cover? E.g. if we start with an exhaustive pattern R r and we expand R into all its components R(A1, B1, C1), R(A2, B2, C2) ... -- how can we get into a situation where we're no longer exhaustive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the code here (and a few lines below) is to merge "unnecesarily" specific patterns inside the missing patterns into their more generic supertypes. Like if at a specific place in the record patterns, there are both A and B, permitted subtypes of Base, we could merge and replace with Base here. But I understand it is somewhat matter of opinion on what is better.
Note the "unnecessary" combinatorialPatterns (i.e. those that are covered by the user provided patterns) are already removed here, and computeCoverage is ran on combinatorialPatterns, so those may not cover the original type, because they no longer contain anything covered by the user-provided patterns.
One example where this changes the outcomes is:
| public void testComplex1(Path base) throws Exception { |
where the current outcome is:
test.Test.Root(test.Test.R2 _, test.Test.Base _, test.Test.Base _)
test.Test.Root(test.Test.R3 _, test.Test.Base _, test.Test.Base _)
but if I disable this merging, we would get:
test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R2 _)
test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R3 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R3 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R1 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R1 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R3 _)
There may be a way to better control this merging (given the original user-provided pattern has _ on both the second and third component), but given how coverage works for record patterns, it is a bit tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think my question originated from a wrong assumption on how your code behaved -- after some debugging sessions I've seen cases where the set of patterns does not cover the target
| incompletePatterns, | ||
| Set.of(defaultPattern)); | ||
| } catch (TimeoutException ex) { | ||
| return ex.missingPatterns != null ? ex.missingPatterns : Set.of(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of a timeout, I wonder if you could instead cut the recursion at a specific threshold. It seems to me that recursing more will provide more precision at the nested level, so it's a trade off between when do we want to stop.
Overload resolution provides some kind of precedent:
error: incompatible types: String cannot be converted to int
m("Hello");
^
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
1 error
(We "compress" the diagnostic whenever we can somehow figure out if an overload is "better" than the others). Then if you provide the option, you get the full thing:
error: no suitable method found for m(String)
m("Hello");
^
method Test.m() is not applicable
(actual and formal argument lists differ in length)
method Test.m(int) is not applicable
(argument mismatch; String cannot be converted to int)
method Test.m(int,int) is not applicable
(actual and formal argument lists differ in length)
method Test.m(int,int,int) is not applicable
(actual and formal argument lists differ in length)
method Test.m(int,int,int,int) is not applicable
(actual and formal argument lists differ in length)
method Test.m(int,int,int,int,int) is not applicable
(actual and formal argument lists differ in length)
method Test.m(int,int,int,int,int,int) is not applicable
(actual and formal argument lists differ in length)
But, also, maybe putting an upper bound on the recursion, no matter what, might be a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that having a timeout in weird/non-standard in javac, I believe it is the first time when we have a process that a) can run for a very long time; b) is not required for correctness. In all other cases I can recall, if some process is slow (including e.g. verifying exhaustiveness), then it is required for correctness. And so we cannot skip it based on criteria like time.
The timeout here provides a way to say how much real-world time is the user willing to invest to get the outcome - if more time is invested, more detailed missing patterns may possibly be computed. It is a somewhat weird approach for javac, but it is a limit that (I think) the user and javac can agree on.
We could introduce a limit on e.g. the depth to which we expand, but adding one level of nesting may affect performance significantly or almost not at all, depending on e.g. the record type form/shape.
E.g. having a record with many components, where each component is a sealed type with many permitted subtypes, one level of nesting may lead to a high number of newly generated patterns possibly taking a lot of time to go through them. But having a record that has a single component that is not a sealed type should only generate one pattern, and so have minimal impact.
| Symbol enumType = bp.type.tsym; | ||
| return enum2Constants.get(enumType).stream().map(c -> enumType.toString() + "." + c.name); | ||
| } else { | ||
| return Stream.of(pd.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, eager string generation will render the diagnostic arguments completely opaque to the formatter. This would mean that no where clauses will be generated, and unambiguous qualifiers will not be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, don't forget JCDiagnostic.MultilineDiagnostic, which we use in overloads to render repetitive fragments in tabular format.
| case Triple(_, A _, _) -> 0; | ||
| case Triple(_, _, A _) -> 0; | ||
| case Triple(A p, C(Nested _, NestedBaseA _), _) -> 0; | ||
| case Triple(A p, C(Nested _, NestedBaseB _), C(Underneath _, NestedBaseA _)) -> 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is Underneath defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a mistake, it should be Nested. Fixed here:
08fdb6d
Thanks!
| } | ||
| """, | ||
| "test.Test.Root(test.Test.R2 _, test.Test.R2(test.Test.Base _, test.Test.R2 _), test.Test.R2(test.Test.R2 _, test.Test.Base _))"); | ||
| //ideally, the result would be as follow, but it is difficult to split Base on two distinct places: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With recursion it should be possible -- but I guess the problem becomes, when do you stop?
e.g.
{ Base } -> { R1, R2 } -> { R1, R2(Base, Base) } -> ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps an idea here (not sure how crazy -- possibly a lot) would be to run an analysis on the original patterns as defined in the source code, and determine the max depth, and then expand "up to that depth".
So, in the above case, you would know that when you get to {R1, R2}, it's "good enough" for the level of depth that is present in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here for the current approach is that the current approach only expands one pattern at a time, but to get the optimal result here, both the test.Test.Base _ patterns would need to be expanded at the same time (as some combinations of these two are required and some are not).
I was thinking of doing a full expansion originally, but I was worried a bit that for fairly small inputs, the number of combinations could be too high. For example, a newly added:
| return switch (b) { |
a relatively naive expansion (expanding components that have record patterns in the original pattern) would lead to a few hundreds of patterns, unless I am mistaken. There may be a way to limit the expansions, though.
We can look into this deeper, surely.
mcimadamore
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is impressive work. In "normal" situations (e.g. switches not too big, or too nested) I can easily imagine the new diagnostics to be a life saver.
There's of course a lot of tinkering and followup work that might be possible, to improve the performance of the analysis, or to fine tune the expansion more to the shape of the code.
For now, I think I see two more general issues that stick out:
- the early flattening to string, which bypasses the diagnostic formatter -- but that should be easy to fix
- the timeout-based strategy. We don't have anything like that anywhere else in the compiler. I think it would be preferrable to have a "variable rate" of accuracy, and maybe limit how the analysis is ran, unless the user really wants to discover every detail. But I'm not sure it's always possible. Cutting on recursion might be a good way to put a ceiling on complexity. Another avenue might be to refuse to expand sealed types that have more than N ermitted subclasses.
| //if there is a binding pattern at a place where the original based patterns | ||
| //have a record pattern, try to expand the binding pattern into a record pattern | ||
| //create all possible combinations of record pattern components: | ||
| Type[] componentTypes = ((ClassSymbol) bp.type.tsym).getRecordComponents() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This same "pattern" of code seems repeated in makePatternDescription -- ideally there should be a way to compute the "instantiated" component types
|
The parent pull request that this pull request depends on has now been integrated and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork: git checkout JDK-8367530
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push |
|
@lahodaj this pull request can not be integrated into git checkout JDK-8367530
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
|
WOAH Is this fabled "Example Generator" that was talked about in the past? Does this mean that we will be granted at least one example from now on when our Switch is non-exhaustive? If so, this is amazing news! Wow! |
…ternDescriptions through the diagnostics to format it at the end.
|
@lahodaj |
Consider code like:
This is missing a case for
Root(R2(R2 _), R2(R2 _)). javac will produce an error correctly, but the error is not very helpful:The goal of this PR is to improve the error, at least in some cases to something along these lines:
The (very simplified) way it works in a recursive (or induction) way:
This approach relies heavily on our ability to compute exhaustiveness, which is evaluated repeatedly in the process.
There are some cases where the algorithm does not produce ideal results (see the tests), but overall seems much better than what we have now.
Another significant limitation is the speed of the process. Evaluating exhaustiveness is not a fast process, and this algorithm evaluates exhaustiveness repeatedly, potentially for many combinations of patterns (esp. for record patterns). So part of the proposal here is to have a time deadline for the computation. The default is 5s, and can be changed by
-XDexhaustivityTimeout=<timeout-in-ms>.There's also an open possibility for select tools to delay the more detailed computation to some later time, although that would need to be tried and evaluated.
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27256/head:pull/27256$ git checkout pull/27256Update a local copy of the PR:
$ git checkout pull/27256$ git pull https://git.openjdk.org/jdk.git pull/27256/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27256View PR using the GUI difftool:
$ git pr show -t 27256Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27256.diff
Using Webrev
Link to Webrev Comment