Skip to content

Conversation

@lahodaj
Copy link
Contributor

@lahodaj lahodaj commented Sep 12, 2025

Consider code like:

package test;
public class Test {
    private int test(Root r) {
        return switch (r) {
            case Root(R2(R1 _), R2(R1 _)) -> 0;
            case Root(R2(R1 _), R2(R2 _)) -> 0;
            case Root(R2(R2 _), R2(R1 _)) -> 0;
        };
    }
    sealed interface Base {}
    record R1() implements Base {}
    record R2(Base b1) implements Base {}
    record Root(R2 b2, R2 b3) {}
}

This is missing a case for Root(R2(R2 _), R2(R2 _)). javac will produce an error correctly, but the error is not very helpful:

$ javac test/Test.java
.../test/Test.java:4: error: the switch expression does not cover all possible input values
        return switch (r) {
               ^
1 error

The goal of this PR is to improve the error, at least in some cases to something along these lines:

$ javac test/Test.java 
.../test/Test.java:4: error: the switch expression does not cover all possible input values
        return switch (r) {
               ^
  missing patterns:
      Root(R2(R2 _), R2(R2 _))
1 error

The (very simplified) way it works in a recursive (or induction) way:

  • start with defining the missing pattern as the binding pattern for the selector type. This would certainly exhaust the switch.
  • for a current missing pattern, try to enhance it:
    • if the current type is a sealed type, try to expand to its (direct) permitted subtypes. Remove those that are not needed.
    • if the current (binding pattern) type is a record type, expand it to a record type, generate all possible combinations of its component types based on sealed hierarchies. Remove those that are not needed.

This approach relies heavily on our ability to compute exhaustiveness, which is evaluated repeatedly in the process.

There are some cases where the algorithm does not produce ideal results (see the tests), but overall seems much better than what we have now.

Another significant limitation is the speed of the process. Evaluating exhaustiveness is not a fast process, and this algorithm evaluates exhaustiveness repeatedly, potentially for many combinations of patterns (esp. for record patterns). So part of the proposal here is to have a time deadline for the computation. The default is 5s, and can be changed by -XDexhaustivityTimeout=<timeout-in-ms>.

There's also an open possibility for select tools to delay the more detailed computation to some later time, although that would need to be tried and evaluated.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8367530: The exhaustiveness errors could be improved (Enhancement - P2)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27256/head:pull/27256
$ git checkout pull/27256

Update a local copy of the PR:
$ git checkout pull/27256
$ git pull https://git.openjdk.org/jdk.git pull/27256/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27256

View PR using the GUI difftool:
$ git pr show -t 27256

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27256.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 12, 2025

👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into pr/27253 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 12, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Sep 12, 2025

⚠️ @lahodaj This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk
Copy link

openjdk bot commented Sep 12, 2025

@lahodaj The following label will be automatically applied to this pull request:

  • compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the compiler compiler-dev@openjdk.org label Sep 12, 2025
@lahodaj
Copy link
Contributor Author

lahodaj commented Nov 7, 2025

Thanks for the comments so far. I think I've updated the code where needed based on them, and tried to respond. Thanks!

tree.isExhaustive = tree.hasUnconditionalPattern ||
TreeInfo.isErrorEnumSwitch(tree.selector, tree.cases);
if (exhaustiveSwitch) {
tree.isExhaustive |= exhaustiveness.exhausts(tree.selector, tree.cases);
Copy link
Member

@biboudis biboudis Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This |= turned into =. Is that correct? I think so. Now it is guarded by tree.isExhaustive itself right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intentional. There is one more additional if (!tree.isExhaustive) test below, which replaces the use of the or (but is faster in case we know the switch is exhaustive).

}

/* The stricness of determining the equivalent of patterns, used in
* nestedComponentsEquivalent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in computeCoverage. Can you provide an example/sentence of two strictly equivalent patterns and two loosely equivalent?

Comment on lines 1174 to 1179
/*
* Based on {@code basePattern} generate new {@code RecordPattern}s such that all
* components instead of {@code replaceComponent}th component, which is replaced
* with values from {@code updatedNestedPatterns}. Resulting {@code RecordPatterns}s
* are sent to {@code target}.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*
 * Using {@code basePattern} as a starting point, generate new {@code
 * RecordPattern}s, such that all corresponding components but one, are the
 * same. The component described by the {@code replaceComponent} index is
 * replaced with all {@code PatternDescription}s taken from {@code
 * updatedNestedPatterns} and the resulting {@code RecordPatterns}s are sent
 * to {@code target}.
 */

}
}

//assert?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address this todo?

lahodaj and others added 2 commits November 13, 2025 18:05
Co-authored-by: Aggelos Biboudis <biboudis@gmail.com>
@openjdk openjdk bot removed the rfr Pull request is ready for review label Nov 13, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 13, 2025

return currentMissingPatterns;
} else if ((bp.type.tsym.flags_field & Flags.RECORD) != 0 &&
//only expand record types into record patterns if there's a chance it may change the outcome
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this choice


removeUnnecessaryPatterns(selectorType, bp, basePatterns, inMissingPatterns, combinatorialPatterns);

CoverageResult coverageResult = computeCoverage(targetType, combinatorialPatterns, PatternEquivalence.LOOSE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can the expansion of a record pattern not cover? E.g. if we start with an exhaustive pattern R r and we expand R into all its components R(A1, B1, C1), R(A2, B2, C2) ... -- how can we get into a situation where we're no longer exhaustive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the code here (and a few lines below) is to merge "unnecesarily" specific patterns inside the missing patterns into their more generic supertypes. Like if at a specific place in the record patterns, there are both A and B, permitted subtypes of Base, we could merge and replace with Base here. But I understand it is somewhat matter of opinion on what is better.

Note the "unnecessary" combinatorialPatterns (i.e. those that are covered by the user provided patterns) are already removed here, and computeCoverage is ran on combinatorialPatterns, so those may not cover the original type, because they no longer contain anything covered by the user-provided patterns.

One example where this changes the outcomes is:

public void testComplex1(Path base) throws Exception {

where the current outcome is:

test.Test.Root(test.Test.R2 _, test.Test.Base _, test.Test.Base _)
test.Test.Root(test.Test.R3 _, test.Test.Base _, test.Test.Base _)

but if I disable this merging, we would get:

test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R2 _)
test.Test.Root(test.Test.R3 _, test.Test.R1 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R3 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R3 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R2 _, test.Test.R3 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R1 _)
test.Test.Root(test.Test.R3 _, test.Test.R3 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R1 _, test.Test.R1 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R2 _, test.Test.R1 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R2 _)
test.Test.Root(test.Test.R2 _, test.Test.R3 _, test.Test.R3 _)

There may be a way to better control this merging (given the original user-provided pattern has _ on both the second and third component), but given how coverage works for record patterns, it is a bit tricky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think my question originated from a wrong assumption on how your code behaved -- after some debugging sessions I've seen cases where the set of patterns does not cover the target

incompletePatterns,
Set.of(defaultPattern));
} catch (TimeoutException ex) {
return ex.missingPatterns != null ? ex.missingPatterns : Set.of();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a timeout, I wonder if you could instead cut the recursion at a specific threshold. It seems to me that recursing more will provide more precision at the nested level, so it's a trade off between when do we want to stop.

Overload resolution provides some kind of precedent:

error: incompatible types: String cannot be converted to int
      m("Hello");
        ^
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
1 error

(We "compress" the diagnostic whenever we can somehow figure out if an overload is "better" than the others). Then if you provide the option, you get the full thing:

error: no suitable method found for m(String)
      m("Hello");
      ^
    method Test.m() is not applicable
      (actual and formal argument lists differ in length)
    method Test.m(int) is not applicable
      (argument mismatch; String cannot be converted to int)
    method Test.m(int,int) is not applicable
      (actual and formal argument lists differ in length)
    method Test.m(int,int,int) is not applicable
      (actual and formal argument lists differ in length)
    method Test.m(int,int,int,int) is not applicable
      (actual and formal argument lists differ in length)
    method Test.m(int,int,int,int,int) is not applicable
      (actual and formal argument lists differ in length)
    method Test.m(int,int,int,int,int,int) is not applicable
      (actual and formal argument lists differ in length)

But, also, maybe putting an upper bound on the recursion, no matter what, might be a good idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that having a timeout in weird/non-standard in javac, I believe it is the first time when we have a process that a) can run for a very long time; b) is not required for correctness. In all other cases I can recall, if some process is slow (including e.g. verifying exhaustiveness), then it is required for correctness. And so we cannot skip it based on criteria like time.

The timeout here provides a way to say how much real-world time is the user willing to invest to get the outcome - if more time is invested, more detailed missing patterns may possibly be computed. It is a somewhat weird approach for javac, but it is a limit that (I think) the user and javac can agree on.

We could introduce a limit on e.g. the depth to which we expand, but adding one level of nesting may affect performance significantly or almost not at all, depending on e.g. the record type form/shape.

E.g. having a record with many components, where each component is a sealed type with many permitted subtypes, one level of nesting may lead to a high number of newly generated patterns possibly taking a lot of time to go through them. But having a record that has a single component that is not a sealed type should only generate one pattern, and so have minimal impact.

Symbol enumType = bp.type.tsym;
return enum2Constants.get(enumType).stream().map(c -> enumType.toString() + "." + c.name);
} else {
return Stream.of(pd.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, eager string generation will render the diagnostic arguments completely opaque to the formatter. This would mean that no where clauses will be generated, and unambiguous qualifiers will not be omitted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, don't forget JCDiagnostic.MultilineDiagnostic, which we use in overloads to render repetitive fragments in tabular format.

case Triple(_, A _, _) -> 0;
case Triple(_, _, A _) -> 0;
case Triple(A p, C(Nested _, NestedBaseA _), _) -> 0;
case Triple(A p, C(Nested _, NestedBaseB _), C(Underneath _, NestedBaseA _)) -> 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is Underneath defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a mistake, it should be Nested. Fixed here:
08fdb6d
Thanks!

}
""",
"test.Test.Root(test.Test.R2 _, test.Test.R2(test.Test.Base _, test.Test.R2 _), test.Test.R2(test.Test.R2 _, test.Test.Base _))");
//ideally, the result would be as follow, but it is difficult to split Base on two distinct places:
Copy link
Contributor

@mcimadamore mcimadamore Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With recursion it should be possible -- but I guess the problem becomes, when do you stop?

e.g.
{ Base } -> { R1, R2 } -> { R1, R2(Base, Base) } -> ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an idea here (not sure how crazy -- possibly a lot) would be to run an analysis on the original patterns as defined in the source code, and determine the max depth, and then expand "up to that depth".

So, in the above case, you would know that when you get to {R1, R2}, it's "good enough" for the level of depth that is present in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here for the current approach is that the current approach only expands one pattern at a time, but to get the optimal result here, both the test.Test.Base _ patterns would need to be expanded at the same time (as some combinations of these two are required and some are not).

I was thinking of doing a full expansion originally, but I was worried a bit that for fairly small inputs, the number of combinations could be too high. For example, a newly added:

a relatively naive expansion (expanding components that have record patterns in the original pattern) would lead to a few hundreds of patterns, unless I am mistaken. There may be a way to limit the expansions, though.

We can look into this deeper, surely.

Copy link
Contributor

@mcimadamore mcimadamore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is impressive work. In "normal" situations (e.g. switches not too big, or too nested) I can easily imagine the new diagnostics to be a life saver.

There's of course a lot of tinkering and followup work that might be possible, to improve the performance of the analysis, or to fine tune the expansion more to the shape of the code.

For now, I think I see two more general issues that stick out:

  • the early flattening to string, which bypasses the diagnostic formatter -- but that should be easy to fix
  • the timeout-based strategy. We don't have anything like that anywhere else in the compiler. I think it would be preferrable to have a "variable rate" of accuracy, and maybe limit how the analysis is ran, unless the user really wants to discover every detail. But I'm not sure it's always possible. Cutting on recursion might be a good way to put a ceiling on complexity. Another avenue might be to refuse to expand sealed types that have more than N ermitted subclasses.

//if there is a binding pattern at a place where the original based patterns
//have a record pattern, try to expand the binding pattern into a record pattern
//create all possible combinations of record pattern components:
Type[] componentTypes = ((ClassSymbol) bp.type.tsym).getRecordComponents()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same "pattern" of code seems repeated in makePatternDescription -- ideally there should be a way to compute the "instantiated" component types

@openjdk-notifier openjdk-notifier bot changed the base branch from pr/27247 to master November 18, 2025 14:06
@openjdk-notifier
Copy link

The parent pull request that this pull request depends on has now been integrated and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork:

git checkout JDK-8367530
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk
Copy link

openjdk bot commented Nov 18, 2025

@lahodaj this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8367530
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 18, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 18, 2025
@davidalayachew
Copy link
Contributor

WOAH

Is this fabled "Example Generator" that was talked about in the past?

Does this mean that we will be granted at least one example from now on when our Switch is non-exhaustive?

If so, this is amazing news! Wow!

@openjdk openjdk bot added the build build-dev@openjdk.org label Nov 25, 2025
@openjdk
Copy link

openjdk bot commented Nov 25, 2025

@lahodaj build has been added to this pull request based on files touched in new commit(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build build-dev@openjdk.org compiler compiler-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

4 participants