-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8065554: MatchResult should provide values of named-capturing groups #10000
Conversation
👋 Welcome back rgiulietti! A progress list of the required criteria for merging this PR into |
@rgiulietti The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
* this method to work. However, overriding this method directly | ||
* might be preferable for other reasons. | ||
* | ||
* @since 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the method declare that it throws UnsupportedOperationsExceptions
?
Because that is what will happen if namedGroups
is not overridden/implemented.
Same comment for the other new methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure.
If the convention is to document every single RuntimeException
that methods invoked by this one could throw, then yes.
In other words, should RuntimeExcpetion
s thrown deep in an invocation stack be documented in every caller method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, yes. In practice, I see that namedGroups
doesn't have an @throws UnsupportedOperationException
but has an @implSpec
that says that the default implementation throws UnsupportedOperationException
. This seems strange to me - maybe @stuart-marks or @jddarcy can comment.
What I was hinting at here however is that we might want to extend the @implSpec
of the new methods to note that these method will throw UnsuportedOperationException
if namedGroups
is not implemented (like the @implSpec
of namedGroups
does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
So what you mean is not adding another @throws clause but to either improve @implNote or, better, to add @implSpec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some inline comments on the @implSpec
. But I do think that these methods require @throws UnsupportedOperationException
for the cases where they don't support named groups.
Addressed concerns about undocumented |
* @implSpec | ||
* The default implementation of this method throws | ||
* {@link UnsupportedOperationException} if {@link #namedGroups()} is not | ||
* overridden. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The essential thing for @implSpec
is to describe "self-use" of methods on this object. This is important for subclassers to know whether they can inherit the default implementation or whether they should override it. It looks like start(String)
does the following:
- calls namedGroups() to obtain a mapping from group names to group numbers, propagating UOE if namedGroups() throws it
- if
name
is not present in the group map, throws IAE - calls
start()
on the group number obtained from the map, and returns that value
I don't think we need to go to the level of detail about whether get
or containsKey
is called on the map, but I think the self-calls to namedGroups()
and start(int)
are important.
Similar comments apply to the @implSpec
comments of end(String)
and group(String)
.
* The default implementation of this method makes use of the map returned | ||
* by {@link #namedGroups()}. It is thus sufficient to override | ||
* {@link #namedGroups()} for this method to work. However, overriding this | ||
* method directly might be preferable for performance or for other reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This @implNote
text, is repeated in three different methods. Consider moving this to the class specification. It might make it a bit easier for implementors to see a central overview instead of having this information in each method.
* {@link UnsupportedOperationException}. | ||
* | ||
* @apiNote | ||
* This method must be overridden by an implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit odd. It sounds like existing MatchResult implementations (outside the JDK) are now invalid. I think it really means something like, "This method must be overridden by an implementation in order to provide valid information about whether this MatchResult contains a match." I'm not sure whether saying this is necessary; it could be omitted.
Probably also needs an @throws UnsupportedOperationException
in case the match information is unavailable.
r.end("noSuchGroup"); | ||
r.group("noSuchGroup"); | ||
} catch (IllegalArgumentException e) { // swallowing intended | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If MatchResult is behind correctly, the call to r.start("noSuchGroup")
will always throw an exception and the subsequent calls to r.end
and r.group
will never be executed. This potentially will miss testing of those methods.
result.end("noSuchGroup"); | ||
result.group("noSuchGroup"); | ||
} catch (IllegalArgumentException e) { // swallowing intended | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issue here as above.
Overall the specs, code, and tests look pretty good. I do think some areas of the spec need updating; sorry I didn't get to this before you created the CSR. The test is OK, but it's starting to get to the point where it would be profitable to use TEST-NG data providers to collapse some of the test cases. We have two implementations of MatchResult: one is Matcher itself, and the other is the internal implementation returned by toMatchResult(). The setup for them differs somewhat, but the assertions should all be the same. This is kind of hard to see with separate test methods for Matcher and MatchResult. The test is reasonable as it stands, but we'll see what it looks like after the cases for checking exceptions from start(String), end(String), and group(String) are expanded. |
Addressed concerns about spec details. |
Good updates to the test, the |
If the most recent commit is OK in terms of Javadoc, I'll update the CSR accordingly |
Spec looks good. Let me know when you're done updating the CSR so I can mark it reviewed. |
testMatchResultStartEndGroup1(); | ||
testMatchResultStartEndGroup2(); | ||
testMatchResultStartEndGroup3(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three numbered tests here are a little hard to follow. Looks like these tests are
- test existing group names, with no match
- test existing group names, with a successful match
- test nonexistent group names, with a successful match
On test names, sometimes people provide extremely verbose test names such as testThatExistingGroupNameWithMatchReturnsNegativeOrNull
which I think is overkill, but having a name that's somewhat descriptive would be helpful.
It looks like a case is missing, which is a test for a nonexistent group name on a MatchResult that doesn't have a successful match. I'm not sure which is checked first; I think the implementation would throw IAE, because of the nonexistent name, regardless of whether or not the MatchResult has a match.
However, I don't think we've specified this, and in fact I don't think we want to. In general though, if multiple error conditions can arise in the same operation, the general style is not to constrain implementations to check for things in a particular order. Thus either IAE or ISE would be acceptable. Perhaps a test should be added for that. (Hm, might want to take another look at the specs regarding this issue.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Hm, might want to take another look at the specs regarding this issue.)
Not sure who wants to take another look. If that it's you, then I'll wait with the CSR.
I'll change the method names to something a bit more speaking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have been more specific. "Somebody" should take another look. :-) Well, anyway, I did, and the specification as written does not indicate which error condition is checked first. I think this is OK, so I don't think any changes are necessary. You might mention this in the text of the CSR; I know that Joe and I have discussed this issue previously, and he might have a recommendation.
testMatchResultHasMatch(); | ||
|
||
testMatchResultStartEndGroup(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gone through all the tests in great detail (yet), but it occurs to me that we potentially have THREE implementations of some of the logic, and strictly speaking we should test all the code paths. The three implementations are in:
- Matcher
- Matcher$ImmutableMatchResult
- MatchResult's default methods
I took a quick look and it looks like Matcher and Matcher$ImmutableMatchResult override the default methods, so the default methods themselves need to be tested. This is essentially testing the @implSpec
. The typical way to do that is to have the test create its own MatchResult implementation(s). There might need to be implementations that do and do not implement namedGroups
, in order to test UOE. They might also need some state to cover various cases of no-match, has-match with and without group names, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An implementation that overrides namedGroups
without overriding the other methods accepting group names is Matcher$ImmutableMatchResult
, which is already exercised in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, as long as all the code paths are covered.
Added an implementation of |
CSR updated to current status. |
* @return an unmodifiable map from capturing group names to group numbers | ||
* | ||
* @throws UnsupportedOperationException | ||
* The default implementation of this method always throws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @throws
clause needs to state a general contract over all implementations, so it should say something like, "@throws UOE if the implementation does not support named groups". Then, the specification for the default implementation always throwing UOE should be moved to @implSpec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* @return whether {@code this} contains a valid match | ||
* | ||
* @throws UnsupportedOperationException | ||
* The default implementation of this method always throws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here as above, though it has nothing to do with named groups. The @throws
clause should say it throws UOE if "the implementation cannot report whether or not it has a match" or some such. This is a bit odd, but the specification needs to be permissive enough so that it doesn't invalidate existing implementations outside the JDK.
As before, the "always throws" should be moved to @implSpec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I took another look at everything and I think this is good to integrate.
The tests seem adequate but it seems like they would benefit from some refactoring. It might be an interesting exercise to revisit them and try out the new JUnit 5 APIs.
@rgiulietti This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 401 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
I'll integrate now but agree to open another PR to make use of JUnit in the tests, perhaps later this week or next week. |
/integrate |
Going to push as commit ce85cac.
Your commit was automatically rebased without conflicts. |
@rgiulietti Pushed as commit ce85cac. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Add support for named groups to java.util.regex.MatchResult
Progress
Issues
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10000/head:pull/10000
$ git checkout pull/10000
Update a local copy of the PR:
$ git checkout pull/10000
$ git pull https://git.openjdk.org/jdk pull/10000/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10000
View PR using the GUI difftool:
$ git pr show -t 10000
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10000.diff