-
Notifications
You must be signed in to change notification settings - Fork 5.9k
8316734: URLEncoder should specify that replacement bytes will be used in case of coding error #16709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back dclarke! A progress list of the required criteria for merging this PR into |
@DarraghClarke The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
* @throws IllegalArgumentException if the implementation encounters illegal | ||
* characters | ||
* @throws IllegalArgumentException if the implementation encounters malformed | ||
* escape sequences |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method specifies that it throws IAE, the implNote seems to be saying the same thing, do I read this correctly? I'm wondering if the implNote can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I see that there's another decode(String, String)
method above in this file that has the same old @implNote
but not @throws
. Maybe the implNote should be removed there too and the @throws
added.
Not sure it's worth touching the first @Deprecated decode(String)
method though. Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I see that there's another
decode(String, String)
method above in this file that has the same old@implNote
but not@throws
. Maybe the implNote should be removed there too and the@throws
added. Not sure it's worth touching the first@Deprecated decode(String)
method though. Opinions?
Since we are editing this method descriptions then it's probably best to add the throws IAE to the other 2-arg decode method. I suppose the 1-arg/deprecated decode method should document the exception too, doesn't need to be done in this PR of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy to change all in this PR if there are no objections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - let's fix it all here.
* <p> | ||
* If any consecutive well-formed escape sequences cannot | ||
* be decoded as a sequence of characters in the supplied {@code Charset} | ||
* {@linkplain java.nio.charset.CharsetDecoder##cae the replacement character} will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a bit clearer to say that erroneous bytes are replaced with the Charset's replacement value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I just wanted to make sure I was understanding you correctly before committing the change.
Would it be something like this?
* Erroneous bytes are replaced with the supplied {@code Charset}'s
* {@linkplain java.nio.charset.CharsetDecoder##cae replacement value}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with new text on the condition it is moved inside the paragraph that talks about decoding (appended to lines 147-148 above):
* The supplied charset is used to determine
* what characters are represented by any consecutive escape sequences of
* the form "<i>{@code %xy}</i>". Erroneous bytes are replaced with the
* supplied {@code Charset}'s {@linkplain java.nio.charset.CharsetDecoder##cae
* replacement value}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Maybe wait until @AlanBateman has had a chance to re-review before integrating.
@@ -204,6 +204,9 @@ public static String encode(String s, String enc) | |||
* "http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars"> | |||
* World Wide Web Consortium Recommendation</a> states that | |||
* UTF-8 should be used. Not doing so may introduce incompatibilities.</em> | |||
* <p> | |||
* If a character needs encoding but cannot be encoded, the | |||
* {@linkplain CharsetEncoder##cae replacement bytes} will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this text will appear in the "Note" section of the method description. We are adding normative text so I think would be better if the new text went into the first paragraph or introduce a new parameter before the "Note". We could replace the "Note" heading with @apiNote
if you want to clean this up.
As regards the text, I think it would be more correct to say that if the input string is malformed, or if the input cannot be mapped to a valid byte sequence in the given charset, then the erroneous input with be replaced with the charset's replacement value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input Alan, I pushed a commit that makes use of @apiNote
and changed the wording of the text. Let me know if there is anything else that could be improved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the feedback on this, I think both classes look much better now.
@DarraghClarke This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 162 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
/integrate |
Going to push as commit 48960df.
Your commit was automatically rebased without conflicts. |
@DarraghClarke Pushed as commit 48960df. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Currently the descriptions of
URLEncoder.encode
andURLDecoder.decode
don't specify their use of replacement bytes or replacement character when they cannot handle a character or sequence of bytes. This is longstanding behavior but needs to be documented.Solution
Added a new line to
URLEncoder.encode
API documentation to document that the charset's replacement bytes are used.Also changed
URLDecoder.decode
API documentation to document its use of the charset's replacement character, also changed some wording.Progress
Issues
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16709/head:pull/16709
$ git checkout pull/16709
Update a local copy of the PR:
$ git checkout pull/16709
$ git pull https://git.openjdk.org/jdk.git pull/16709/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16709
View PR using the GUI difftool:
$ git pr show -t 16709
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16709.diff
Webrev
Link to Webrev Comment