-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8251989: Hex formatting and parsing utility #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/csr |
👋 Welcome back rriggs! A progress list of the required criteria for merging this PR into |
@RogerRiggs this pull request will not be integrated until the CSR request JDK-8251991 for issue JDK-8251989 has been approved. |
@RogerRiggs The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
To avoid spamming email lists, I'm removing the i18n, net, nio, and security labels since most developers already are subscribed to core-libs. /label remove i18n, net, nio, security |
@RogerRiggs The The The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class name already has Hex
inside, and methods names still contain it. Also, I found it unexpected that fromHexDigits
returns an integer. Can we use encode
, decode
, decodeInt
, decodeLong
etc?
Including 'Hex' in the method names reinforces the function, without it, more context is required to make the code readable. The decoding of hexadecimal digits to a binary number is at its core untyped and unsigned for all three fromHexDigits methods. The formatHex and parseHex methods operating on byte arrays are quite different than toHexDigits and fromHexDigits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really useful addition! Utility classes for formatting bytes to hex have been re-invented so often already that having this in the JDK would be really great.
Now that the OpenJDK project migrated to GitHub I took the opportunity and wrote this GitHub pull request review (though I am not an OpenJDK contributor). Most things are formatting or documentation related. I hope these comments are useful and are not too intrusive. Please let me know otherwise.
Also is it common practice to use System.out
in JDK tests? In my opinion it often does not add much value once the unit test implementation has been completed because the output is not checked during tests automatically anyways and might only clutter the console output.
The tests are using String.toUpperCase
, toLowerCase
and format
without Locale
(therefore using the default one). Does the test setup guarantee a constant default locale or would be better to include a Locale
to make sure the tests don't break for any unusual locale?
Has it been also considered to add support for parsing from a Reader
to an OutputStream
and from InputStream
to Appendable
to support arbitrary length input and output?
Maybe it would also be good to mention for the method parsing and formatting int
, long
, ... in which byte order the output is created.
And would it be worth supporting a delimiter period or frequency to only apply the delimiter every nth byte? This would be useful when the user want to write hex chars in groups of 4 or 8.
It has been useful to see the output generated and if there are failures there is additional information available.
Good point, the ROOT locale should be used. Though for the cases of the hexadecimal characters, they are consistently treated across all locales.
That can be considered separately, the Jira issue 8254708 will track that enhancement.
The big/little -endian terminology usually applies to binary representations. The natural reading order for numbers is left to right but it can be more explicit.
It is a tradeoff in complexity. The api focuses on the conversion of byte arrays to strings and back. |
…to match Character, remove unnecessary Class name qualifications, etc.
- misc javadoc markup fixes. - added checking of byte array sizes to generate useful exceptions if the arrays would be too large. - Small implementation cleanups
@dfuch I'll add a clarification.
|
… delimiter, or uppercase parameter.
…rsions Switched order of declaration of a couple of method to make the javadoc sequence easier to read
* @throws IllegalArgumentException if the string length is greater than eight (8) or | ||
* if any of the characters is not a hexadecimal character | ||
*/ | ||
public int fromHexDigits(CharSequence string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend this group of methods include an apinote explaining the differences in behavior of compared to parseInt(s, 16) and parseUnsignedInt(s, 16).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add:
* @apiNote
* {@link Integer#parseInt(String, int) Integer.parseInt(s, 16)} and
* {@link Integer#parseUnsignedInt(String, int) Integer.pareUnsignedInt(s, 16)}
* are similar but allow all Unicode hexadecimal digits allowed by
* {@link Character#digit(char, int) Character.digit(ch, 16)}.
* {@code HexFormat} uses only Latin1 hexadecimal characters "0-9, "A-F", and "a-f".
* {@link Integer#parseInt(String, int)} can parse signed hexadecimal strings.
And similar text for Long#parseLong and Long.parseUnsignedLong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay; however, I suggest saying more on the signed/unsigned behavior.
@RogerRiggs This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 4 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
/integrate |
@RogerRiggs Since your change was applied there have been 22 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit aa9c136. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
java.util.HexFormat utility:
like System.out are supported in addition to StringBuilder. (IOExceptions are converted to unchecked exceptions)
See the HexFormat javadoc for details.
Review comments and suggestions welcome.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/482/head:pull/482
$ git checkout pull/482