8282047: Enhance StringDecode/Encode microbenchmarks#7516
8282047: Enhance StringDecode/Encode microbenchmarks#7516cl4es wants to merge 7 commits intoopenjdk:masterfrom
Conversation
|
👋 Welcome back redestad! A progress list of the required criteria for merging this PR into |
Webrevs
|
|
Ping? |
| @State(Scope.Thread) | ||
| public class StringDecode { | ||
|
|
||
| @Param({"US-ASCII", "ISO-8859-1", "UTF-8", "MS932", "ISO-8859-6", "ISO-2022-KR"}) |
There was a problem hiding this comment.
What would you think of retaining the previous set of charset names as a comment -- as a suggestion for someone who wants additional coverage?
| @CompilerControl(CompilerControl.Mode.DONT_INLINE) | ||
| public void decodeAsciiLong(Blackhole bh) throws Exception { | ||
| bh.consume(new String(longAsciiString, charset)); | ||
| bh.consume(new String(longAsciiString, 0, 1024 + 31, charset)); |
There was a problem hiding this comment.
I imagine the 1024+31 addition gets compiled down, and is not executed during the test, right?
There was a problem hiding this comment.
Yes, adding two integer literals will be constant folded already by javac:
public static void main(String...args) {
int foo = 1024 + 31;
}
javap -v output:
0: sipush 1055
3: istore_1
4: return
| @State(Scope.Thread) | ||
| public class StringEncode { | ||
|
|
||
| @Param({"US-ASCII", "ISO-8859-1", "UTF-8", "MS932", "ISO-8859-6"}) |
There was a problem hiding this comment.
Same, re: keeping list as a comment.
| @@ -46,22 +36,22 @@ | |||
| @State(Scope.Thread) | |||
| @Setup | ||
| public void setup() { | ||
| charset = Charset.forName(charsetName); | ||
| asciiString = LOREM.substring(0, 32).getBytes(charset); |
There was a problem hiding this comment.
This is problematic IMO in that it's missing short strings such as "Claes". Average Java strings are about 32 bytes long AFAICR, and people writing (vectorized) ijntrinsics have a nasty habit of optimizing for long strings, to the detriment of typical-length ones.
Whether we like it or not, people will optimize for benchmarks, so it's important that benchmark data is realistic. The shortest here is 15 bytes, as far as I can see. I'd certainly include a short string of just a few bytes so that intrinsics don't cause regressions in important cases.
There was a problem hiding this comment.
All good points. I've added a number of such short variants to all(?) relevant microbenchmarks. The tests should now better cover a mix of input lengths and encodings.
|
@cl4es This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 87 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
/integrate |
|
Going to push as commit d4d12ad.
Your commit was automatically rebased without conflicts. |
Splitting out these micro changes from #7231
?, so the test is effectively the same as testing ASCII-only.Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7516/head:pull/7516$ git checkout pull/7516Update a local copy of the PR:
$ git checkout pull/7516$ git pull https://git.openjdk.java.net/jdk pull/7516/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7516View PR using the GUI difftool:
$ git pr show -t 7516Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7516.diff