Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8260030: Improve stringStream buffer handling #2160

Conversation

@tstuefe
Copy link
Member

@tstuefe tstuefe commented Jan 20, 2021

stringStream objects are used as temporary string buffers a lot in hotspot. When investigating JDK-8259710, I found that a large majority of the stringStreams created never use the backing buffer fully; about 70% of all streams write less than 32 characters.

stringStream creates an backing buffer, C-heap allocated, with a default size of 256 characters. Some things could be improved:

  • add a small in-object buffer for the first n characters to be written. This would avoid or at least delay allocation of the backing buffer from C-heap, at the expense of slightly increased object size and one memcpy when switching over to C-heap. Note that if the backing buffer is smaller than what now is the default size, the total footprint will go down despite the increased object size.

  • Delaying allocation of the backing buffer would be good for the many cases where stringStream objects are created as temporary objects without being actually used, see eg. compile.hpp class PrintInliningBuffer

  • Besides all that, the code could benefit from a little grooming (using modern style syntax etc).


This patch:

  • renames members of stringStream to conform to common syntax (eg leading underscores) and to be clearer
  • Uses initialization lists
  • Factors out buffer growth code from stringStream::write() to a new function stringStream::grow()
  • Introduces a new object-internal mini buffer, stringStream::_small_buffer, sized 32 bytes, which is initially used for all writes

This patch drastically reduces the number of malloc calls done from this class. The internal buffer size of 32byte seems a good cut-off. Running some unrelated test program (no tracing active), I see a reduction in the number of malloc calls from stringStream from ~211K malloc calls down to 53K (debug VM). In a release VM, it drops from ~85K down to about 1K. The reason is that stringStream is used in a ton of places as a temporary stack allocated object, to write very minimal text snippets to.

I also tweaked the associated gtest to test more thoroughly.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2160/head:pull/2160
$ git checkout pull/2160

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jan 20, 2021

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Jan 20, 2021

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot label Jan 20, 2021
@tstuefe tstuefe marked this pull request as ready for review Jan 20, 2021
@openjdk openjdk bot added the rfr label Jan 20, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jan 20, 2021

Webrevs

iklam
iklam approved these changes Jan 20, 2021
Copy link
Member

@iklam iklam left a comment

LGTM

@openjdk
Copy link

@openjdk openjdk bot commented Jan 20, 2021

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8260030: Improve stringStream buffer handling

Reviewed-by: iklam, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 35 new commits pushed to the master branch:

  • 1452280: 8164484: Unity, JTable cell editor, javax/swing/JComboBox/6559152/bug6559152.java
  • a70acf2: 8259928: compiler/jvmci tests fail with -Xint
  • ba38661: 8259882: Reduce the inclusion of perfData.hpp
  • 92c2f08: 8259869: [macOS] Remove desktop module dependencies on JNF Reference APIs
  • a7c2ebc: 8239894: Xserver crashes when the wrong high refresh rate is used
  • 2f47c39: 8259943: FileDescriptor.close0 does not handle EINTR
  • a8073ef: 8253478: (se) epoll Selector should use eventfd for wakeup instead of pipe
  • 34eb8b3: 8255765: Shenandoah: Isolate concurrent, degenerated and full GC
  • c3c6662: 8259954: gc/shenandoah/mxbeans tests fail with -Xcomp
  • 6ce0799: 8259851: Use boolean type for tasks in SubTasksDone
  • ... and 25 more: https://git.openjdk.java.net/jdk/compare/cf25383d19e5b42768cdd07385a89dcbafbe02bb...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jan 20, 2021
@tstuefe
Copy link
Member Author

@tstuefe tstuefe commented Jan 21, 2021

LGTM

Thanks, Ioi!

src/hotspot/share/utilities/ostream.cpp Outdated Show resolved Hide resolved
src/hotspot/share/utilities/ostream.cpp Outdated Show resolved Hide resolved
src/hotspot/share/utilities/ostream.cpp Outdated Show resolved Hide resolved
src/hotspot/share/utilities/ostream.hpp Show resolved Hide resolved
}

void stringStream::reset() {
buffer_pos = 0; _precount = 0; _position = 0;
_written = 0; _precount = 0; _position = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a pre-existing bug that this doesn't also have _newlines = 0; ?

Copy link
Member Author

@tstuefe tstuefe Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking yes. But _newlines is really only used as a very circumvent way to trigger a flush in defaultStream::write() if the last write contained a newline. We could probably throw _newlines completely away at the cost of a strchr(s, '\n') in defaultStream::write(). I'll remove the outside accessor newlines() at least, since its not used, and will correct stringStream::reset(), so it is consistent for now. I leave the cleanup for another day.

size_t _written;
size_t _capacity;
const bool _is_fixed;
char _small_buffer[32];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised that 32 bytes is that effective. Any idea how much better
(in malloc avoidance) a few more words would be? Since 32 bytes isn't really
all that much.

Copy link
Member Author

@tstuefe tstuefe Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised too. I did not want to blow up the stringStream object too much because I was afraid someone may use it on stack in a recursion. I even considered melting the buffer and the buffer pointer into a union to save space, since you only need either one or the other, but the coding got too complex and it felt overengineered.

I did some tests:

java -version

debug:    <32:1038 (86%), <48:111 (9%), <64:42 (3%), >=64:4 (0%)
release:  <32:449 (84%),  <48:54 (10%), <64:26 (4%), >=64:4 (0%)

Running some unrelated test program, no tracing, doing metaspace stuff

debug: 	  <32:151985 (52%), <48:83341 (28%), <64:46916 (16%), >=64:6445 (2%)                  
release:  <32:84123 (49%),  <48:80332 (47%), <64:871 (0%), >=64:3567 (2%) 

same test with -XX:+PrintOptoInlining, which creates many temporary stringStream objects with large buffers

debug:	  <32:160311 (52%), <48:83661 (27%), <64:47821 (15%), >=64:15345 (4%)   

So, depending on the scenario, increasing the buffer size to 48 may give us between 10% and almost 30% more cases. Beyond that the effect strongly diminishes. I will increase it to 48.

strncpy(copy, buffer, buffer_pos);
copy[buffer_pos] = 0; // terminating null
NEW_C_HEAP_ARRAY(char, _written + 1, mtInternal) : NEW_RESOURCE_ARRAY(char, _written + 1);
strncpy(copy, _buffer, _written);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing: Should this really be using strncpy rather than memcpy?
strncpy will stop at an embedded NUL.

Copy link
Member Author

@tstuefe tstuefe Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll switch it to memcpy. Its probably more efficient. The content should not contain embedded zeros, but since we do not test on write, they can slip in. Using memcpy would be at least consistent with write().

@tstuefe
Copy link
Member Author

@tstuefe tstuefe commented Jan 21, 2021

I rewrote stringStream::write somewhat:

  • I check and ignore len==0
  • I check len for unreasonably high values (>1G). I assert, in release I return: stringStream is used for tracing a lot and we never touch all tracing code in debug. I did not add a overflow check, though I could, since overflow would require very high values for _written (> SIZE_MAX - 1G) which is not possible.
  • I changed the guarantee to an assert, and added a similar assert to the start of the function

Copy link

@kimbarrett kimbarrett left a comment

Other than the minor nit about use of _is_fixed in the destructor, looks good.

@@ -404,7 +411,7 @@ char* stringStream::as_string(bool c_heap) const {
}

stringStream::~stringStream() {
if (_is_fixed == false && _buffer != NULL && _buffer != _small_buffer) {
if (_is_fixed == false && _buffer != _small_buffer) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still using _is_fixed == false rather than !_is_fixed.

Copy link
Member Author

@tstuefe tstuefe Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that, sorry. Will fix before integration.

@tstuefe
Copy link
Member Author

@tstuefe tstuefe commented Jan 22, 2021

Thanks Kim, Ioi.

/integrate

@openjdk openjdk bot closed this Jan 22, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jan 22, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 22, 2021

@tstuefe Since your change was applied there have been 35 commits pushed to the master branch:

  • 1452280: 8164484: Unity, JTable cell editor, javax/swing/JComboBox/6559152/bug6559152.java
  • a70acf2: 8259928: compiler/jvmci tests fail with -Xint
  • ba38661: 8259882: Reduce the inclusion of perfData.hpp
  • 92c2f08: 8259869: [macOS] Remove desktop module dependencies on JNF Reference APIs
  • a7c2ebc: 8239894: Xserver crashes when the wrong high refresh rate is used
  • 2f47c39: 8259943: FileDescriptor.close0 does not handle EINTR
  • a8073ef: 8253478: (se) epoll Selector should use eventfd for wakeup instead of pipe
  • 34eb8b3: 8255765: Shenandoah: Isolate concurrent, degenerated and full GC
  • c3c6662: 8259954: gc/shenandoah/mxbeans tests fail with -Xcomp
  • 6ce0799: 8259851: Use boolean type for tasks in SubTasksDone
  • ... and 25 more: https://git.openjdk.java.net/jdk/compare/cf25383d19e5b42768cdd07385a89dcbafbe02bb...master

Your commit was automatically rebased without conflicts.

Pushed as commit d066f2b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tstuefe tstuefe deleted the JDK-8260030-improve-stringStream-backing-buffer-handling branch Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants