Skip to content

8292989: Avoid dynamic memory in AsyncLogWriter #10092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

navyxliu
Copy link
Member

@navyxliu navyxliu commented Aug 31, 2022

Current implementation of AsyncLogWriter uses dynamic memory. There are 2 sources.

  1. Overhead of pointer-based linked-list.
  2. strdup of message contents

This implementation has impact on glibc/malloc. If allocation of logsites interleave with other allocation, it's hard to clean up all glibc arenas. This worsens fragmentation issue.

In this PR, we replace linked-list with 2 pre-allocated raw buffers. AsyncLogWriter appends payload AsyncLogMessage to the serving buffer and avoids all dynamic allocation. Please note this effort won't eliminate mutex lock. We use ping-pong buffers to guarantee AsyncLogWriter is still non-blocking. A buffer serves as a FIFO queue like before.

In addition, AsyncLogWriter doesn't enqueue meta messages anymore when it needs to report the number of discarded messages. This is archived using a temporary hashtable called snapshot. It copies the working hashtable with the protection of lock and reset it. After writing all regular messages, AsyncLogWriter writes meta messages from snapshot.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10092/head:pull/10092
$ git checkout pull/10092

Update a local copy of the PR:
$ git checkout pull/10092
$ git pull https://git.openjdk.org/jdk pull/10092/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10092

View PR using the GUI difftool:
$ git pr show -t 10092

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10092.diff

Xin Liu added 4 commits August 30, 2022 00:23
1. replace iterator with C++ lambda and don't enqueue counters.
2. reserve space for flushing token, so push_back(token) is always
   successful.
@bridgekeeper
Copy link

bridgekeeper bot commented Aug 31, 2022

👋 Welcome back xliu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 31, 2022

@navyxliu The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-runtime hotspot-runtime-dev@openjdk.org rfr Pull request is ready for review labels Aug 31, 2022
@mlbridge
Copy link

mlbridge bot commented Aug 31, 2022

Copy link
Contributor

@jdksjolen jdksjolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR. A few comments from me.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Xin,

This is good, and goes into the right direction.

General remark:

Since you allocate the buffers up-front, you really don't have to handle allocation failures. Just consider:

  • In the vast majority of cases the user will not modify AsyncLogBufferSize, and if we fail to malloc 4 MB at VM start we have a much larger problem.
  • And if the user specifies AsyncLogBufferSize, but with such an outrageously large buffer that it fails to allocate, it is also better to fail. Because the user had a good reason to specify -Xlog:async, and you don't want to quietly fall back to synchronous logging. Also, maybe the user made just a stupid argument mistake, then again, you don't want to quielty swallow that error.

If you remove the allocation failure handling, you can simplify quite a bit, e.g. also get rid of Buffer::is_valid(), AsyncLogWriter::Buffer::~Buffer(), etc.

More remarks inline.

Cheers, Thomas

Xin Liu added 2 commits September 8, 2022 00:09
This saves a pointer(8-byte on LP64 machines) for each message.
Resume default size of buffer to 2M because -Xlog:all=debug
-Xlog:async doesn't drop any message.
MSVC reports logAsyncWriter.cpp(62): warning C4267: 'initializing': conversion from
'size_t' to 'int', possible loss of data
Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Xin,

getting closer. Remarks inline.

Cheers, Thomas

Xin Liu added 4 commits September 10, 2022 18:43
Also add a unit test for AsyncLogWriter::Buffer.
This patch changed the behehavior of Buffer. Enqueuing a null
message is not allowed and it will trigger an assertion in debug build.
it's still a compiler-time constant like TOKEN_SIZE. put the
headroom logic in Buffer::push_back().
@navyxliu
Copy link
Member Author

hi, @tstuefe,

Thank you for reviewing this patch! I addressed most of the requests. I haven't integrated the 'len' hint yet. It's not I don't intend to. On the contrary, I found that reducing the size of 'Message' is very useful to avoid buffer overflow. I took you advice and saved the explicit c-str pointer. Another observation is that almost all Message::_output pointers are the same when we use just one LogOutput file. eg. one may just save all GC related activities using -Xlog:gc*=debug,file=gc.log. I think I can save identical pointers with len hint.

Can we leave len optimization to the next patch? This patch just focuses on one problem: 'dynamic memory allocation'. I replaced linked-list with the pre-allocated buffer. I use NMT to validate it. It's stable with the upfront memory chunks(2*M by default):

                   Logging (reserved=2048KB, committed=2048KB)
                            (malloc=2048KB #4) 

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for taking my suggestions. I'm fine with delaying further refinements to future RFEs.

Thanks, Thomas

@openjdk
Copy link

openjdk bot commented Sep 12, 2022

@navyxliu This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8292989: Avoid dynamic memory in AsyncLogWriter

Reviewed-by: jsjolen, stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 267 new commits pushed to the master branch:

  • eeb625e: 8290169: adlc: Improve child constraints for vector unary operations
  • 2057070: 8293815: P11PSSSignature.engineUpdate should not print debug messages during normal operation
  • 7376c55: 8293769: RISC-V: Add a second temporary register for BarrierSetAssembler::load_at
  • d191e47: 8293768: Add links to JLS 19 and 20 from SourceVersion enum constants
  • a75ddb8: 8293122: (fs) Use file cloning in macOS version of Files::copy method
  • 95c7c55: 8293402: hs-err file printer should reattempt stack trace printing if it fails
  • 211fab8: 8291669: [REDO] Fix array range check hoisting for some scaled loop iv
  • 7f3250d: 8293787: Linux aarch64 build fails after 8292591
  • 2a38791: 8292755: Non-default method in interface leads to a stack overflow in JShell
  • 8351b30: 8293771: runtime/handshake/SystemMembarHandshakeTransitionTest.java fails if MEMBARRIER_CMD_QUERY is unsupported
  • ... and 257 more: https://git.openjdk.org/jdk/compare/27af0144ea57e86d9b81c2b328fad66e4a046f61...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 12, 2022
@navyxliu navyxliu requested a review from jdksjolen September 13, 2022 17:00
Copy link
Contributor

@jdksjolen jdksjolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your efforts, this LGTM now.

@jdksjolen (no known github.com user name / role)

I've forgotten to associate my OpenJDK account with my Github account, so my approval counts for nothing at the moment :-(.

@tstuefe
Copy link
Member

tstuefe commented Sep 15, 2022

Thank you for your efforts, this LGTM now.

@jdksjolen (no known github.com user name / role)

I've forgotten to associate my OpenJDK account with my Github account, so my approval counts for nothing at the moment :-(.

IIRC, in hotspot we need two reviews, one Reviewer, the other at least Contributor. Maybe one of your team should propose you as Contributor (@coleenp?), then your reviews count too. I think you have the required number of patches already.

@jdksjolen
Copy link
Contributor

IIRC, in hotspot we need two reviews, one Reviewer, the other at least Contributor. Maybe one of your team should propose you as Contributor (@coleenp?), then your reviews count too. I think you have the required number of patches already.

I've heard Reviewer + Author is fine, but I'll get a Contributor to look at this. I think I have the required number of patches, but not all of them are non-trivial so I'm waiting a bit to remove any doubt about the outcome of a vote.

@tstuefe
Copy link
Member

tstuefe commented Sep 15, 2022

IIRC, in hotspot we need two reviews, one Reviewer, the other at least Contributor. Maybe one of your team should propose you as Contributor (@coleenp?), then your reviews count too. I think you have the required number of patches already.

I've heard Reviewer + Author is fine, but I'll get a Contributor to look at this. I think I have the required number of patches, but not all of them are non-trivial so I'm waiting a bit to remove any doubt about the outcome of a vote.

Maybe you are right and Author is sufficient for the second vote. Whenever I look for the relevant wiki page I cannot find it :)

Up to @navyxliu of course, but since we are not stormed by willing reviewers, I'd just push.

@jdksjolen
Copy link
Contributor

Just integrating sounds good to me.

@navyxliu
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Sep 15, 2022

Going to push as commit bf79f99.
Since your change was applied there have been 276 commits pushed to the master branch:

  • 2028ec7: 8289608: Change com/sun/jdi tests to not use Thread.suspend/resume
  • ecb456a: 8293779: redundant checking in AESCrypt.makeSessionKey() method
  • 6fca9ae: 8288474: Move EventContinuationFreezeOld from try_freeze_fast to freeze_slow
  • fbd8b42: 8293591: Remove use of Thread.stop from jshell tests
  • aff5ff1: 8244681: Add a warning for possibly lossy conversion in compound assignments
  • 15cb1fb: 8256265: G1: Improve parallelism in regions that failed evacuation
  • b31a03c: 8293695: Implement isInfinite intrinsic for RISC-V
  • 8f3bbe9: 8293472: Incorrect container resource limit detection if manual cgroup fs mounts present
  • 1caba0f: 8292948: JEditorPane ignores font-size styles in external linked css-file
  • eeb625e: 8290169: adlc: Improve child constraints for vector unary operations
  • ... and 266 more: https://git.openjdk.org/jdk/compare/27af0144ea57e86d9b81c2b328fad66e4a046f61...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 15, 2022
@openjdk openjdk bot closed this Sep 15, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 15, 2022
@openjdk
Copy link

openjdk bot commented Sep 15, 2022

@navyxliu Pushed as commit bf79f99.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@navyxliu
Copy link
Member Author

Thanks @tstuefe and @jdksjolen for reviewing this!
--lx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants