Skip to content

8307517: Add VMErrorCallback infrastructure to extend hs_err dumping #13824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

stefank
Copy link
Member

@stefank stefank commented May 5, 2023

Sometimes when we crash in the GC we'd like to get some more information about what was going on the crashing thread. One example is when Generational ZGC crashes during store barrier flushing. From https://github.com/openjdk/zgc/blob/349cf9ae38664991879402a90c5e23e291f1c1c3/src/hotspot/share/gc/z/zStoreBarrierBuffer.cpp#L245

class ZStoreBarrierBuffer::OnError : public VMErrorCallback {
private:
  ZStoreBarrierBuffer* _buffer;

public:
  OnError(ZStoreBarrierBuffer* buffer) :
      _buffer(buffer) {}

  virtual void call(outputStream* st) {
    _buffer->on_error(st);
  }
};

void ZStoreBarrierBuffer::on_error(outputStream* st) {
  st->print_cr("ZStoreBarrierBuffer: error when flushing");
  st->print_cr(" _last_processed_color: " PTR_FORMAT, _last_processed_color);
  st->print_cr(" _last_installed_color: " PTR_FORMAT, _last_installed_color);

  for (int i = current(); i < (int)_buffer_length; ++i) {
    st->print_cr(" [%2d]: base: " PTR_FORMAT " p: " PTR_FORMAT " prev: " PTR_FORMAT,
        i,
        untype(_base_pointers[i]),
        p2i(_buffer[i]._p),
        untype(_buffer[i]._prev));
  }
}

void ZStoreBarrierBuffer::flush() {
  if (!ZBufferStoreBarriers) {
    return;
  }

  OnError on_error(this);
  VMErrorCallbackMark mark(&on_error);

  for (int i = current(); i < (int)_buffer_length; ++i) {
    const ZStoreBarrierEntry& entry = _buffer[i];
    const zaddress addr = ZBarrier::make_load_good(entry._prev);
    ZBarrier::mark_and_remember(entry._p, addr);
  }

  clear();
}

If we crash in ZStoreBarrierBuffer::flush, we print the information above into the hs_err file.

We've found this information to be useful and would like to upstream the infrastructure separately from the much larger Generational ZGC PR.

Testing: this has been brewing and been used in the Generational ZGC repository for a long time.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8307517: Add VMErrorCallback infrastructure to extend hs_err dumping

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13824/head:pull/13824
$ git checkout pull/13824

Update a local copy of the PR:
$ git checkout pull/13824
$ git pull https://git.openjdk.org/jdk.git pull/13824/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13824

View PR using the GUI difftool:
$ git pr show -t 13824

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13824.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 5, 2023

👋 Welcome back stefank! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented May 5, 2023

@stefank This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8307517: Add VMErrorCallback infrastructure to extend hs_err dumping

Reviewed-by: eosterlund, aboldtch, dholmes, stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 29 new commits pushed to the master branch:

  • ad0e5a9: 8304720: SuperWord::schedule should rebuild C2-graph from SuperWord dependency-graph
  • 495f268: 8306966: RISC-V: Support vector cast node for Vector API
  • 0dca573: 8301739: AArch64: Add optimized rules for vector compare with immediate for SVE
  • 3d3eaed: 8306941: Open source several datatransfer and dnd AWT tests
  • 1f57ce0: 8307446: RISC-V: Improve performance of floating point to integer conversion
  • 4e4828e: 8307553: Remove dead code MetaspaceClosure::push_method_entry
  • 7d58978: 8280031: Deprecate GTK2 for removal
  • b5922c3: 8305846: Support compilation in Proc test utility
  • 73ac710: 8307425: Socket input stream read burns CPU cycles with back-to-back poll(0) calls
  • e2b1013: 8306326: [BACKOUT] 8277573: VmObjectAlloc is not generated by intrinsics methods which allocate objects
  • ... and 19 more: https://git.openjdk.org/jdk/compare/1b143ba78712e7ac98ca9873c50989b3fba07394...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 5, 2023
@openjdk
Copy link

openjdk bot commented May 5, 2023

@stefank The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org ready Pull request is ready to be integrated and removed ready Pull request is ready to be integrated labels May 5, 2023
Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

I have also experimented with using this functionality to only record the last unloading events for a specific unloading cycle. (And have the callback print the events if a crash occurred during the cycle). As I found that in release builds the majority of the time is spent inside Events::log_class_unloading for do_unloading.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed ready Pull request is ready to be integrated labels May 5, 2023
@mlbridge
Copy link

mlbridge bot commented May 5, 2023

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite neat but I'm not clear on the need for the VMErrorCallbackMark - can't the callback link/unlink itself at construction/destruction?

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like it.

@@ -213,4 +213,27 @@ class VMError : public AllStatic {
static int prepare_log_file(const char* pattern, const char* default_pattern, bool overwrite_existing, char* buf, size_t buflen);

};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-existing, can you please add a prototype decl for outputStream?

};

class VMErrorCallbackMark : public StackObj {
Thread* _thread;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we need the thread here? Why not use Thread::current in dtor? This object is only used as stack object, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was treading in Runtime code and Coleen usually wants to use cached-away Thread pointers instead of calling Thread::current() repeatedly. I'm fine with either solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the context it would have to be Thread::current_or_null_safe(). But yes we prefer not to re-materialize the current thread if we already have it at hand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why it would have to be Thread::current_or_null_safe()? The constructor and destructor are run in "normal" JVM code and not in the error handler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my mistake.

@tstuefe
Copy link
Member

tstuefe commented May 5, 2023

This looks quite neat but I'm not clear on the need for the VMErrorCallbackMark - can't the callback link/unlink itself at construction/destruction?

I like it better this way. Otherwise you dictate that the callback obj itself has to live on the stack. It may be large, or it may be shared between different threads.

@stefank
Copy link
Member Author

stefank commented May 5, 2023

This looks quite neat but I'm not clear on the need for the VMErrorCallbackMark - can't the callback link/unlink itself at construction/destruction?

Yes it could.

It has a couple of drawbacks, but it's unclear to me if those are important:

  1. The linking of the callbacks happens before they have been fully constructed
  2. It makes a strong tie between the lifecycle of the callback and the linking/unlinking. For some callbacks that might not be preferable.

The main advantage is that there's one less class and the linking-site can become a one-liner.

I can go either way, so it would be good if the reviewers could chime in with their preference. This is what it would look like:
master...stefank:jdk:8307517_VMErrorCallback_2

@tstuefe
Copy link
Member

tstuefe commented May 5, 2023

I can go either way, so it would be good if the reviewers could chime in with their preference. This is what it would look like:

master...stefank:jdk:8307517_VMErrorCallback_2

I prefer the explicit RAII object, separate from the callback.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the code as-is. Thanks.

@stefank
Copy link
Member Author

stefank commented May 8, 2023

Thanks for reviewing! In the interest of getting this pushed before the Generational ZGC, I'm going to integrate it now. FWIW, I'm not opposed to doing some follow-up style changes if we decide that this should be further tweaked.
/integrate

@openjdk
Copy link

openjdk bot commented May 8, 2023

Going to push as commit 33245d6.
Since your change was applied there have been 33 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 8, 2023
@openjdk openjdk bot closed this May 8, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 8, 2023
@openjdk
Copy link

openjdk bot commented May 8, 2023

@stefank Pushed as commit 33245d6.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants