Skip to content

8337517: Redacted Heap Dumps #20409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Conversation

Henry-Lin-A
Copy link
Contributor

@Henry-Lin-A Henry-Lin-A commented Jul 31, 2024

Adds a command line option -redact to jcmd and -XX:+HeapDumpRedacted enabling redacted heap dumps. When enabled, the output binary heap dump has zeroes written out in place of the original primitive values in the object fields. There is a new jtreg test heapDumpRedactedTest.java that tests that the fields are properly redacted.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8337518 to be approved

Issues

  • JDK-8337517: Redacted Heap Dumps (Enhancement - P4)
  • JDK-8337518: Introduce options to jcmd and a JVM runtime flag to generate redacted heap dumps. (CSR)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20409/head:pull/20409
$ git checkout pull/20409

Update a local copy of the PR:
$ git checkout pull/20409
$ git pull https://git.openjdk.org/jdk.git pull/20409/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20409

View PR using the GUI difftool:
$ git pr show -t 20409

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20409.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 31, 2024

👋 Welcome back Henry-Lin-A! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 31, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Jul 31, 2024
@openjdk
Copy link

openjdk bot commented Jul 31, 2024

@Henry-Lin-A The following labels will be automatically applied to this pull request:

  • hotspot-runtime
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org labels Jul 31, 2024
@Henry-Lin-A Henry-Lin-A marked this pull request as ready for review July 31, 2024 21:33
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 31, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 31, 2024

Webrevs

@@ -139,7 +139,7 @@ public void detach() throws IOException {
* Execute the given command in the target VM.
*/
InputStream execute(String cmd, Object ... args) throws AgentLoadException, IOException {
assert args.length <= 3; // includes null
assert args.length <= 4; // includes null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we've been down this path before, and allowing 4 args will result in newer versions of the JVM not working with older versions of the tool. See JDK-8219721

@dholmes-ora
Copy link
Member

I must be missing something in the approach. The vast majority of confidential data will be in strings yet you focus on primitives that would rarely (if ever for boolean float/double) contain anything that could be recognised as such.

testPrimitiveArray("boolArray", targetObject, "false");

//static field
JavaThing staticInt = testClass.getStaticField("staticInt");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
JavaThing staticInt = testClass.getStaticField("staticInt");
JavaThing staticInt = testClass.getStaticField("staticInt");

@AlanBateman
Copy link
Contributor

The alternative is of course to not burden the VM and heap dumpers with this but instead provide tooling to process a HPROF heap dump to zero the contents of Strings and other fields.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks interesting, some suggestions:

} else {
writer->write_raw(array->int_at_addr(0), length_in_bytes);
if (redact) {
writer->write_raw(nullptr, length_in_bytes); //nullptr to write zeros
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be easier to just:

  if (redact) {
    writer->write_raw(nullptr, length_in_bytes); //nullptr to write zeros
    writer->end_sub_record();
    return;
  }

...and leave the rest of the code alone.

@@ -988,7 +996,7 @@ void DumperSupport::dump_double(AbstractDumpWriter* writer, jdouble d) {
}

// dumps the raw value of the given field
void DumperSupport::dump_field_value(AbstractDumpWriter* writer, char type, oop obj, int offset) {
void DumperSupport::dump_field_value(AbstractDumpWriter* writer, char type, oop obj, int offset, bool redact) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner to specialize this method into two: dump_field_value and dump_redacted_field_value, and dispatch to it from outside. Likely saves branches.

@shipilev
Copy link
Member

shipilev commented Aug 1, 2024

I must be missing something in the approach. The vast majority of confidential data will be in strings yet you focus on primitives that would rarely (if ever for boolean float/double) contain anything that could be recognised as such.

(moved the reply to bug, so that it is not buried in PR comments: https://bugs.openjdk.org/browse/JDK-8337517?focusedId=14695155#comment-14695155)

@plummercj
Copy link
Contributor

plummercj commented Aug 1, 2024

The alternative is of course to not burden the VM and heap dumpers with this but instead provide tooling to process a HPROF heap dump to zero the contents of Strings and other fields.

That approach also has the added benefit of addressing this issue with heap dumps produced by SA, which is not being covered by this PR.

@earthling-amzn
Copy link
Contributor

@plummercj - I'm sorry, can you expand this acronym for me?

heap dumps produced SA

This reverts commit e87b74e.
@plummercj
Copy link
Contributor

@plummercj - I'm sorry, can you expand this acronym for me?

heap dumps produced SA

Serviceability Agent. It's a set of tools used to debug jvm and application issues. It can be run against a live process or a core file. It does things like produce stack traces and heap dumps, but also is good at inspecting the JVM and heap state.

https://docs.oracle.com/en/java/javase/17/docs/specs/man/jhsdb.html

See jhsdb jmap -binaryheap.

The SA implementation for heap dumping is mostly in: src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/HeapHprofBinWriter.java

@plummercj
Copy link
Contributor

heap dumps produced SA

Sorry also about the typo. That should have been produced by SA. I fixed it in the original comment.

@kevinjwalls
Copy link
Contributor

Notes on built-in redaction vs. external tool:

As a system owner, I would want both kinds of dump?
The pure dump for myself, and the redacted dump to share with a 3rd party for analysis.

They may come back with information about relationships (a particular HttpRequest object #ID is involved, or observations on threads and locks etc...), and I can confirm in more detail in the original dump with the actual Strings.

That suggests a tool outside the JVM, a post-processor for HPROF dumps that performs redaction.

If I can use jcmd to switch redaction on and off, to get both kinds of dump: do I really want to perform the dump twice (if some people don't like suspending their app once), and getting slightly different (or maybe significantly different) state in the two dumps.

@plummercj
Copy link
Contributor

If I can use jcmd to switch redaction on and off, to get both kinds of dump: do I really want to perform the dump twice

Yep, and there are two issues with getting two dumps. The first is that sometimes the dump is triggered automatically, like when out of memory, so there may not even be an option for a 2nd dump. The other issue is that if you request two dumps with jcmd, the two dumps won't be identical.

@earthling-amzn
Copy link
Contributor

This review has gone quiet. I sense that there are no major technical objections to the change, but that the consensus prefers to leave this functionality to external tools. If there are no further comments by the end of the week, we will withdraw this pull request.

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 17, 2024

@Henry-Lin-A This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 16, 2024

@Henry-Lin-A This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csr Pull request needs approved CSR before integration hotspot-runtime hotspot-runtime-dev@openjdk.org rfr Pull request is ready for review serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

8 participants