Skip to content

8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents #13899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 20 commits into from

Conversation

AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented May 10, 2023

This is the implementation for JEP 451. There are two parts to this:

  1. A multi-line warning is printed when a JVM TI or Java agent is loaded into a running VM. For JVM TI, the message is printed to stderr from JvmtiAgent::load. For Java agents, it is printed to System.err (as that may be redirected) in the JPLIS (j.l.instrumentation) implementation. This part includes an update to the JVM TI spec and API docs to require the warning.

  2. If running with -Djdk.instrument.traceUsage or -Djdk.instrument.traceUsage=true, the calls to the Instrumentation API print a trace message and stack trace.

Testing: tier1-6


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8307479 to be approved

Issues

  • JDK-8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents
  • JDK-8307479: Implementation of Prepare to Restrict The Dynamic Loading of Agents (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13899/head:pull/13899
$ git checkout pull/13899

Update a local copy of the PR:
$ git checkout pull/13899
$ git pull https://git.openjdk.org/jdk.git pull/13899/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13899

View PR using the GUI difftool:
$ git pr show -t 13899

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13899.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 10, 2023

👋 Welcome back alanb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label May 10, 2023
@openjdk
Copy link

openjdk bot commented May 10, 2023

@AlanBateman The following labels will be automatically applied to this pull request:

  • build
  • core-libs
  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels May 10, 2023
@AlanBateman
Copy link
Contributor Author

/label remove build
/label remove core-libs

@openjdk openjdk bot removed the build build-dev@openjdk.org label May 10, 2023
@openjdk
Copy link

openjdk bot commented May 10, 2023

@AlanBateman
The build label was successfully removed.

@openjdk openjdk bot removed the core-libs core-libs-dev@openjdk.org label May 10, 2023
@openjdk
Copy link

openjdk bot commented May 10, 2023

@AlanBateman
The core-libs label was successfully removed.

@openjdk
Copy link

openjdk bot commented May 11, 2023

@AlanBateman Unknown command remove - for a list of valid commands use /help.

@openjdk
Copy link

openjdk bot commented May 11, 2023

@AlanBateman Unknown command add - for a list of valid commands use /help.

@AlanBateman
Copy link
Contributor Author

/label add hotspot-runtime
/label remove hotspot

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label May 11, 2023
@openjdk
Copy link

openjdk bot commented May 11, 2023

@AlanBateman
The hotspot-runtime label was successfully added.

@openjdk openjdk bot removed the hotspot hotspot-dev@openjdk.org label May 11, 2023
@openjdk
Copy link

openjdk bot commented May 11, 2023

@AlanBateman
The hotspot label was successfully removed.

@AlanBateman AlanBateman marked this pull request as ready for review May 11, 2023 12:55
@openjdk openjdk bot added the rfr Pull request is ready for review label May 11, 2023
@mlbridge
Copy link

mlbridge bot commented May 11, 2023

@mlbridge
Copy link

mlbridge bot commented May 11, 2023

Mailing list message from Remi Forax on serviceability-dev:

----- Original Message -----

From: "Alan Bateman" <alanb at openjdk.org>
To: hotspot-runtime-dev at openjdk.org, "serviceability-dev" <serviceability-dev at openjdk.org>
Sent: Thursday, May 11, 2023 3:04:16 PM
Subject: RFR: 8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents

This is the implementation for JEP 451. There are two parts to this:

1. A multi-line warning is printed when a JVM TI or Java agent is loaded into a
running VM. For JVM TI, the message is printed to stderr from JvmtiAgent::load.
For Java agents, it is printed to System.err (as that may be redirected) in the
JPLIS (j.l.instrumentation) implementation.

2. If running with -Djdk.instrument.traceUsage or
-Djdk.instrument.traceUsage=true, the calls to the Instrumentation API print a
trace message and stack trace.

I believe the flag should mention that what is traced is just the dynamic instrumentation, static instrumentation is not an issue,
perhaps "-Djdk.instrument.traceDynamicUsage=true"

R?mi

@mlbridge
Copy link

mlbridge bot commented May 11, 2023

Mailing list message from Alan Bateman on serviceability-dev:

On 11/05/2023 14:28, Remi Forax wrote:

:
I believe the flag should mention that what is traced is just the dynamic instrumentation, static instrumentation is not an issue,
perhaps "-Djdk.instrument.traceDynamicUsage=true"

There is static, load-time, and dynamic instrumentation. The
Instrumentation API supports load-time and dynamic. If there is static
instrumentation going on then it happens before the class bytes are
loaded. Is that what you mean?

Or maybe you mean how the agent is deployed? The opt-in tracing right
now is independent of how the agent is started so it works for
-javaagent, Launcher-Agent-Class, and agents that are dynamically loaded
via the attach mechanism. Maybe you are arguing that it should be
limited to agents that are dynamically loaded?

-Alan

@mlbridge
Copy link

mlbridge bot commented May 11, 2023

Mailing list message from forax at univ-mlv.fr on serviceability-dev:

----- Original Message -----

From: "Alan Bateman" <Alan.Bateman at oracle.com>
To: "Remi Forax" <forax at univ-mlv.fr>
Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.org>, "serviceability-dev" <serviceability-dev at openjdk.org>
Sent: Thursday, May 11, 2023 3:46:59 PM
Subject: Re: RFR: 8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents

On 11/05/2023 14:28, Remi Forax wrote:

:
I believe the flag should mention that what is traced is just the dynamic
instrumentation, static instrumentation is not an issue,
perhaps "-Djdk.instrument.traceDynamicUsage=true"

There is static, load-time, and dynamic instrumentation. The
Instrumentation API supports load-time and dynamic. If there is static
instrumentation going on then it happens before the class bytes are
loaded. Is that what you mean?

Or maybe you mean how the agent is deployed? The opt-in tracing right
now is independent of how the agent is started so it works for
-javaagent, Launcher-Agent-Class, and agents that are dynamically loaded
via the attach mechanism. Maybe you are arguing that it should be
limited to agents that are dynamically loaded?

yes !
I did not take a look to the patch, i was thinking that only dynamically loaded agents were traced.

-Alan

R?mi

@AlanBateman
Copy link
Contributor Author

AlanBateman commented May 27, 2023

When testing the proposed patch, I stumbled upon two (related) issues:

Thanks for bringing this up. The question is whether dynamically loading the same agent library N times is treated as one or N agents. Similarly, if an agent library is specified on the command line, then later the same agent is dynamically loaded and started in the running VM, is it the same agent or not? Right now, they are treated as different agents, they may have different agent options, they accumulate in the "agent list", each load/start generates a JFR event, and with the changes here, emit a warning. I think we'll need to think a bit more about that scenario as the notion of agent identity doesn't surface in the API docs (the library address or canonical path to the agent library or JAR file might do). I agree it could be annoying to have multiple warnings if the profiler is attaching and loading the same agent library many times.

@mlbridge
Copy link

mlbridge bot commented May 29, 2023

Mailing list message from Kirk Pepperdine on serviceability-dev:

Hi Andrei,

On May 26, 2023, at 5:00 PM, Andrei Pangin <apangin at openjdk.org> wrote:

As @pron mentioned, the presence of `-agentlib/agentpath` option serves as an explicit user consent to use the tool, and disallowing such agents (or issuing a warning about them) is a non-goal of the JEP.

With the current behavior, users will be tempted to add `-XX:+EnableDynamicAgentLoading` option instead of putting one particular agent on the command line. This again does not serve the purpose of strengthen the integrity.

The use cases I mentioned here are quite natural and supported by popular profilers: a user may want to use a tool some time after the JVM startup, or reconfigure it later at runtime similarly to how `jcmd JFR.configure` works.

Call me dumb.. but? I would have the say that the most puzzling piece of this entire JEP/proposal is that I?m failing to make the connect between how an agent is loaded and how that strengthens integrity. I keep re-reading this JEP looking for clues but I keep bumping my head again? "Giving free reign to tools would imply giving free reign to libraries, which is tantamount to giving up on integrity by default.?. IMO, this is a false equivalency that is not supported by any other point in the document. IOWs, there is nothing in this document that is giving me a clue as to how turning off dynamic attach improves integrity when I can achieve the same effects with a direct attach. What I do know is that turning off dynamic attach by default will cause grief to those that already having to cope with exceptionally complex deployment. I would argue that the complexity of these deployments have dramatically increased since 2017. Do we really want to add to that complexity or should we be refocusing on adding features that help to reduce that complexity.

Kind regards,
Kirk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/serviceability-dev/attachments/20230528/c11be040/attachment-0001.htm>

@AlanBateman
Copy link
Contributor Author

AlanBateman commented May 29, 2023

I agree it could be annoying to have multiple warnings if the profiler is attaching and loading the same agent library many times.

I've updated the changes to allow it be implementation specific as to whether a warning is printed when attempting to start the same agent a second or subsequent time. I need to sync up with Ron/Alex on this point to see if the description in the JEP needs to say anything on this.

For now, attempting to start a JVM TI agent that was previously loaded will not print a warning. We could potentially do the same for Java agents.

@openjdk
Copy link

openjdk bot commented May 31, 2023

@AlanBateman This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8307478: Implementation of Prepare to Restrict The Dynamic Loading of Agents

Reviewed-by: sspitsyn, cjplummer

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 14 new commits pushed to the master branch:

  • 2bb1972: 8308954: [JVMCI] code installation increments decompile_count for call_site_target_value failures
  • 0ab0963: 8308469: [PPC64] Implement alternative fast-locking scheme
  • ec55539: 8309138: Fix container tests for jdks with symlinked conf dir
  • e827164: 8309146: extend JDI StackFrame.setValue() and JDWP StackFrame.setValues minimal support for virtual threads
  • be36096: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe
  • c6f20db: 8308232: nsk/jdb tests don't pass -verbose flag to the debuggee
  • d987176: 8307794: Test for HSS/LMS Signature Verification
  • 050425b: 8298127: HSS/LMS Signature Verification
  • a6109bf: 8308856: jdk.internal.classfile.impl.EntryMap::nextPowerOfTwo math problem
  • 6adc242: 8308943: jdk.internal.le build fails on AIX
  • ... and 4 more: https://git.openjdk.org/jdk/compare/6c7225f819a729b1ef6f8b2769da4b50d879455d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed csr Pull request needs approved CSR before integration labels May 31, 2023
@mlbridge
Copy link

mlbridge bot commented May 31, 2023

Mailing list message from Ron Pressler on serviceability-dev:

On 28 May 2023, at 17:25, Kirk Pepperdine <kirk at kodewerk.com> wrote:

Call me dumb.. but? I would have the say that the most puzzling piece of this entire JEP/proposal is that I?m failing to make the connect between how an agent is loaded and how that strengthens integrity. I keep re-reading this JEP looking for clues but I keep bumping my head again? "Giving free reign to tools would imply giving free reign to libraries, which is tantamount to giving up on integrity by default.?. IMO, this is a false equivalency that is not supported by any other point in the document. IOWs, there is nothing in this document that is giving me a clue as to how turning off dynamic attach improves integrity when I can achieve the same effects with a direct attach.

You can?t find the answer to your question in 451 because we factored out the motivation for 451 (and several other integrated and future JEPs) into the informational JEP, https://openjdk.org/jeps/8305968, which you must read to understand the motivation.

In short, the goal is integrity *by default*, which means that what we seek not an end to ?superpowers" but the explicit granting of superpowers on the command line. The problem is not superpowers, but superpowers that *are not explicitly acknowledged by the application*. Agents loaded at startup have always been explicitly acknowledged by the application, while those loaded after startup have not. The goal is to make them explicitly acknowledged, not ?turned off?, and then you get the same for both ways of loading agents: the application explicitly grants superpowers in both situations. We want the capabilities offered by agents, and we want the application to be able to track them.

What I do know is that turning off dynamic attach by default will cause grief to those that already having to cope with exceptionally complex deployment. I would argue that the complexity of these deployments have dramatically increased since 2017. Do we really want to add to that complexity or should we be refocusing on adding features that help to reduce that complexity.

As the informational JEP explains, not having integrity BY DEFAULT, causes grief too. Do or don?t, someone gets grief. Given that the vast majority of Java programs never require or want agents ? attached dynamically or otherwise ? given that many current uses of dynamically loaded agents are better served by agents loaded at startup, and given that we have adding a command line flag on the one hand vs not having integrity by default that makes it impossible for applications to know the ?map? of their codebase (see the informational JEP), we believe that, when integrated over the entire ecosystem, more grief is caused by not restricting dynamically loaded agents than by restricting it.

? Ron

@mlbridge
Copy link

mlbridge bot commented May 31, 2023

Mailing list message from Kirk Pepperdine on serviceability-dev:

Hi Ron,

Thank you for the response. I?d read the JEP you?ve linked also.From what I?ve been able to discern, this change will affect a small majority of our customers as it will change how they deploy observability and diagnostics. I?ve spoken to the some of the product owners and they feel that this represents a one-off pain point that they can adjust to. The principal CS that have to deal directly with helping customers deploy had a completely different view point on the subject. As I sift through the noise the filter I?m applying is that I?m generally more interested in easing customer concerns over engineering ones.

I?ve read the relevant JEPs several times and I believe I understand the arguments being forth. I?m sorry but I just don?t agree that on by default dynamic loading of an agent has any connection to better code integrity. As someone who had to sit through 2 days of meetings trying to get a call to System.gc() that was happening every 2 minutes resulting in system failures turned off by setting a flag I can tell you that getting an agent dynamically loaded into a runtime isn?t easy to being with. While this experience may seem extreme, it is more common than you?d imagine. As such it is my opinion that this JEP that seems to have more importance to engineers than end users/customers. More over, I feel that it is inflicting an unnecessary change to deployments that is adding to complexity rather than easing it.

Other than that, I don?t feel I have anything more of use to add to the conversation except thank you for answering my questions.

Kind regards,
Kirk

On May 31, 2023, at 4:59 AM, Ron Pressler <ron.pressler at oracle.com> wrote:

On 28 May 2023, at 17:25, Kirk Pepperdine <kirk at kodewerk.com> wrote:

Call me dumb.. but? I would have the say that the most puzzling piece of this entire JEP/proposal is that I?m failing to make the connect between how an agent is loaded and how that strengthens integrity. I keep re-reading this JEP looking for clues but I keep bumping my head again? "Giving free reign to tools would imply giving free reign to libraries, which is tantamount to giving up on integrity by default.?. IMO, this is a false equivalency that is not supported by any other point in the document. IOWs, there is nothing in this document that is giving me a clue as to how turning off dynamic attach improves integrity when I can achieve the same effects with a direct attach.

You can?t find the answer to your question in 451 because we factored out the motivation for 451 (and several other integrated and future JEPs) into the informational JEP, https://openjdk.org/jeps/8305968, which you must read to understand the motivation.

In short, the goal is integrity *by default*, which means that what we seek not an end to ?superpowers" but the explicit granting of superpowers on the command line. The problem is not superpowers, but superpowers that *are not explicitly acknowledged by the application*. Agents loaded at startup have always been explicitly acknowledged by the application, while those loaded after startup have not. The goal is to make them explicitly acknowledged, not ?turned off?, and then you get the same for both ways of loading agents: the application explicitly grants superpowers in both situations. We want the capabilities offered by agents, and we want the application to be able to track them.

What I do know is that turning off dynamic attach by default will cause grief to those that already having to cope with exceptionally complex deployment. I would argue that the complexity of these deployments have dramatically increased since 2017. Do we really want to add to that complexity or should we be refocusing on adding features that help to reduce that complexity.

As the informational JEP explains, not having integrity BY DEFAULT, causes grief too. Do or don?t, someone gets grief. Given that the vast majority of Java programs never require or want agents ? attached dynamically or otherwise ? given that many current uses of dynamically loaded agents are better served by agents loaded at startup, and given that we have adding a command line flag on the one hand vs not having integrity by default that makes it impossible for applications to know the ?map? of their codebase (see the informational JEP), we believe that, when integrated over the entire ecosystem, more grief is caused by not restricting dynamically loaded agents than by restricting it.

? Ron

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update looks good to me.
Posted one nit though.
Thanks,
Serguei

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update looks good.
Thanks,
Serguei

Copy link
Contributor

@plummercj plummercj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the possible addition of some comments regarding what is meant by "loaded", the changes look good.

@AlanBateman
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jun 2, 2023

Going to push as commit 5bd2af2.
Since your change was applied there have been 16 commits pushed to the master branch:

  • 325940b: 8307105: JFileChooser InvalidPathException when selecting some system folders on Windows
  • 101bf22: 8308891: TestCDSVMCrash.java needs @requires vm.cds
  • 2bb1972: 8308954: [JVMCI] code installation increments decompile_count for call_site_target_value failures
  • 0ab0963: 8308469: [PPC64] Implement alternative fast-locking scheme
  • ec55539: 8309138: Fix container tests for jdks with symlinked conf dir
  • e827164: 8309146: extend JDI StackFrame.setValue() and JDWP StackFrame.setValues minimal support for virtual threads
  • be36096: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe
  • c6f20db: 8308232: nsk/jdb tests don't pass -verbose flag to the debuggee
  • d987176: 8307794: Test for HSS/LMS Signature Verification
  • 050425b: 8298127: HSS/LMS Signature Verification
  • ... and 6 more: https://git.openjdk.org/jdk/compare/6c7225f819a729b1ef6f8b2769da4b50d879455d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 2, 2023
@openjdk openjdk bot closed this Jun 2, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 2, 2023
@openjdk
Copy link

openjdk bot commented Jun 2, 2023

@AlanBateman Pushed as commit 5bd2af2.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

5 participants