Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow application start with xray agent #70

Open
dan-lind opened this issue Dec 4, 2020 · 3 comments
Open

Slow application start with xray agent #70

dan-lind opened this issue Dec 4, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@dan-lind
Copy link

dan-lind commented Dec 4, 2020

I recently added aws-xray-java-agent 2.7.1 to an app running Spring boot 2.3.5.

Before I added the agent, startup typically took around 3-4 seconds.
Started Application in 3.927 seconds (JVM running for 4.282)

Using aws-xray-java-agent 2.7.1, the startup of my application takes anywhere between 5-10 times longer.

Basically put the disco jars in place and added this line to my gradle.build

bootRun {
    jvmArgs = ["-javaagent:disco/disco-java-agent.jar=pluginPath=disco/disco-plugins"]
}

Started Application in 21.268 seconds (JVM running for 23.917)

Does this slowdown seem reasonably, or is something going on with my setup?

@willarmiros
Copy link
Contributor

Hi @dan-lind,

Thanks for raising this issue. I experienced this while developing the agent as well and spent quite a long time investigating it without too many results. I achieved incremental results by reducing our usage of Jackson Object Mappers and lowering the timeout of some of the metadata endpoint requests our plugins make during initialization. But none of these seemed to address the root cause of the very long startup time. My latest findings were that the slowdown definitely involves the initial agent configuration (in XRaySDKConfiguation) and might have something to do with loading the AWS SDK.

I'll leave this open as a bug while I continue to investigate it.

@willarmiros willarmiros added the bug Something isn't working label Dec 4, 2020
@dan-lind
Copy link
Author

dan-lind commented Dec 7, 2020

I did some more digging on my side. It looks to me like the xray agent itself doesn't add that much time, maybe around 3-5 seconds or so in. Rather it seems to be the disco agent that adds most of the overhead. For example each SdkClient takes a second or so, with the first one taking 3-5 seconds.

[software.amazon.disco.agent.awsv2.AWSClientBuilderInterceptor] DiSCo(AWSv2) method interception of public final java.lang.Object software.amazon.awssdk.core.client.builder.SdkDefaultClientBuilder.build()
SsmClient: 3580 ms
[software.amazon.disco.agent.awsv2.AWSClientBuilderInterceptor] DiSCo(AWSv2) method interception of public final java.lang.Object software.amazon.awssdk.core.client.builder.SdkDefaultClientBuilder.build()
DDB Enhanced client: 807 ms
[software.amazon.disco.agent.awsv2.AWSClientBuilderInterceptor] DiSCo(AWSv2) method interception of public final java.lang.Object software.amazon.awssdk.core.client.builder.SdkDefaultClientBuilder.build()
SesClient: 904 ms
[software.amazon.disco.agent.awsv2.AWSClientBuilderInterceptor] DiSCo(AWSv2) method interception of public final java.lang.Object software.amazon.awssdk.core.client.builder.SdkDefaultClientBuilder.build()
CloudWatch client: 342 ms

Same init without the agent

SsmClient: 276 ms
DDB Enhanced client: 38 ms
SesClient: 41 ms
CloudWatch client: 15 ms

Also noticed that in my case, Disco lists the same DataAccessor 5 times as an "installable" for the web plugin. Not sure if it adds to the overhead or not.

[software.amazon.disco.agent.DiscoAgent] DiSCo(Agent) Plugin name: disco-java-agent-web-plugin.jar
        Bootstrap: no
        Installables: software.amazon.disco.agent.web.servlet.HttpServletServiceInterceptor,
software.amazon.disco.agent.web.apache.httpclient.ApacheHttpClientInterceptor,
software.amazon.disco.agent.interception.templates.DataAccessor,
software.amazon.disco.agent.interception.templates.DataAccessor,
software.amazon.disco.agent.interception.templates.DataAccessor,
software.amazon.disco.agent.interception.templates.DataAccessor,
software.amazon.disco.agent.interception.templates.DataAccessor

These numbers only account for around 10 of the ~25 seconds of slowdown to the startup that I see when running locally, so there is probably more overhead coming from the disco agent. So I guess that this slowdown is patially/mostly an issue for another repo.

@willarmiros
Copy link
Contributor

Hmm yeah this makes sense, since ByteBuddy does have to do some magic to change the bytecode of all those AWS SDK clients when their classes are loaded for auto-instrumentation to occur. The Disco team is working on removing the DataAccessor pattern altogether since it's somewhat of a legacy way of doing introspection anyway. They're also working on more general performance improvements. Feel free to raise this issue over with them as well: https://github.com/awslabs/disco

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants