Skip to content

Conversation

@felixbarny
Copy link
Member

@felixbarny felixbarny commented Mar 5, 2019

Enables to have multiple service names per JVM,
one per class loader.

closes #136

Checklist

  • Manually test with 7.0 stack
  • document this only takes effect when using a 7.0 APM Server
  • send service name in transaction.context.service.name as opposed to transaction.service.name
  • add instrumentation group names to make it possible to disable spring.application.name and display-name detection.

Enables to have multiple service names per JVM,
one per class loader.

closes elastic#136
Copy link
Contributor

@eyalkoren eyalkoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A few minor comments.
One concern: elastic/apm-server#1175 means that the service.name overrides the one from metadata, right? If so, this PR practically makes it impossible for Java agent user to manually set the general service name for the agent, right? If so, we need to come up with a good configuration that will allow that.

==== `transaction_sample_rate`

By default, the agent will sample every transaction (e.g. request to your service). To reduce overhead and storage requirements, you can set the sample rate to a value between 0.0 and 1.0. We still record overall time and the result for unsampled transactions, but no context information, labels, or spans.
By default, the agent will sample every transaction (e.g. request to your service). To reduce overhead and storage requirements, you can set the sample rate to a value between 0.0 and 1.0. We still record overall time and the result for unsampled transactions, but no context information, tags, or spans.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this in my PR (ie change in CoreConfiguration.java), so no need to fix if you encounter this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just a byproduct of executing the tests 🙂 (ConfigurationExporterTest)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant I updated the origin (the java class) in my PR

@eyalkoren
Copy link
Contributor

One more thing- if the automatic service name discovery is on by default, it means that users upgrading to the new server version that support that should get a warning in advance that their data is going to be presented/aggregated differently. I am not sure whether it is something we should document or server, probably depends on how relevant this is for other agents 🤔 .
Maybe the best thing is to make service_name configuration optional, publish the automatic discovery capability well to make sure users know about it, and if someone actively sets the service_name- let it override the manual. This way, existing users will not get surprised and new users will be able to choose.

Set initiating class loader for public API, OT API and trace_methods
@felixbarny
Copy link
Member Author

this PR practically makes it impossible for Java agent user to manually set the general service name for the agent, right?

service_name would still set the name in case an application is deployed to the root context and the <display-name> is not configured in the web.xml. It also sets the service_name for the metrics. There can be multiple services per JVM and the metrics are not specific to one service but the whole JVM. When we add application-specific metrics, like metrics for response times, it gets a bit more tricky. We will then have to make it possible to override the service_name for metrics as well.

Maybe the best thing is to make service_name configuration optional

It currently is already optional

if someone actively sets the service_name- let it override the [automatic one]

But how can they then override just the default service_name which is relevant for the metrics? Also, it would not allow to have multiple custom service names. What about providing the ability to set a custom service_name per servlet context path?

We could also add an option to disable the automatic detection of the service name. Not sure if that would be useful though 🤔.

…@CaptureTransaction

Relying on `this` would not work for static methods
@felixbarny
Copy link
Member Author

@kuisathaverat seems Jenkins has problems to clone this PR. Do you have an idea how to fix that?

@felixbarny
Copy link
Member Author

It also sets the service_name for the metrics. There can be multiple services per JVM and the metrics are not specific to one service but the whole JVM.

Another idea would be to just report the metrics multiple times if there are multiple applications on one JVM - once for each distinct service name. Then we could actually make it so that if the service_name is explicitly configured, there will only be one and multiple service names based on the display-name etc. are disabled. If needed, we can then later add a config to manually set the name of different applications per their mapped context path.

@gregkalapos do you have similar considerations in the .NET agent?

@felixbarny felixbarny marked this pull request as ready for review March 11, 2019 14:33
@kuisathaverat
Copy link
Contributor

@felixbarny I have seen you resolved the merge conflict

[2019-03-11T14:25:26.364Z] hudson.plugins.git.GitException: Command "git merge f4a84b78dd08de89eadf4e5bbf36293dd0264c7d" returned status code 1:
[2019-03-11T14:25:26.364Z] stdout: Auto-merging apm-agent-plugins/apm-slf4j-plugin/src/test/java/co/elastic/apm/agent/slf4j/Slf4JMdcActivationListenerTest.java
[2019-03-11T14:25:26.364Z] CONFLICT (content): Merge conflict in apm-agent-plugins/apm-slf4j-plugin/src/test/java/co/elastic/apm/agent/slf4j/Slf4JMdcActivationListenerTest.java
[2019-03-11T14:25:26.364Z] Automatic merge failed; fix conflicts and then commit the result.

@eyalkoren
Copy link
Contributor

I believe our data model is missing something here, which makes it so difficult...
If we had a distinction between agent.id/agent.name and service.name that could be useful for this and other things as well. This is certainly tricky.
Maybe we can start with

make it so that if the service_name is explicitly configured, there will only be one and multiple service names based on the display-name etc. are disabled.

and see how we tackle the broader problem when we introduce app-specific metrics. Hopefully we will have something to identify the entire JVM/agent-instance by then

@codecov-io
Copy link

codecov-io commented Mar 12, 2019

Codecov Report

Merging #514 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #514   +/-   ##
=========================================
  Coverage     65.39%   65.39%           
  Complexity       68       68           
=========================================
  Files           175      175           
  Lines          6903     6903           
  Branches        835      835           
=========================================
  Hits           4514     4514           
  Misses         2136     2136           
  Partials        253      253

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 700b9e6...53fd452. Read the comment docs.

@felixbarny
Copy link
Member Author

felixbarny commented Mar 20, 2019

Going with @roncohen's call for now:

elastic/apm#65 (comment)

@felixbarny if we have people actively requesting the possibility to use multiple service names, i think it would be fine to release the PR you linked. Having that in will unblock all those folks. They might not be able to use metrics fully yet, but at least all the regular APM stuff will become available to them. We can fix metrics (whatever that ends up meaning) in a following release

Even when the APM Server adds support for specifying multiple service names, we will have to do a version check and only do that on APM Server versions which support that. If we would now send the metrics in multiple requests with different metadata, we could never get rid of that code as the agent might be used with an older APM Server version.

Will add docs for that caveat (metrics UI not working when) in the service_name option description. Explicitly setting the service_name serves as a escape hatch as sets the service name for the whole process and disables discovering web-application-specific service names. This also means that if users have already explicitly set the service_name, updating the agent does not change their service names.

@felixbarny
Copy link
Member Author

felixbarny commented Mar 20, 2019

This is how the docs would look like:

Edit: reflect changes in 58076a0

Screen Shot 2019-03-21 at 11 46 06

@felixbarny felixbarny requested a review from roncohen March 20, 2019 16:49
Copy link
Contributor

@eyalkoren eyalkoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our changelog and release notes should emphasize this change in BOLD CAPITALS- anyone upgrading to 1.5.0+ and not specifying the service_name will get their data completely changed

# NOTE: When relying on auto-discovery of the service name in Servlet environments,
# there is currently a caveat related to metrics.
# +
# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.
# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilization if the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.

# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.
# The reason is that metrics are reported with the detected default service name,
# for example `tomcat-application`.
# +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a clarification that this means system metrics can still be viewed when service names are detected automatically, but through the fallback service name.

@felixbarny
Copy link
Member Author

Test observations: when running a spring boot application with an embedded server which has set spring.application.name=opbeans-java in application.properties, there are two services listed the UI - opbeans-java and OpbeansApplication (the name of the main class). Only the former contains traces while only the latter contains metrics. That might be somewhat confusing as in embedded Spring Boot applications, there can only be one service per JVM. However, you can also deploy spring boot applications to a servlet container which may contain multiple applications.

It's tempting to only report metrics after there has been at least one recorded transaction. But reporting metrics without instrumenting the application is a real use case so we can't really do that.

We could disable spring.application.name detection as this makes the situation for spring boot applications with embedded servers arguably worse because the service list then contains two application names. However, it makes the situation worse for spring applications deployed to a servlet container which are mapped to the root servlet context (/) and don't have a <display-name> specified in their web.xml.

Even when we do have support specifying multiple service names for metrics, we'd still have the situation that there is a service in the service list, representing a servlet container, which may never have actual traces.

Removing services which can never have traces and allowing to track metrics without collecting traces are, however, contradicting goals. We could introduce a filter in the services list UI to only show services with actual traces but that would also hide applications which have just been started and no traces have yet been collected. Not showing those in the services list has lead to confusion for users and several discuss topics. That's why we now also show services without traces but with metrics.

@eyalkoren
Copy link
Contributor

It's tempting to only report metrics after there has been at least one recorded transaction. But reporting metrics without instrumenting the application is a real use case so we can't really do that.

I was also tempted to suggest using the first transaction to decide on the metrics service, but didn't because it means no consistency is guaranteed, so you could get the metrics under different service names in different restarts...

We could disable spring.application.name detection as this makes the situation for spring boot applications with embedded servers arguably worse because the service list then contains two application names.

I see what you're saying, but I don't think we need to go down this path (which is not limited to spring apps). We decided to provide this capability for those that see the lack of it blocking them from using us, with the decision we will make it better in the future. It can be imperfect now as long as we provide the escape from it through the service_name config.

However, it makes the situation worse for spring applications deployed to a servlet container which are mapped to the root servlet context (/) and don't have a specified in their web.xml.

Again, don't forget we still have the manual configuration to deal with this kind of cases in the first place

Even when we do have support specifying multiple service names for metrics, we'd still have the situation that there is a service in the service list, representing a servlet container, which may never have actual traces.

If we don't include the fallback service name in the list sent within the metrics document, and only rely on the discovered service name cache, we will make it a bit better. However, we may get metrics reported for this service name every restart before the first transaction occurs, so this service will show up for long-enough time ranges... This is a more general problem I think- the fact that we rely on discovery means that service names may be accumulated in metrics documents over time after each restart.

@felixbarny felixbarny merged commit 9379f5f into elastic:master Mar 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multiple service names

5 participants