Use display-name or context path as the default service name #514

felixbarny · 2019-03-05T16:53:13Z

Enables to have multiple service names per JVM,
one per class loader.

closes #136

Checklist

Manually test with 7.0 stack
document this only takes effect when using a 7.0 APM Server
send service name in transaction.context.service.name as opposed to transaction.service.name
add instrumentation group names to make it possible to disable spring.application.name and display-name detection.

Enables to have multiple service names per JVM, one per class loader. closes elastic#136

… class loader test context path based service name

eyalkoren

LGTM. A few minor comments.
One concern: elastic/apm-server#1175 means that the service.name overrides the one from metadata, right? If so, this PR practically makes it impossible for Java agent user to manually set the general service name for the agent, right? If so, we need to come up with a good configuration that will allow that.

...pm-servlet-plugin/src/test/java/co/elastic/apm/agent/servlet/ServletInstrumentationTest.java

eyalkoren · 2019-03-07T17:22:55Z

docs/configuration.asciidoc

 ==== `transaction_sample_rate`

-By default, the agent will sample every transaction (e.g. request to your service). To reduce overhead and storage requirements, you can set the sample rate to a value between 0.0 and 1.0. We still record overall time and the result for unsampled transactions, but no context information, labels, or spans.
+By default, the agent will sample every transaction (e.g. request to your service). To reduce overhead and storage requirements, you can set the sample rate to a value between 0.0 and 1.0. We still record overall time and the result for unsampled transactions, but no context information, tags, or spans.


I fixed this in my PR (ie change in CoreConfiguration.java), so no need to fix if you encounter this

This was just a byproduct of executing the tests 🙂 (ConfigurationExporterTest)

Yes, I meant I updated the origin (the java class) in my PR

...tion-tests/src/test/java/co/elastic/apm/servlet/AbstractServletContainerIntegrationTest.java

eyalkoren · 2019-03-07T18:02:05Z

One more thing- if the automatic service name discovery is on by default, it means that users upgrading to the new server version that support that should get a warning in advance that their data is going to be presented/aggregated differently. I am not sure whether it is something we should document or server, probably depends on how relevant this is for other agents 🤔 .
Maybe the best thing is to make service_name configuration optional, publish the automatic discovery capability well to make sure users know about it, and if someone actively sets the service_name- let it override the manual. This way, existing users will not get surprised and new users will be able to choose.

Set initiating class loader for public API, OT API and trace_methods

felixbarny · 2019-03-11T09:49:07Z

this PR practically makes it impossible for Java agent user to manually set the general service name for the agent, right?

service_name would still set the name in case an application is deployed to the root context and the <display-name> is not configured in the web.xml. It also sets the service_name for the metrics. There can be multiple services per JVM and the metrics are not specific to one service but the whole JVM. When we add application-specific metrics, like metrics for response times, it gets a bit more tricky. We will then have to make it possible to override the service_name for metrics as well.

Maybe the best thing is to make service_name configuration optional

It currently is already optional

if someone actively sets the service_name- let it override the [automatic one]

But how can they then override just the default service_name which is relevant for the metrics? Also, it would not allow to have multiple custom service names. What about providing the ability to set a custom service_name per servlet context path?

We could also add an option to disable the automatic detection of the service name. Not sure if that would be useful though 🤔.

…@CaptureTransaction Relying on `this` would not work for static methods

felixbarny · 2019-03-11T14:21:53Z

@kuisathaverat seems Jenkins has problems to clone this PR. Do you have an idea how to fix that?

felixbarny · 2019-03-11T14:29:47Z

It also sets the service_name for the metrics. There can be multiple services per JVM and the metrics are not specific to one service but the whole JVM.

Another idea would be to just report the metrics multiple times if there are multiple applications on one JVM - once for each distinct service name. Then we could actually make it so that if the service_name is explicitly configured, there will only be one and multiple service names based on the display-name etc. are disabled. If needed, we can then later add a config to manually set the name of different applications per their mapped context path.

@gregkalapos do you have similar considerations in the .NET agent?

kuisathaverat · 2019-03-11T15:01:20Z

@felixbarny I have seen you resolved the merge conflict

[2019-03-11T14:25:26.364Z] hudson.plugins.git.GitException: Command "git merge f4a84b78dd08de89eadf4e5bbf36293dd0264c7d" returned status code 1:
[2019-03-11T14:25:26.364Z] stdout: Auto-merging apm-agent-plugins/apm-slf4j-plugin/src/test/java/co/elastic/apm/agent/slf4j/Slf4JMdcActivationListenerTest.java
[2019-03-11T14:25:26.364Z] CONFLICT (content): Merge conflict in apm-agent-plugins/apm-slf4j-plugin/src/test/java/co/elastic/apm/agent/slf4j/Slf4JMdcActivationListenerTest.java
[2019-03-11T14:25:26.364Z] Automatic merge failed; fix conflicts and then commit the result.

eyalkoren · 2019-03-12T04:14:41Z

I believe our data model is missing something here, which makes it so difficult...
If we had a distinction between agent.id/agent.name and service.name that could be useful for this and other things as well. This is certainly tricky.
Maybe we can start with

make it so that if the service_name is explicitly configured, there will only be one and multiple service names based on the display-name etc. are disabled.

and see how we tackle the broader problem when we introduce app-specific metrics. Hopefully we will have something to identify the entire JVM/agent-instance by then

…ead of null

codecov-io · 2019-03-12T17:29:02Z

Codecov Report

Merging #514 into master will not change coverage.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master     #514   +/-   ##
=========================================
  Coverage     65.39%   65.39%           
  Complexity       68       68           
=========================================
  Files           175      175           
  Lines          6903     6903           
  Branches        835      835           
=========================================
  Hits           4514     4514           
  Misses         2136     2136           
  Partials        253      253

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 700b9e6...53fd452. Read the comment docs.

felixbarny · 2019-03-20T16:18:29Z

Going with @roncohen's call for now:

elastic/apm#65 (comment)

@felixbarny if we have people actively requesting the possibility to use multiple service names, i think it would be fine to release the PR you linked. Having that in will unblock all those folks. They might not be able to use metrics fully yet, but at least all the regular APM stuff will become available to them. We can fix metrics (whatever that ends up meaning) in a following release

Even when the APM Server adds support for specifying multiple service names, we will have to do a version check and only do that on APM Server versions which support that. If we would now send the metrics in multiple requests with different metadata, we could never get rid of that code as the agent might be used with an older APM Server version.

Will add docs for that caveat (metrics UI not working when) in the service_name option description. Explicitly setting the service_name serves as a escape hatch as sets the service name for the whole process and disables discovering web-application-specific service names. This also means that if users have already explicitly set the service_name, updating the agent does not change their service names.

felixbarny · 2019-03-20T16:47:33Z

This is how the docs would look like:

Edit: reflect changes in 58076a0

eyalkoren

Our changelog and release notes should emphasize this change in BOLD CAPITALS- anyone upgrading to 1.5.0+ and not specifying the service_name will get their data completely changed

eyalkoren · 2019-03-21T04:53:27Z

docs/configuration.asciidoc

+# NOTE: When relying on auto-discovery of the service name in Servlet environments,
+# there is currently a caveat related to metrics.
+#  +
+# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.


Suggested change

# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.

# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilization if the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.

eyalkoren · 2019-03-21T05:01:04Z

docs/configuration.asciidoc

+# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.
+# The reason is that metrics are reported with the detected default service name,
+# for example `tomcat-application`.
+#  +


Maybe add a clarification that this means system metrics can still be viewed when service names are detected automatically, but through the fallback service name.

…ce.name

felixbarny · 2019-03-21T12:42:41Z

Test observations: when running a spring boot application with an embedded server which has set spring.application.name=opbeans-java in application.properties, there are two services listed the UI - opbeans-java and OpbeansApplication (the name of the main class). Only the former contains traces while only the latter contains metrics. That might be somewhat confusing as in embedded Spring Boot applications, there can only be one service per JVM. However, you can also deploy spring boot applications to a servlet container which may contain multiple applications.

It's tempting to only report metrics after there has been at least one recorded transaction. But reporting metrics without instrumenting the application is a real use case so we can't really do that.

We could disable spring.application.name detection as this makes the situation for spring boot applications with embedded servers arguably worse because the service list then contains two application names. However, it makes the situation worse for spring applications deployed to a servlet container which are mapped to the root servlet context (/) and don't have a <display-name> specified in their web.xml.

Even when we do have support specifying multiple service names for metrics, we'd still have the situation that there is a service in the service list, representing a servlet container, which may never have actual traces.

Removing services which can never have traces and allowing to track metrics without collecting traces are, however, contradicting goals. We could introduce a filter in the services list UI to only show services with actual traces but that would also hide applications which have just been started and no traces have yet been collected. Not showing those in the services list has lead to confusion for users and several discuss topics. That's why we now also show services without traces but with metrics.

eyalkoren · 2019-03-21T13:57:28Z

It's tempting to only report metrics after there has been at least one recorded transaction. But reporting metrics without instrumenting the application is a real use case so we can't really do that.

I was also tempted to suggest using the first transaction to decide on the metrics service, but didn't because it means no consistency is guaranteed, so you could get the metrics under different service names in different restarts...

We could disable spring.application.name detection as this makes the situation for spring boot applications with embedded servers arguably worse because the service list then contains two application names.

I see what you're saying, but I don't think we need to go down this path (which is not limited to spring apps). We decided to provide this capability for those that see the lack of it blocking them from using us, with the decision we will make it better in the future. It can be imperfect now as long as we provide the escape from it through the service_name config.

However, it makes the situation worse for spring applications deployed to a servlet container which are mapped to the root servlet context (/) and don't have a specified in their web.xml.

Again, don't forget we still have the manual configuration to deal with this kind of cases in the first place

Even when we do have support specifying multiple service names for metrics, we'd still have the situation that there is a service in the service list, representing a servlet container, which may never have actual traces.

If we don't include the fallback service name in the list sent within the metrics document, and only rely on the discovered service name cache, we will make it a bit better. However, we may get metrics reported for this service name every restart before the first transaction occurs, so this service will show up for long-enough time ranges... This is a more general problem I think- the fact that we rely on discovery means that service names may be accumulated in metrics documents over time after each restart.

Use display-name or context path as the default service name

ecc2233

Enables to have multiple service names per JVM, one per class loader. closes elastic#136

felixbarny assigned felixbarny and unassigned felixbarny Mar 5, 2019

felixbarny added the [zube]: In Progress label Mar 5, 2019

felixbarny self-assigned this Mar 5, 2019

felixbarny added 2 commits March 7, 2019 15:01

Service name based on spring.application.name

3b02600

Override service name based on spring.application.name on the correct…

d547882

… class loader test context path based service name

eyalkoren reviewed Mar 7, 2019

View reviewed changes

Require to set initiating class loader when starting transactions

a90d725

Set initiating class loader for public API, OT API and trace_methods

felixbarny added 2 commits March 11, 2019 11:32

Move serviceName to TraceContext

5c791ec

Use class instead of this to determine initiating class loader for …

90f9dfc

…@CaptureTransaction Relying on `this` would not work for static methods

Remove unnecessary class.getClass() call

c0ad235

felixbarny marked this pull request as ready for review March 11, 2019 14:33

Merge branch 'master' into servlet-service-name

688a9c1

felixbarny added 2 commits March 11, 2019 16:18

Fix compile error due to merge

7e40364

Fix servlet test

728b577

felixbarny added 4 commits March 12, 2019 08:20

Make sure to init agent before spring-boot

c3883bf

Reset agent in @afterclass

b469fe9

Guard against servletContext.getServletContextName() being empty inst…

cd29b80

…ead of null

Don't rely on calling init manually

1af9569

felixbarny added 2 commits March 13, 2019 07:59

Don't override the service name if set explicitly

a8c524d

Fix tests, small docs change

a4f3372

felixbarny mentioned this pull request Mar 13, 2019

Multiple service names per process and process-global metrics elastic/apm#65

Closed

felixbarny added the blocked label Mar 13, 2019

eyalkoren approved these changes Mar 14, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into servlet-service-name

6c9c4f9

felixbarny added 2 commits March 20, 2019 17:46

Document caveats related to metrics

ab0efd9

Merge remote-tracking branch 'origin/master' into servlet-service-name

c501d5c

felixbarny requested a review from roncohen March 20, 2019 16:49

eyalkoren approved these changes Mar 21, 2019

View reviewed changes

felixbarny added 2 commits March 21, 2019 11:47

Improve docs, add changelog, add instrumentation group names

58076a0

Serialize overridden name in context.service.name as opposed to servi…

4aa5cff

…ce.name

Merge branch 'master' into servlet-service-name

53fd452

felixbarny removed the blocked label Mar 21, 2019

felixbarny merged commit 9379f5f into elastic:master Mar 21, 2019

zube bot added [zube]: Done and removed [zube]: In Progress labels Mar 21, 2019

alvarolobato removed the [zube]: Done label Mar 25, 2019

	# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilizationif the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.
	# The consequence is that the 'Metrics' tab of a service does not show process-global metrics like CPU utilization if the corresponding service name has been auto-discovered based on the `web.xml`'s `display-name` or the servlet context path.

Use display-name or context path as the default service name #514

Use display-name or context path as the default service name #514

Uh oh!

Conversation

felixbarny commented Mar 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eyalkoren left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eyalkoren Mar 7, 2019

Choose a reason for hiding this comment

Uh oh!

felixbarny Mar 11, 2019

Choose a reason for hiding this comment

Uh oh!

eyalkoren Mar 12, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eyalkoren commented Mar 7, 2019

Uh oh!

felixbarny commented Mar 11, 2019

Uh oh!

felixbarny commented Mar 11, 2019

Uh oh!

felixbarny commented Mar 11, 2019

Uh oh!

kuisathaverat commented Mar 11, 2019

Uh oh!

eyalkoren commented Mar 12, 2019

Uh oh!

codecov-io commented Mar 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

felixbarny commented Mar 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felixbarny commented Mar 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eyalkoren left a comment

Choose a reason for hiding this comment

Uh oh!

eyalkoren Mar 21, 2019

Choose a reason for hiding this comment

Uh oh!

eyalkoren Mar 21, 2019

Choose a reason for hiding this comment

Uh oh!

felixbarny commented Mar 21, 2019

Uh oh!

eyalkoren commented Mar 21, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

felixbarny commented Mar 5, 2019 •

edited

Loading

codecov-io commented Mar 12, 2019 •

edited

Loading

felixbarny commented Mar 20, 2019 •

edited

Loading

felixbarny commented Mar 20, 2019 •

edited

Loading