Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better service name and version detection #1327

Closed
7 tasks done
felixbarny opened this issue Aug 6, 2020 · 21 comments
Closed
7 tasks done

Better service name and version detection #1327

felixbarny opened this issue Aug 6, 2020 · 21 comments

Comments

@felixbarny
Copy link
Member

felixbarny commented Aug 6, 2020

Currently, the only way to take advantage of the deployment tracking is to use the service_version configuration option.

We could get the version from the MANIFEST.MF Implementation-Version entries.

While doing that, we might also want to read the Implementation-Title entry, if available to determine the service.name.

Spring Boot seems to set both of these entries.

Tasks

@samilAyoub
Copy link

@felixbarny Could you give me a few pointers to deal with that?

@SylvainJuge SylvainJuge added the new-feature New feature label Nov 26, 2020
@SylvainJuge SylvainJuge added this to the 7.11 milestone Nov 26, 2020
@AlexanderWert AlexanderWert added this to Planned in APM-Agents (OLD) via automation Mar 1, 2021
@AlexanderWert AlexanderWert modified the milestones: 7.11, 7.12 Mar 1, 2021
@AlexanderWert AlexanderWert modified the milestones: 7.12, 7.13, 7.14 Mar 24, 2021
@felixbarny
Copy link
Member Author

I gave a bit of thought after reading #1725 (comment)

I think we can make it more generic than just doing it specifically for the Servlet API.
Already today, each transaction is started in the context of a specific class loader. When co.elastic.apm.agent.impl.ElasticApmTracer#overrideServiceNameForClassLoader has been called before with that specific class loader, the overridden service name will be used.

What we could do on the first invocation of co.elastic.apm.agent.impl.ElasticApmTracer#getServiceName for a given classloader is trying to load the META-INF/MANIFEST.MF for that classloader and to override the service name and service version based on that.

That would probably work generically for any framework by looking up the service name based on the manifest entry when the first transaction of any given service starts. It can also help to determine the service name early enough in the startup process for the automatic ECS log reconfiguration (#1261) to add the correct service.name to the logs.

It needs validation but it seems worth trying it out.

@tobiasstadler
Copy link
Contributor

I played with it a little bit here: https://github.com/tobiasstadler/apm-agent-java/tree/poc-service-name-and-version. It lacks tests and documentation, but It my it (mostly) works for me.

@felixbarny
Copy link
Member Author

This looks great! It has one issue, though. It assumes that the agent is attached at startup via -javaagent. However, we also support runtime attachment. At the point in time when the agent is runtime-attached, all the initializers might have already run. Is there a particular reason why you preferred resolving the service name and version so early in the startup process, as opposed to resolving them when the first transaction is started? The latter seems easier, wouldn't require technology-specific instrumentations, and works with runtime attachment. But maybe I'm missing something why it's important to resolve the service name/version early.

@felixbarny
Copy link
Member Author

Another thing to consider is that a single JVM may have multiple service names, for example when multiple web applications (e.g. war files) are deployed to a single Tomcat instance. I'm referring specifically to the void overrideServiceName(@Nullable String serviceName) method In your POC.

@tobiasstadler
Copy link
Contributor

I implemented early service name/version detection because in case of Application Servers transactions might already be created before the first request hits the application. E.g. during deployment using startup EJBs or CDI events or after deployment using scheduled EJBs. Also the are applications which might not have a http interface because they only process JMD/Kafka/... messages. In this cases the transactions will have the „wrong“ service name/version, if more than one application is deployed on the same application server.

@tobiasstadler
Copy link
Contributor

The service name set via overrideServiceName(String serviceName) is only used if no service name is set via overrideServiceNameForClassLoader(ClassLoader classLoader, String serviceName) for the application classloader. So deploying multiple wars in one application server is not a problem. Infact, As I am actually deploying multiple war in one application server, it is important to me that this works ;-)

@tobiasstadler
Copy link
Contributor

Yes my approach won‘t work with runtime attachment. But as I only use the javaagent and I have intention to change that, I haven‘t put much effort in supporting runtime attachment yet.

@felixbarny
Copy link
Member Author

I implemented early service name/version detection because in case of Application Servers transactions might already be created before the first request hits the application.

That would also be handled with my suggestion: on the first transaction, no matter if it's a servlet request or a CDI event, the service name would be determined using the initiating class loader that's a required parameter when creating a transaction. That works with both runtime and startup attachment and doesn't require implementing technology-specific startup instrumentations.

@tobiasstadler
Copy link
Contributor

I implemented early service name/version detection because in case of Application Servers transactions might already be created before the first request hits the application.

That would also be handled with my suggestion: on the first transaction, no matter if it's a servlet request or a CDI event, the service name would be determined using the initiating class loader that's a required parameter when creating a transaction. That works with both runtime and startup attachment and doesn't require implementing technology-specific startup instrumentations.

I am fine with that. But one still has to implement service name/version detection for the none servlet case.

@felixbarny
Copy link
Member Author

But one still has to implement service name/version detection for the none servlet case.

But no matter if servlet or not, the service name detection would work the same way, by looking up the META-INF/MANIFEST.MF file of the initiating class loader, right? Or am I missing something?

@tobiasstadler
Copy link
Contributor

But one still has to implement service name/version detection for the none servlet case.

But no matter if servlet or not, the service name detection would work the same way, by looking up the META-INF/MANIFEST.MF file of the initiating class loader, right? Or am I missing something?

There might be more than one META-INF/MANIFEST.MF on the class/module path. How would you like to determine which one is the "correct" one?

@tobiasstadler
Copy link
Contributor

As the order of resources is more or less class loader dependent, I think it is quite hard to finde the application META-INF/MANIFEST.MF

@felixbarny
Copy link
Member Author

@tobiasstadler
Copy link
Contributor

I think it is safe to use ServletContext.getResource/getResourceAsStream here, because the Servlet specification states: "These methods will first search the root of the web application context for the requested resource before looking at any of the JAR files in the WEB-INF/lib directory" => The war itself is scanned before any dependency.

The specification also states: "The getResource and getResourceAsStream methods take a String with a leading “/” as an argument that gives the path of the resource relative to the root of the context or relative to the META-INF/resources directory of a JAR file inside the web application’s WEB-INF/lib directory". So in my opinion ServletContext.getResource("META-INF/MANIFEST") should not return a manifest file from a dependency. But I am not really sure if I read hat correctly.

On the other side ClassLoader.getResource only states "This method will first search the parent class loader for the resource; if the parent is null the path of the class loader built-in to the virtual machine is searched. That failing, this method will invoke findResource(String) to find the resource.". So I don't think one can trust ClassLoader.getResource("META-INF/MANIFEST") returning the application' manifest file

@tobiasstadler
Copy link
Contributor

tobiasstadler commented Apr 19, 2021

@felixbarny I find you approach very interesting, better und more maintainable then what I did. I just don't know how to find the right manifest file.

@felixbarny
Copy link
Member Author

That's a very valid point and I haven't considered it. I'm also not sure how we could reliably find the right manifest file. Maybe we can get the resources as URLs and check whether they're inside WEB-INF/lib vs META-INF/MANIFEST.MF for web applications. Not quite sure how to handle it more generically for other types of applications, though.

@felixbarny
Copy link
Member Author

Some observations using Spring Boot 2.0 and embedded Tomcat:

  • servletContext.getResource("/META-INF/MANIFEST.MF") returns null
  • Collections.list(servletContext.getClassLoader().getResources("META-INF/MANIFEST.MF")) returns 88 URLs, including
0 = {URL@16447} "jar:file:/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home/jre/lib/ext/jfxrt.jar!/META-INF/MANIFEST.MF"
1 = {URL@16448} "jar:file:/Users/felixbarnsteiner/projects/github/spring-projects/spring-petclinic/target/spring-petclinic-2.0.0-SNAPSHOT.jar!/META-INF/MANIFEST.MF"
2 = {URL@16449} "jar:file:/private/var/folders/q5/c7npddds22scd3dk78sx73lr0000gn/T/elastic-apm-agent-df58ce3247b8ec69683fc08e68fe9f23-8dd5d80d86bec4d88129f73444422a82.jar!/META-INF/MANIFEST.MF"
3 = {URL@16450} "jar:file:/Users/felixbarnsteiner/projects/github/spring-projects/spring-petclinic/target/spring-petclinic-2.0.0-SNAPSHOT.jar!/BOOT-INF/lib/apm-agent-api-1.17.1-SNAPSHOT.jar!/META-INF/MANIFEST.MF"
... other dependency's manifest files
  • Collections.list(servletContext.getClassLoader().getResources("/META-INF/MANIFEST.MF")) (with a leading slash) returns 85 URLs, the same as without the slash, minus the first three ones.

Not sure if that's something we can generalize, such as always the second one or always the one before the agent's manifest file.

@tobiasstadler
Copy link
Contributor

@felixbarny How should we proceed? I would like to get #1726 or sth. similar committed, so I can use the undertow plugin of my poc (as an external plugin)

@felixbarny
Copy link
Member Author

Thanks to @tobiasstadler we made good progress on this. I've updated the issue description to track the different tasks.

Seems the attempt to generically get the manifest based just on the class loader is not fruitful. But as we can now get the manifest of standalone jars, we just need to focus on getting the manifest for different web applications. The ServletContext::getResource method seems to be a promising way to do that.

I implemented early service name/version detection because in case of Application Servers transactions might already be created before the first request hits the application.

Can we just instrument Filter::init and/or Servlet::init and get the ServletContext from there? This seems to be more generic rather than relying on container-specific hooks. In addition to that, we also need to get the ServletContext from the first request in case the agent has been attached at runtime rather than on startup.

Some observations using Spring Boot 2.0 and embedded Tomcat:

  • servletContext.getResource("/META-INF/MANIFEST.MF") returns null

That's not an issue because we can now get the manifest from the standalone jar directly, without relying on the ServletContext.

@felixbarny felixbarny changed the title Better service version detection Better service name and version detection Jan 27, 2022
@felixbarny
Copy link
Member Author

I think we can close this now as done. We have excellent service name and version detection now.

We might look into instrumenting container-specific hooks in the future to be able to detect the service name earlier. But that would just be an additional improvement on top. See also #2443 (comment)

APM-Agents (OLD) automation moved this from Planned to Done Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants