Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak in WebMvcLinkBuilder #1701

Closed
dpasek-senacor opened this issue Nov 3, 2021 · 7 comments
Closed

Potential memory leak in WebMvcLinkBuilder #1701

dpasek-senacor opened this issue Nov 3, 2021 · 7 comments
Assignees
Labels
in: core Core parts of the project stack: webmvc type: bug
Milestone

Comments

@dpasek-senacor
Copy link

dpasek-senacor commented Nov 3, 2021

The usage of <T> T methodOn(Class<T> controller, Object... parameters) causes a memory leak when used with varying parameters.

Root cause:
methodOn uses the DummyInvocationUtils for generating a proxy for recording a method call. The proxy is then cached for later usage. The proxy is cached by the unique combination of controller class and the provided parameters.

According to the documentation of methodOn() the parameters are used to define template variables on type level.

@param parameters parameters to extend template variables in the type level mapping.

Example:

@RestController
@RequestMapping("/api/resource/{resourceId}")
public class Controller {
    @GetMapping
    public ResponseEntity<Resource> getResource(@PathVariable String resourceId) {
        Resource resource = new Resource();
        resource.setResourceId(resourceId);
        return ResponseEntity.ok(resource);
    }
}

With a controller like this it is possible to create the link to the controller method with two approaches:

Link link1= linkTo(methodOn(Controller.class, resourceId).getResource(null)).withSelfRel()
Link link2= linkTo(methodOn(Controller.class).getResource(resourceId)).withSelfRel()

A developer might be tempted to use the first variant as the template variable is defined on the type level and not on the method level. The first option causes a memory leak, the second option is totally safe. The generated link is exactly the same in both options.

Test case:
spring-hateoas-memory-leak.zip

Workaround: never use methodOn() with parameters.

Solution:

  • Improve documentation to clarify the usage of the parameters on type/method level.
  • Improve caching behavior in general: cache (being thread local bound to HTTP execution threads) might produce a lot of issues as multiple consecutive requests an same resource will be executed on different execution threads multiplying the memory footprint of caching the proxies without improving efficiency due to cache misses.
    Cache should be limited in size using LRU strategy and potentially shared between threads to increase hit ratio and reduce memory footprint.
@metters
Copy link

metters commented Nov 3, 2021

FYI: The headline is supposed to be "Potential memory leak in WebMvcLinkBuilder and WebFluxLinkBuilder"

@dpasek-senacor dpasek-senacor changed the title Potential memory link in WebMvcLinkBuilder and WebFluxLinkBuilder Potential memory leak in WebMvcLinkBuilder and WebFluxLinkBuilder Nov 3, 2021
@reda-alaoui
Copy link
Contributor

In addition, DummyInvocationUtils.CACHE ThreadLocal leaks the application classloader in a Tomcat container.

@odrotbohm
Copy link
Member

@reda-alaoui – Would you mind opening a separate ticket for that problem? It looks like we could just switch to the fully qualified type name of the class to the type itself to end up in the cache key.

@odrotbohm
Copy link
Member

I'll look into the memory leak ASAP. Regarding the comments on the caching and the ThreadLocal usage in general: we've had several rounds of optimizations in that area already. A shared cache, e.g. via a static ConcurrentHashMap has performed significantly worse than the current approach in JMH benchmarks (see the commit introducing the cache for details). We ended up with the current solution to primarily optimize for the creation of links pointing to the same resource during a single request, as it doesn't make much sense to compute those links multiple times and at the same time not accumulate cached instances across requests.

I am happy to tweak things further or in a different direction but only if the suggestions made are backed by improving numbers of a JMH benchmark for review as it's easy to over-optimize for a very particular use case and – by doing that – significantly impact other, likely also very common use cases.

@odrotbohm odrotbohm changed the title Potential memory leak in WebMvcLinkBuilder and WebFluxLinkBuilder Potential memory leak in WebMvcLinkBuilder and WebFluxLinkBuilder Nov 23, 2021
@odrotbohm odrotbohm self-assigned this Nov 23, 2021
@odrotbohm odrotbohm added in: core Core parts of the project stack: webmvc type: bug labels Nov 23, 2021
@odrotbohm odrotbohm added this to the 1.5 M1 milestone Nov 23, 2021
@odrotbohm odrotbohm changed the title Potential memory leak in WebMvcLinkBuilder and WebFluxLinkBuilder Potential memory leak in WebMvcLinkBuilder Nov 23, 2021
odrotbohm added a commit that referenced this issue Nov 23, 2021
We now constrain the cache of controller proxy instances to 256 elements using Spring's ConcurrentLruCache to avoid instances created via DummyInvocationUtils.methodOn(Class<?>, Object…). The parameters are part of the cache key and used to expand the type-level mappings. If those vary for each call and a request creates a lot of links (>100000) the memory consumption grows significantly, first and foremost indefinitely.

Using the ThreadLocal will still make sure that the cache is local to a current request, so the proxies can actually be reused as the method invocations used to record the mappings would interfere for concurrent requests otherwise.

Removed obsolete generic parameter on the CacheKey type.
odrotbohm added a commit that referenced this issue Nov 23, 2021
The commits for #467 significantly degraded performance as CachingMappingDiscoverer.getParams(Method) doesn't properly cache the result of the call which causes quite expensive, unnecessarily repeated annotation lookups for the very same method. We also avoid the creation of an Optional instance for the sole purpose of a simple null check.

Introduce FormatterFactory to potentially cache the to-String formatting functions and thus avoid repeated evaluation and Function object creation.

We now also avoid the creation of ParamRequestCondition instances if no @RequestParams(params = …) values could be found in the first place.
@odrotbohm
Copy link
Member

I've limited the amount of instance held in the cache using a ConcurrentLruCache. We need to keep the ThreadLocal around to shield the proxies from concurrent invocations as they only keep the last method invocation around.

The provided sample keeps a constant memory load now. That said, it's a very a typical scenario as requests usually do not produce tens of thousands of links per request and the memory pressure should be resolved due to the ThreadLocal in use. I'd be interested to learn how the issue manifested in production.

I also took the chance to run some benchmarks, and it turned out that the fix for #467 introduced a severe performance regression that I've now fixed by adding proper caching to the expensive (and superfluous) annotation lookups. Backported to 1.4.x via #1723, too.

@dpasek-senacor
Copy link
Author

HI @odrotbohm

I'd be interested to learn how the issue manifested in production.

The problem manifested by a classical out-of-memory error after a days/weeks of our servers running. By analysing the heap dump I could see that each HTTPS-Execution-Thread of Tomcat (in Spring Boot) had a ThreadLocal instance of the cache hashmap containing ten thousands of proxy instances. Each cache entry counted for about 1-2KB of memory. In total we had 300-400 MB of heap occupied by the caches. Since our JVM are limited to 1GB this was slowly eating up the free memory causing additional GCs especially full GCs (G1 Old). These GCs do not free up memory as cache was not build on weak references and the problem did not go away. Finally GC CPU load goes to 50% and OOMs happen.

The reason for this is quite simple. We habe a customer base of a few millions customers and a few endpoints had built their links using the customerId in the methodOn() part as it is defined the type annotation.

@RestController
@RequestMapping("/api/customers/{customerId}")
public class Controller {
}
...
Link link= linkTo(methodOn(Controller.class, customerId).getResource(null)).withSelfRel()

Since customer requests are not pinned to a certain HTTPS-Execution thread each customer interacting with our API caused entries in more or less each ThreadLocal cache for each thread. After days/weeks ten thousands of customers have done the same filling up the cache.
We could fix the problem by changing the code to

Link link1= linkTo(methodOn(Controller.class).getResource(customerId)).withSelfRel()

But we still have at least endpoint which uses the customerId in the type level request mapping and where the customer ID is not part of the controller method, being just part of the URI: /api/customers/{customerId}/addresses/{addressId}.
The addressId is a UUID therefore the the controller method does not need the customerId as method argument therefore we have to define the customerId on the class level when building links.

@dpasek-senacor
Copy link
Author

dpasek-senacor commented Nov 24, 2021

@odrotbohm
You wrote on your commit:

The parameters are part of the cache key and used to expand the type-level mappings. If those vary for each call and a request creates a lot of links (>100000) the memory consumption grows significantly, first and foremost indefinitely.

The problem did not occur because we generated a lot of links in a single request. That is not the case as explained in my previous post. The ThreadLocals are persistent for the life cycle of the HTTP(S) execution thread in Tomcat which are fed from a thread pool, i.e. these threads live for a long time (e.g. we have a minimum pool size of 10-15 per service). So these threads and therefore the ThreadLocals are never finalized.

In regard of that the Cache size could be bigger in my opionen than the used 256 entries, as users usually interact for a longer time with the API in their user sessions, e.g. interacting with a customer self service portal. Allowing the caching across multiple requests might be beneficial for longer user sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Core parts of the project stack: webmvc type: bug
Projects
None yet
Development

No branches or pull requests

4 participants