Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many threads blocked in LRUMap (with ScalaObjectMapper) -- increase type cache size? #428

Open
gmethvin opened this issue Nov 2, 2018 · 15 comments

Comments

@gmethvin
Copy link

gmethvin commented Nov 2, 2018

(NOTE: moved from jackson-databind on 11-Sep-2019 by @cowtowncoder )

Version: Jackson 2.8.11, probably also 2.9.7

We have an API server that receives requests then makes ElasticSearch requests using the elastic4s library. The JSON parsing is done using Jackson under the hood. We’ve noticed a lot of threads blocked on the synchronized blocks in the LRUMap data structure, which seems to be contributing to slow performance on our API server.

Several threads are blocked on puts to the map, for example:

   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.fasterxml.jackson.databind.util.LRUMap.putIfAbsent(LRUMap.java:58)
	- waiting to lock <0x0000000629be2270> (a com.fasterxml.jackson.databind.util.LRUMap)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1287)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory.constructMapLikeType(TypeFactory.java:807)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:68)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.$anonfun$constructType$1(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$$Lambda$2573/1098099028.apply(Unknown Source)
	at scala.collection.immutable.List.map(List.scala:286)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.$anonfun$constructType$1(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$$Lambda$2573/1098099028.apply(Unknown Source)
	at scala.collection.immutable.List.map(List.scala:286)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.$anonfun$constructType$1(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$$Lambda$2573/1098099028.apply(Unknown Source)
	at scala.collection.immutable.List.map(List.scala:286)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.readValue(ScalaObjectMapper.scala:190)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.readValue$(ScalaObjectMapper.scala:189)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.readValue(JacksonSupport.scala:11)
	at com.sksamuel.elastic4s.http.ResponseHandler$.fromEntity(ResponseHandler.scala:34)
	at models.es.ESHttpClientBackend$GetMappingHttpExecutable$$anon$1.handle(ESHttpClientBackend.scala:575)
	at com.sksamuel.elastic4s.http.HttpClient.$anonfun$execute$1(HttpClient.scala:62)
	at com.sksamuel.elastic4s.http.HttpClient$$Lambda$2539/1265612311.apply(Unknown Source)
	at scala.util.Success.$anonfun$map$1(Try.scala:251)
	at scala.util.Success.map(Try.scala:209)
	at scala.concurrent.Future.$anonfun$map$1(Future.scala:288)
	at scala.concurrent.Future$$Lambda$1146/1816269091.apply(Unknown Source)
	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:29)
	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
	at scala.concurrent.impl.Promise$$Lambda$609/1310865036.apply(Unknown Source)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

and another thread blocked on clear():

   java.lang.Thread.State: BLOCKED (on object monitor)
	at java.util.concurrent.ConcurrentHashMap.clear(ConcurrentHashMap.java:1193)
	- locked <0x0000000625710000> (a java.util.concurrent.ConcurrentHashMap$TreeBin)
	at com.fasterxml.jackson.databind.util.LRUMap.clear(LRUMap.java:70)
	at com.fasterxml.jackson.databind.util.LRUMap.putIfAbsent(LRUMap.java:60)
	- locked <0x0000000629be2270> (a com.fasterxml.jackson.databind.util.LRUMap)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1287)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1396)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromParamType(TypeFactory.java:1400)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromAny(TypeFactory.java:1165)
	at com.fasterxml.jackson.databind.type.TypeFactory._resolveSuperInterfaces(TypeFactory.java:1311)
	at com.fasterxml.jackson.databind.type.TypeFactory._fromClass(TypeFactory.java:1254)
	at com.fasterxml.jackson.databind.type.TypeFactory.constructMapLikeType(TypeFactory.java:807)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:68)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.$anonfun$constructType$1(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$$Lambda$2573/1098099028.apply(Unknown Source)
	at scala.collection.immutable.List.map(List.scala:286)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType(ScalaObjectMapper.scala:64)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.constructType$(ScalaObjectMapper.scala:56)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.constructType(JacksonSupport.scala:11)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.readValue(ScalaObjectMapper.scala:190)
	at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper.readValue$(ScalaObjectMapper.scala:189)
	at com.sksamuel.elastic4s.json.JacksonSupport$$anon$1.readValue(JacksonSupport.scala:11)
	at com.sksamuel.elastic4s.http.ResponseHandler$.fromEntity(ResponseHandler.scala:34)
	at models.es.ESHttpClientBackend$GetMappingHttpExecutable$$anon$1.handle(ESHttpClientBackend.scala:575)
	at com.sksamuel.elastic4s.http.HttpClient.$anonfun$execute$1(HttpClient.scala:62)
	at com.sksamuel.elastic4s.http.HttpClient$$Lambda$2539/1265612311.apply(Unknown Source)
	at scala.util.Success.$anonfun$map$1(Try.scala:251)
	at scala.util.Success.map(Try.scala:209)
	at scala.concurrent.Future.$anonfun$map$1(Future.scala:288)
	at scala.concurrent.Future$$Lambda$1146/1816269091.apply(Unknown Source)
	at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:29)
	at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
	at scala.concurrent.impl.Promise$$Lambda$609/1310865036.apply(Unknown Source)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Based on heap dumps, we see that the _typeCache map in TypeFactory is getting filled up quickly with type information for the JSON objects we’re parsing, then the map is getting cleared, then the process is repeated quickly. Since the clear() operation is relatively expensive and inside a synchronized block, it blocks all other threads from making progress, and this can happen repeatedly in a short period of time.

We could replace the ConcurrentHashMap with a synchronized LinkedHashMap with an LRU eviction policy, and/or increase the size of the cache. Either way, clearing the cache seems like an inefficient solution, especially when many threads are trying to use the cache.

@cowtowncoder
Copy link
Member

TL;DNR: I am open to increasing cache size, but starting with Scala module since I think this is likely to be due to Scala or Scala module's usage (and possibly sheer number of throw-away classes).

Aside from that, if at all possible, I would suggest you to see if it would be possible to pre-resolve types (Class, TypeReference) into JavaType, and reuse those -- JavaType instances are immutable and safe to hold on to. My experience on trying to work around cache concurrency issues is that it tends to be losing battle at this level, and that in situations where bottleneck exists here, something else is likely the root cause.

So: before considering potential for improving cache handling in TypeFactory, I would want to know why there is a bottleneck there. It is possible that some Scala generic types are expensive (and if so, maybe scala module's sub-class of TypeFactory could/should be improved to optimize for those cases). And/or maybe there is just a huge number of generated throw-away classes from all advanced features.

Specifically as to clear(): what that tells is simply that working set of types to resolve is bigger than max cache size. I do not think eviction mechanism would help a lot. Disabling caching would probably reduce contention, but in the big picture that would just increase time spent on type resolution -- sure, threads would not be blocked, but now they would be using CPU for same amount of time.
The reason "clear if full" was chosen is actually simple: this allows avoiding minimizing locking for (assumed) common case of finding entry from cache, and only locking on write (modification).
Earlier we found out the hard way that LinkedHashMap is not thread-safe for reads (when it has to maintain the lru aspect).

So... first things first. I think increasing cache size for Scala module probably makes sense. This does not require change to jackson-databind, and I have not yet seen evidence that this would be common problem on Java JVM (although maybe it could become, with Java 8+ closures?).
I am not convinced yet of benefit of going back to LinkedHashMap, although I am not totally opposed (for 2.10 or 3.0; no go for 2.9.x patches).

@gmethvin
Copy link
Author

gmethvin commented Nov 3, 2018

I suspect the problem is more likely to happen in Scala because Scala code tends to have more types, particularly with the Scala collection hierarchy, but I don't think it's fair to characterize this as a Scala problem. I can easily imagine a Java app having over 200 types in this cache, especially when using lots of nested types for domain models. To me it makes sense to change the default cache size to something much higher, maybe 1000 or so.

Changing the cache implementation is a more involved discussion. If you're concerned about needing a write lock in LinkedHashMap, then you can evict based on insertion order rather than access order. That way reads don't require writing to the map. That would be a lot less expensive than clear() and I think would satisfy your other requirements.

@cowtowncoder cowtowncoder changed the title Many threads blocked in LRUMap Many threads blocked in LRUMap (with ScalaObjectMapper) Sep 11, 2019
@cowtowncoder
Copy link
Member

@pjfanning On Scala side, do you think it'd be easy to change ScalaObjectMapper to construct differently configure TypeFactory? Instead of using no-args constructor, it can pass pre-constructed LRUMap, using bigger size (like 800 max size instead of 200).

@pjfanning
Copy link
Member

pjfanning commented Sep 11, 2019

@cowtowncoder ScalaObjectMapper is a mixin for ObjectMapper (as opposed to being a subclass).

val mapper = new ObjectMapper() with ScalaObjectMapper

ScalaObjectMapper gets the TypeFactory from the ObjectMapper.

You can create a TypeFactory with a custom LRUMap by using something like val factory = TypeFactory.defaultInstance().withCache(cache: LRUMap[Object, JavaType]).

You can then call mapper.setTypeFactory(factory).

@cowtowncoder
Copy link
Member

@pjfanning Ok. This is why I asked, as I can't even read Scala well. :)

Anyway, how about I transfer this issue to jackson-module-scala since I think it has not been reported for Java or Kotlin.

@cowtowncoder cowtowncoder changed the title Many threads blocked in LRUMap (with ScalaObjectMapper) Many threads blocked in LRUMap (with ScalaObjectMapper) -- increase type cache size? Sep 12, 2019
@cowtowncoder cowtowncoder transferred this issue from FasterXML/jackson-databind Sep 12, 2019
@pjfanning
Copy link
Member

This code can be used to create an ObjectMapper with a larger typeCache (1000 max entries instead of 200).

  val typeCache = new LRUMap[Any, JavaType](16, 1000)
  val typeFactory = TypeFactory.defaultInstance().withCache(typeCache)
  val mapper = new ObjectMapper() with ScalaObjectMapper
  mapper.setTypeFactory(typeFactory)
  mapper.registerModule(DefaultScalaModule)

@pjfanning
Copy link
Member

@cowtowncoder there is no easy solution to making scala ObjectMappers have a larger default size for the LRUMap. Would it be ok to support a system property that allows users to override the default max of 200?

@cowtowncoder
Copy link
Member

I did not mean to change default size for LRUMap, but rather replacing TypeFactory used with one constructed with LRUMap created with explicit larger size. Changing defaults would affect many places where current defaults are already big enough.

@pjfanning
Copy link
Member

@cowtowncoder my suggestion is to leave the LRUMap with a default max of 200 but that if a system property is set then this value will override the default. #428 (comment) -- provides a workaround that affected users can use.

@cowtowncoder
Copy link
Member

Ok: unfortunately I am pretty strongly against use of system properties -- they are stateful global singletons, and (in my opinion) something that libraries should use rarely if ever. One of design goals with Jackson mappers is to keep them isolated, with their own configuration, and global state settings go against that goal. So I don't want to add support that way in jackson-databind.

Then again, I am not sure if this is problem any more: or if it is, whether cache size change would help significantly. Maybe it is more a symptom of the way Scala mapper overuses type resolution.

@cowtowncoder
Copy link
Member

@pjfanning Would the recent changes to 2.12 help here?

@pjfanning
Copy link
Member

@cowtowncoder You still need to explicity set a new type cache - eg #428 (comment)

@cowtowncoder
Copy link
Member

@pjfanning Ah. Hmmh. Yes, of course, it was the global cache, not specific to module.

I think we went over same ideas, including bigger changes to allow users to configure caching aspects more generally.
But that is not something that could quite fit in 2.12 timeframe...

I think I'll add a note somewhere on this, as "maybe in 2.13" kind of thing.

@cowtowncoder
Copy link
Member

Older me thought of this back then already :)

FasterXML/jackson-databind#2502

@cowtowncoder
Copy link
Member

While this issue is not necessarily completely solved, it may be significantly helped by:

FasterXML/jackson-databind#3530

which will be in Jackson (databind) 2.14. Specifically it should help by using actual LRU implementation so that if the working set fits in the maximum size there won't be "clear all" events which may be what is happening currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants