You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've recently been doing some profiling on one of our applications that uses re2j quite heavily, and we're noticing a decent amount of thread contention coming from RE2 when doing concurrent patten matching (i.e. multiple threads using the same Pattern instance).
Both RE2#get and RE2#put synchronize access on the monitor to obtain either a cached Machine instance or instantiate a new one if there are none that already exist.
Here's a small example that when run demonstrates the blocking thread behavior:
The thread profile (via YourKit) looks as follows:
I did some JMH profiling with various other concurrent data structures, but it seems like all have slightly worse off performance than the current implementation at lower thread counts (within error bounds anyway). A ConcurrentLinkedQueue or Dequeue seems to be slightly more performant at higher thread counts. However, even if they were better, many of these more "exotic" classes are not available in GWT anyway or they're at a higher JDK language level (re2j uses 1.6 features now), so I think that precludes them from use.
That said, I was wondering if there were any thoughts how this synchronization bottleneck could be removed, or at least improved a little, in the case that there are many threads accessing the same Pattern instance?
Our current approach is to just use ThreadLocalPatterns in our application code (as we have bounded thread pools). I see that the ThreadLocal approach was adopted in re2j too, but it lead to some memory leaks, so it was reverted.
This definitely isn't a dealbreaker for us, and it's not really a bug either, just food for thought.
Re2j has been great! Thanks!
The text was updated successfully, but these errors were encountered:
I was looking at that. and also it would be possible to have a non-locking structure to keep track of the cache it's quite complicated and error-prone. It might be easier to allow people to get non Thread safe instance that avoid the complication of managing a pool of Machine if they are ready to have one per thread.
We've recently been doing some profiling on one of our applications that uses
re2j
quite heavily, and we're noticing a decent amount of thread contention coming fromRE2
when doing concurrent patten matching (i.e. multiple threads using the samePattern
instance).Both
RE2#get
andRE2#put
synchronize access on the monitor to obtain either a cachedMachine
instance or instantiate a new one if there are none that already exist.Here's a small example that when run demonstrates the blocking thread behavior:
The thread profile (via YourKit) looks as follows:
I did some JMH profiling with various other concurrent data structures, but it seems like all have slightly worse off performance than the current implementation at lower thread counts (within error bounds anyway). A
ConcurrentLinkedQueue
orDequeue
seems to be slightly more performant at higher thread counts. However, even if they were better, many of these more "exotic" classes are not available in GWT anyway or they're at a higher JDK language level (re2j uses 1.6 features now), so I think that precludes them from use.That said, I was wondering if there were any thoughts how this synchronization bottleneck could be removed, or at least improved a little, in the case that there are many threads accessing the same
Pattern
instance?Our current approach is to just use
ThreadLocal
Pattern
s in our application code (as we have bounded thread pools). I see that theThreadLocal
approach was adopted in re2j too, but it lead to some memory leaks, so it was reverted.This definitely isn't a dealbreaker for us, and it's not really a bug either, just food for thought.
Re2j
has been great! Thanks!The text was updated successfully, but these errors were encountered: