Avoid locking for put methods for RE2. Fixes #46 #121

charlesmunger · 2020-10-28T18:39:32Z

The locking could be avoided entirely by allocating wrapper objects for the stack nodes, but I'm not sure if that's desirable. I've also provided another approach that is more complex but offers lock-freedom when the cache is empty.

alandonovan · 2020-10-28T18:48:08Z

A commit message for a change to introduce a lock-free concurrent data structure needs to say a lot more about background, problem, alternative solutions, technical explanation, and benchmarks than the commit message for this change does.

charlesmunger · 2020-10-28T20:28:47Z

I've updated the commit description. I am not sure how to resolve the GWT failure.

java/com/google/re2j/Machine.java

java/com/google/re2j/RE2.java

alandonovan · 2020-10-28T21:25:42Z

java/com/google/re2j/RE2.java

@@ -213,25 +214,43 @@ int numberOfCapturingGroups() {
  // get() returns a machine to use for matching |this|.  It uses |this|'s
  // machine cache if possible, to avoid unnecessary allocation.
  Machine get() {
+    // Treiber stack (if reusing nodes) suffers from ABA problem on pop. Avoid by unlinking the
+    // entire stack, and stashing it in a pop-only stack guarded by a lock. This also reduces
+    // contention on the AtomicReference between putters and the getter.


reduces contention

How is that? Calls to get must incur a semaphore (exclusive lock).

Having the two stacks and batch moves between them means that there's reduced CAS traffic on the AtomicReference, because getters either do a read (empty stack) or pop from their own stack, or import a whole batch, which amortizes the CAS on get.

Right, but before reaching the read, they must first enter a synchronized block, whose semaphore requires an atomic decrement. How does the performance compare to a single stack?

I can't use a single stack without allocating a new object per put(). That would make the whole system lock-free, but would also mean allocating on every call (once there's more than one concurrent call ever).

Or I could use a single stack and guard the pop section with a lock - but this would mean locking and atomic operations on every get() call. You would, however, be able to use a double-checked-locking approach to avoid synchronization if the pool is empty:

Machine head = machine.get(); if (head != null) { // Lock necessary here because we're reusing nodes - otherwise suffer from ABA problem synchronized (this) { while (true) { head = machine.get(); if (head == null) { break; } if (machine.compareAndSet(head, head.next)) { head.next = null; return head; } head = machine.get(); } } }

That approach felt a bit more complicated and seemed like more of a departure from the existing code, but if you prefer it I can use it instead. Actual performance cost depends on target architecture, amount of contention, etc. The checked-in benchmarks for RE2 don't exercise threads, and even if there were concurrent benchmarks it'd be hard to generalize them across platforms.

java/com/google/re2j/RE2.java

charlesmunger · 2020-10-29T22:27:18Z

With my change:

Benchmark                           (impl)  (regex)  (repeats)  Mode  Cnt      Score      Error  Units
BenchmarkBacktrack.matched             JDK      N/A          5  avgt    5      0.546 ±    0.018  us/op
BenchmarkBacktrack.matched             JDK      N/A         10  avgt    5     18.374 ±    2.611  us/op
BenchmarkBacktrack.matched             JDK      N/A         15  avgt    5    699.405 ±   26.860  us/op
BenchmarkBacktrack.matched             JDK      N/A         20  avgt    5  27318.340 ± 3808.455  us/op
BenchmarkBacktrack.matched            RE2J      N/A          5  avgt    5      0.926 ±    0.049  us/op
BenchmarkBacktrack.matched            RE2J      N/A         10  avgt    5      2.876 ±    0.234  us/op
BenchmarkBacktrack.matched            RE2J      N/A         15  avgt    5      5.837 ±    0.156  us/op
BenchmarkBacktrack.matched            RE2J      N/A         20  avgt    5      9.333 ±    0.839  us/op
BenchmarkCompile.compile               JDK     DATE        N/A  avgt    5    657.609 ±   30.343  ns/op
BenchmarkCompile.compile               JDK    EMAIL        N/A  avgt    5    209.188 ±    4.231  ns/op
BenchmarkCompile.compile               JDK    PHONE        N/A  avgt    5    320.981 ±   13.061  ns/op
BenchmarkCompile.compile               JDK   RANDOM        N/A  avgt    5   1018.124 ±   18.428  ns/op
BenchmarkCompile.compile               JDK   SOCIAL        N/A  avgt    5    283.678 ±   12.677  ns/op
BenchmarkCompile.compile               JDK   STATES        N/A  avgt    5   1321.436 ±  141.171  ns/op
BenchmarkCompile.compile              RE2J     DATE        N/A  avgt    5   2649.448 ±   32.502  ns/op
BenchmarkCompile.compile              RE2J    EMAIL        N/A  avgt    5    959.094 ±   24.034  ns/op
BenchmarkCompile.compile              RE2J    PHONE        N/A  avgt    5   1513.050 ±  100.937  ns/op
BenchmarkCompile.compile              RE2J   RANDOM        N/A  avgt    5   5404.207 ±  118.983  ns/op
BenchmarkCompile.compile              RE2J   SOCIAL        N/A  avgt    5   1067.002 ±   11.441  ns/op
BenchmarkCompile.compile              RE2J   STATES        N/A  avgt    5   7525.129 ±  638.113  ns/op
BenchmarkFullMatch.matched             JDK      N/A        N/A  avgt    5    102.320 ±    7.576  ns/op
BenchmarkFullMatch.matched            RE2J      N/A        N/A  avgt    5    486.554 ±   22.257  ns/op
BenchmarkFullMatch.notMatched          JDK      N/A        N/A  avgt    5     61.846 ±    4.451  ns/op
BenchmarkFullMatch.notMatched         RE2J      N/A        N/A  avgt    5    437.695 ±   21.006  ns/op
BenchmarkSubMatch.findPhoneNumbers     JDK      N/A        N/A  avgt    5      3.030 ±    0.152  ms/op
BenchmarkSubMatch.findPhoneNumbers    RE2J      N/A        N/A  avgt    5     13.751 ±    1.253  ms/op

Without:

Benchmark                           (impl)  (regex)  (repeats)  Mode  Cnt      Score      Error  Units
BenchmarkBacktrack.matched             JDK      N/A          5  avgt    5      0.563 ±    0.023  us/op
BenchmarkBacktrack.matched             JDK      N/A         10  avgt    5     19.759 ±    0.452  us/op
BenchmarkBacktrack.matched             JDK      N/A         15  avgt    5    717.583 ±   20.566  us/op
BenchmarkBacktrack.matched             JDK      N/A         20  avgt    5  30465.382 ± 1059.643  us/op
BenchmarkBacktrack.matched            RE2J      N/A          5  avgt    5      1.052 ±    0.015  us/op
BenchmarkBacktrack.matched            RE2J      N/A         10  avgt    5      3.245 ±    0.226  us/op
BenchmarkBacktrack.matched            RE2J      N/A         15  avgt    5      6.825 ±    0.196  us/op
BenchmarkBacktrack.matched            RE2J      N/A         20  avgt    5     11.549 ±    1.129  us/op
BenchmarkCompile.compile               JDK     DATE        N/A  avgt    5    785.358 ±   20.005  ns/op
BenchmarkCompile.compile               JDK    EMAIL        N/A  avgt    5    255.705 ±   13.109  ns/op
BenchmarkCompile.compile               JDK    PHONE        N/A  avgt    5    385.573 ±   15.514  ns/op
BenchmarkCompile.compile               JDK   RANDOM        N/A  avgt    5   1195.869 ±  106.540  ns/op
BenchmarkCompile.compile               JDK   SOCIAL        N/A  avgt    5    343.272 ±   12.794  ns/op
BenchmarkCompile.compile               JDK   STATES        N/A  avgt    5   1569.633 ±  164.334  ns/op
BenchmarkCompile.compile              RE2J     DATE        N/A  avgt    5   3208.025 ±  225.822  ns/op
BenchmarkCompile.compile              RE2J    EMAIL        N/A  avgt    5   1136.255 ±   71.899  ns/op
BenchmarkCompile.compile              RE2J    PHONE        N/A  avgt    5   1803.507 ±  106.871  ns/op
BenchmarkCompile.compile              RE2J   RANDOM        N/A  avgt    5   6472.174 ±  236.211  ns/op
BenchmarkCompile.compile              RE2J   SOCIAL        N/A  avgt    5   1273.778 ±   17.144  ns/op
BenchmarkCompile.compile              RE2J   STATES        N/A  avgt    5   9010.883 ±  734.132  ns/op
BenchmarkFullMatch.matched             JDK      N/A        N/A  avgt    5    119.059 ±    6.297  ns/op
BenchmarkFullMatch.matched            RE2J      N/A        N/A  avgt    5    573.888 ±   15.052  ns/op
BenchmarkFullMatch.notMatched          JDK      N/A        N/A  avgt    5     69.659 ±    2.206  ns/op
BenchmarkFullMatch.notMatched         RE2J      N/A        N/A  avgt    5    515.374 ±   25.949  ns/op
BenchmarkSubMatch.findPhoneNumbers     JDK      N/A        N/A  avgt    5      3.509 ±    0.640  ms/op
BenchmarkSubMatch.findPhoneNumbers    RE2J      N/A        N/A  avgt    5     15.647 ±    2.617  ms/op

Looks like an overall improvement, even for the single thread case. Avoiding the lock and avoiding the arraydeque bookkeeping probably helps.

alandonovan · 2020-10-29T23:00:36Z

Nice; looks like about a 10% improvement across the board in the single threaded case. That probably paid for your time in CPUs already, never mind contention. : )

I don't see the latest changes we discussed yet. Ping me when they're ready.

alandonovan

Looks good; thanks for the optimization, and for your patience.

java/com/google/re2j/Machine.java

java/com/google/re2j/RE2.java

charlesmunger · 2021-02-17T00:45:49Z

Any updates on blockers for this CL?

sjamesr · 2021-03-11T19:58:26Z

This change should work against gwt 2.9.0, which you can specify in the build.gradle file.

Google-wide-profiling indicated that this was a significant source of java lock contention. This new approach uses a Treiber stack to make adding an operation back into the pool a lock-free operation. It uses the existing objects as nodes in the linked stack - the Treiber stack suffers from an ABA problem when popping if nodes are reused, so removing an item from the pool is done by moving the whole stack to a simple linked stack guarded by the existing lock. The locking could be avoided entirely by allocating wrapper objects for the stack nodes, but I'm not sure if that's desirable, since the goal of the pool was to avoid allocation.

charlesmunger · 2021-03-11T20:26:02Z

Updated, and tests now pass.

google-cla bot added the cla: yes label Oct 28, 2020

charlesmunger force-pushed the doublestack branch from dd7bfda to a9da081 Compare October 28, 2020 20:27

alandonovan reviewed Oct 28, 2020

View reviewed changes

alandonovan reviewed Oct 30, 2020

View reviewed changes

java/com/google/re2j/Machine.java Show resolved Hide resolved

java/com/google/re2j/RE2.java Outdated Show resolved Hide resolved

java/com/google/re2j/RE2.java Outdated Show resolved Hide resolved

charlesmunger changed the title ~~Avoid locking for put methods for RE2.~~ Avoid locking for put methods for RE2. Fixes #46 Mar 9, 2021

charlesmunger mentioned this pull request Mar 9, 2021

Thoughts on synchronized access on RE2 instances #46

Closed

Charles Munger added 6 commits March 11, 2021 12:15

Address comments

3be45c8

Address comments

f78feab

fix commnt

a032fe7

Address comments

74a6014

Update GWT dependency to support AtomicReference

82b62cd

charlesmunger force-pushed the doublestack branch from 69a614f to 82b62cd Compare March 11, 2021 20:24

sjamesr merged commit 2bf9ce9 into google:master Mar 11, 2021

gauransh-tandon mentioned this pull request Apr 8, 2024

Infinite loop in Pattern#compile on certain case-insensitive patterns #168

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid locking for put methods for RE2. Fixes #46 #121

Avoid locking for put methods for RE2. Fixes #46 #121

charlesmunger commented Oct 28, 2020

alandonovan commented Oct 28, 2020

charlesmunger commented Oct 28, 2020

alandonovan Oct 28, 2020

charlesmunger Oct 28, 2020

alandonovan Oct 29, 2020

charlesmunger Oct 29, 2020

charlesmunger commented Oct 29, 2020

alandonovan commented Oct 29, 2020

alandonovan left a comment

charlesmunger commented Feb 17, 2021

sjamesr commented Mar 11, 2021

charlesmunger commented Mar 11, 2021

Avoid locking for put methods for RE2. Fixes #46 #121

Avoid locking for put methods for RE2. Fixes #46 #121

Conversation

charlesmunger commented Oct 28, 2020

alandonovan commented Oct 28, 2020

charlesmunger commented Oct 28, 2020

alandonovan Oct 28, 2020

Choose a reason for hiding this comment

charlesmunger Oct 28, 2020

Choose a reason for hiding this comment

alandonovan Oct 29, 2020

Choose a reason for hiding this comment

charlesmunger Oct 29, 2020

Choose a reason for hiding this comment

charlesmunger commented Oct 29, 2020

alandonovan commented Oct 29, 2020

alandonovan left a comment

Choose a reason for hiding this comment

charlesmunger commented Feb 17, 2021

sjamesr commented Mar 11, 2021

charlesmunger commented Mar 11, 2021