Fix thread safety issue and add cache to EmptySegmentPruner#7828
Fix thread safety issue and add cache to EmptySegmentPruner#7828Jackie-Jiang merged 2 commits intoapache:masterfrom
Conversation
|
Why not use a |
Codecov Report
@@ Coverage Diff @@
## master #7828 +/- ##
============================================
- Coverage 71.67% 71.64% -0.03%
Complexity 4089 4089
============================================
Files 1579 1579
Lines 80843 80851 +8
Branches 12010 12017 +7
============================================
- Hits 57944 57929 -15
- Misses 18995 19018 +23
Partials 3904 3904
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
@richardstartin I actually considered that, and decided to go with the snapshot approach because we want to optimize the query runtime method as much as possible |
I don’t think this is likely to be an optimisation unless the removeAll call is made frequently and the number of empty segments is large. I think when an optimisation makes the code much harder to read we need 4 bits of data:
I raise this only because the code is much harder to read than if it used a threadsafe Set. If it were as readable I would be inclined to assume that the optimisation is effective. |
|
|
||
| // Return the cached result when the input is the same reference | ||
| ResultCache resultCache = _resultCache; | ||
| if (resultCache != null && resultCache._inputSegments == segments) { |
There was a problem hiding this comment.
use resultCache != null && resultCache._inputSegments.equals(segments) here?
There was a problem hiding this comment.
This is part of the COW mechanism, it's checking that the result cache wasn't built by some other racing thread, it has to be the same reference and the contents are irrelevant. This is why I challenge whether the performance justifies the complexity, because it's fairly subtle.
There was a problem hiding this comment.
We don't want to compare the actual content because it can be expensive
|
@richardstartin I didn't think from the readability perspective. I agree there is no clear evidence on the optimization. Changed it to use the concurrent set. |
richardstartin
left a comment
There was a problem hiding this comment.
Even I can read it now 👍
Currently
EmptySegmentPrunermight have race condition where_emptySegmentscan be modified and accessed at the same time. Fix it by using a concurrent set for_emptySegments.Also add cache to the pruner so that no need to recalculate the result given the same input segments.