Fix JSONPath cache inefficient issue #7409

Ferrari6 · 2021-09-08T10:17:40Z

Description

this commit fixes #7403

Background

When we used jsonpath transformation functions, we found that there was a delay in consumption, and the CPU usage was very high. Analysis of jstack found that the consumption threads were waiting for the lock of LRUCache in jayway, and further analysis of the CPU and lock contented, we can confirm that this inefficient LRUCache is the consumption performance bottleneck.

stack trace

**flamegraphs can be found in the issue descriptions #7403 **

Fix

A new JSON path cache is implemented using ConcurrentHashMap, and the cache threshold is set at the same time. When the maximum is exceeded, the JSON path will not be cached anymore.

In Pinot, the number of JSON paths is bounded by the size of the transformation config
Even if it exceeds the maximum cache size, not cache JSON path may be better than frequent swapping in and out of LRU
If JSON path compile is not cached, CPU consumption is also very small

"transformConfigs": [
        {
          "columnName": "id",
          "transformFunction": "jsonPathString(report,'$.identifiers.id','')"
        },
       {
          "columnName": "name",
          "transformFunction": "jsonPathString(report,'$.identifiers.name','')"
        },
 ...
]

Pinot Server Flamegraphs when using ConcurrentHashMap cache (28vcpu)

jsonpath CPU usage is low and no lock contentions
-->

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

New configuration options
Deprecation of configurations
Signature changes to public methods/interfaces
New plugins added or old plugins removed

Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

codecov-commenter · 2021-09-08T11:12:22Z

Codecov Report

Merging #7409 (fd7aed0) into master (421645d) will decrease coverage by 0.02%.
The diff coverage is 85.71%.

@@             Coverage Diff              @@
##             master    #7409      +/-   ##
============================================
- Coverage     71.54%   71.51%   -0.03%     
+ Complexity     4036     4033       -3     
============================================
  Files          1579     1580       +1     
  Lines         80390    80386       -4     
  Branches      11945    11944       -1     
============================================
- Hits          57512    57489      -23     
- Misses        18996    19012      +16     
- Partials       3882     3885       +3

Flag	Coverage Δ
integration1	`29.21% <74.28%> (-0.03%)`	⬇️
integration2	`27.67% <48.57%> (-0.09%)`	⬇️
unittests1	`68.61% <80.00%> (-0.05%)`	⬇️
unittests2	`14.55% <0.00%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...m/function/JsonExtractScalarTransformFunction.java	`49.52% <60.00%> (-0.48%)`	⬇️
...form/function/JsonExtractKeyTransformFunction.java	`78.26% <92.30%> (+4.18%)`	⬆️
...rg/apache/pinot/common/function/JsonPathCache.java	`100.00% <100.00%> (ø)`
...he/pinot/common/function/scalar/JsonFunctions.java	`80.35% <100.00%> (-1.00%)`	⬇️
...a/manager/realtime/RealtimeSegmentDataManager.java	`50.00% <0.00%> (-25.00%)`	⬇️
...nt/local/startree/v2/store/StarTreeDataSource.java	`40.00% <0.00%> (-13.34%)`	⬇️
...n/java/org/apache/pinot/common/utils/URIUtils.java	`66.66% <0.00%> (-7.41%)`	⬇️
.../common/request/context/predicate/EqPredicate.java	`66.66% <0.00%> (-6.67%)`	⬇️
...mmon/request/context/predicate/NotEqPredicate.java	`66.66% <0.00%> (-6.67%)`	⬇️
...elix/core/periodictask/ControllerPeriodicTask.java	`76.00% <0.00%> (-6.00%)`	⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 421645d...fd7aed0. Read the comment docs.

richardstartin · 2021-09-08T16:39:29Z

pinot-common/src/test/java/org/apache/pinot/common/function/JsonPathMapCacheTest.java

+  public void testSimpleJsonPathMapCacheWorks() throws JsonProcessingException {
+    String path = "$.contact.email";
+    assertEquals(JsonFunctions.jsonPathString(_jsonString, path), "test@example.com");
+    JsonPathMapCache cache = (JsonPathMapCache) CacheProvider.getCache();
+
+    // verify json path has been cached
+    LinkedList<Predicate> filterStack = new LinkedList<>(Collections.emptyList());
+    String cacheKey = Utils.concat(path, filterStack.toString());
+    JsonPath jsonPath = cache.get(cacheKey);
+    assertNotNull(jsonPath);
+  }


This depends on undocumented library behaviour, it assumes the format of the cache keys. I have actually submitted a PR to the library to change this behaviour, which would break this test: json-path/JsonPath#750

I suggest iterating over the map instead (requires exposing the keys on JsonPathMapCache).

Or just use a Mockito spy and verify put was called, I guess.

Good suggestions.
I always prefer to use Mockitio for unit testing, but all method calls related to JSONFunctions are static, so I try to check the map size now.

pinot-common/src/main/java/org/apache/pinot/common/function/JsonPathMapCache.java

richardstartin · 2021-09-09T09:43:14Z

pinot-common/src/main/java/org/apache/pinot/common/function/JsonPathMapCache.java

+ * of it.
+ */
+public class JsonPathMapCache implements Cache {
+  private final ConcurrentMap<String, JsonPath> _pathCache = new ConcurrentHashMap<>(128, 0.75f, 1);


I don't think we need to explicitly set the concurrency level or load factor, these are the default values any way.

I will change it. This is the habit of my previous team!

richardstartin · 2021-09-09T09:45:02Z

pinot-common/src/main/java/org/apache/pinot/common/function/FunctionConstants.java

+   * the number of JSON paths is bounded by the size of the transformation config,
+   * add this threshold for protection from the extreme cases.
+   */
+  public static final int MAX_JSON_PATH_CACHE_SIZE = 1024 * 16;


This class should be removed and the constant inlined into JsonPathMapCache where it is relevant.

richardstartin

Not an approver but this looks good to me, and solves a big problem 👍🏻

Jackie-Jiang

LGTM otherwise

pinot-common/src/main/java/org/apache/pinot/common/function/JsonPathMapCache.java

Jackie-Jiang · 2021-09-09T19:00:39Z

pinot-common/src/main/java/org/apache/pinot/common/function/JsonPathMapCache.java

+ * a lot of unnecessary lock waits during high concurrent data ingestion,
+ * and LRU mechanism is inappropriate for Pinot bounded size of the
+ * transformation config, so we should use this simple Map cache instead
+ * of it.


JsonPath can also be used at query time (see JsonExtractScalarTransformFunction). Let's add a TODO here to add some evict policy in the future

I think in the future this can be handled better by just precompiling JsonPath objects and bypassing the library's cache entirely. For transformConfig, it would require changes to the way ScalarFunction is defined as well as changes to InbuiltFunctionEvaluator.planExecution (this would also permit validation and cleansing of config, which is currently impossible). For JsonExtractScalarTransformFunction the change is trivial.

I think in the meantime using a guava Cache instead of a ConcurrentHashMap as suggested on slack would alleviate this concern.

I think guava Cache is a more safe cache solution cause the JsonPath cache will also be applied to the query.
For one of my table ingestion cases, reading data QPS close to 10w and configuring JsonPath transformations with dozens of fields, this will put a certain pressure on GC and CPU because of the LRU algorithm needs to record every get. Of course, this should not become a system bottleneck, but the implementation is a bit too heavy.

Guava Cache uses recencyQueue to record cache accesses. A large number of CAS enqueue operations generated by hotkeys will cause CPU consumption, and this synchronous recording will also slow down cache reading.

I think maybe Caffeine Cache is a good choice. Its Striped-RingBuffer design is more efficient for GC and CPU usage. And its API is very similar to Guava Cache.

Whether using Guava's cache would result in high CPU consumption needs measurement. I agree that Caffeine is a better cache implementation, but Guava is already on the classpath. Is this particular use case important enough to add a 1MB depedency?

I think we should use the default value of the concurrencyLevel of Guava cache. In this scenario, there will be very few caches writes and no need to update the cache. My concern is the way Guava records cache reads, which is not very efficient.

The performance of cache reads improves with the concurrency level due to reducing the likelihood of contending on the same CLQ instance. The benchmarks use a zipf to simulate a hotspot.

You are right.

Thanks for the insights @ben-manes. FWIW I have been using Caffeine in commercial projects for years, the only question here is whether this particular use case is worth adding a relatively large (in bytes) dependency.

It's not a problem. I co-authored Guava's so either way my code is used. 😄

@Ferrari6 is correct in what your preference and default should be, though. Prior to my involvement, the Guava team bet on reference caching (MapMaker) and spent their complexity budget by forking the hash table as an optimization. This was a mistake because soft references can cause GC death spirals of stop the world events and unpredictable evictions, but looked fine in a naive benchmark. This blew their complexity budget, so porting size eviction from CLHM favored simplicity over performance.

The longer-term problem you'll face is that no one maintains CacheBuilder. The last big change was adding Map.compute, but it was riddled with major bugs and done inefficiently. I fixed some of those problems, but there are show stoppers. If you keep to the historic functionality then Guava's can be made acceptable in most cases. Caffeine does have jar bloat by code generating per-configuration entry classes to minimize the memory footprint. In those cases where disk is a premium some projects embed CLHM into their code base, e.g. msjdbc and groovy. You'll probably have many caches throughout Pinot making the Caffeine dependency worthwhile even if you can get by in a case-by-case basis.

snleee · 2021-09-15T21:03:39Z

@Ferrari6 Can you rebase based on the master branch and retrigger github action tests? We recently introduced the flakyness to one of our integration test and fixed from #7432

Ferrari6 · 2021-09-15T22:42:59Z

@Ferrari6 Can you rebase based on the master branch and retrigger github action tests? We recently introduced the flakyness to one of our integration test and fixed from #7432

yes, sure. But because the setCache of CacheProvider in Jayway can only be called once, and once getCache is called, it cannot be setCache again. So I need to do some code changes.

Ferrari6 · 2021-09-15T22:49:04Z

Both JsonFunctions and JsonExtractScalarTransformFunction use JsonPath, so need to set Jayway's CacheProvider before the initialization of these two functions and can only be set once. At the same time, I am also doing some performance tests on Guava Cache, I will submit some more code soon.

mayankshriv · 2021-09-29T05:48:19Z

Some tests failed, rerunning to see if intermittent issue due to timeouts.

Ferrari6 · 2021-09-29T12:53:34Z

Some tests failed, rerunning to see if intermittent issue due to timeouts.

Thanks @mayankshriv. I have fixed the failed tests.

mayankshriv · 2021-09-29T16:33:22Z

w

Some tests failed, rerunning to see if intermittent issue due to timeouts.

Thanks @mayankshriv. I have fixed the failed tests.

@Ferrari6 seems like there are still failing tests.

Addressed comments to improve unit test Addressed review comments - Remove Constants class and configure inlined into JsonPathMapCache - Remove useless ConcurrentHashMap parameters Addressed review comments Co-authored-by: Xiaotian (Jackie) Jiang <17555551+Jackie-Jiang@users.noreply.github.com> Change cache maximum size to reduce memory usage use Guava cache for json path add json path cache integration test

Fix the issue of setting different default configurations Do not access the JsonPath cache in the transform function

This reverts commit 14c377d.

Also fix the issue of setting different default configurations

richardstartin reviewed Sep 8, 2021

View reviewed changes

richardstartin reviewed Sep 9, 2021

View reviewed changes

pinot-common/src/main/java/org/apache/pinot/common/function/JsonPathMapCache.java Outdated Show resolved Hide resolved

richardstartin reviewed Sep 9, 2021

View reviewed changes

richardstartin approved these changes Sep 9, 2021

View reviewed changes

Jackie-Jiang approved these changes Sep 9, 2021

View reviewed changes

Ferrari6 force-pushed the jsonpath-cache-improve branch from 8dd48c4 to 3166fae Compare September 20, 2021 11:02

Ferrari6 force-pushed the jsonpath-cache-improve branch from 3166fae to 2be13d1 Compare September 29, 2021 12:49

Jackie-Jiang force-pushed the jsonpath-cache-improve branch 3 times, most recently from 740572f to e78ac2c Compare November 2, 2021 19:32

Jackie-Jiang force-pushed the jsonpath-cache-improve branch from e78ac2c to c03156c Compare November 3, 2021 00:38

Rebase, cleanup and fix test

fd7aed0

Fix the issue of setting different default configurations Do not access the JsonPath cache in the transform function

Jackie-Jiang force-pushed the jsonpath-cache-improve branch from c03156c to fd7aed0 Compare November 3, 2021 04:32

Jackie-Jiang merged commit 14c377d into apache:master Nov 3, 2021

richardstartin added a commit to richardstartin/pinot that referenced this pull request Nov 3, 2021

Revert "Fix JSONPath cache inefficient issue (apache#7409)"

897b35a

This reverts commit 14c377d.

richardstartin mentioned this pull request Nov 3, 2021

[DO NOT MERGE] Revert "Fix JSONPath cache inefficient issue" #7685

Closed

walterddr pushed a commit to walterddr/pinot that referenced this pull request Nov 3, 2021

Revert "Fix JSONPath cache inefficient issue (apache#7409)"

fb10152

This reverts commit 14c377d.

kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021

Fix JSONPath cache inefficient issue (apache#7409)

2eb6845

Also fix the issue of setting different default configurations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix JSONPath cache inefficient issue #7409

Fix JSONPath cache inefficient issue #7409

Ferrari6 commented Sep 8, 2021

codecov-commenter commented Sep 8, 2021 •

edited

Loading

richardstartin Sep 8, 2021

richardstartin Sep 8, 2021

Ferrari6 Sep 9, 2021

richardstartin Sep 9, 2021

Ferrari6 Sep 9, 2021

richardstartin Sep 9, 2021

richardstartin left a comment

Jackie-Jiang left a comment

Jackie-Jiang Sep 9, 2021

richardstartin Sep 9, 2021 •

edited

Loading

Ferrari6 Sep 10, 2021

Ferrari6 Sep 10, 2021

richardstartin Sep 10, 2021

Ferrari6 Sep 17, 2021

ben-manes Sep 17, 2021

Ferrari6 Sep 17, 2021

richardstartin Sep 17, 2021

ben-manes Sep 17, 2021

snleee commented Sep 15, 2021 •

edited

Loading

Ferrari6 commented Sep 15, 2021 •

edited

Loading

Ferrari6 commented Sep 15, 2021

mayankshriv commented Sep 29, 2021

Ferrari6 commented Sep 29, 2021

mayankshriv commented Sep 29, 2021

Fix JSONPath cache inefficient issue #7409

Fix JSONPath cache inefficient issue #7409

Conversation

Ferrari6 commented Sep 8, 2021

Description

Background

Fix

Pinot Server Flamegraphs when using ConcurrentHashMap cache (28vcpu)

Upgrade Notes

Release Notes

Documentation

codecov-commenter commented Sep 8, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardstartin left a comment

Choose a reason for hiding this comment

Jackie-Jiang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardstartin Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snleee commented Sep 15, 2021 • edited Loading

Ferrari6 commented Sep 15, 2021 • edited Loading

Ferrari6 commented Sep 15, 2021

mayankshriv commented Sep 29, 2021

Ferrari6 commented Sep 29, 2021

mayankshriv commented Sep 29, 2021

codecov-commenter commented Sep 8, 2021 •

edited

Loading

richardstartin Sep 9, 2021 •

edited

Loading

snleee commented Sep 15, 2021 •

edited

Loading

Ferrari6 commented Sep 15, 2021 •

edited

Loading