SOLR-16615: Reinstate Jersey app-per-configset #1314

gerlowskija · 2023-01-25T17:30:13Z

https://issues.apache.org/jira/browse/SOLR-16615

Description

Currently, our Jersey integration involves creating an 'ApplicationHandler' class for each SolrCore. These classes are expensive and their creation can have a noticeable cumulative effect at startup time on nodes hosting many many cores. (See SOLR-16531 for more details.)

Solution

This PR addresses this problem by allowing ApplicationHandler's to be shared across multiple SolrCores, as long as each of those cores use the same configset. To do this we introduce JerseyAppHandlerCache, which takes over ownership of all core-level ApplicationHandler's. AH's are ref-counted so that they can be removed from the cache and closed when no longer needed by any SolrCore's.

A previous attempt at this functionality was committed but then reverted due to test failures. This new PR reinstates the functionality after fixing a few bugs and addressing some additional review feedback.

Tests

See diff for new JUnit tests.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.

Previously reverted due to a large bug that crept in while resolving some merge conflicts, this PR adds the app-per-configset code back. Still TODO - address David's addtl review feedback on the original PR

gerlowskija · 2023-01-25T17:50:17Z

(Credit to Kevin for pointing this out). Anyone familiar with the previous PR see what has changed since then with this link

dsmiley · 2023-01-25T20:57:46Z

I don't think the ApplicationHandler needs to be RefCount'ed because there is no intrinsic closing work to do (unlike SolrIndexSearcher & SolrCore, say). RefCount'ing is no fun -- it's awkward; not an API I reach for unless there's no better option. We just don't want to retain them in memory; that's all. There are already utilities for this -- CaffeineCache with weak references.

Still need to address a minor race condition around concurrent inc/decref calls for JerseyAppHandlerCache entries.

gerlowskija · 2023-01-25T21:38:10Z

I don't think the ApplicationHandler needs to be RefCount'ed because there is no intrinsic closing work to do (unlike SolrIndexSearcher & SolrCore, say). RefCount'ing is no fun -- it's awkward; not an API I reach for unless there's no better option. We just don't want to retain them in memory; that's all. There are already utilities for this -- CaffeineCache with weak references.

I'm not super familiar with CaffeineCache, but I'll look into it. Thanks for the suggestion @dsmiley

Two initial questions I'd have:

Does it bring along any sort of overhead? I see it's tracking a bunch of metrics and historical data for cache entries, computing RAM usage estimates, etc. Does that stuff have any non-negligible cost? Seemed reasonable to ask as performance is the driving motivation for this PR...
I understand the drive of switching from RefCounted to WeakReference. But doesn't that get us what we need right there? What does switching the enclosing data structure from ConcurrentHashMap to CaffeineCache get us - we don't need (or want?) cache-eviction or any of the other stuff that CC would bring, unless I'm missing some reason we'd need to evict? Feel like I'm missing some aspect of what you're suggesting...

dsmiley · 2023-01-25T22:27:10Z

I believe its overhead is minimal; we use it already in Solr for many of our caches on the query path. We layer on our own metrics :-)
CHM doesn't have weak values natively -- Caffeine supports this option. It's an internal detail, not something we wrap ourselves, so is similar to understand/review.

Allows a JerseyAppHandlerCache to present a much cleaner interface by eliminating the need to RefCount entirely. By removing RefCounting, also fixes a race condition that David S pointed out between when items were being fetched from a ConcurrentMap and when they were inc/decref'd.

gerlowskija · 2023-01-26T15:20:33Z

Alright - perfect. This ended up being much much nicer I think!

I ended up using Caffeine.newBuilder directly instead of Solr's CaffeineCache wrapper. (CaffeineCache doesn't expose Caffeine's native support for weak-references afaict, and comes with a good bit of other baggage that we don't need here.). But it allows us to get rid of the ref-counting and all that by using "weak reference eviction"

In eliminating ref-counting, it also clears up the race condition that David mentioned here, so a win-win all around.

gerlowskija · 2023-01-26T20:38:30Z

Thanks for the remaining fixes Houston. It LGTM, and I think addresses all of David's outstanding comments from the previous PR?

Running check locally now - would love to get this merged before the weekend!

dsmiley · 2023-01-26T20:27:57Z

solr/core/src/java/org/apache/solr/core/SolrConfig.java

+  public String effectiveId(String configSetName) {
+    return configSetName + "-" + getName() + "-" + znodeVersion + "-" + rootDataHashCode;
+  }


This feels off. "Generates a String ID to represent the ConfigSet". No... this is on SolrConfig; it should be an ID to represent the ID of this SolrConfig. I'm skeptical configSetName even needs to be a part of this. Sure in practice it will will be by-configSet but as a by-product of configSets having different configs. If there is a reason ApplicationHandler needs to be by ConfigSet even if there may be identical solrconfig's (which I could imagine being so) then I think it's the caller of this method to concatenate the configSet name. Or build the configSet's identity into an existing field to reference so that it's not passed in (I less prefer).

Yeah the documentation was meant to say SolrConfig, not ConfigSet.

I'm skeptical configSetName even needs to be a part of this.

I'm also skeptical, but saw no harm in adding it.

then I think it's the caller of this method to concatenate the configSet name.

I did this originally then added the configSetName here. In general if we don't see a use case of sharing SolrConfigs between configSets, then I don't really see a need to make that a possibility. @gerlowskija can weigh in, but I don't really want to open that rats nest.

I'm skeptical configSetName even needs to be a part of this.

I think it should be a part of the ID.

we don't see a use case of sharing SolrConfigs between configSets
agree

I'm good with it being as a part of the ID so long as configSet is a field in the SolrConfig so that it's more clearly a part of its identity. Adding as a parameter on this method is suspicious.

I'm skeptical configSetName even needs to be a part of this

I didn't have a single specific reason in mind. It just seemed prudent from an "abundance of caution" perspective. Maybe some bug will crop up where we'd want to eliminate AH sharing, maybe a paranoid admin running a multi-tenant service would want to ensure sharing doesn't happen, etc.

That said I don't feel strongly about where the configset name gets added. I'll move it closer to the actual caching as suggested.

solr/core/src/java/org/apache/solr/core/SolrCore.java

solr/core/src/java/org/apache/solr/jersey/InjectionFactories.java

dsmiley · 2023-01-26T20:44:07Z

solr/core/src/java/org/apache/solr/jersey/JerseyAppHandlerCache.java

+ * <p>ApplicationHandler creation is expensive; caching these objects allows them to be shared by
+ * multiple cores with the same configuration.
+ */
+public class JerseyAppHandlerCache {


This class now seems useless -- and that's a good thing :-). Less to name & document -- can be done so more succinctly in-context.

@gerlowskija WDYT?

noblepaul · 2023-01-27T01:22:11Z

solr/core/src/java/org/apache/solr/core/ConfigOverlay.java

@@ -34,15 +34,15 @@
 * performed on tbhis gives a new copy of the object with the changed value
 */
 public class ConfigOverlay implements MapSerializable {
-  private final int znodeVersion;


is there any reason why this rename was necessary ?

Because its no longer necessarily the znode version, it could be the file "version" for a file-system based configSet.

If a znode with version 0 is deleted and recreated quickly then it comes back with the same version and this missed updates before

solr/core/src/java/org/apache/solr/core/CoreContainer.java

noblepaul · 2023-01-27T01:26:24Z

solr/core/src/java/org/apache/solr/core/SolrConfig.java

@@ -178,8 +179,12 @@ private class ResourceProvider implements Function<String, InputStream> {
        ZkSolrResourceLoader.ZkByteArrayInputStream zkin =
            (ZkSolrResourceLoader.ZkByteArrayInputStream) in;
        zkVersion = zkin.getStat().getVersion();
-        hash = Objects.hash(zkin.getStat().getCtime(), zkVersion, overlay.getZnodeVersion());
+        hash = Objects.hash(zkin.getStat().getCtime(), zkVersion, overlay.getVersion());


why is zkin.getStat().getCtime() used to compute hash? isn't zkVersion not enough?

No idea, but that was the behavior before the PR. I'm unsure whether the hash computed here is even used.

Note that a re-created file (removed then added) will have its zkVersion reset to 0. So incorporating CTime guards against this.

noblepaul · 2023-01-27T01:30:39Z

solr/core/src/java/org/apache/solr/core/SolrConfig.java

+  public String effectiveId(String configSetName) {
+    return configSetName + "-" + getName() + "-" + znodeVersion + "-" + rootDataHashCode;
+  }


I'm skeptical configSetName even needs to be a part of this.

I think it should be a part of the ID.

we don't see a use case of sharing SolrConfigs between configSets
agree

noblepaul · 2023-01-27T01:32:00Z

solr/core/src/java/org/apache/solr/core/SolrCore.java

+                .computeIfAbsent(
+                    effectiveSolrConfigId,
+                    () -> {
+                      log.debug(


add log.isDebugEnabled()`

Why? Note we have checks that can tell when it's helpful and when it isn't. It isn't helpful in this circumstance because the interpolated arguments are plain references -- not method calls that might do expensive stuff. The log.isDebugEnabled is verbose IMO for too little gain.

gerlowskija · 2023-01-27T15:56:07Z

Alright, I think I addressed everyone's comments? Lmk what you think when you guys get a chance...

HoustonPutman · 2023-01-27T17:15:05Z

Wait this definitely deserves a CHANGES.txt entry if it doesn't already have one

gerlowskija · 2023-01-27T20:39:42Z

I'm kindof on the fence about a CHANGES.txt entry. Why would a user ever care about this? What would they do with the information? (i.e. Yes this fixes a perf regression, but it's one that was never released.)

I'll draft up an entry just mentioning that the Jersey AppHandlers are now shared where possible - I'll push that up shortly and then merge this so that Jenkins can test away over the weekend. If folks have thoughts about the CHANGES.txt entry or additional review, happy to address that all.

Previously reverted due to a large bug that crept in while resolving some merge conflicts, this PR adds the app-per-configset code back. Jersey 'ApplicationHandlers' are now shared by cores in the same JVM with the same configset.

SOLR-16615: Reinstate Jersey app-per-configset

a02af97

Previously reverted due to a large bug that crept in while resolving some merge conflicts, this PR adds the app-per-configset code back. Still TODO - address David's addtl review feedback on the original PR

gerlowskija mentioned this pull request Jan 25, 2023

SOLR-16615: Reuse Jersey apps for cores with the same configset #1286

Merged

7 tasks

Tidy

7d2bbc0

Remove temporary logging

be84d41

Address review comments, rd 1

b23592c

Still need to address a minor race condition around concurrent inc/decref calls for JerseyAppHandlerCache entries.

HoustonPutman added 3 commits January 26, 2023 13:50

Remove hashCode(), refactor a bit

d7c0bc8

Cleanup SolrFileInputStream

7214777

Fixes

4457cd7

dsmiley reviewed Jan 26, 2023

View reviewed changes

noblepaul reviewed Jan 27, 2023

View reviewed changes

Address review comments, rd 2

322cf35

HoustonPutman approved these changes Jan 27, 2023

View reviewed changes

CHANGES.txt entry

bdb56c1

dsmiley approved these changes Jan 27, 2023

View reviewed changes

gerlowskija merged commit 73c938a into apache:main Jan 30, 2023

gerlowskija deleted the SOLR-16615-reinstate-app-per-configset2 branch January 30, 2023 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-16615: Reinstate Jersey app-per-configset #1314

SOLR-16615: Reinstate Jersey app-per-configset #1314

gerlowskija commented Jan 25, 2023 •

edited

gerlowskija commented Jan 25, 2023

dsmiley commented Jan 25, 2023

gerlowskija commented Jan 25, 2023

dsmiley commented Jan 25, 2023

gerlowskija commented Jan 26, 2023

gerlowskija commented Jan 26, 2023 •

edited

dsmiley Jan 26, 2023

HoustonPutman Jan 26, 2023

noblepaul Jan 27, 2023

dsmiley Jan 27, 2023

gerlowskija Jan 27, 2023 •

edited

dsmiley Jan 26, 2023

dsmiley Jan 27, 2023

noblepaul Jan 27, 2023

HoustonPutman Jan 27, 2023

madrob Jan 27, 2023

noblepaul Jan 27, 2023

HoustonPutman Jan 27, 2023

dsmiley Jan 27, 2023

noblepaul Jan 27, 2023

noblepaul Jan 27, 2023

dsmiley Jan 27, 2023

gerlowskija commented Jan 27, 2023

HoustonPutman commented Jan 27, 2023

gerlowskija commented Jan 27, 2023

SOLR-16615: Reinstate Jersey app-per-configset #1314

SOLR-16615: Reinstate Jersey app-per-configset #1314

Conversation

gerlowskija commented Jan 25, 2023 • edited

Description

Solution

Tests

Checklist

gerlowskija commented Jan 25, 2023

dsmiley commented Jan 25, 2023

gerlowskija commented Jan 25, 2023

dsmiley commented Jan 25, 2023

gerlowskija commented Jan 26, 2023

gerlowskija commented Jan 26, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gerlowskija Jan 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gerlowskija commented Jan 27, 2023

HoustonPutman commented Jan 27, 2023

gerlowskija commented Jan 27, 2023

gerlowskija commented Jan 25, 2023 •

edited

gerlowskija commented Jan 26, 2023 •

edited

gerlowskija Jan 27, 2023 •

edited