Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17096: Cluster Singleton plugin support in solr.xml #2126

Merged
merged 11 commits into from Jan 17, 2024

Conversation

pjmcarthur
Copy link
Contributor

@pjmcarthur pjmcarthur commented Dec 7, 2023

https://issues.apache.org/jira/browse/SOLR-17096

Description

It should be possible to install and manage the configuration of a cluster singleton plugin in an immutable deployment of SolrCloud. Currently, the only method to manage cluster singleton plugin configs is by using the ContainerPluginApi edit APIs, and this is not always possible, especially in the case of immutable SolrCloud deployments.

Solution

This PR changes the ContainerPluginRegistry to use a pluggable source for plugin configs. The pluggable source is determined by a system property, solr.clusterPluginsSource, or by a declaration in solr.xml.

The default source is ZkContainerPluginsSource, which allows plugins to be added, removed or updated at runtime using the ContainerPluginApi. This matches the existing behavior.

A new OOTB plugin implementation is NodeConfigClusterPluginsSource. This plugin supports managing cluster singleton plugins via declarations in solr.xml, and does not support the edit APIs of ContainerPluginApi; these APIs are not registered in the case that node config is used as the plugin source.

An example of solr.xml that supports cluster singleton declarations would be

<clusterPluginsSource class="org.apache.solr.api.NodeConfigClusterPluginsSource"/>
<clusterSingleton name="my.plugin.name" class="my.plugin.class">
  <config>...</config>
</clusterSingleton>

Tests

New tests are added in NodeConfigContainerPluginsSourceTest that spin up clusters and Core containers using the node config plugin source, and verify that container plugin cluster singleton declarations in solr.xml are added correctly to the ContainerPluginsRegistry.

Node configuration tests have been added to TestSolrXml

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to update TestSolrXml as well.

Overall, I like this!
I think there is a big issue with naming. In JIRA, I advocated for using "Node" nomenclature instead of "Container", at least for anything the user sees (users don't know about CoreContainer; that's an internal thing and a name we regret). But we can punt on renaming until the very end of this PR with respect to class names, to ease review.

@dsmiley
Copy link
Contributor

dsmiley commented Dec 9, 2023

Oh I totally missed an important concern that I raised with @pvcnt 's previous PR #2016. The configuration is very generic (the specific plugins are effectively untyped in solr.xml). I took this back to the parent JIRA with my comment here: https://issues.apache.org/jira/browse/SOLR-15782?focusedCommentId=17782415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17782415

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test, by the way. I'm really glad you tested configurability. These plugins are configured internally very unusually compared to the rest of Solr.

@pvcnt could you also take a look at this too? You feared this approach might be too much work, I recall.

@noblepaul this is the start of a move to support these plugins in solr.xml

There is a decision to make on naming... "ClusterPlugins" vs "ContainerPlugins" vs "NodePlugins". And where to document such plugins when they can be configured in two places. I tend to think solr.xml documentation should be primary and only minimally document in the cluster plugin API (where these are currently documented).

CoreContainer cc, SolrResourceLoader loader, PluginInfo info) {
ClusterPluginsSource clusterPluginsSource = newInstance(loader, resolveClassName(info), cc);
if (info != null && info.isEnabled() && info.initArgs != null) {
Map<String, Object> config = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these two lines can be one; toMap returns its arg.

@pvcnt
Copy link
Contributor

pvcnt commented Dec 24, 2023

@dsmiley I’ll have a more detailed look after my holidays. But one thing we should consider is that this PR follows a different design than what I did for replica placement plugins. For the latter, what goes inside solr.xml is applied as a first layer, and then cluster plugins as overrides. Here it’s one or the other, if my understanding is correct.

I believe this PR is heading towards a promising direction, but we should make sure that cluster singletons and replica placement plugins are at least configured the same.

@pjmcarthur
Copy link
Contributor Author

@pvcnt @dsmiley I think there should be a follow-up PR that addresses the replica placement plugin. My understanding is that from a user's perspective, the configuration of solr.xml wrt. the placement plugin would remain as it is, but the internal mechanics of instantiating the RPP factory would change to align with this PR. I would be able to commit to doing this follow-up PR if it is deemed the right path forward.

@pvcnt
Copy link
Contributor

pvcnt commented Jan 3, 2024

I don't know when Solr 9.5 is going to be released, but IMHO we should make sure that it is not released with two different ways of configuring a cluster plugin in solr.xml, or even not released with a new way of configuring a cluster plugin in solr.xml that would be changing in Solr 9.6 (hence breaking backwards compatibility, etc.). I would hence be in favour of either doing both in a single PR, or starting by refactoring replica placement plugins. It would be a good exercice to validate that the approach can be extended to other cluster plugins.

The more philosophical question I have is about the behaviour. Do we want to be able to choose between solr.xml and plugin API (this PR) or to layer the latter on top of the former (my previous PR). I don't have a strong opinion on this, but that is a structuring question I believe.

@dsmiley
Copy link
Contributor

dsmiley commented Jan 4, 2024

Agreed that 9.5 should not be released without reconciling the two approaches.

I think I would prefer to revert SOLR-17079 as it exists today, then merge this PR SOLR-17096 (after whatever changes in the review process), then do a quick (?) follow-up PR for SOLR-17079 to return the essential functionality building on what's here. Do you think that hypothetical PR would be straight-forward? You raised concerns a while back that what Paul is doing here would maybe be too much work, if I recall.

I don't think it's important/necessary to layer them as you did for replica placement. That adds greater support requirements (testing that both can be used coherently) and nobody has asked for that.

@pvcnt
Copy link
Contributor

pvcnt commented Jan 4, 2024

I believe it would still be interesting to have at least two different plugin types to validate that the generic approach is really generic, and what it takes to add a new type of plugin.

Another remark, it seems that there is no validation that a <clusterSingleton/> tag does indeed defines a cluster singleton. I believe the following would work and define a replica placement plugin:

<clusterSingleton name=".placement-plugin" class="org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory"/>

This is why in SOLR-17079 we decided to go with an ad-hoc solution (i.e., one for every possible cluster plugin) instead of the generic solution. Otherwise, we are back to my initial proposal which is to have an untype <plugin/> tag in solr.xml, which is essentially what this PR does.

@dsmiley
Copy link
Contributor

dsmiley commented Jan 4, 2024

I believe it would still be interesting to have at least two different plugin types to validate that the generic approach is really generic, and what it takes to add a new type of plugin.

I don't think it'll be totally generic; a given plugin type might best have some adjustments to create the PluginInfo. For example, the ".placement-plugin" name should be added to the PluginInfo when the XML element is <placementPlugin>. While that could be done in SolrXmlConfig (reading solr.xml), such an idiosyncrasy feels specific to the nature of the cluster edit API/JSON that maybe could be in another spot (unsure).

it seems that there is no validation that a tag does indeed defines a cluster singleton.

I don't recall ever making a mistake of that nature, probably also because the class name is nearly always indicative of the type, so it'd look wrong if it was wrong as well. Validation would be nice but not a requirement. I leave the choice to validate or not up to Paul.

@pvcnt
Copy link
Contributor

pvcnt commented Jan 4, 2024

The point I am raising about validation has actually already been raised by Paul there: https://github.com/apache/solr/pull/2126/files#r1424690508

I think it would be a mistake to not enforce that a <clusterSingleton/> actually defines a cluster singleton and not a replica placement plugin. Otherwise, what is the point of having a typed tag instead of a generic tag for all plugins, besides providing some documentation? Again, with the current implementation, you can perfectly declare a replica placement plugin (or any kind of cluster plugin really) inside a <clusterSingleton/> and it will just work.

@pjmcarthur
Copy link
Contributor Author

I will add the validation.

Regarding piggybacking on the existing disable.configEdit property, I like the idea, but I have one concern.
If there are any Solr deployments out there that set disable.configEdit to true and also have cluster plugins installed, they will have issues after this change. The property value of true would cause the node config plugin source to be used instead of the ZK plugin source, and any plugins that were configured would not be loaded.

Maybe this isn't a real problem, and its unlikely that anyone would configure Solr in this way. You mention the property is not documented, so that could benefit us. I'm not sure how confident we can be about how safe it really is though. I think for that reason I slightly prefer the implicit selection of the pluigin source based on the presence of clusterSingleton declarations in solr.xml, but I'm ok to use the disable.configEdit property if we think its safe.

@dsmiley
Copy link
Contributor

dsmiley commented Jan 7, 2024

Let's imagine someone out there has set disable.configEdit and is also posting to /api/cluster/plugin. After the proposed change here, a call to that API is going to get an error (or should) telling them it's disabled. So they know, and can act. They aren't stuck; they have options. Like don't set disable.configEdit (easiest) or don't POST to /api/cluster/plugin (use solr.xml instead).

solr/core/src/java/org/apache/solr/core/NodeConfig.java Outdated Show resolved Hide resolved
solr/core/src/java/org/apache/solr/core/NodeConfig.java Outdated Show resolved Hide resolved
@@ -104,9 +105,6 @@
public class SolrConfigHandler extends RequestHandlerBase
implements SolrCoreAware, PermissionNameProvider {
private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
public static final String CONFIGSET_EDITING_DISABLED_ARG = "disable.configEdit";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janhoy if you didn't notice, we're elevating this very internal undocumented sys prop to node level to increase the scope beyond config editing of a core live to also include disabling cluster config edit. Obviously the name of this sys prop is poor and your new efforts to standardize suggest we should take the opportunity to have a nicer name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back & forth here... I had thought let's use disable.configEdit but I don't want to shine a light (document) a badly named config, especially at a time when a config naming standardization effort is underway.

Absent further feedback, I think we should use a new property, like solr.cluster.plugin.mutable using EnvUtils. It's scoped to just the /cluster/plugin API and thus doesn't have a debatable scope (too broad, like all config editing, which should get more agreement/care). Such a narrow scope also suggests moving the setting to ContainerPluginsRegistry or ZkClusterPluginsSource.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jan 10, 2024
Comment on lines 76 to 77
Arrays.stream(clusterSingletons)
.forEach(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streams can be cool when their power can be leveraged, but if you're simply going to iterate, than really, for(p : clusterSingletons) is best

@@ -159,16 +160,15 @@ public static NodeConfig fromConfig(
configBuilder.setSolrResourceLoader(loader);
configBuilder.setUpdateShardHandlerConfig(updateConfig);
configBuilder.setShardHandlerFactoryConfig(getPluginInfo(root.get("shardHandlerFactory")));
configBuilder.setReplicaPlacementFactoryConfig(
getPluginInfo(root.get("replicaPlacementFactory")));
// configBuilder.setReplicaPlacementFactoryConfig(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a "nocommit" I suppose. BTW that's a magic keyword used by the Lucene & Solr projects that will show up in WIP that is not intended to be committed/merged.

solr/CHANGES.txt Show resolved Hide resolved
* implementation is {@link ZkClusterPluginsSource}, but can be overridden by a system property, or
* by declaring the implementation class in solr.xml
*/
public class ClusterPluginsSourceConfigurator implements NamedListInitializedPlugin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending NamedListInitializedPlugin means we have an init() method. But what becomes of those args?

@@ -104,9 +105,6 @@
public class SolrConfigHandler extends RequestHandlerBase
implements SolrCoreAware, PermissionNameProvider {
private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
public static final String CONFIGSET_EDITING_DISABLED_ARG = "disable.configEdit";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back & forth here... I had thought let's use disable.configEdit but I don't want to shine a light (document) a badly named config, especially at a time when a config naming standardization effort is underway.

Absent further feedback, I think we should use a new property, like solr.cluster.plugin.mutable using EnvUtils. It's scoped to just the /cluster/plugin API and thus doesn't have a debatable scope (too broad, like all config editing, which should get more agreement/care). Such a narrow scope also suggests moving the setting to ContainerPluginsRegistry or ZkClusterPluginsSource.

@pjmcarthur
Copy link
Contributor Author

@pvcnt FYI - This PR has been updated to handle the replicaPlacementFactory declared in solr.xml in the same way as cluster singleton plugins.

Copy link
Contributor

@pvcnt pvcnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach, and feel much more comfortable now that replica placement plugins have been integrated.

@@ -211,6 +215,17 @@ private NodeConfig(
if (null == this.solrHome) throw new NullPointerException("solrHome");
if (null == this.loader) throw new NullPointerException("loader");

if (this.clusterPlugins != null
&& this.clusterPlugins.length > 0
&& !ClusterPluginsSourceConfigurator.resolveClassName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done later in CoreContainer, and rely on a method on the ClusterPluginsSource interface rather than a comparison of the class name?

&& this.clusterPlugins.length > 0
&& !ClusterPluginsSourceConfigurator.resolveClassName()
.equals(NodeConfigClusterPluginsSource.class.getName())) {
throw new SolrException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it can be a warning instead of an exception? Not entirely sure, but maybe people will have those in solr.xml, change the source later and want to keep them. Or maybe they could be injected into the cluster API as defaults at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd be doing users any favors by allowing/encouraging ambiguity in configs, and ignoring some of it based on settings. IMO the exception is correct here.

.map(n -> new PluginInfo(n, n.name(), true, true))
.collect(Collectors.toList());

// Cluster plugin names must be unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't plugin names supposed to be unique globally, instead of per plugin type? If so, this should be moved outside of this method specific to ClusterSingleton. We should also verify for reserved names, i.e., that a name does not start with a dot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it seems to be that Container/Cluster plugins have names in one namespace, but that's weird -- Solr doesn't work that way anywhere else. I'd rather fix that limitation in a follow-up -- ContainerPluginsRegistry ought to have a a map keyed by type.

}
}

return pluginInfo.name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can return null, and likely generate an NPE later on (?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be possible to return null here. The name attribute for cluster plugins plugin is mandatory and validated in SolrXmlConfig.
There are some exceptions for built in plugins, like the replica placement plugin, for which it isn't necessary to specify a name in solr.xml because the name is implied. For those cases only, the name is set here.

@@ -124,25 +124,6 @@ public void testConfigurationInSystemProps() {
cc.shutdown();
}

@Test
public void testConfigurationInSolrXml() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this test be moved elsewhere? It is the only end-to-end (well, kindof) verification that the placement plugin configured in solr.xml is indeed taking into account in CoreContainer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see NodeConfigPlacementPluginTest. The test requires a ZK aware SolrCloud now because it uses the ContainerPluginRegistry.

@dsmiley dsmiley changed the title SOLR-17096: Cluster Singleton plugin support for immutable deployments SOLR-17096: Cluster Singleton plugin support in solr.xml Jan 13, 2024
Comment on lines 163 to 164
// configBuilder.setReplicaPlacementFactoryConfig(
// getPluginInfo(root.get("replicaPlacementFactory")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be removed?

* implementation is {@link ZkClusterPluginsSource}, but can be overridden by teh {@link
* ContainerPluginsRegistry#MUTABLE_CLUSTER_PLUGINS} property
*/
public class ClusterPluginsSourceConfigurator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a couple static methods related to ClusterPluginsSource. Shouldn't these be on ClusterPluginsSource instead of having this class to hold them?

.map(n -> new PluginInfo(n, n.name(), true, true))
.collect(Collectors.toList());

// Cluster plugin names must be unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it seems to be that Container/Cluster plugins have names in one namespace, but that's weird -- Solr doesn't work that way anywhere else. I'd rather fix that limitation in a follow-up -- ContainerPluginsRegistry ought to have a a map keyed by type.

@@ -75,6 +73,8 @@
public class ContainerPluginsRegistry implements ClusterPropertiesListener, MapWriter, Closeable {
private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());

public static final String MUTABLE_CLUSTER_PLUGINS = "solr.cluster.plugin.mutable";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fresh information: https://cwiki.apache.org/confluence/display/SOLR/System+property+naming+structure
Based on this audit and proposals (not yet formalized, granted), I think "solr.cluster.plugin.edit.enabled" would be the right property name here... ehh @janhoy ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, one can argue how far the naming standard should go. I was mostly concerned with the lowercase dot separated aspect, with a controlled set of top-level categories. Guess it is nice-to-have to also agree on "enabled/disabled", "edit/mutable", "allowlist/includelist" etc

@dsmiley
Copy link
Contributor

dsmiley commented Jan 17, 2024

There seems to be a test failure. Once that's worked out, I plan to merge!

@dsmiley
Copy link
Contributor

dsmiley commented Jan 17, 2024

The failure is unrelated; see #2204 for a fix.

@dsmiley dsmiley merged commit 1de6d08 into apache:main Jan 17, 2024
2 of 3 checks passed
dsmiley pushed a commit that referenced this pull request Jan 18, 2024
* Generalize to all "cluster plugins", albeit only supporting <clusterSingleton> and <replicaPlacementFactory> at this time.
* Introduced new `solr.cluster.plugin.edit.enabled` boolean setting to disable /cluster/plugin mutability, required for solr.xml's use of those plugins.

Co-authored-by: Paul McArthur <pmcarthur-apache@proton.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
5 participants