Skip to content

SOLR-12963: add support for node-level uninvertible support configuration#1488

Draft
magibney wants to merge 1 commit intoapache:mainfrom
cowpaths:SOLR-12963-uninvertible-config
Draft

SOLR-12963: add support for node-level uninvertible support configuration#1488
magibney wants to merge 1 commit intoapache:mainfrom
cowpaths:SOLR-12963-uninvertible-config

Conversation

@magibney
Copy link
Contributor

See: SOLR-12963

We missed the boat on changing the uninvertible default for solr 9.

To avoid having to wait until Solr 10, this PR (initially a draft/proposal, no tests) makes it possible to configure "uninvertible" behavior and defaults at the node level. Rather than simply changing the default, this supports 4 modes, configured via solr.xml:

  1. FORBID: any attempt to configure uninvertible="true" at the field level will throw an error and node startup.
  2. DISABLE: tolerant of configurations with uninvertible="true", but ignores such configurations, logging an info message to that effect (this option is best for testing/convenience)
  3. DEFAULT_FALSE: by default, fields are not uninvertible, but this may be overridden at the fieldType or field level.
  4. DEFAULT_TRUE: by default, fields are uninvertible, but this may be overridden at the fieldType or field level

The current implicit default is DEFAULT_TRUE. There are a couple of benefits to this approach over simply changing the default:

  1. Allows users to easily configure the safety of uninvertible="false" at the node level, without needing to wait for the next major version
  2. Providing the ability to forbid/disable uninversion at the node level allows SolrIndexSearcher to avoid UninvertibleReader wrapping entirely.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pursuing this!

Not sure if we need a FORBID vs DISABLE distinction.

Comment on lines +119 to +127
/**
* Implicit default for `uninvertible` FieldType property, if not specified by `fieldType` or
* `field`. This implicit default may be overridden by a node-level default specified in
* `solr.xml`. The implicit defaults is <code>true</code> for historical reasons, but users are
* strongly encouraged to set this to <code>false</code> for stability and use <code>
* docValues="true"</code> or <code>uninvertible="true"</code> as needed.
*/
public static final UninvertingReader.Support IMPLICIT_DEFAULT_UNINVERTIBLE =
UninvertingReader.Support.DEFAULT_TRUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we could quickly follow-up with a master-only change (v10) to DEFAULT_FALSE.

public static final UninvertingReader.Support IMPLICIT_DEFAULT_UNINVERTIBLE =
UninvertingReader.Support.DEFAULT_TRUE;

private static UninvertingReader.Support uninvertibleSupport = IMPLICIT_DEFAULT_UNINVERTIBLE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static non-final variables concern me, as they should anyone. It has to be handled carefully in tests to ensure it's reset. If we can avoid this, we should. Did you investigate passing this into the IndexSchema constructor, ultimately held in the NodeConfig?

Copy link
Contributor Author

@magibney magibney Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this (I think this ties into a number of your other comments as well). I was taking an approach analogous to the precedent set by maxBooleanClauses config here, but I agree if this can be done another way, that would be preferable.

return ExitableDirectoryReader.wrap(
UninvertingReader.wrap(reader, core.getLatestSchema().getUninversionMapper()),
SolrQueryTimeoutImpl.getInstance());
switch (IndexSchema.getUninvertibleSupport()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like it should be an instance method on the IndexSchema, not a global lookup. Even if it may be toggle-able at a global level.

switch (IndexSchema.getUninvertibleSupport()) {
case DEFAULT_FALSE:
case DEFAULT_TRUE:
reader = UninvertingReader.wrap(reader, core.getLatestSchema().getUninversionMapper());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the schema has no field types declared UNINVERTABLE (wether it occurs explicitly or via a default), we could skip this wrap here. Not strictly necessary but it removes a thin layer of indirection. Such logic could go here or be in the wrap method if, say, we have the uninversionMapper return null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll look into this. Maybe worth doing if there's an easy way to track this, but if not I'm not sure it'd be worth the effort (the cost of wrapping presumably being relatively low -- I just did it here because it was easy 🙂).

@magibney
Copy link
Contributor Author

Not sure if we need a FORBID vs DISABLE distinction.

That's fair. I was thinking of DISABLE as offering a lazy way for people to "try this out" without overhauling their schema. Probably dropping DISABLE and only leaving FORBID would be the way to go, if choosing one or the other?

@justinrsweeney
Copy link
Contributor

Do you think this should instead live in the solrconfig.xml for a collection instead of solr.xml for a node? I think of this as being a per-collection related configuration as opposed to a per-node setting.

@magibney
Copy link
Contributor Author

hmmm yes; along the lines of what @justinrsweeney is suggesting: the patch that @hossman uploaded on SOLR-12963 left support for uninversion always enabled, but flipped the default behavior according to new schema.xml schema version (proposed version: 1.7).

I'm now seeing the schema-version-based approach as basically equivalent to what I proposed in this PR -- and arguably more straightforward. Especially assuming that @dsmiley's suggestion to dynamically avoid wrapping in UninvertingReader pans out, then iiuc the only thing that this PR offers that the schema-version-based approach does not is the ability to forbid uninversion, and the ability to configure uninversion behavior independent of schema version (e.g., if someone wanted to forbid uninversion and use schema version 1.5?). I'm leaning toward thinking that those differences are not enough to justify the extra complexity of the approach taken here.

Are there any other reasons to prefer configuration via solr.xml? Any reasons to be wary of introducing a new schema version for this purpose?

@dsmiley
Copy link
Contributor

dsmiley commented Mar 29, 2023

Definitely a schema (thus configSet) thing and not a node thing. Ideally just about anything one could specify in a configSet could have some general means of configuring globally but (a) I wouldn't want each feature having to "do something" to enable that (no custom solr.xml handling), (b) it's semi-possible already albeit with sys-prop substitution, and (c) and really it's a bigger issue than this specific one.

BTW I have thoughts on "schema version" https://issues.apache.org/jira/browse/SOLR-12420 but don't let that stop you if you prefer to add yet another schema version.

@noblepaul
Copy link
Contributor

I too feel that this should not be a node level property . I agree with @dsmiley @justinrsweeney

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants