Skip to content

Solr 8127 backport distributed luke#4472

Open
kotman12 wants to merge 3 commits into
apache:branch_9_11from
kotman12:SOLR-8127-backport-distributed-luke
Open

Solr 8127 backport distributed luke#4472
kotman12 wants to merge 3 commits into
apache:branch_9_11from
kotman12:SOLR-8127-backport-distributed-luke

Conversation

@kotman12
Copy link
Copy Markdown
Contributor

Description

Backport #4149

Tests

Aside from the automated tests already included I manually built this and sanity checked some requests against a locally built, multi-shard collection. I manually tested the narrowing logic by tweaking the headers and requesting wt=xml which passes along the type.

kotman12 and others added 2 commits May 27, 2026 16:09
* Fans out to one replica per shard by default when in Solr Cloud mode as well as with `shards` explicitly specified in non-Cloud mode
* Any index information that can't be aggregated, i.e. directory, version, indexCommit, etc., will be placed for every *responding* shard in a new shards response field. This only gets returned when shards.info=true
* docs and docCount were widened to long as they can now overflow. For javabin codec compatibility the server will narrow these to int for old calling SolrClients (when it is safe to do so)
* Previously show=doc mode would error if it couldn't find a matching doc but now returns an empty response and a 200 status code
* show=doc in distributed mode works only with Solr document Id but not with lucene docId, i.e. "id=..." works but "docId=..." does not.
* When in distributed mode Luke handler will validate index and schema flags of each field for consistency and error with an informative message in case of any mismatch.
* You can go back to the old, non-distributed behavior in Cloud mode by specifying distrib=false
* For single-sharded Solr Clouds there is no behavior change (this is a special case).

Co-authored-by: David Smiley <dsmiley@apache.org>
(cherry picked from commit cac69ae)
…che#4455)

* Pooling assertions in fewer tests
* Using SolrCloudTestCase whenever possible

(cherry picked from commit 691053e)

# Conflicts:
#	solr/core/src/test/org/apache/solr/handler/admin/LukeRequestHandlerDistribTest.java
@github-actions github-actions Bot added documentation Improvements or additions to documentation test-framework client:solrj tests labels May 27, 2026
HttpSolrCall call = req.getHttpSolrCall();
if (call == null) return false;
SolrVersion clientVersion = call.getUserAgentSolrVersion();
return clientVersion != null && clientVersion.lessThan(DISTRIB_LONG_COUNTS_MIN_VERSION);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue here is that this only works for javabin clients, but a client requesting wt=xml may still get back an unexpected type, i.e. <long name="docs">10000</long> instead of <int name="docs">10000</int>. You can of course go back to the old, undistributed behavior via distrib=false. Btw I found it amusing that you can hijack the solrj-version headers for the xml wt so it respects the narrowing while writing out to xml instead of javabin, i.e.:

curl -s -H 'User-Agent: Solr[org.apache.solr.client.solrj.impl.Http2SolrClient] 9.10.0' \ 'http://localhost:8983/solr/luke_test/admin/luke?wt=xml&shards.info=true'

vs

curl -s -H 'User-Agent: Solr[org.apache.solr.client.solrj.impl.Http2SolrClient] 9.11.0' \ 'http://localhost:8983/solr/luke_test/admin/luke?wt=xml&shards.info=true'

But obviously that is an odd thing to recommend to users so I'd stick to recommending reverting to old behavior via distrib=false as is documented. Perhaps it is worth rewriting this more explicitly:

To revert to old, pre-distributed behavior just pass distrib=false

I just worry about backwards compat and breaking people going from a popular and stable 9.10 version.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why the response writer (wt) would foil our attempts to return a compatible response.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean it was an issue with the wt itself, just that to test the old client narrowing behavior I hardcoded the java client version string into a regular curl request header with wt=xml because I realized that the xml writer response includes long/int datatype in the xml tag of the field. But this of course can break callers requesting xml with a non-java client who expect docs to be in an int tag and not a long tag. So I'm just calling this out as a backwards incompatible flow.

== Distributed Mode (multiple shards)

When running in SolrCloud, the Luke handler automatically distributes requests across all shards in the collection, the same as search requests.
To inspect only the receiving shard's index set `distrib=false`.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned previously, perhaps it is worth rewriting this more explicitly:

To revert to old, pre-distributed behavior just pass distrib=false

Copy link
Copy Markdown
Contributor

@dsmiley dsmiley May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should say, to support older clients without updating them we could also recommend a Solr deployment configure luke in solrconfig.xml with a default of distrib=false. But I don't think that belongs on this page; that's an upgrade page matter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a bad idea, I will look to mention this on the upgrade page as part of this change.

Copy link
Copy Markdown
Contributor Author

@kotman12 kotman12 May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsmiley I added a section to 9.11 major changes to incorporate this idea. I apologize about the force pushes but I pushed the wrong branch by accident.

@kotman12 kotman12 force-pushed the SOLR-8127-backport-distributed-luke branch from b767b6e to d72a3f8 Compare May 28, 2026 15:44
@kotman12 kotman12 force-pushed the SOLR-8127-backport-distributed-luke branch from d72a3f8 to f94f4a2 Compare May 28, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client:solrj documentation Improvements or additions to documentation test-framework tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants