Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't add to the changelog on documentation matters

Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc
title: Improve combined query documentation
type: other
authors:
- name: Ilaria Petreti
- name: Alessandro Benedetti
links:
- name: SOLR-18100
url: https://issues.apache.org/jira/browse/SOLR-18100
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,43 @@ It is extending JSON Query DSL ultimately enabling Hybrid Search.
This feature is currently unsupported for grouping and Cursors.
====

[IMPORTANT]
====
This feature works in both Standalone and SolrCloud modes and always performs distributed search execution.
In Standalone (user-managed) mode, shard URLs must be explicitly allow-listed using the *allowUrls* parameter, otherwise Solr returns HTTP 403. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t encountered this issue before, as it might vary depending on the infrastructure (?)


```
--jvm-opts "-Dsolr.security.allow.urls=http://localhost:8983/solr/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; okay, the need for this is something Solr should improve (to automatically add localhost same-port). If this was improved, I suspect you wouldn't of even written this entire paragraph.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--jvm-opts -- surely we can just pass -Dsolr.....

```

In SolrCloud mode, this is managed automatically via ZooKeeper.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do away with this sentence, as redundant with the prior paragraph. And the ZK reference is a needless implementation detail reference.

====

== Configuration Requirements

Combined Query Feature has a separate handler with class `solr.CombinedQuerySearchHandler` which can be configured as below:

```
<requestHandler name="/search" class="solr.CombinedQuerySearchHandler">
.....
</requestHandler>
```

In addition, the `QueryComponent` has been extended to create a new `CombinedQueryComponent`, which must be declared as a search component:
```
<searchComponent class="solr.CombinedQueryComponent" name="combined_query">
<int name="maxCombinerQueries">2</int>
</searchComponent>
```


The Search Component also accepts parameters as below:

`maxCombinerQueries`::
This parameter can be set to put upper limit check on the maximum number of queries can be executed defined in `combiner.query`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This parameter can be set to put upper limit check on the maximum number of queries can be executed defined in `combiner.query`.
This parameter can be set to enforce an upper limit on the number of queries defined in `combiner.query`.

It defaults to `5` if not set.


== Query DSL Structure
The query structure is similar to JSON Query DSL except for how multiple queries are defined along with their parameters.

Expand Down Expand Up @@ -71,37 +108,41 @@ Below is a sample JSON query payload:
}
```

== Search Handler Configuration

Combined Query Feature has a separate handler with class `solr.CombinedQuerySearchHandler` which can be configured as below:

```
<requestHandler name="/search" class="solr.CombinedQuerySearchHandler">
.....
</requestHandler>
```
== Combiner Algorithm Plugin

The Search Handler also accepts parameters as below:
As mentioned xref:json-combined-query-dsl.adoc#query-dsl-structure[above], custom algorithms can be configured to combine the results across multiple queries using a https://solr.apache.org/guide/solr/latest/configuration-guide/solr-plugins.html[Solr plugin].

`maxCombinerQueries`::
This parameter can be set to put upper limit check on the maximum number of queries can be executed defined in `combiner.query`.
It defaults to `5` if not set.
The class to implement the custom logic has to extend `QueryAndResponseCombiner`, which is an abstract base class that provides a framework for implementing various algorithms used to merge ranked lists and shard documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the full reference of the class 'QueryAndResponseCombiner'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, having the full class reference here would be nice.


=== Combiner Algorithm Plugin
The custom class must be implemented in a Java project built against the Solr version that includes this feature (declared as a dependency in the build configuration), and the compiled JAR must then be deployed to the Solr libraries directory `../server/solr-webapp/webapp/WEB-INF/lib`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than say these specific things, please instead refer to solr-plugins.adoc which discusses several ways to install them (IMO least preferred is WEB-INF/lib). If you want to say more about java development, I think that page would be where we might want to say such things. But not this page here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this


As mentioned xref:json-combined-query-dsl.adoc#query-dsl-structure[above], custom algorithms can be configured to combine the results across multiple queries.
The Combined Query Search Handler definition takes parameter `combiners` where a custom class can be used to define the algorithm by giving a name and the parameters required.
The Combined Query Component definition takes the `combiners` parameter, where the custom class can be declared by specifying a name and the custom parameters required by the custom algorithm.

Example of the Search Handler as below:
Example of the Search Component as below:
```
<searchComponent class="solr.CombinedQueryComponent" name="combined_query">
<int name="maxCombinerQueries">2</int>
<int name="maxCombinerQueries">2</int>
<lst name="combiners">
<lst name="customAlgorithm">
<str name="class">org.apache.solr.search.combine.CustomCombiner</str>
<int name="var1">35</int>
<str name="var2">customValue</str>
<lst name="customAlgorithm">
<str name="class">org.apache.solr.handler.component.combine.CustomCombiner</str>
<int name="customParam1">35</int>
<str name="customParam2">customValue</str>
</lst>
</lst>
</searchComponent>
</searchComponent>
```

Then, when executing the combined query, the only thing that changes in the JSON query payload is the value specified in the `combiner.algorithm` parameter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate you breaking this down.


```
...
"params": {
"combiner": true,
"combiner.query": ["lexical1", "vector"],
"combiner.algorithm": "customAlgorithm"
}
...
```

In this case, `customAlgorithm` is specified which is the name defined in the configuration; the RRF-specific parameters do not need to be provided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add links to our blog posts, we have two on the topic, you can ask Lisa for the URLs (they will be published before this documentation will)