Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-15715: Dedicated query coordinator nodes in the solr cluster #996

Merged
merged 35 commits into from Sep 13, 2022

Conversation

noblepaul
Copy link
Contributor

No description provided.

private static void addReplica(String syntheticCollectionName, CoreContainer cores) {
SolrQueryResponse rsp = new SolrQueryResponse();
try {
cores.getCollectionsHandler().handleRequestBody(new LocalSolrQueryRequest(null,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESOURCE_LEAK: resource of type org.apache.solr.request.LocalSolrQueryRequest acquired by call to LocalSolrQueryRequest(...) at line 117 is not released after line 114.
Note: potential exception at line 114


ℹ️ Learn about @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.


Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]

.setCreateNodeSet(cores.getZkController().getNodeName()).getParams();
log.info("sending collection admin command : {}", Utils.toJSONString(params));
cores.getCollectionsHandler()
.handleRequestBody(new LocalSolrQueryRequest(null,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESOURCE_LEAK: resource of type org.apache.solr.request.LocalSolrQueryRequest acquired by call to LocalSolrQueryRequest(...) at line 137 is not released after line 138.
Note: potential exception at line 138


ℹ️ Learn about @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.


Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]

};
}

public static class Factory implements SolrDispatchFilter.HttpSolrCallFactory {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs docs

* large heap utilization
* frequent GC pauses

In such cases, a few dedicated nodes can be started with a *`coordinator`* role and queries can be sent to that node and avoid intermittent and unpredictable load in data nodes. The coordinator node is stateless and does not host any data. So, we can create and destroy coordinator nodes without any data loass or down time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which client needs to be used to send queries to coordinator nodes? Do I need to set a routing property on my request? A shards.preference policy? How do I take advantage of this new node type? Especially if the set of coordinator nodes is dynamic, as suggested below, do I have to look up the live nodes in zookeeper?

Some of these questions would be better answered in solrcloud-distributed-requests.adoc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which client needs to be used to send queries to coordinator nodes?

Client support is not yet added as part of this PR.

Comment on lines +1104 to +1106
LBSolrClient.Req req = new LBSolrClient.Req(request, urlList);
LBSolrClient.Rsp rsp = getLbClient().request(req);
return rsp.getResponse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this style of multiple return points from a method, can we combine this with the same duplicated code at the end of the method?

==== The work-flow in a `coordinator` node

1. A request for *`coll-A`* that uses configset *`configset-A`* comes to coordinator node
2. It checks if there is a core that uses the configset *`configset-A`* is present. If yes, that core acts as a replica of *`coll-A`* and performs a distributed request to all shards of *`coll-A`* and sends back a response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that if collection-1 and collection-2 are both using the same configset then a single synthetic core would be able to service requests for the both of them. I don't think that is accurate from my reading of the code in CoordinatorHttpSolrCall.getCore - we should either be more precise with the docs or improve the code to handle this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that if collection-1 and collection-2 are both using the same configset then a single synthetic core would be able to service requests for the both of them

yes. why do you think otherwise ?

Comment on lines +54 to +56
if (action == SolrDispatchFilter.Action.PROCESS && core != null) {
solrReq = CoordinatorHttpSolrCall.wrappedReq(solrReq, collectionName, this);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this logic do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if mean , the if condition?

we want to ensure that the SolrQueryRequest object is custom and it behaves as if it is a request for the right collection

@madrob
Copy link
Contributor

madrob commented Sep 6, 2022

Left some comments. I tried to go in with fresh eyes, so I don't know how many are repeated from the last PR. In general, when you close and start new PRs we lose all of the discussion history and previous review feedback, so there can be content that is missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants