SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections #2459

dsmiley · 2024-05-14T02:51:42Z

https://issues.apache.org/jira/browse/SOLR-17289

Not sure if the first version here is right; maybe there are untested issues where this won't work? And maybe with withCollection/withShards can be made to be scalable too. Like init the replicas of those collections only (not the whole cluster!)

dsmiley · 2024-05-14T14:23:00Z

Separately, I wonder if we make it too easy to loop the state of every collection. I'm looking at Cluster added by @murblanc which contains not only an Iterator<SolrCollection> iterator(); method, but also Iterable<SolrCollection> collections(); to make it that much easier. Instead, let's just have a method to list collection names. Then if the caller is hell bent on looping everything in the cluster, it's going to be that much more obvious to that code that it's looking up collection info for each and every one.

murblanc · 2024-05-17T09:18:36Z

Separately, I wonder if we make it too easy to loop the state of every collection. I'm looking at Cluster added by @murblanc which contains not only an Iterator<SolrCollection> iterator(); method, but also Iterable<SolrCollection> collections(); to make it that much easier. Instead, let's just have a method to list collection names. Then if the caller is hell bent on looping everything in the cluster, it's going to be that much more obvious to that code that it's looking up collection info for each and every one.

org.apache.solr.cluster.Cluster is made to present the internal cluster abstraction to plugin writers in order to decouple plugins from the internal implementation (so we can change the abstractions without breaking plugins).

The existing internal cluster abstraction does allow listing all collections, as does the Collection API BTW. Instead of shooting the messenger (org.apache.solr.cluster.Cluster) we could reconsider if listing all collections in general makes sense. Obviously listing all collections does not scale, but most SolrCloud deployments do not have such scaling issues.

If we do think that listing all collections is useful (which you seem to agree to given the proposal to return all names), I'd rather have the API we offer to plugin writers be easy to use. Returning names and forcing the caller to go fetch the collection one by one is not convenient.

aparnasuresh85 · 2024-05-17T18:14:09Z

We discovered a severe issue during QA for this change, where although the fix placed replicas faster by 90% on avg, replicas were consistently placed on only a few nodes. In our case, they were always placed on the same two nodes, likely due to the replication factor.

dsmiley · 2024-05-17T19:19:21Z

Yeah the results were disappointing from a placement diversity standpoint, which is a total deal-breaker. Perhaps a bit of randomness layered onto the placement would help with placement diversity? But I confess this really is just a draft PR; I didn't try to deeply understand why all replicas get weighted. I was encouraged to see all tests pass, so clearly there's a test gap that would allow this change to go in yet be quite flawed. I found no tests specific to SimplePlacementPlugin, the one we used with the change here.

We're going a different direction that does not use an OrderedNodePlacementPlugin foundation; we will not return to this matter to fix it, unfortunately.

RE Collection listing: IMO it should definitely continue to be supported. My objection is producing java.util.Collection or Iterable or Map of basically any aggregate of the state of a collection (e.g. SolrCollectoin, DocCollection, etc.). List collections by name, then force the caller to resolve a name to a state if it must. A bit of API friction can be a good thing where we know there are performance issues. I suppose we shall agree to disagree as usual Ilan.

epugh · 2024-05-22T21:00:58Z

One thought I had the other day is that I've seen plenty of API's that when you list things have a range.... Without the range or size parameter, you get X, but you can control that by specifying some other counter.... Would that allow folks with just a small number of collections have simplicity, but if you have 1000's, well then you want to use a range to work your way through? Kind of likes rows and start ;-)

SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections

ee1286e

dsmiley requested a review from HoustonPutman May 14, 2024 02:51

github-actions bot added the cat:cloud label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections #2459

SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections #2459

dsmiley commented May 14, 2024

dsmiley commented May 14, 2024

murblanc commented May 17, 2024

aparnasuresh85 commented May 17, 2024

dsmiley commented May 17, 2024

epugh commented May 22, 2024

SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections #2459

Are you sure you want to change the base?

SOLR-17289: OrderedNodePlacementPlugin: optimize don't loop collections #2459

Conversation

dsmiley commented May 14, 2024

dsmiley commented May 14, 2024

murblanc commented May 17, 2024

aparnasuresh85 commented May 17, 2024

dsmiley commented May 17, 2024

epugh commented May 22, 2024