Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-14680: Provide simple interfaces to our cloud classes (only API) #1694

Merged
merged 29 commits into from Aug 11, 2020

Conversation

noblepaul
Copy link
Contributor

@noblepaul noblepaul commented Jul 26, 2020

A few notes before anyone who starts reviewing this

  • This was created after I saw similar attempt as a part SOLR-14613: strongly typed placement plugin interface and implementation #1684 . I believe this has to receive a more wider input and review irrespective of whether devs are interested in autoscaling or not
  • This is a WIP PR
  • The concrete implementations are for demo purposes. Can be omitted, if required. Anything outside the o.a.s.cluster.api package is optional and will be removed
  • The interfaces are designed to be minimal to avoid overload. We can and will add more methods later. Let's not add a lot

@murblanc
Copy link
Member

Can we start simple by defining the external interface without the required changes to internal classes first? Will make a smaller PR easier to discuss.

@murblanc
Copy link
Member

I also feel this PR has a lot of overlap with the one I started to define an API for Autoscaling plugins (#1684).

Unless we want to pursue two distinct options for these interfaces and decide later (perfectly OK if that’s the idea) I suggest we use the other PR that has also examples of how these interfaces are used (makes it easier to reason about them and decide what goes where).

@gus-asf
Copy link
Contributor

gus-asf commented Jul 26, 2020

These interfaces look like a good way to begin. Thoughts: This is an excellent point at which to consider and then stick to a preferred set of terminology. My favorites are:

  • Machines are the operating system level container or hardware and have
    • Nodes which are running solr processes on machines which have portions of
      • Collections which are logical indexes associated with a particular schema and have
        • Shards which split a collection on a routing value to allow scaling and have
          • Replicas which are duplicate copies implementing a shard often hosted on separate nodes.

We should strive to apply only ONE word for each level throughout the interfaces to make them as simple and easy to understand as possible.

These 5 terms all have nice clear english words that communicate their function to some extent. Core however only really means "central thing" and I have always thought it was a very confusing word to use here since it has almost no memnonic value. Slice is also good for the shard level, but I think shard is no worse and more commonly used in documentation. Although core has been used many times in the past, I don't think anyone is going to have trouble finding what they want via the terms I'm suggesting.

The Cluster level concept usually ignores Machines and speaks only of nodes... I could be convinced either way on whether or not that's a good thing.

For fun I drew a near worst case diagaram (omitting back references and convenience rollups) :)
cluster-stuff

Copy link
Contributor

@gus-asf gus-asf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this specifies a read-only view. Made a comment regarding naming conventions elsewhere.


String baseUrl(boolean isV2);

SimpleMap<ShardReplica> cores();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly good to also answer the question of "what collections are participating in this node" and what shards of Collection X are on this node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, that is something that can be easily computed from the cores() method and can be provided as a helper method elsewhere

*/
public interface SolrCluster {
/** collections in the cluster */
SimpleMap<SolrCollection> collections();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also aliases... and the Alias class returned should list the collections provided and routing info if routed. (also law of Demeter etc...)

Copy link
Contributor Author

@noblepaul noblepaul Jul 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to include aliases. Where do you propose it to be there?

I would say , There should be no Alias class/interface.

It should be as simple as
SolrCluster#collections#get(String name , boolean includeAlias)

and

SolrCollection#aliases() returns List<String>

import org.apache.solr.common.util.SimpleMap;

/** Represents a collection in Solr */
public interface SolrCollection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I've definitely had cases where I wanted a list of nodes where this collection is hosted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be a utility method outside

/**
* Represents a Solr cluster
*/
public interface SolrCluster {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have our own package namespace, prepending solr isn't really needed unless we think we might also model non-solr clusters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. Imagine this will also be used outside of Solr as a part of SolrJ. So if you read some client code,

Cluster cluster;

is less readable compared to

SolrCluster solrCluster;

Eventually, I would wish to replace a lot of SolrJ code/API with a standard set of interfaces.

* Solr node
*/
String thisNode();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've wondered sometimes if clusters should have name or id of some sort but that's probably another topic.


import java.util.LinkedHashMap;

public class LinkedSimpleHashMap<T> extends LinkedHashMap<CharSequence, T> implements SimpleMap<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class appears unused...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this was to demonstrate how to implement a default impl of SimpleMap

@@ -38,7 +43,7 @@
* {@link ZkStateReader#getClusterState()}.
* @lucene.experimental
*/
public class ClusterState implements JSONWriter.Writable {
public class ClusterState implements JSONWriter.Writable , SolrCluster {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little worried that this design allows for casting of the SolrCluster reference...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have multiple implementations of SolrCluster .The idea of existing classes implementing these interfaces is to have readily available implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename Solr luster to Cluster.

@murblanc
Copy link
Member

murblanc commented Jul 26, 2020

Snitches have a target (at least node or replica, but one could imagine Cluster, collection or shard if such snitches do not already exist).
Would be helpful for all these classes to implement a common interface so they can be passed when a notch target is needed in Autoscaling plugins.

(See in https://github.com/apache/lucene-solr/pull/1684/files#diff-fc8bc5eb94b9e48a24c2ec768733e72b)

@chatman
Copy link
Contributor

chatman commented Jul 26, 2020

As I suggested in the other PR on autoscaling, I feel we should change the package name:
#1684 (comment)

Also, WDYT about marking the concrete classes DocCollection, Replica etc. with @lucene.internal to make it clear that they shouldn't be used directly from the plugins? Alternatively, we can mark these new interfaces in this PR with a new annotation (e.g. @solr.external) that signals to plugin writers that these will be stable for their use across versions.

@gus-asf
Copy link
Contributor

gus-asf commented Jul 27, 2020

While marking things internal and external is good, a stronger possibility is to code more defensively and make "new plugins" get instantiated/injected with implementations of a facade that return instances of the interfaces. If those implementations can't just be cast to internal types we then have a much higher barrier to abuse (reflection).

As for package names I agree we need to think carefully about them. It is a question what the goal is too. If we're reworking the internals, sdk won't sound right at all. If we are building something only meant to be consumed by an outer layer of plugins to insulate the plugins from internal refactoring, something in that vein sounds fine (or maybe o.a.s.cloud.plugin.api?) Settling on what our goals are will help us decide what name is best.

@noblepaul
Copy link
Contributor Author

noblepaul commented Jul 27, 2020

Can we start simple by defining the external interface without the required changes to internal classes first? Will make a smaller PR easier to discuss.

Feel free to ignore the implementing classes. That was to demonstrate one way of implementing these interfaces. I'm happy to focus on the interfaces inside the o.a.s.cluster.api package

I have updated the description. Only the SDK is important. Implementation can/will be removed

@noblepaul
Copy link
Contributor Author

I also feel this PR has a lot of overlap with the one I started to define an API for Autoscaling plugins #1684

Yes, totally.That ticket is trying to do something more than the scope of that ticket. I want to limit the scope of that ticket and have wider discussion on how a simple set of interfaces to represent the existing Solr cluster/cloud

@noblepaul
Copy link
Contributor Author

noblepaul commented Jul 27, 2020

These interfaces look like a good way to begin. Thoughts: This is an excellent point at which to consider and then stick to a preferred set of terminology. My favorites are:

There is no way to have a "machine" abstraction in Solr. We can start with a Node as an atomic unit. I wanted to carefully avoid the class names already taken up so that existing code can be left as it is

@gus-asf
Copy link
Contributor

gus-asf commented Jul 27, 2020

These interfaces look like a good way to begin. Thoughts: This is an excellent point at which to consider and then stick to a preferred set of terminology. My favorites are:

There is no way to have a "machine" abstraction in Solr. We can start of a Node as an atomic unit. I wanted to carefully avoid the class names already taken up so that existing code can be left as it is

Yeah machine is a difficult notion as it's a real world thing entirely outside java. It probably has to be optional and provided by sysprop or env var. and even so might be reducible to a property on node. Certainly detecting it would be fraught with peril. But machine designations (and rack designations) might be important to some folks.

@murblanc
Copy link
Member

Many methods returning other objects are returning names of other abstractions being defined here (Shard, Collection, Node etc.).
I’d think an instance of the abstraction should be returned instead, and that instance would have a getName() method (i.e have SolrCollection getCollection() rather than String getCollection() on a Shard for example).
What’s the rationale for returning names rather than objects?

@murblanc
Copy link
Member

I see two cases where we need interfaces such as defined here:

  • internal code. Coding to interfaces rather than the actual implementation makes for better structured code usually with less implementation leaks,
  • external “plugins”.

What’s the intention? If only the later, then there’s really no need to have the internal classes implement these interfaces, a wrapper is ok.
I believe addressing both points with a single interface is complicated (and to a point counterproductive as it ties internal and external views).

@chatman
Copy link
Contributor

chatman commented Jul 28, 2020

+1 to o.a.s.cluster.api package name. +1 to getting rid of "cloud" here.


String name();

/**set of files in the config */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General : Please format Javadocs properly with text starting on line 2 and capital letter, even if it takes more lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A one-line variant is acceptable: https://google.github.io/styleguide/javaguide.html#s7.1.1-javadoc-multi-line
The only problem I see with the javadoc here is that it's missing a leading space, and it should capitalize the first letter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, agree


import org.apache.solr.common.SolrException;

public interface Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ConfigSet instead? Config is soooo overloaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like both names , Config & ConfigSet

configset suggests it is a set of configs? in reality it's a single configuration which has multiple files

@murblanc
Copy link
Member

Machine vs Node: environment variables (rack, availability zone etc) are "machine" abstractions but can be considered at the node level. Free disk is clearly a machine abstraction (two nodes on same machine might be sharing the same free disk and skewing decisions based on it) but I don't see how this can be solved (even a single node might be sharing space with other processes).
Therefore I don't see how a machine abstraction helps.

@murblanc
Copy link
Member

Would love to see sample plug-in code accessing cluster config. How the instances are obtained and how the cluster is explored (nodes, collections, shards, replicas).

@noblepaul
Copy link
Contributor Author

@murblanc I shall try to provide some sample implementation in another 2-3 days

@noblepaul
Copy link
Contributor Author

Opening another PR

@noblepaul noblepaul changed the title SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes (only API) Aug 10, 2020
@noblepaul noblepaul changed the title SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes (only API) SOLR-14680: Provide simple interfaces to our cloud classes (only API) Aug 10, 2020
@noblepaul
Copy link
Contributor Author

I intend to merge this soon

@noblepaul noblepaul merged commit 15ae014 into apache:master Aug 11, 2020
gus-asf pushed a commit to gus-asf/lucene-solr that referenced this pull request Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants