Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote node CRD #75

Closed
zlobober opened this issue Dec 13, 2023 · 4 comments
Closed

Remote node CRD #75

zlobober opened this issue Dec 13, 2023 · 4 comments
Assignees

Comments

@zlobober
Copy link
Contributor

We'd like to have an opportunity to describe a group of nodes (dat/exe/tab) as a separate CRD, providing the endpoints of a remote cluster. Such group of nodes (called remote nodes) will work the same as if they were described within the ytsaurus CRD, but they can belong to a different k8s cluster, which may be useful.

I'd suggest having a CRD RemoteNodeGroup, which looks exactly the same as dataNodes/execNodes/tabletNodes (and, ideally, reuses as much of existing code as possible).

The main difference is that there should be a field "remoteClusterSpec" which must be a reference to a resource (CRD) of type RemoteYtsaurusSpec.

RemoteYtsaurusSpec must describe necessary information for connecting a node to a remote cluster. remoteClusterSpec must be a structure, currently containing one field:

MasterAddresses: []string

But in the future it will probably be extended by various other fields.
When ytsaurus/ytsaurus#248 completes:

RemoteClusterUrl: string // allows taking cluster connection by HTTP cal to RemoteClusterUrl + "/cluster_connection".

When authentication in native protocol is employed:

Credentials: SomeKindOfCredentials
@psushin
Copy link
Contributor

psushin commented Dec 15, 2023

Should RemoteYsaurusSpec be a separate CRD, or just described in a config map?
This CRD doesn't have any behavior, as far as I can see, so no particular controller, reconciliation, etc.

@zlobober
Copy link
Contributor Author

We discussed this matter with k8s-experienced colleagues, and it seems like having CRD for such case is a common practice. Configmap is unschematized object, and CRD is schematized, which is much less error-prone. In general, CRD does not necessarily require a designated controller, it may be a part of a more complex CRD like in our case.

robot-piglet pushed a commit that referenced this issue Jan 5, 2024
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

First PR for #75

Currently we have test coverage for YTsaurus config generating code only for master and scheduler. Here I added tests for other components, so config generator can be easily refactored if needed.

Changes are only in test code and existing tests behaviour is not changed.

---

Pull Request resolved: #83
@l0kix2
Copy link
Collaborator

l0kix2 commented Jan 5, 2024

This change will be separated in a chain of PRs to ease codereviewing.

  1. (merged) Config generator tests for all components #83 in which we increase test coverage of config generator so we don't break things in following refactoring;
  2. [remote nodes] 1. Refactor config generator #84 in which config generator is being refactored to be able to work with remote nodes in the future.
  3. [remote nodes] 2. Refactor statefulset & server #87 server/statefulset refactoring to be able to work without full ytsaurus spec with only master connection
  4. [remote nodes] 3. Remote CRDs and controller #111 new CRDs for remote ytsaurus & remote exec nodes (we've decided to start with exec nodes and later do the other nodes) + reconciller
  5. [remote nodes] 4. Remote nodes controller #112 remote exec nodes controller implementation
  6. [remote nodes] 5. e2e test #113 e2e test

@l0kix2
Copy link
Collaborator

l0kix2 commented Feb 1, 2024

This CRD doesn't have any behavior, as far as I can see, so no particular controller, reconciliation, etc.

In general, CRD does not necessarily require a designated controller, it may be a part of a more complex CRD like in our case.

To be clear, RemoteYsaurus CRD doesn't require reconciliation controller, but if master hosts in RemoteYsaurus are changed — RemoteNodes controller should update nodes configs. Therefore RemoteNodes controller need to watch connected RemoteYtsaurus changes.
Implementation-wise it would be simpler to inline(or yaml-anchor-reference from the same file) masterHosts+cellTag in RemoteNodes spec, though I've found an example of watching external resource from controller — it requires a bit of boilerplate code, but it seems to work, so we can proceed with separate CRD option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants