Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split startup logic out of ScyllaDB container #1934

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

tnozicka
Copy link
Member

@tnozicka tnozicka commented May 22, 2024

Description of your changes:
Startup logic should run in a separate container. See #1940 for more context.

This PR adds the scylla-operator run-ignition command, that runs in its own container and signals ScyllaDB container by creating a file that unblocks its start up. Unfortunately, we can't do this with an init container because we need ScyllaDB container running so tuning can be run and its result can unblock the startup.

Contrary to the old logic that was running in the same container where scylla container ID didn't change (it was restarted with it), the new logic has to deal with all the externalities like living through a scylla container restart and its container id changing.

(It also raises log level for checking the tuning ConfigMap so it's clearer what's being waited on.)

Which issue is resolved by this Pull Request:
Resolves #1941

Requires

@tnozicka tnozicka added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 22, 2024
Copy link
Contributor

@tnozicka: GitHub didn't allow me to request PR reviews from the following users: tnozicka.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Description of your changes:
TODO

Which issue is resolved by this Pull Request:
Resolves TODO

Requires

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@scylla-operator-bot scylla-operator-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 22, 2024
@tnozicka tnozicka changed the title [WIP] Split startup logic from ScyllaDB container [WIP] Split startup logic out of ScyllaDB container May 24, 2024
@scylla-operator-bot scylla-operator-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 11, 2024
@scylla-operator-bot scylla-operator-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 15, 2024
@tnozicka tnozicka force-pushed the split-starter branch 4 times, most recently from 81acb6e to 3f15229 Compare July 18, 2024 12:01
@tnozicka tnozicka changed the title [WIP] Split startup logic out of ScyllaDB container Split startup logic out of ScyllaDB container Jul 18, 2024
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 18, 2024
@tnozicka
Copy link
Member Author

/cc @zimnx @rzetelskik

@tnozicka
Copy link
Member Author

ping @rzetelskik @zimnx

@rzetelskik
Copy link
Member

ping @rzetelskik @zimnx

Sorry, this one slipped away. I'll send a review today.

pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Show resolved Hide resolved
pkg/cmd/operator/sidecar.go Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Outdated Show resolved Hide resolved
pkg/cmd/operator/ignition.go Show resolved Hide resolved
@tnozicka tnozicka changed the title Split startup logic out of ScyllaDB container [WIP] Split startup logic out of ScyllaDB container Jul 30, 2024
@scylla-operator-bot scylla-operator-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 30, 2024
@tnozicka
Copy link
Member Author

@scylla-operator-bot scylla-operator-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/dependency Issues or PRs related to dependency changes size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 6, 2024
@tnozicka tnozicka force-pushed the split-starter branch 6 times, most recently from 4c4f8de to d9b34c6 Compare August 13, 2024 16:30
@tnozicka tnozicka changed the title [WIP] Split startup logic out of ScyllaDB container Split startup logic out of ScyllaDB container Aug 13, 2024
@tnozicka
Copy link
Member Author

@zimnx @rzetelskik please take another look. I had to add an extra controller instead of the serial (and non-repeated) check to evaluate the ignition / tuning conditions, like it used to, because a scylla container can now restart independently and tuning was getting stuck on the old containerID not being tuned. (The independent restart always happens in our NodeConfig should correctly project state for each scylla pod test because a startup probe times out when the tuning is blocked for 10m30s. The container restart, combined with preStopHook and cri-o bug has also uncovered the other issues around contexts and retries with NodeConfigs handled in the other PRs like #2058 #2073)

Copy link
Member

@rzetelskik rzetelskik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of naming comments
rest lgtm

/assign zimnx

pkg/controller/ignition/controller.go Outdated Show resolved Hide resolved
pkg/controller/ignition/controller.go Show resolved Hide resolved
pkg/controller/ignition/controller.go Outdated Show resolved Hide resolved
pkg/controller/ignition/controller.go Outdated Show resolved Hide resolved
pkg/controller/scyllacluster/resource.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@zimnx zimnx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
/assign rzetelskik

Copy link
Member

@rzetelskik rzetelskik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@scylla-operator-bot scylla-operator-bot bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2024
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rzetelskik, tnozicka, zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rzetelskik
Copy link
Member

@tnozicka there's still a WIP label, is this on purpose?

@tnozicka tnozicka removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 16, 2024
@tnozicka
Copy link
Member Author

@tnozicka there's still a WIP label, is this on purpose?

nope, probably some race, thx

@scylla-operator-bot scylla-operator-bot bot merged commit b3da0a4 into scylladb:master Aug 16, 2024
12 checks passed
@tnozicka tnozicka deleted the split-starter branch August 16, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/dependency Issues or PRs related to dependency changes kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split startup logic out of ScyllaDB container
3 participants