-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cephadm: add basic init containers support #52178
Conversation
98afaf2
to
8f48ab8
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
8f48ab8
to
e3df865
Compare
As discussed in recent ceph orch weekly and standup meetings, I experimented with the behavior of the systemd unit and long running init containers. I can confirm that a single init container or a cumulative running for more than the TimeoutStartSec value will cause the entire unit to time out (and then restart). Short of major changes to the systemd unit files I think we'll need to live with this restriction for now. When I add docs to this PR I can add a note covering this fact. When I work on planned changes to add sidecar containers, which will probably require changes to systemd unit files, I think then we can revisit this and either put a longer timeout, or even possibly allow custom timeout values. This will be discussed in an upcoming orch weekly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, as long as all the teuthology tests still pass with the refactoring done here . The actual changes for the init container support itself all seem good.
ic_params = params['init_containers'] = [] | ||
for ic in ics: | ||
ic_meta.append(ic.to_json()) | ||
ic_params.append(ic.to_json(flatten_args=True)) | ||
return ic_meta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why we're building this ic_params
here when we don't end up returning it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an alias ic_params = params['init_containers'] = []
that side effects the params dict passed in. This is similar to the behavior of _setup_extra_deployment_args as I was intentionally mimicking that function. Do you think the aliasing is too confusing?
@@ -1297,6 +1298,7 @@ async def _create_daemon(self, | |||
extra_entrypoint_args=ArgumentSpec.map_json( | |||
extra_entrypoint_args, | |||
), | |||
init_containers=init_containers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also want to set these up like extra_container_args
and extra_entrypoint_args
where if they're modified we'll redeploy the daemon automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point thanks, I'll look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in the weekly orch meeting and agreed to add it in a follow up PR.
e3df865
to
752e02d
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
752e02d
to
f91429d
Compare
a5e3d13
to
301f5df
Compare
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
301f5df
to
c263144
Compare
2849c6a
to
f2422f5
Compare
Rename argument to "container" from "c" because a one character variable at a large function's scope (or anywhere outside of a comprehension) is a understanding/readability issue (IMO). Signed-off-by: John Mulligan <jmulligan@redhat.com>
Refactor the _write_container_cmd_to_bash such that it now uses a new _bash_cmd helper function and tries to avoid non-DRY repetition of the same pattern of shell command expression. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Break up deploy_daemon_units which had a lot of various "sub-tasks" implemented within the main body of the function. Splitting up the function makes each function's scope smaller, more readily testable and more readable. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a BasicContainer class that will act as a base class for CephContainer and any future XyzContainer classes. The class is currently a stub that provides no functionality. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Move a bunch of logic out of CephContainer and into BasicContainer. BasicContainer needs to be more general than CephContainer so it adds a few parameters that don't exist in CephContainer and CephContainer will provide defaults for those values. Signed-off-by: John Mulligan <jmulligan@redhat.com>
It's an extremely common pattern in cephadm to need a triple of (fsid, daemon_type, daemon_id). These triples are used to generate various names and paths etc. Add a new type, DaemonIdentity, that can represent these values in a more convient package. It has an additional optional field "subcomponent" that can represent a part of a daemon/service. It is expected to be used in an immutable manner like a namedtuple, but I didn't end using a namedtuple because cephadm doesn't make much use of them and the syntax to combine namedtuple and type hints is awkward. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Use the new DaemonIdentity class to help identify a container. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add an optional identity field based on the new DaemonIdentity type to help name/identify the CephContainers. I would have preferred to make this field mandatory but there are so many places in the code that call CephContainer now, I didn't want to have to touch them all at once. CephContainer objects created using the for_daemon classmethod *will* all have DaemonIdentity set. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a new InitContainer class that is similar to CephContainer but will not assume certain defaults and is expected to run for a "short" period before exiting. These init containers will be used to preform tasks before a long running container is started. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add functions that write commands to clean up and execute init containers. These init containers will be executed before the main container as part of unit.run files (for now). Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add get_deployment_init_containers to extract any init containers from the deploy configuration context. Currently, all init containers are to be explicitly defined, not inferred from the service type. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Complete the deployment functions so that any custom containers defined with "init_containers" will add said init containers to the commands to be executed. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a test case that covers the act of setting init_containers in the deploy config on a custom container instance. This test executes custom containers that both specify custom volume_mounts and not, which are expected to "inherit" the volume_mounts of the primary container's config. Signed-off-by: John Mulligan <jmulligan@redhat.com>
The InitContainerSpec type will be used to define explicit "init containers" - containers that are expected to run and then exit and are executed prior to running a primary container. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Use the "black" style of multi-line import with parens. Signed-off-by: John Mulligan <jmulligan@redhat.com>
This allows custom containers to run init containers before the primary container starts. Signed-off-by: John Mulligan <jmulligan@redhat.com>
This will permit serializing init_container information for sending to cephadm binary. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
This will permit the transfer of init_containers from a spec to the structure that actually generates the deploy JSON. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Use the parenthesized format with each import on a line for readability. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Signed-off-by: John Mulligan <jmulligan@redhat.com>
…ctrs This adds a test case to explicitly check custom containers with init containers. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add documentation covering the new init_containers parameter and the parameters within each init container's configuration. Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a 2nd custom container yaml to the test_extra_daemon_features.yaml file that uses two init containers. Add verification that the init containers (and primary) ran successfully. Signed-off-by: John Mulligan <jmulligan@redhat.com>
34e797b
to
921f474
Compare
3 failures, 4 jobs that will time out (still "running" as of me writing this
Overall, no failures that wouldn't be expected on a main branch run so nothing to block merging |
Depends on #51863
Fixes: https://tracker.ceph.com/issues/62170
Add init_containers support to the cephadm binary and cephadm orchestrator module via the Custom Container service type.
init_containers supports a more formalized method of executing a container that runs to completion and exits prior to a long running deaemon container. These containers can be used to assert certain preconditions or configure the environment for the later daemon container or perform other setup actions.
init containers currently support fields:
init containers are currently part of the script that starts daemon containers (unit.run) and is limited by the systemd services start-up timeout of 200 seconds. In later work we will be expanding cephadm's use of systemd services and plan to seperate init_containers into a separate service such that limits can be placed on the init containers (or not) independently of the service and vice versa.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows