-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceph: fix monitor pvc storage on ebs #3594
Conversation
@@ -283,7 +304,6 @@ func (c *Cluster) makeMonDaemonContainer(monConfig *monConfig) v1.Container { | |||
k8sutil.PodIPEnvVar(podIPEnvVar), | |||
), | |||
Resources: cephv1.GetMonResources(c.spec.Resources), | |||
Lifecycle: opspec.PodLifeCycle(""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider this change for other daemons? For now, maybe we should leave this line (but commented out) and add a comment about why it's not needed. Otherwise, someone might think it is missing since all the other daemons have it and put it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we only be seeing this now for mons on PVs? That would affect the data dir, but remind me why is the log dir affected as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two issues here. According to the docs, using a post start life cycle for anything that the entry point depends on is a race. Assuming that's a correct reading and the chown is in fact a dependency, that's the primary reason I left this patch in. In that context, it seems like all usages are incorrect if the intention is to set permissions for the main container.
The second issue is the problem we saw with proc/<pid>/ipc
file not being found when chown was being run in the life cycle hook. I don't have a root cause for that, but it doesn't seem to show up the chown is performed in an init container.
A similar error here https://bugzilla.redhat.com/show_bug.cgi?id=1701326#c6 and someone mentioning a race between CRI-O and kubelet here cri-o/cri-o#1927 (comment)
Presumably postStart hooks are intended to observe containers that are fully initialized, but if anything would exacerbate an existing startup race, I think PVC attachments qualify (even if the mount related to the race is hostpath).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a note on the lifecycle hook linking back here for the discussion
the life cycle post-start hook is used to chown the ceph daemon log directory. however, hooks are not guaranteed to run before a container entry point. "Kubernetes sends the postStart event immediately after the Container is created. There is no guarantee, however, that the postStart handler is called before the Container’s entrypoint is called" Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
when the root of a filesystem mode pvc is mounted as the base data directory for the monitor it will not be empty: `lost+found` from ext4 will be present. this causes the mon mkfs to fail because it expects an empty directory. the solution is to mount under a directory in the source volume. Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Following the Ceph OSD rework, my test environment regularly failed to start OSDs, as they would start before the postStart lifecyle hook which was chown'ing the log and data directories could finish. Pods failed to start for an arbitrarily long time before starting successfully. This was a manifestation of a race condition identified in Ceph monitors in rook#3594 (comment) Create a Ceph operator-level method to create an init container that will chown the necessary directories for all Ceph daemons (any daemon which runs as the ceph:ceph user-and-group). Also add this container to expectations for Ceph pod templates in unit tests. Signed-off-by: Blaine Gardner <blaine.gardner@suse.com>
Description of your changes:
there is a build here for testing
noahdesu/rook-ceph-master-no-lifecycle-e
initializes ownership on files using init container rather than life cycle hook. see commit message for explanation.
make sure that pvc file system mount uses empty directory for daemon data
ext4 adds some extra directories in the root, so this PR mounts in a subdirectory.
Which issue is resolved by this Pull Request:
Resolves #3591
Checklist:
make codegen
) has been run to update object specifications, if necessary.