Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ceph MDS deployment not updated/created in case of MDS_ALL_DOWN #5846

Closed
stephan2012 opened this issue Jul 17, 2020 · 5 comments
Closed

Ceph MDS deployment not updated/created in case of MDS_ALL_DOWN #5846

stephan2012 opened this issue Jul 17, 2020 · 5 comments
Labels
bug ceph main ceph tag

Comments

@stephan2012
Copy link

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:
When Ceph reports MDS_ALL_DOWN, the Rook Ceph operator does not create/update the MDS deployments:

2020-07-17 13:40:34.597233 I | ceph-spec: ceph-block-pool-controller: CephCluster "rook-ceph" found but skipping reconcile since ceph health is &{"HEALTH_ERR" map["FS_DEGRADED":{"HEALTH_WARN" "1 filesystem is degraded"} "FS_WITH_FAILED_MDS":{"HEALTH_WARN" "1 filesystem has a failed mds daemon"} "MDS_ALL_DOWN":{"HEALTH_ERR" "1 filesystem is offline"} "MDS_INSUFFICIENT_STANDBY":{"HEALTH_WARN" "insufficient standby MDS daemons available"}] "2020-07-17T13:40:00Z" "2020-07-17T13:24:14Z" "HEALTH_WARN"}

I brought myself into this situation by setting resources limits triggering OOMkilled for both MDS. After raising the limit for the cephfilesystem CRD, I noticed that the operator does not update the MDS deployment. While I could have adjusted the deployment manually, I just went for deleting the deployments assuming the operator would restore them. Obviously, it does not.

(Side Note: A 4 GiB limit causing OOMkilled for a nearly empty CephFS looks a bit strange to me. Depending on further analysis I will file another issue for it.)

Expected behavior:

The Rook Ceph Operator should create/update the MDS deployments in case of MDS_ALL_DOWN. When this is too dangerous in general, an option similar to skipUpgradeChecks would help.

Also, it might be worth considering reverting to the previous deployment if Pods do not start with changed parameters.

How to reproduce it (minimal and precise):

  • Create CephFS and make MDS consume more than 4 gig memory
  • Set memory resource limit via cephfilesystem CRD

or

  • Delete both MDS deployments, restart operator

Environment:

  • OS (e.g. from /etc/os-release): Ubuntu 18.04.4 LTS
  • Kernel (e.g. uname -a): Linux n0201 5.3.0-62-generic #56~18.04.1-Ubuntu SMP Wed Jun 24 16:17:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Cloud provider or hardware configuration: Bare metal
  • Rook version (use rook version inside of a Rook Pod): rook: v1.3.8
  • Storage backend version (e.g. for ceph do ceph -v): ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)
  • Kubernetes version (use kubectl version): 1.15.4
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): kubeadm
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_ERR 1 filesystem is degraded; 1 filesystem has a failed mds daemon; 1 filesystem is offline; insufficient standby MDS daemons available
@travisn travisn added the ceph main ceph tag label Jul 17, 2020
@travisn
Copy link
Member

travisn commented Jul 17, 2020

@leseb How about if the upgrade looks for certain HEALTH_ERR codes and continues the upgrade if the codes are only from a list of known codes that are ok for the upgrade?

  • In this case, if MDS_ALL_DOWN makes sense to continue the upgrade.
  • For upgrading other daemons, they should upgrade if the only errors are from MDS.

An alternative idea is that we shouldn't even block the reconcile of MDS deployments based on the ceph health. The HEALTH_ERR check for upgrades makes more sense only for mons and osds IMO.

@stephan2012 stephan2012 changed the title Ceph MDS deployment not updated/create in case of MDS_ALL_DOWN Ceph MDS deployment not updated/created in case of MDS_ALL_DOWN Jul 20, 2020
@leseb
Copy link
Member

leseb commented Jul 20, 2020

The list of codes is interesting, but this makes me nervous at the same time...
We need better isolation of the error codes, hopefully Ceph errors are somewhat consistent between components :)

@bitfactory-henno-schooljan

What is the procedure for restarting the MDS in this case?

@LalitMaganti
Copy link
Contributor

This should now be marked as fixed.

@travisn
Copy link
Member

travisn commented Oct 30, 2020

Yes, resolved with #6494

@travisn travisn closed this as completed Oct 30, 2020
mergify bot pushed a commit that referenced this issue Oct 30, 2020
This allows the catch-22 situation where the filesystem cannot be
reconciled because there is no MDS but there is no MDS because the
operator has not reconciled the filesystem and brought up the MDS pods.

Closes #5967, #5846

Signed-off-by: Lalit Maganti <lalitm@google.com>
(cherry picked from commit 88f16e4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ceph main ceph tag
Projects
None yet
Development

No branches or pull requests

5 participants