Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BACKPORT 2.4] [Platform] parsing of df output is fragile and may fai…
…l in case of "safe" error in df #7402 Summary: cluster_health.py, check_disk_utilization() now suppresses all the errors including problems with nfs mounts (like `df: ‘/mnt’: Input/output error` or `df: ‘/mnt’: Stale file handle`). Later in bounds of "[Platform] Update cluster_health script to screen out non-local volumes #5246" I'm going to exclude network resources from the output of the `df` command. I'm not doing it in this diff as I want to make an additional protection against other possible errors not related to network resources but also appearing as 'df: ...' messages. Original diff: https://phabricator.dev.yugabyte.com/D10787 Test Plan: Jenkins: rebase: 2.4 Test scenario: 1. Create a universe with three nodes. We will configure NFS resource on node 1 and will mount it to node 2. 2. Connect to node 1. Create the NFS resource: ``` yum install -y nfs-utils mkdir -p /nfs/share echo '/nfs/share *(rw)' >> /etc/exports systemctl start nfs ``` 3. Connect to node 2. Mount the created nfs: ``` mount -o soft,timeo=10 AAA.BBB.CCC.DDD:/nfs/share /mnt ``` where AAA.BBB.CCC.DDD is an IP address of node 1. Check that the mounted resource appeared using command `df -h`. 4. Wait while the health-check is completed for the universe. Verify that the mounted resource is there. 5. Connect to node 1. Turn off the NFS service: ``` systemctl stop nfs ``` 6. Connect to node 2. Execute command `df -h`. Verify that the result contains the error string: `df: ‘/mnt’: Input/output error` 7. Wait for the next health-check. Verify that it is OK and that the mounted resource disappeared from the report. Important: It is better to do step 5 closer to the next health-check time (otherwise we will get Reviewers: daniel Reviewed By: daniel Subscribers: jenkins-bot, yugaware Differential Revision: https://phabricator.dev.yugabyte.com/D10846
- Loading branch information