-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PWX-28826: Handle pre-flight check for DMthin #1014
Conversation
Signed-off-by: Harsh Desai <hadesai@purestorage.com>
Signed-off-by: Harsh Desai <hadesai@purestorage.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
* Updating CSV to use 23.3.1 released image * Update for 23.3.1 release * Controller gen vendor Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29389 Add CRD for portworx diags collection Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29409: Ignore zones with no nodes (#1008) In disaggregated mode, there could be zones in which no storage nodes might be present. Such a zone would make the maxSNPZ value to be 0. CHanging the behavior to ignore 0 nodes in a zone for maxSNPZ calculation. Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> --------- Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
…emove 'wait' code. Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
drivers/storage/portworx/component/securitycontextconstraints.go
Outdated
Show resolved
Hide resolved
pkg/preflight/utils.go
Outdated
@@ -11,8 +11,9 @@ func IsEKS() bool { | |||
|
|||
// RequiresCheck returns whether a preflight check is needed based on the platform | |||
func RequiresCheck() bool { | |||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we wanted to enable this only on EKS, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No the way Harsh and Prabir thought this should go is that we would always run the pre-flight and we have a check for EKS stuff in IsEKS() condition blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming we will fix this later and run it only on cloud or if the metadata device is already provided. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I opened up https://portworx.atlassian.net/browse/PWX-30418 to make sure.
Signed-off-by: Jose Rivera <jose@portworx.com>
…that don't work since Validate was removed from the controller.validate() func. PWX-30373 to try and fix later. Signed-off-by: Jose Rivera <jose@portworx.com>
… check failure to trigger the needed workflow. Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
…space. Signed-off-by: Jose Rivera <jose@portworx.com>
…running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com>
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #1014 +/- ##
========================================
Coverage 77.73% 77.74%
========================================
Files 60 61 +1
Lines 16362 16678 +316
========================================
+ Hits 12719 12966 +247
- Misses 2772 2817 +45
- Partials 871 895 +24
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
…ctionality correctly. Signed-off-by: Jose Rivera <jose@portworx.com>
drivers/storage/portworx/portworx.go
Outdated
|
||
// Add five minute timeout. If we do reconcile loop check we will need a different way. | ||
cnt++ | ||
if cnt == 100 { // 3s * 100 = 300s (5 mins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not be enough because the first time we would be pulling the px-enterprise image as well. Should we set it to 10mins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
incremented the time for now and logged https://portworx.atlassian.net/browse/PWX-30419 to fix this.
pkg/preflight/utils.go
Outdated
@@ -11,8 +11,9 @@ func IsEKS() bool { | |||
|
|||
// RequiresCheck returns whether a preflight check is needed based on the platform | |||
func RequiresCheck() bool { | |||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming we will fix this later and run it only on cloud or if the metadata device is already provided. Correct?
Signed-off-by: Jose Rivera <jose@portworx.com>
… exists. Signed-off-by: Jose Rivera <jose@portworx.com>
Signed-off-by: Jose Rivera <jose@portworx.com>
* PWX-28826 Boilerplace Signed-off-by: Harsh Desai <hadesai@purestorage.com> * more boilerplate Signed-off-by: Harsh Desai <hadesai@purestorage.com> * PWX-28826: Pre-flight check for DMthin. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add comments and move StorageNode cleanup. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. (#1010) Signed-off-by: Jose Rivera <jose@portworx.com> * Pwx 28826 (#1011) * Pwx 28826 (#1012) * PWX-28826: Update with the latest master changes. (#1013) * Updating CSV to use 23.3.1 released image * Update for 23.3.1 release * Controller gen vendor Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29389 Add CRD for portworx diags collection Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29409: Ignore zones with no nodes (#1008) In disaggregated mode, there could be zones in which no storage nodes might be present. Such a zone would make the maxSNPZ value to be 0. CHanging the behavior to ignore 0 nodes in a zone for maxSNPZ calculation. Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> --------- Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com> * Add PassPreFlight event tag and logging Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Check status of portworx container in pre-flight pod and remove 'wait' code. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: PR review changes and fix portworx_test.go UTs Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix gomack Validate calls. Also comment out the two tests that don't work since Validate was removed from the controller.validate() func. PWX-30373 to try and fix later. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Re-add back in the commented out tests and add K8s version check failure to trigger the needed workflow. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit pre-check wait if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add 5 min timeout to pre-flight status check. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit GetPreFlightStatus() with success if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Don't automatically enable dmthin via pre-flight check if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Revert UT and integration test hacks. Need to mock the functionality correctly. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Increase pre-flight daemonset ready wait to 10mins. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix 'TestValidate' UT. Don't error if pre-flight daemonset exists. Signed-off-by: Jose Rivera <jose@portworx.com> * Only run preflight if AWS. Signed-off-by: Jose Rivera <jose@portworx.com> --------- Signed-off-by: Harsh Desai <hadesai@purestorage.com> Signed-off-by: Jose Rivera <jose@portworx.com> Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: Harsh Desai <hadesai@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com>
* PWX-29973 Add check result to StorageNode CRD (#992) Signed-off-by: Harsh Desai <hadesai@purestorage.com> * [PWX-25353][PWX-25355] Auto detect eks cloud environment and check cloud permissions * [PWX-27619] Update cloudops vendor for EKS dry run * [PWX-27619] Use dry run for eks cloud permission check * [PWX-27621] Fix cluster status issue after running preflight * [PWX-25354] Set default cloud storage spec on EKS * PWX-27656: auto-ssl support (resync w/ DaemonSet) (#826) Syncs DaemonSet changes done for auto- ssl/tls support: * adding events/update RBAC permission * adding certificatesigningrequests RBAC permissions * adding containerdvardir (/var/lib/containerd) mount Signed-off-by: Zoran Rajic <zrajic@purestorage.com> * [PWX-27588] Add support for EKS cloud storage capacity based configuration * [PWX-27622] Use provided AWS credentials to run permission check on EKS * [PWX-28664] Unset AWS credential env vars after client creation * PWX-29973 Add check result to StorageNode CRD (#992) Signed-off-by: Harsh Desai <hadesai@purestorage.com> * [PWX-27765] StorageCluster status redesign to show more details * PWX-28826: Handle pre-flight check for DMthin (#1014) * PWX-28826 Boilerplace Signed-off-by: Harsh Desai <hadesai@purestorage.com> * more boilerplate Signed-off-by: Harsh Desai <hadesai@purestorage.com> * PWX-28826: Pre-flight check for DMthin. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add comments and move StorageNode cleanup. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. (#1010) Signed-off-by: Jose Rivera <jose@portworx.com> * Pwx 28826 (#1011) * Pwx 28826 (#1012) * PWX-28826: Update with the latest master changes. (#1013) * Updating CSV to use 23.3.1 released image * Update for 23.3.1 release * Controller gen vendor Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29389 Add CRD for portworx diags collection Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29409: Ignore zones with no nodes (#1008) In disaggregated mode, there could be zones in which no storage nodes might be present. Such a zone would make the maxSNPZ value to be 0. CHanging the behavior to ignore 0 nodes in a zone for maxSNPZ calculation. Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> --------- Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com> * Add PassPreFlight event tag and logging Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Check status of portworx container in pre-flight pod and remove 'wait' code. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: PR review changes and fix portworx_test.go UTs Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix gomack Validate calls. Also comment out the two tests that don't work since Validate was removed from the controller.validate() func. PWX-30373 to try and fix later. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Re-add back in the commented out tests and add K8s version check failure to trigger the needed workflow. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit pre-check wait if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add 5 min timeout to pre-flight status check. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit GetPreFlightStatus() with success if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Don't automatically enable dmthin via pre-flight check if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Revert UT and integration test hacks. Need to mock the functionality correctly. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Increase pre-flight daemonset ready wait to 10mins. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix 'TestValidate' UT. Don't error if pre-flight daemonset exists. Signed-off-by: Jose Rivera <jose@portworx.com> * Only run preflight if AWS. Signed-off-by: Jose Rivera <jose@portworx.com> --------- Signed-off-by: Harsh Desai <hadesai@purestorage.com> Signed-off-by: Jose Rivera <jose@portworx.com> Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: Harsh Desai <hadesai@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com> * PWX-30496: if '-T dmthin' exists in stc before preflight is ran and preflight fails don't start. (#1019) * PWX-28826 Boilerplace Signed-off-by: Harsh Desai <hadesai@purestorage.com> * more boilerplate Signed-off-by: Harsh Desai <hadesai@purestorage.com> * PWX-28826: Pre-flight check for DMthin. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add comments and move StorageNode cleanup. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. Signed-off-by: Jose Rivera <jose@portworx.com> * Passed checks should be Info events. (#1010) Signed-off-by: Jose Rivera <jose@portworx.com> * Pwx 28826 (#1011) * Pwx 28826 (#1012) * PWX-28826: Update with the latest master changes. (#1013) * Updating CSV to use 23.3.1 released image * Update for 23.3.1 release * Controller gen vendor Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29389 Add CRD for portworx diags collection Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> * PWX-29409: Ignore zones with no nodes (#1008) In disaggregated mode, there could be zones in which no storage nodes might be present. Such a zone would make the maxSNPZ value to be 0. CHanging the behavior to ignore 0 nodes in a zone for maxSNPZ calculation. Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> --------- Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com> * Add PassPreFlight event tag and logging Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Check status of portworx container in pre-flight pod and remove 'wait' code. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Fix unit test. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: PR review changes and fix portworx_test.go UTs Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix gomack Validate calls. Also comment out the two tests that don't work since Validate was removed from the controller.validate() func. PWX-30373 to try and fix later. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Re-add back in the commented out tests and add K8s version check failure to trigger the needed workflow. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit pre-check wait if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Add 5 min timeout to pre-flight status check. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Exit GetPreFlightStatus() with success if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Don't automatically enable dmthin via pre-flight check if running CBT namespace. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30373: Revert UT and integration test hacks. Need to mock the functionality correctly. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: Increase pre-flight daemonset ready wait to 10mins. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826: fix 'TestValidate' UT. Don't error if pre-flight daemonset exists. Signed-off-by: Jose Rivera <jose@portworx.com> * Only run preflight if AWS. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30496: If the user intended to use dmthin. The '-T dmthin' will exist in the stc before preflight is ran. If preflight fails in this case don't start. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30496: If preflight enables DMthin add a 64G metadata drive. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30496: Review fixes. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-30496: add the metadata device in both cases where the user has passed -T dmthin or we added it for them. Signed-off-by: Jose Rivera <jose@portworx.com> --------- Signed-off-by: Harsh Desai <hadesai@purestorage.com> Signed-off-by: Jose Rivera <jose@portworx.com> Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: Harsh Desai <hadesai@purestorage.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com> * Fix test broken when merged in pre-flight code. Signed-off-by: Jose Rivera <jose@portworx.com> * PWX-28826 & PWX-30496: Add ClusterCondition Source to test runs. Signed-off-by: Jose Rivera <jose@portworx.com> * [PWX-27765] Fix migration status update issue --------- Signed-off-by: Harsh Desai <hadesai@purestorage.com> Signed-off-by: Zoran Rajic <zrajic@purestorage.com> Signed-off-by: Jose Rivera <jose@portworx.com> Signed-off-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Signed-off-by: Naveen Revanna <nrevanna@purestorage.com> Co-authored-by: Harsh Desai <hadesai@purestorage.com> Co-authored-by: Jiafeng Liao <jliao@purestorage.com> Co-authored-by: Zoran Rajic <zox@portworx.com> Co-authored-by: CNBU Jenkins <cnbu-jenkins@purestorage.com> Co-authored-by: Piyush Nimbalkar <pnimbalkar@purestorage.com> Co-authored-by: Naveen Revanna <83608369+nrevanna@users.noreply.github.com>
Modify operator to execute a pre-flight on each node in the cluster. This pre-flight will actually be executed by oci-monitor and px-runc installed on the host. The pre-flight will determine if DMthin can be enabled on the cluster. We do this by having operator launch a daemonset using the storage config with an extra pre-flight option. This will cause oci-monitor to run in pre-flight mode which executes checks via px-runc on each node. The results of the check are passed to oci-monitor via a json file. They are read in and added to the StorageNode obj and passed to operator. Once the checks are done this pre-flight daemonset is deleted by the operator. The operator processes the checks returned in the StorageNode obj to determine if the existing storage config can be modified to enable DMthin (add: -T dmthin param). If the checks fail then the storage config is not updated and the cluster starts with the original stc.
Which issue(s) this PR fixes (optional)
Closes #
PWX-28826
Special notes for your reviewer:
px-runc side is in this PR: https://github.com/portworx/porx/pull/11018
px-installer side is in this PR: https://github.com/portworx/px-installer/pull/1679