Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PWX-32580: Do not create pre-flight storage node objects for nodes wh… #1222

Merged
merged 7 commits into from
Aug 26, 2023

Conversation

jrivera-px
Copy link
Contributor

…ich have label px/enabled=false.

What this PR does / why we need it:
If pre-flight storage node exists it should have status results provided by the pre-flight pod run. If not we will assume pre-flight failed to run on that node and we will fail pre-flight. We create storage nodes objects for all nodes except for master. However when label px/enabled=false is supplied on a node, pre-flight will not run on this node. This resulted in that node not having any results and we fail pre-flight. So don't create storage node object for nodes with PX disabled label."

Which issue(s) this PR fixes (optional)
Closes #
PWX-32580
Special notes for your reviewer:

…ich have label px/enabled=false.

Signed-off-by: Jose Rivera <jose@portworx.com>
@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Patch coverage: 60.60% and project coverage change: +0.04% 🎉

Comparison is base (0146c87) 75.77% compared to head (5b92f56) 75.82%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1222      +/-   ##
==========================================
+ Coverage   75.77%   75.82%   +0.04%     
==========================================
  Files          65       65              
  Lines       18410    18422      +12     
==========================================
+ Hits        13951    13968      +17     
+ Misses       3464     3455       -9     
- Partials      995      999       +4     
Files Changed Coverage Δ
pkg/controller/storagecluster/storagecluster.go 77.50% <60.60%> (+0.88%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jrivera-px
Copy link
Contributor Author

jrivera-px commented Aug 21, 2023

Latest runs on OCP worked

time="21-08-2023 07:30:24" level=info msg="Reconciling StorageNode" file="storagenode.go:204" storagenode=ocp-dev-3-hts86-worker-0-j5kl8
time="21-08-2023 07:30:24" level=info msg="Create pre-flight storage node entry for node: ocp-dev-3-hts86-worker-0-z87ps" file="storagecluster.go:452"
time="21-08-2023 07:30:24" level=info msg="Reconciling StorageNode" file="storagenode.go:204" storagenode=ocp-dev-3-hts86-worker-0-j5kl8
time="21-08-2023 07:30:24" level=info msg="Reconciling StorageNode" file="storagenode.go:204" storagenode=ocp-dev-3-hts86-worker-0-z87ps
time="21-08-2023 07:30:24" level=debug msg="pod does not fit" file="pod.go:207" fitsNodeAffinity=false fitsNodeName=true nodeName=ocp-dev-3-hts86-worker-0-ztckc
time="21-08-2023 07:30:24" level=info msg="Skipping pre-flight storage node entry for node: ocp-dev-3-hts86-worker-0-ztckc" file="storagecluster.go:455"
time="21-08-2023 07:30:24" level=debug msg="pod does not fit" file="pod.go:207" fitsNodeAffinity=false fitsNodeName=true nodeName=ocp-dev-3-hts86-master-0
time="21-08-2023 07:30:24" level=info msg="Skipping pre-flight storage node entry for node: ocp-dev-3-hts86-master-0" file="storagecluster.go:455"
time="21-08-2023 07:30:24" level=debug msg="pod does not fit" file="pod.go:207" fitsNodeAffinity=false fitsNodeName=true nodeName=ocp-dev-3-hts86-master-1
time="21-08-2023 07:30:24" level=info msg="Skipping pre-flight storage node entry for node: ocp-dev-3-hts86-master-1" file="storagecluster.go:455"
time="21-08-2023 07:30:24" level=debug msg="pod does not fit" file="pod.go:207" fitsNodeAffinity=false fitsNodeName=true nodeName=ocp-dev-3-hts86-master-2
time="21-08-2023 07:30:24" level=info msg="Skipping pre-flight storage node entry for node: ocp-dev-3-hts86-master-2" file="storagecluster.go:455"
time="21-08-2023 07:30:24" level=info msg="Create pre-flight storage node entry for node: ocp-dev-3-hts86-worker-0-hps77" file="storagecluster.go:452"
time="21-08-2023 07:30:24" level=info msg="Reconciling StorageNode" file="storagenode.go:204" storagenode=ocp-dev-3-hts86-worker-0-z87ps
time="21-08-2023 07:30:24" level=info msg="Reconciling StorageNode" file="storagenode.go:204" storagenode=ocp-dev-3-hts86-worker-0-hps77

...
...

time="21-08-2023 07:30:49" level=info msg="pre-flight: process pre-flight results..." file="preflight.go:388"
time="21-08-2023 07:30:49" level=info msg="storageNode[ocp-dev-3-hts86-worker-0-j5kl8]: []v1.CheckResult{v1.CheckResult{Type:\"status\", Reason:\"oci-mon: pre-flight completed\", Success:true}} " file="preflight.go:317"
time="21-08-2023 07:30:49" level=info msg="storageNode[ocp-dev-3-hts86-worker-0-z87ps]: []v1.CheckResult{v1.CheckResult{Type:\"status\", Reason:\"oci-mon: pre-flight completed\", Success:true}} " file="preflight.go:317"
time="21-08-2023 07:30:49" level=info msg="storageNode[ocp-dev-3-hts86-worker-0-hps77]: []v1.CheckResult{v1.CheckResult{Type:\"status\", Reason:\"oci-mon: pre-flight completed\", Success:true}} " file="preflight.go:317"
time="21-08-2023 07:30:49" level=info msg="Enabling PX-StoreV2" file="events.go:16"
time="21-08-2023 07:30:49" level=info msg="pre-flight: done..." file="portworx.go:207"

Comment on lines 449 to 455
shouldRun, _, err := c.nodeShouldRunStoragePod(&node, toUpdate)
if err == nil {
if shouldRun {
logrus.Infof("Create pre-flight storage node entry for node: %s", node.Name)
c.createStorageNode(toUpdate, node.Name)
} else {
logrus.Infof("Skipping pre-flight storage node entry for node: %s", node.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add UTs to cover the diff here.

Copy link
Contributor

@nrevanna nrevanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrivera-px does this also take care of infra nodes in OCP scenario?

Copy link
Contributor

@harsh-px harsh-px left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the UTs!

for _, node := range k8sNodeList.Items {
shouldRun, _, err := c.nodeShouldRunStoragePod(&node, toUpdate)
if err != nil {
logrus.Infof("Skipping pre-flight storage node entry for %s node. Error: %v", node.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Infof should be an warnf instead as this is not expected. Not error since we aren't failing the operator due to this.

@harsh-px harsh-px dismissed their stale review August 23, 2023 19:42

Dismissing as UTs are now added

Signed-off-by: Jose Rivera <jose@portworx.com>
@jrivera-px
Copy link
Contributor Author

@nrevanna sorry didn't answer this "does this also take care of infra nodes in OCP scenario?".... Yes it does.

Copy link
Contributor

@nrevanna nrevanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jrivera-px jrivera-px merged commit 723462a into master Aug 26, 2023
7 of 8 checks passed
jrivera-px added a commit that referenced this pull request Aug 26, 2023
#1222)

* PWX-32580: Do not create pre-flight storage node objects for nodes which have label px/enabled=false.

Signed-off-by: Jose Rivera <jose@portworx.com>

* PWX-32580: Use correct api to check if a node will run px. Create pre-flight  storage node based on this api check.

Signed-off-by: Jose Rivera <jose@portworx.com>

* PWX-32580: break out pre-flight storage node create/delete funcs.  Add unit test.

Signed-off-by: Jose Rivera <jose@portworx.com>

* PWX-32580: PR suggestion change infof to warnf on skips.

Signed-off-by: Jose Rivera <jose@portworx.com>

---------

Signed-off-by: Jose Rivera <jose@portworx.com>
jrivera-px added a commit that referenced this pull request Aug 28, 2023
#1222) (#1233)

* PWX-32580: Do not create pre-flight storage node objects for nodes which have label px/enabled=false.



* PWX-32580: Use correct api to check if a node will run px. Create pre-flight  storage node based on this api check.



* PWX-32580: break out pre-flight storage node create/delete funcs.  Add unit test.



* PWX-32580: PR suggestion change infof to warnf on skips.



---------

Signed-off-by: Jose Rivera <jose@portworx.com>
@piyush-nimbalkar piyush-nimbalkar deleted the PWX-32580-preflight-skip-px-disabled-nodes branch September 15, 2023 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants