Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staticpod: improve operator messages #913

Closed
wants to merge 1 commit into from

Conversation

mfojtik
Copy link
Member

@mfojtik mfojtik commented Oct 7, 2020

  • Do proper pluralization in messages
  • Show total node count (so we can determine if all expected control plane nodes are there)
  • Unify on terminology ("achieve" vs. "is on")
  • Add messages to unit test to make sure we have them correct

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mfojtik
To complete the pull request process, please assign smarterclayton after the PR has been reviewed.
You can assign the PR to them by writing /assign @smarterclayton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

count := counts[currentRevision]
revisionStrings = append(revisionStrings, fmt.Sprintf("%d nodes are at revision %d", count, currentRevision))
if count := counts[currentRevision]; count == 1 {
revisionStrings = append(revisionStrings, fmt.Sprintf("1 node achieved revision %d", currentRevision))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "achieved" the right word?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should pick one and be consistent "are at" vs. "achieved" I don't care much which one, I liked "achieved" better.

}
// if we are progressing and no nodes have achieved that level, we should indicate
if numProgressing > 0 && counts[newStatus.LatestAvailableRevision] == 0 {
revisionStrings = append(revisionStrings, fmt.Sprintf("%d nodes have achieved new revision %d", 0, newStatus.LatestAvailableRevision))
revisionStrings = append(revisionStrings, fmt.Sprintf("none of the nodes have achieved new revision %d", newStatus.LatestAvailableRevision))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these strings are a lot of text. Shouldn't we rather try to be more brief?

some nodes are on old revisions (34, 35), 1 is on latest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think it will read better than 3 semicolons

}
revisionDescription := strings.Join(revisionStrings, "; ")

if numAvailable > 0 {
message := fmt.Sprintf("%d nodes are active; %s", numAvailable, revisionDescription)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does active mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, I think "active" means "available" which means "they achieved the desired levels and are running" ?

latestCount++
}
if c != 0 {
availableCount++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this mean anything about availability?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the existing code "numAvailable" is determined by CurrentRevision!=0 ... So i guess CurrentRevision==0 means the node is not available or not actively making progress? i'm a bit confused here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has nothing to do with availability. It only says that this revision was the last that got ready eventually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastReadyRevision ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it nonZeroCount

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastReadyRevision

It's a count.


if !currentRevisions.Has(r.latestRevision) {
if r.progressing {
return fmt.Sprintf("%snodes progressing towards revision %d", notAvailable, r.latestRevision)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am confused about notAvailable (e.g. none of the nodes are available) to be plugged into the first %s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried to incorporate that into message, so we don't lose it but also get rid of the ; .... I think it will read as "none of the nodes are available and nodes progressing towards revision N"

Copy link
Contributor

@sttts sttts Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!currentRevisions.Has(r.latestRevision) does not match message. Better check that there is at least one node that is not at latestRevision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. you mean "all nodes progressing ..."

@@ -422,15 +422,77 @@ func setNodeStatusFn(status *operatorv1.NodeStatus) v1helpers.UpdateStaticPodSta
}
}

type revisionDescriptionPrinter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an object in the first place and not just a simple func?

}
}

notAvailable := fmt.Sprintf("%d %s are not available and ", availableCount, pluralizedNode(availableCount))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot follow. This must be len(nodes)-availableCount, no?

And better turn the wording into: %d %s have never been available and

notAvailable = ""
}
if availableCount == 0 {
return "none of the nodes are available"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of the nodes was ever available.

}
if currentRevisions.Has(r.latestRevision) {
oldRevisionStringList := []string{}
for _, i := range currentRevisions.Difference(sets.NewInt32(r.latestRevision)).List() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop over currentRevisions and exclude latestRevision. No need for set arithmetic.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 28, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 30, 2021
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 29, 2021

@mfojtik: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2021
@openshift-ci-robot
Copy link

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants