-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
report template controller's status in ControllerConfig #339
report template controller's status in ControllerConfig #339
Conversation
0eb682b
to
cc394d7
Compare
Is this working towards a fix for #338 ? |
/test e2e-aws |
6224fb0
to
802ff4a
Compare
@cgwalters it is ready for review, updated the PR message too. |
failures from e2e-aws [sig-storage] Subpath [Volume type: hostPathSymlink] should fail for new directories when readOnly specified in the volumeSource [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support existing directories when readOnly specified in the volumeSource [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support existing directory [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support existing single file [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support file as subpath [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support non-existent path [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support readOnly directory specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPathSymlink] should support readOnly file specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: hostPath] should support existing directory [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: nfsPVC] should support existing directories when readOnly specified in the volumeSource [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: nfs] should support existing directory [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Subpath [Volume type: nfs] should support existing single file [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] Volumes iSCSI [Feature:Volumes] should be mountable [Suite:openshift/conformance/parallel] [Suite:k8s] /retest |
Pulling this down now, I should expect to see status in |
/retest |
Yes |
ControllerConfig is currently used to configure the `template controller`. And, while all other controllers report progress on various objects like, - `render controller` reports status on `machine config pools` - `node controller` reports status on `machine config pools` etc. Any status updates for `template controller` have been missing.. This adds `status` field to `ControllerConfig` to allow reporting various conditions. Currently 3 conditions have been added for `template controller` namely, `completed`, `running` and `failing`.
The operator needs to know when the internal machineconfigs in `template controller` have been synced to the cluster. In terms of operator `done` is when `template controller` reports `completed`, not `failing` and not `running` for that current generation of `ControllerConfig`.
802ff4a
to
f19f48d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated images in cluster to use this PR and can config that I now see:
status:
conditions:
- lastTransitionTime: 2019-01-25T22:59:17Z
reason: sync completed towards (2) generation using controller version 3.11.0-507-gf19f48d4
status: "True"
type: TemplateContollerCompleted
- lastTransitionTime: 2019-01-25T22:59:17Z
status: "False"
type: TemplateContollerRunning
- lastTransitionTime: 2019-01-25T22:59:17Z
status: "False"
type: TemplateContollerFailing
observedGeneration: 2
when I run: oc get -o yaml controllerconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated review: Can we take a look at this logging to see if we can improve the UX?
return wait.Poll(controllerConfigCompletedInterval, controllerConfigCompletedTimeout, func() (bool, error) { | ||
if err := isControllerConfigCompleted(resource, optr.ccLister.Get); err != nil { | ||
glog.Errorf("controllerconfig is not completed: %v", err) | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return false, err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to try to poll until timeout to prevent us exiting on interruptions / transient errors. the pool time out is 1 minute, which seems not too long.
return optr.waitForDeploymentRollout(mcc) | ||
var waitErrs []error | ||
waitErrs = append(waitErrs, optr.waitForDeploymentRollout(mcc)) | ||
waitErrs = append(waitErrs, optr.waitForControllerConfigToBeCompleted(cc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. OK, I think I see how this approach should solve the problem.
And so then it should obsolete this outstanding change: c8109db
f19f48d
to
4d1fffa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked the logging tweak that I requested and it's resolved 👍
Will let @cgwalters loop back in to give his final review.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, cgwalters The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-aws |
1 similar comment
/test e2e-aws |
/retest |
/test e2e-aws |
/retest Please review the full test history for this PR and help us cut down flakes. |
Add IBM Cloud managed annotations to CVO manifests
ControllerConfig is currently used to configure the
template controller
.And, while all other controllers report progress on various objects like,
render controller
reports status onmachine config pools
node controller
reports status onmachine config pools
etc.Any status updates for
template controller
have been missing..This adds
status
field toControllerConfig
to allow reporting various conditions.Currently 3 conditions have been added for
template controller
namely,completed
,running
andfailing
.The operator needs to know when the internal machineconfigs in
template controller
have been synced to the cluster.In terms of operator
done
is whentemplate controller
reportscompleted
, notfailing
and notrunning
for that current generation ofControllerConfig
.This makes sure that we are installing the server and daemon when the controller is installing its operands which has been causing some races like #338.
/cc @cgwalters @ashcrow