-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove pkg/controller/ready #733
Remove pkg/controller/ready #733
Conversation
deploy/500-controller.yaml
Outdated
livenessProbe: | ||
exec: | ||
command: | ||
- stat | ||
- /tmp/shipwright-build-ready | ||
initialDelaySeconds: 5 | ||
periodSeconds: 10 | ||
readinessProbe: | ||
exec: | ||
command: | ||
- stat | ||
- /tmp/shipwright-build-ready | ||
initialDelaySeconds: 5 | ||
periodSeconds: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an alternative using the metrics endpoint:
ports:
- containerPort: 8383
name: metrics-port
livenessProbe:
httpGet:
path: /metrics
port: metrics-port
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /metrics
port: metrics-port
initialDelaySeconds: 5
periodSeconds: 10
We will likely need to use this downstream as we require a probe. It is probably as good as the Tekton solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you require both a liveness probe and a readiness probe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know, would need to check. But, you are right, we probably should here only define the liveness probe because there is no service defined. In downstream we have our own deployment yaml anyway, so, if we would have to specify a readiness probe, we can always deviate from what is defined in upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have your own downstream deployment anyway, I think I'd prefer to remove both probes here for now, until we decide there's real value in adding them. Does that sound okay to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am okay. The risk then is that we use something in downstream (a probe based on the metrics endpoint) that is not "tested" upstream. But, if it would stop working (for whatever reason) and not being repairable, then I think we could fairly quickly introduce a new basic liveness endpoint like Tekton does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need both probes for some required standards/policies downstream. I will prefer to keep both in upstream so that we can continuously test them, if any vendor downstream does not require them, then they should remove it. PR looks good for me.
This package wrote a file to report that the controller was live and ready, which was unnecessary. Instead, we'll use the /metrics endpoint to determine liveness and readiness. This change also removes some unused code in pkg/controller/controller.go
Hi, @imjasonh maybe just for your info, I am not sure if liveness/readiness probes can work well for non-leader pods or not when we use metrics endpoint do these two checks. Tekton-controller is using metrics endpoint to check liveness/readiness and it works normally. This is because tekton is using active/active HA mode for leader-election. That means each tekton controller pod can be the leader, only different pods hold different buckets at the same time. However, for build side, we are using |
@xiujuan95 in our downstream dev environment, the build controller is running with the httpGet based probes since yesterday and there has not been an alert yet = the probes must be working. You may also check the metrics endpoint of non-leading pods in our environment. They are present (they just do not contain build-related metrics which is expected). |
@xiujuan95 its a good point, I was also wondering why it works now. I think something changed in the controller-runtime pkg that allowed us to serve the /metrics endpoint in both the leader and passive pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@imjasonh @SaschaSchwarze0 whats the state of this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SaschaSchwarze0 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This package wrote a file to report that the controller was live and ready, which was unnecessary.
This change also removes some unused code in pkg/controller/controller.go
Fixes #732
/kind cleanup
Submitter Checklist
See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.
Release Notes
/assign @SaschaSchwarze0