Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow startup probes on user containers #12565

Closed
wants to merge 2 commits into from

Conversation

psschwei
Copy link
Contributor

@psschwei psschwei commented Jan 27, 2022

Fixes #10037

Proposed Changes

  • Adds startupProbe to the list of allowed fields for containers, updates the schemas, and adds some tests.

Release Note

Users are able to declare a `startupProbe` on their applications. To learn more about startup probes, see the [Kubernetes documentation](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-startup-probe).

/assign @dprotaso @julz

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/API API objects and controllers area/test-and-release It flags unit/e2e/conformance/perf test issues for product features approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 27, 2022
@psschwei
Copy link
Contributor Author

/cancel-approve
(auto-approved because of release lead superpowers)

@psschwei
Copy link
Contributor Author

/remove-approve
(use the right prow command)

@knative-prow-robot knative-prow-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 27, 2022
@codecov
Copy link

codecov bot commented Jan 27, 2022

Codecov Report

Merging #12565 (c306c1e) into main (ff30afc) will increase coverage by 0.00%.
The diff coverage is 33.33%.

@@           Coverage Diff           @@
##             main   #12565   +/-   ##
=======================================
  Coverage   87.48%   87.49%           
=======================================
  Files         195      195           
  Lines        9718     9721    +3     
=======================================
+ Hits         8502     8505    +3     
  Misses        931      931           
  Partials      285      285           
Impacted Files Coverage Δ
pkg/activator/net/revision_backends.go 91.81% <0.00%> (-0.80%) ⬇️
pkg/apis/serving/fieldmask.go 95.15% <100.00%> (+0.01%) ⬆️
pkg/reconciler/configuration/configuration.go 86.15% <0.00%> (+1.53%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff30afc...c306c1e. Read the comment docs.

Copy link
Member

@julz julz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow the user to add a StartupProbe, don't we need to ensure that activator (via the optimisation where it checks QP readiness directly) doesn't start routing to the user container before it's passing?

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from dprotaso after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@psschwei
Copy link
Contributor Author

/test pull-knative-serving-istio-latest-no-mesh

@psschwei
Copy link
Contributor Author

psschwei commented Jan 28, 2022

Test error:

Error: error creating the clusters: error creating clusters: error creating cluster: exit status 1, output: "Default change: VPC-native is the default mode during cluster creation for versions greater than 1.21.0-gke.1500. To create advanced routes based clusters, please pass the --no-enable-ip-alias flag\nNote: Your Pod address range (--cluster-ipv4-cidr) can accommodate at most 1008 node(s).\nERROR: (gcloud.beta.container.clusters.create) ResponseError: code=403, message=Insufficient regional quota to satisfy request: resource "CPUS": request requires '48.0' and is short '28.0'. project has a quota of '200.0' with '20.0' available. View and manage quotas at https://console.cloud.google.com/iam-admin/quotas?usage=USED&project=knative-boskos-55.\n"

@psschwei
Copy link
Contributor Author

/test pull-knative-serving-upgrade-tests

@psschwei
Copy link
Contributor Author

/test pull-knative-serving-istio-latest-no-mesh

@dprotaso
Copy link
Member

dprotaso commented Feb 9, 2022

I think introducing this attribute is a bit more complex and we probably need to document all the scenarios to help us understand the implications.

For example - we rewrite the user container's readiness probe to go via the queue proxy. I believe we even default to a TCP probe when one is not set. Thus the queue proxy is aware of the application's 'readiness'. But this readiness probe should occur after the startupProbe which currently the queue proxy isn't aware of.

There's probably other edge cases/scenarios that I haven't thought of

@dprotaso
Copy link
Member

dprotaso commented Feb 9, 2022

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 9, 2022
@psschwei
Copy link
Contributor Author

psschwei commented Jun 7, 2022

Going to close this for now as it's not a trivial update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support startupProbe in webhook
4 participants