Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New probe handling in Queue-Proxy & Activator #5159

Merged
merged 1 commit into from
Aug 16, 2019

Conversation

JRBANCEL
Copy link
Contributor

Part 1 & 2 & 3 of #5156.

Proposed Changes

I am not 100% sure why, but today Activator and Queue-Proxy care if the a probe request was sent to them specifically. For #5156, we need to be able to probe Envoy though the real data path (eventually reaching either Activator or Queue-Proxy) without distinction. This change introduces a new probe handler at the head of the handlers pipeline doing just that as well as a new header that will be populated by Envoy and used to find out which version of the VirtualService/Gateway Envoy is using.

Next

Maybe take a holistic look at probing in all components and consolidate/refactor.

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Aug 14, 2019
@knative-prow-robot knative-prow-robot added area/networking size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 14, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRBANCEL: 0 warnings.

In response to this:

Part 1 & 2 & 3 of #5156.

Proposed Changes

I am not 100% sure why, but today Activator and Queue-Proxy care if the a probe request was sent to them specifically. For #5156, we need to be able to probe Envoy though the real data path (eventually reaching either Activator or Queue-Proxy) without distinction. This change introduces a new probe handler at the head of the handlers pipeline doing just that as well as a new header that will be populated by Envoy and used to find out which version of the VirtualService/Gateway Envoy is using.

Next

Maybe take a holistic look at probing in all components and consolidate/refactor.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JRBANCEL
Copy link
Contributor Author

/assign @vagababov

@@ -242,6 +242,7 @@ func main() {
ah = &activatorhandler.HealthHandler{HealthCheck: statSink.Status, NextHandler: ah}
// NOTE: MetricHandler is being used as the outermost handler for the purpose of measuring the request latency.
ah = activatorhandler.NewMetricHandler(revisionInformer.Lister(), reporter, logger, ah)
ah = network.NewProbeHandler(ah)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one going to replace activatorhandler.ProbeHandler in future?

And please make the metric handler as the outermost one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, both probing are different.
See my other comment about the metrics handler.

@@ -388,6 +388,7 @@ func main() {
composedHandler = pushRequestMetricHandler(composedHandler, requestCountM, responseTimeInMsecM, env)
}
composedHandler = tracing.HTTPSpanMiddleware(composedHandler)
composedHandler = network.NewProbeHandler(composedHandler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one going to replace the probe requests check part in func handler in future?

And same for the handlers order. I don't know why currently metric handler is not the outermost one but it should be so that the latency is the time added by all handlers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queue-Proxy probing is different than this probing, so I don't think they'll ever be merged.

Regarding the ordering, you can argue that tracing also needs to be the outer most.
But all these handlers (metrics, tracing, probing) are not all doing sub-millisecond work, plus the metrics handler already filters out probing requests, so it doesn't matter here.

@vagababov
Copy link
Contributor

As I described earlier: originally the k8s service that was used was the same and this was checking for network re-programming. Currently we do actually use private service so this might be less of a concern. Please, make sure the scale from zero works nicely (run a performance scale from 0 test).

pkg/network/probe_handler.go Show resolved Hide resolved
pkg/network/prober/prober.go Outdated Show resolved Hide resolved
@vagababov
Copy link
Contributor

Also what is the problem with systems verifying that the probe was targeted to them specifically?

@JRBANCEL
Copy link
Contributor Author

Also what is the problem with systems verifying that the probe was targeted to them specifically?

Because here we want to probe through Envoy (to know which version Envoy is using), and there is no way to know where Envoy will sent the request, Queue-Proxy or Activator, and it doesn't matter.

@tcnghia
Copy link
Contributor

tcnghia commented Aug 15, 2019

Can we change the existing probe handlers to be more relaxed about headers (perhaps accept "*") and also echo back some request headers? That would be less handlers and more code sharing.

@JRBANCEL
Copy link
Contributor Author

Can we change the existing probe handlers to be more relaxed about headers (perhaps accept "*") and also echo back some request headers? That would be less handlers and more code sharing.

While Activator probe handler is somewhat similar, Queue-Proxy probe handler is a different beast (it probes user-container).

As I said in the Next section. I will revisit and eventually merge what makes sense, but right now, it is not obvious. Small handlers is cleaner than more generic complex handlers.

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion for better readability.

pkg/network/probe_handler.go Show resolved Hide resolved
@googlebot
Copy link

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: no Indicates the PR's author has not signed the CLA. and removed cla: yes Indicates the PR's author has signed the CLA. labels Aug 16, 2019
Copy link
Contributor

@mattmoor-sockpuppet mattmoor-sockpuppet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found go linting violations, please merge: JRBANCEL#3

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot added cla: yes Indicates the PR's author has signed the CLA. and removed cla: no Indicates the PR's author has not signed the CLA. labels Aug 16, 2019
@knative-test-reporter-robot

The following tests are currently flaky. Running them again to verify...

Test name Retries
pull-knative-serving-integration-tests 1/3

Automatically retrying...
/test pull-knative-serving-integration-tests

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold
if you want to deal with the nits.

pkg/network/prober/prober.go Outdated Show resolved Hide resolved
pkg/network/probe_handler.go Show resolved Hide resolved
@knative-prow-robot knative-prow-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Aug 16, 2019
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JRBANCEL, vagababov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2019
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2019
@vagababov
Copy link
Contributor

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 16, 2019
@vagababov
Copy link
Contributor

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2019
@knative-prow-robot knative-prow-robot merged commit 32dcc15 into knative:master Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/autoscale area/networking cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants