Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probe Envoy pods to determine when a ClusterIngress is actually deployed #4734

Merged
merged 2 commits into from
Jul 29, 2019

Conversation

JRBANCEL
Copy link
Contributor

@JRBANCEL JRBANCEL commented Jul 12, 2019

Fixes #3312 & #1582

Proposed Changes

See the design doc for details.

TODO

  • Working implementation
  • Fix existing tests
  • Add tests

Next

  • Remove the special handling of 404s in E2E tests (in a subsequent PR)

@knative-prow-robot knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 12, 2019
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jul 12, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRBANCEL: 0 warnings.

In response to this:

Fixes #3312 & #1582

Proposed Changes

See the design doc for details.

TODO

  • Working implementation
  • Fix existing tests
  • Add tests
  • Remove the special handling of 404s in E2E tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/networking/v1alpha1/ingress_lifecycle.go 91.7% 73.3% -18.3
pkg/network/prober/prober.go 97.7% 91.3% -6.4
pkg/reconciler/ingress/ingress.go Do not exist 0.0%
pkg/reconciler/ingress/resources/virtual_service.go 97.6% 95.9% -1.7
pkg/reconciler/ingress/status.go Do not exist 0.0%

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly superficial items.

pkg/apis/networking/v1alpha1/ingress_lifecycle.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/resources/virtual_service.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
@@ -140,6 +176,37 @@ func makeVirtualServiceSpec(ia v1alpha1.IngressAccessor, gateways map[v1alpha1.I
return &spec
}

func makeProbeRoute(host string) v1alpha3.HTTPRoute {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add a dummy route? Couldn't we also use our probe? It'll either return activator or queue-proxy but we'd know it works without side-effects I think? Is this to keep the layering strict?

Copy link
Contributor Author

@JRBANCEL JRBANCEL Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add a dummy route?

It is an invalid Spec without it.

Couldn't we also use our probe? It'll either return activator or queue-proxy but we'd know it works without side-effects I think? Is this to keep the layering strict?

We could. It is simply because it is simpler, has lower latency and more reliable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that the fake route is propageted on the same paths as the actual route? As in: Can we guarantee the actual route is ready and propagated to envoys if we probe the fake route?

Copy link
Contributor Author

@JRBANCEL JRBANCEL Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can double check the Istio implementation.
I am assuming that yes, since both routes are bound to the same Istio Gateway, they will be both applied to the same Envoy pods.

An interesting related question is how does Istio apply the config changes?
If it is atomic, good. If not, it is possible that the probe route is applied before the actual route.

@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 16, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRBANCEL: 1 warning.

In response to this:

/lint
/test pull-knative-serving-go-coverage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/reconciler/ingress/status.go Show resolved Hide resolved
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRBANCEL: 1 warning.

In response to this:

/lint
/test pull-knative-serving-go-coverage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pkg/reconciler/ingress/status.go Outdated Show resolved Hide resolved
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JRBANCEL: 0 warnings.

In response to this:

/lint
/test pull-knative-serving-go-coverage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JRBANCEL JRBANCEL changed the title [WIP] Probe Envoy pods to determine when a ClusterIngress is actually deployed Probe Envoy pods to determine when a ClusterIngress is actually deployed Jul 16, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 16, 2019
@JRBANCEL
Copy link
Contributor Author

/test pull-knative-serving-integration-tests

Copy link
Contributor

@markusthoemmes markusthoemmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the retries on 404s to assert that this really works as intended?

See

if resp.StatusCode == http.StatusNotFound {
return false, nil
}
and the respective file in v1beta1.

if you choose to do that: Please leave the helper there even though it doesn't do anything currently. We can use it as a convenient hook until we have baked the networking flakiness some more and are sure we can take the extra wheels off.

@JRBANCEL
Copy link
Contributor Author

JRBANCEL commented Jul 17, 2019

Should we remove the retries on 404s to assert that this really works as intended?

Yes, it is planned. See last item in the TODO.
I'll most likely do it in another PR because it is a bit more complex than just removing the retries for 404.

@knative-prow-robot knative-prow-robot added area/test-and-release It flags unit/e2e/conformance/perf test issues for product features do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jul 17, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 17, 2019
@JRBANCEL JRBANCEL force-pushed the cluster-ingress-status branch 2 times, most recently from 5467969 to be182f5 Compare July 25, 2019 22:43
@knative-prow-robot knative-prow-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 25, 2019
@mattmoor
Copy link
Member

Needs a rebase.

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this still has a bunch of debugging stuff in it?

pkg/apis/serving/v1alpha1/service_lifecycle_test.go Outdated Show resolved Hide resolved
pkg/apis/serving/v1beta1/service_defaults_test.go Outdated Show resolved Hide resolved
resyncIngressOnVirtualServiceReady := func(vs *v1alpha3.VirtualService) {
// Reconcile when a VirtualService becomes ready
impl.EnqueueLabelOfClusterScopedResource(networking.ClusterIngressLabelKey)(vs)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be:

resyncIngressOnVirtualServiceReady := impl.EnqueueLabelOfClusterScopedResource(networking.ClusterIngressLabelKey)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, EnqueueLabelOfClusterScopedResource is func(obj interface{}), resyncIngressOnVirtualServiceReady expects a func(*v1alpha3.VirtualService)

@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jul 29, 2019
@JRBANCEL
Copy link
Contributor Author

@tcnghia, @mattmoor it is ready to be approved/lgtm 🚀

/test pull-knative-serving-istio-1.1-mesh
/test pull-knative-serving-istio-1.1-no-mesh
/test pull-knative-serving-istio-1.2-mesh
/test pull-knative-serving-istio-1.2-no-mesh

Copy link
Contributor

@tcnghia tcnghia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

There is one left over from merge, but otherwise /lgtm

pkg/reconciler/ingress/resources/virtual_service.go Outdated Show resolved Hide resolved
@JRBANCEL
Copy link
Contributor Author

/test pull-knative-serving-istio-1.1-mesh
/test pull-knative-serving-istio-1.1-no-mesh
/test pull-knative-serving-istio-1.2-mesh
/test pull-knative-serving-istio-1.2-no-mesh

@mattmoor
Copy link
Member

/lgtm
/approve

Let's get it baking.

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 29, 2019
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JRBANCEL, mattmoor, tcnghia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 29, 2019
@JRBANCEL
Copy link
Contributor Author

/hold cancel 🚀

@tcnghia
Copy link
Contributor

tcnghia commented Jul 29, 2019

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 29, 2019
@knative-prow-robot knative-prow-robot merged commit cbe9951 into knative:master Jul 29, 2019
@vagababov
Copy link
Contributor

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Brief 404 errors after creating a route.
8 participants