Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for TLS (conformance test passes) #316

Merged
merged 12 commits into from
Jul 12, 2022

Conversation

evankanderson
Copy link
Contributor

@evankanderson evankanderson commented Jun 28, 2022

Changes

  • 🐛 Add support for TLS by filling out Gateway's Spec.Listener
  • 🧹 Fix a few typo'ed import aliases

Still TODO:

  • Clean up Spec.Listener on delete (separate PR)
  • More testing in ingress_test.go (I got one done, but I want to test the "listener already exists" scenario, at least)

/kind enhancement

Fixes #199
Fixes #319

Release Note

Adds support for Knative-configured TLS ("auto-TLS")

Docs

This increases parity with other implementations, but should not require its own
documentation, as no configuration options were added.

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2022
@knative-prow knative-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 28, 2022
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2022
@codecov
Copy link

codecov bot commented Jun 28, 2022

Codecov Report

Merging #316 (ad027c7) into main (c82e90c) will increase coverage by 3.63%.
The diff coverage is 68.58%.

@@            Coverage Diff             @@
##             main     #316      +/-   ##
==========================================
+ Coverage   12.91%   16.55%   +3.63%     
==========================================
  Files          20       21       +1     
  Lines        2958     3172     +214     
==========================================
+ Hits          382      525     +143     
- Misses       2554     2616      +62     
- Partials       22       31       +9     
Impacted Files Coverage Δ
...kg/reconciler/ingress/resources/reference_grant.go 0.00% <0.00%> (ø)
pkg/reconciler/ingress/reconcile_resources.go 67.67% <77.21%> (+35.96%) ⬆️
pkg/reconciler/ingress/ingress.go 74.19% <87.50%> (+2.19%) ⬆️
pkg/reconciler/ingress/controller.go 94.02% <100.00%> (+0.27%) ⬆️
pkg/reconciler/ingress/resources/httproute.go 94.77% <100.00%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c82e90c...ad027c7. Read the comment docs.

secret := metav1.PartialObjectMetadata{
TypeMeta: metav1.TypeMeta{
Kind: "Secret",
APIVersion: "v1", //metav1.GroupVersion.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: corev1.Secret not metav1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Copy link
Contributor

@dprotaso dprotaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops accidentally hit the wrong button earlier

change looks good

tl;dr

  • I think one of the lint warnings is a legit bug
  • We should have a follow up where we create multiple gateways vs updating a single one

Comment on lines +231 to +232
_, err := c.gwapiclient.GatewayV1alpha2().Gateways(update.Namespace).Update(
ctx, update, metav1.UpdateOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to open up a new issue about creating multiple gateways vs. updating a single one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_, err := c.gwapiclient.GatewayV1alpha2().Gateways(update.Namespace).Update(
ctx, update, metav1.UpdateOptions{})
if err != nil {
recorder.Eventf(ing, corev1.EventTypeWarning, "GatewayUpdateFailed", "Failed to update Gateway %s: %v", gwName, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these controllers have a default threadiness of 2 - I'd expect many update conflicts since we're updating a single shared resource - this can be addressed by using multiple gateways at a future date.

I'd probably skip recording the event since that might trigger client side throttling since we'll be hitting the API frequently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we shouldn't run this controller in HA with the number of buckets being greater than 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you're suggesting. I'll add o TODO and a comment here.

I'll also bring this up with the Gateway API maintainers -- it feels like generating hundreds of Gateways that are supposed to collapse onto the same proxy is probably not right (and doesn't seem to work for at least Contour today), but the one-Gateway model doesn't scale, either.

Filed #318

if len(listeners) > 0 {
// For now, we only reconcile the external visibility, because there's
// no way to provide TLS for internal listeners.
err := c.reconcileGatewayListeners(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we should update the gateway listeners first before updating the httproute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure -- how would you decide which to do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could imagine a poor implementation (of the gateway api) that doesn't trigger reconciliation of existing HTTP routes when ReferenceGrants for certain namespaces are create.

Thus I think I'd prefer to create the listener & grant and update the gateway prior to creating an HTTPRoute.

I'm ok if that's done in a followup PR

for _, h := range tls.Hosts {
listener := gatewayv1alpha2.Listener{
Name: gatewayv1alpha2.SectionName(h),
Hostname: (*gatewayv1alpha2.Hostname)(&h),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lint warning seems legit -probably should make a copy of h otherwise this points to the address space that's updated by the loop iteration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I don't think I've ever had more than one entry in tls.Hosts, good catch!

for _, l := range listeners {
lmap[string(l.Name)] = l
}
// TODO: how do we track and remove listeners if they are removed from the KIngress spec?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evankanderson
Copy link
Contributor Author

Okay, I've added a couple more test cases, going over Dave's comments now.

I think my next steps are going to be to figure out what the Contour maintainers expect from AppPrortocol to enable Webhook and TLS, and see whether or not I can reasonably hack that into Contour to get the websocket and grpc tests passing.

@nak3
Copy link
Contributor

nak3 commented Jun 29, 2022

This repo has two conformance tests:

  1. the test with net-gateway-api controller. (test/kind-e2e-contour.sh)
  2. the test without net-gateway-api controller (test/kind-conformance-contour.sh)

This PR has 1, but we still need to add a test for 2. I think we can do it in another PR but just wanted to mention about it.

Actually 2 still needs to add the test code into test/conformance/ingressv2 and enable it in run.go.
The procedure is described in DEVELOPMENT.md.

@@ -28,6 +28,7 @@ import (
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
clientgotesting "k8s.io/client-go/testing"
"k8s.io/utils/pointer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: knative/pkg/ptr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why prefer knative.dev packages to the upstream k8s ones?

@evankanderson
Copy link
Contributor Author

/hold

After discussion and testing, it turns out that we can create new Gateway resources in both contour and istio implementations which share the same IP address, so I'm going to re-work this to create new Gateway resources for each KIngress with TLS, rather than adding onto an existing Gateway.

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 29, 2022
@evankanderson
Copy link
Contributor Author

/hold cancel

Gateway merging doesn't actually work for at least Contour, and if you have two Gateways, there's no clear way to determine which Gateway the ones without addresses merge to.

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 30, 2022
@evankanderson
Copy link
Contributor Author

evankanderson commented Jul 1, 2022

/hold

@carlisia asked that #315 go in first.

I'm going to see whether I can get listener cleanup in while she's working.

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2022
Added some (defaulted) fields to HTTPRoute to avoid controller looping.
@evankanderson
Copy link
Contributor Author

I added some more code, we now have distinctly-named listeners of the form kni-$INGRESS_UID, and a finalizer to clean them up.

/assign @carlisia

@dprotaso @nak3 not sure if you want to take (another?) look.

@nak3
Copy link
Contributor

nak3 commented Jul 6, 2022

/test integration-tests_net-gateway-api_main

LGTM. Thank you!

@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 11, 2022
Copy link
Contributor Author

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed one comment earlier.

Also, this is now merged with #315, and unit tests should be passing. I'm checking e2e now, but this is ready for review.

@carlisia @dprotaso @nak3

I wouldn't mind this getting in before nightly testing / as a candidate for 1.6...

@@ -28,6 +28,7 @@ import (
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
clientgotesting "k8s.io/client-go/testing"
"k8s.io/utils/pointer"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why prefer knative.dev packages to the upstream k8s ones?

@evankanderson
Copy link
Contributor Author

Ugh, I've somehow screwed up just the kind e2e tests...

@evankanderson
Copy link
Contributor Author

/hold

(sorting out the timeout...)

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 11, 2022
@evankanderson
Copy link
Contributor Author

I missed the rename ReferencePolicy -> ReferenceGrant in the ClusteRole for the controller, so it couldn't manage the resources it thought it needed to create.

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 11, 2022
@evankanderson
Copy link
Contributor Author

Tests run locally seem to pass now... (I had a cleanup issue I was debugging when I pushed before the /hold, so I figured I'd get the CI tests in parallel).

@evankanderson
Copy link
Contributor Author

Hmm, it looks like the conformance tests in the e2e tests are unhappy; I can repro locally with Istio, but not with Contour. Debugging...

@carlisia
Copy link
Member

The PR changes lgtm. I'll standby...

@@ -75,6 +75,8 @@ func makeHTTPRouteSpec(
namespacedNameGateway := gatewayConfig.Gateways[rule.Visibility].Gateway

gatewayRef := gatewayv1alpha2.ParentReference{
Group: (*gatewayv1alpha2.Group)(&gatewayv1alpha2.GroupVersion.Group),
Kind: (*gatewayv1alpha2.Kind)(pointer.String("Gateway")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Port not need to be added as well?

@carlisia
Copy link
Member

carlisia commented Jul 11, 2022

Whoa side note: I was wondering where and how ISTIO_VERSION was being used since I couldn't find it anywhere in the code base. It is used by the Istio script for its installation (https://raw.githubusercontent.com/istio/istio/master/release/downloadIstioCtl.sh).

Here are the changes to the latest version, v1.14.1: Istio / Announcing Istio 1.14.1. I wonder if this update would help with the tls stuff.

PS: I'll document the info about the Istio version.


Since the Istio version being currently used (1.13.2), this other update also adds a TLS related fix:
Istio / Announcing Istio 1.13.5

@carlisia
Copy link
Member

Also: Istio has not had a release since upgrading their code to use v0.5.0-rc1 (gateway-api: upgrade to v0.5.0-rc1 by howardjohn · Pull Request #39506 · istio/istio).

@carlisia
Copy link
Member

…sn't support ReferenceGrant yet, and there's no converter.
@evankanderson
Copy link
Contributor Author

I foolishly attempted to use the new ReferenceGrant API immediately; I've since mended my ways and switched back to using ReferencePolicy.

Pro tip when doing a merge like this:

  1. Make sure the e2e test passes before upgrade
  2. Upgrade the code and run the e2e test on the pre-merge controller with post-merge tests.
  3. Redeploy the controller and verify the e2e tests continue to pass.

Once I did this, triage of the tests was a lot easier, because it was clear that it was something in the controller code, not the merge, that had broken the tests.

@evankanderson
Copy link
Contributor Author

/hold cascel

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 12, 2022
@evankanderson
Copy link
Contributor Author

Tests, including TLS tests, now pass locally.

@carlisia
Copy link
Member

This is great, thank you! 🎉

I will add an issue to switch to ReferenceGrant once the providers are updated to support it. And I'll add your tips to the dev documentation.

Thanks for creating all the issues for followup work. Wanted to surface this one comment, I think it requires an issue as well? #316

/lgtm
/approve

🚀

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jul 12, 2022
@carlisia
Copy link
Member

/approve

@knative-prow
Copy link

knative-prow bot commented Jul 12, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carlisia, evankanderson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [carlisia,evankanderson]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@carlisia
Copy link
Member

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 12, 2022
@knative-prow knative-prow bot merged commit 4e987da into knative-extensions:main Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/enhancement lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up Gateway Spec.Listeners Add TestIngressTLS test
5 participants