GEP-1651: Gateway Routability #1653

dprotaso · 2023-01-16T21:46:27Z

/kind gep

What this PR does / why we need it:
Support different types of routable Gateways

Which issue(s) this PR fixes:

Supports (but does not resolve) #1651

k8s-ci-robot · 2023-01-16T21:46:35Z

Hi @dprotaso. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dprotaso · 2023-01-16T21:49:05Z

Going to leave this as draft - until I propose an implementation/API - @evankanderson had a simple proposal in the discussion but wanted to dwell on it a bit before committing it to this GEP

dprotaso · 2023-01-16T23:38:52Z

/ok-to-test

robscott

Thanks for starting this @dprotaso! Have some questions about scoping here, especially around some of the edge cases about what "external" means in this context.

site-src/geps/gep-1651.md

robscott · 2023-02-13T20:06:02Z

@dprotaso added this to agenda for community meeting today, want to try to figure out where this fits between a theoretical future ClusterIP GatewayClass, and the broader goal of providing shared config for all in-cluster implementations that are currently mapping a Gateway to a Service type=LB.

howardjohn · 2023-02-13T23:30:49Z

Discussed on zoom: this seems like a specialized case of a more general problem: how do we customize the generate resources (Deployment/Service commonly, maybe others) for in cluster deployments? See also #1713 and #1355.

Currently I think most implementations seem to do this in different ways. It may make sense to have a unified approach here rather than tackling one specific use case

robscott · 2023-02-14T00:19:52Z

Another follow up from community meeting. I think we have 3 questions to answer:

Should we group this together with broader efforts to configure how Gateways are translated to Services (and maybe Deployments)? That's really @howardjohn's question above.
Where should this config live? Is this something that should be per-Gateway or per-GatewayClass? In the case of cloud LBs, you're often talking about entirely different types of LB depending on if they're internal or external.
Should we differentiate between same-cluster and same-network? For example, many in-cluster implementations will only be able to implement same-cluster, while many cloud providers will only be able to implement same-network.

bowei · 2023-02-14T00:43:26Z

+1 on seeing if this fits into a category of standardizing the experience for the user of Gateway proxies/infrastructure that is built and deployed using standard K8s.

Regarding whether GatewayClass or Gateway -- this really rests on the role you think who should be determining these parameters. Our most typical use case is the person owning the lifecycle of the GW would not be also configuring # of replicas etc. They may give hints as to the capacity they need, but not be the one fixing the actual resource values.

robscott · 2023-02-14T00:44:16Z

Here's an attempt to answer the questions from the meeting I commented above:

Another follow up from community meeting. I think we have 3 questions to answer:

Should we group this together with broader efforts to configure how Gateways are translated to Services (and maybe Deployments)? That's really @howardjohn's question above.

At a minimum, we should try to list out what users commonly want to configure here. Determining what those features are and how often they're likely to vary would help us decide if that's a separate effort or directly linked with this.

Where should this config live? Is this something that should be per-Gateway or per-GatewayClass? In the case of cloud LBs, you're often talking about entirely different types of LB depending on if they're internal or external.

I'm biased because GKE already has different GatewayClasses for internal and external load balancers. If we were to include this on GWC, we could add something like a scope field that had the following options: External|SameNetwork|SameCluster.

While I can see how this could also fit on the Gateway itself, to me this feels more fundamentally about the type/class of Gateway that is being used than a switch on the Gateway.

I'm still not really sure how this would interact with the potential of a future "ClusterIP" GatewayClass, but I think this may be a bit less likely to conflict with that.

Should we differentiate between same-cluster and same-network? For example, many in-cluster implementations will only be able to implement same-cluster, while many cloud providers will only be able to implement same-network.

I think these probably should be different values. It's possible that in-cluster implementations would be able to configure internal LBs on some providers that effectively provided "SameNetwork" behavior.

site-src/geps/gep-1651.md

howardjohn · 2023-02-14T16:06:10Z

Regarding whether GatewayClass or Gateway -- this really rests on the role you think who should be determining these parameters. Our most typical use case is the person owning the lifecycle of the GW would not be also configuring # of replicas etc. They may give hints as to the capacity they need, but not be the one fixing the actual resource values.

FWIW this is not how many users use our product, at least. See https://istio.io/latest/docs/setup/additional-setup/gateway/#dedicated-application-gateway and #567.

I think my views on GatewayClass haven't necessarily been congruent with the projects position, but in my mind, if we draw an analogy to deployments (which is not a stretch, we literally make a Deployment for us :-) )

GatewayClass: deploy the meta-infra. i.e. Deployment controller in controller-manager.
Gateway: similar to Deployment itself, configures a deployment; all configuration for a given instance is here

I know there is "params" on Gatewayclass but I think this is an anti-pattern for most use cases. This would be akin to putting a cluster scoped "ResourceClass" resource that has cpu/memory requests, and deployments reference that vs inlining it - this isn't a terrible idea, but feels more like a higher level abstraction that a PaaS may offer, while I would expect Kubernetes to be a bit lower level. Similarly, having the ability to configure things at a higher level seems nice for Gateway, but being able to configure them on a per-Gateway basis remains important.

I would also like to point out that the "1-2 Gateways per cluster" is a common case, but not the only one. We expect to have 100s of Gateways in a cluster which all may have bespoke config. Making each of theserequire a GatewayClass is problematic

pleshakov · 2023-02-14T23:30:03Z

Could introducing parametersRef in the Gateway help with per-Gateway implementation-specific configuration?

dprotaso · 2023-02-21T17:21:04Z

Expanding scope of the GEP

Should we group this together with broader efforts to configure how Gateways are translated to Services (and maybe Deployments)? That's really @howardjohn's question above.

I think it's premature to suggest this use case warrants an expansion in scope to cover the definition of the Gateway's deployment. It feels this locks us into a 1:1 mapping between a Gateway and it's underlying k8s resources.

I worry this would interfere with my the proposal to define how multiple gateways can be merged. See #1713

Empirically, Knative's deployment of Istio makes use of a single deployment to handle cluster local traffic AND traffic from the internet. This is accomplished by using multiple K8s Services (one type: LoadBalancer and the other type: ClusterIP) each pointing to different ports on Istio's K8s deployment.

If a 1:1 mapping between Gateway => (Service & Deployment) is required we would have to push scope to be defined at the listener level and then add a corresponding scope property on the GatewayStatus.Addresses to figure out the address.

ie.

# Bad example
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: prod-web
spec:
  gatewayClassName: acme-lb
  listeners:  
  - protocol: HTTP
    port: 80
    scope: ClusterLocal
    name: prod-internal-gw
  - protocol: HTTP
    port: 80
    scope: Public
    name: prod-web-gw
status:
  addresses: 
  - type: IPAddress
    scope: Public
    value: 53.457.48.54
  - type: IPAddress
    value: 10.0.0.13
    scope: ClusterLocal

GatewayClass or Gateway

Where should this config live? Is this something that should be per-Gateway or per-GatewayClass? In the case of cloud LBs, you're often talking about entirely different types of LB depending on if they're internal or external.

GatewayClass seems solely for providing implementation specific configuration - given that it's so underspecified. Gateway is advertised as 1:1 with the life cycle of the configuration of infrastructure and since the proposed property influences the infrastructure being configured I would lean that it exists on the Gateway.

From howardjohn

Similarly, having the ability to configure things at a higher level seems nice for Gateway, but being able to configure them on a per-Gateway basis remains important.

Yeah +100

From howardjohn

I would also like to point out that the "1-2 Gateways per cluster" is a common case, but not the only one. We expect to have 100s of Gateways in a cluster which all may have bespoke config. Making each of theserequire a GatewayClass is problematic

Knative could easily have 1000s of Gateways resources that are merged (because of unique certs issued by HTTP01)

Same Cluster or Same Network

Should we differentiate between same-cluster and same-network? For example, many in-cluster implementations will only be able to implement same-cluster, while many cloud providers will only be able to implement same-network.

@robscott Can you elaborate on while many cloud providers will only be able to implement same-network.? What is considered same-network and what are the boundaries?

Given that I had to even ask that question I think SameCluster/ClusterIP is a very straight forward definition and has clear boundaries for users. I would encourage cloud providers to support this.

…et conditions to false

…ty fails

dprotaso · 2023-06-16T19:19:10Z

sorry folks - I rebase to resolve nav conflict

dprotaso · 2023-06-16T19:30:41Z

3 ERROR: failed to copy: httpReadSeeker: failed open: content at https://mirror.gcr.io/v2/library/golang/manifests/sha256:6fb612aac0ae076bd4f6a76e48c4c8e59a4bae89dc5201252ec2b4eb8a2ae2a0?ns=docker.io not found: not found

weird
/retest

dprotaso · 2023-06-19T14:20:07Z

gcr transient error

/retest

robscott · 2023-06-20T20:30:06Z

Thanks @dprotaso! This LGTM, but leaving for @youngnick to give final LGTM.

/approve

k8s-ci-robot · 2023-06-20T20:30:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, howardjohn, robscott, shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [robscott,shaneutt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

youngnick · 2023-06-26T14:54:41Z

After one last pass, yes, this LGTM

/lgtm

robscott · 2023-06-26T18:04:49Z

/hold cancel

dprotaso · 2023-06-26T21:22:28Z

wow no way - thanks everyone!

k8s-ci-robot added kind/gep PRs related to Gateway Enhancement Proposal(GEP) cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 16, 2023

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 16, 2023

k8s-ci-robot requested review from bowei and shaneutt January 16, 2023 21:46

dprotaso marked this pull request as draft January 16, 2023 21:47

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 16, 2023

dprotaso changed the title ~~Initial GEP draft with goals/non-goals, intro and references~~ GEP-1651: Cluster local Gateways Jan 16, 2023

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 16, 2023

robscott reviewed Jan 18, 2023

View reviewed changes

site-src/geps/gep-1651.md Outdated Show resolved Hide resolved

dprotaso mentioned this pull request Jan 19, 2023

[Upstream] Gateway API supports cluster-local Gateway knative-extensions/net-gateway-api#369

Open

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 13, 2023

dprotaso marked this pull request as ready for review February 13, 2023 16:04

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2023

k8s-ci-robot requested a review from keithmattix February 13, 2023 16:04

arkodg reviewed Feb 14, 2023

View reviewed changes

site-src/geps/gep-1651.md Outdated Show resolved Hide resolved

dprotaso added 9 commits June 16, 2023 15:17

if mutation can't be prevented then incompatible routability should s…

7418f43

…et conditions to false

Change Invalid reason to UnsupportedRoutability

d6a6eec

Use a reason UnsupportedRoutability if the Gateway doesn't support it

b28c5f6

add a section about in-cluster gateway not support private routability

669a9f2

fix indentation

00062b5

Annotate types with godoc

bc8909a

fix formatting/linking

f58ed6c

drop Ready condition and allow old gateways to be present if mutabili…

45ab2d4

…ty fails

add to nav

08cbe55

dprotaso force-pushed the gep-1651 branch from 7dbcdd7 to 08cbe55 Compare June 16, 2023 19:18

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 16, 2023

dprotaso mentioned this pull request Jun 24, 2023

Gateway::Status::Addresses has a new unique type []GatewayStatusAddress #2144

Merged

k8s-ci-robot assigned youngnick Jun 26, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2023

k8s-ci-robot merged commit d03200b into kubernetes-sigs:main Jun 26, 2023
9 checks passed

dprotaso deleted the gep-1651 branch June 26, 2023 21:22

dprotaso mentioned this pull request Jul 11, 2023

[GEP 1651]: Routability Conformance #2171

Merged

shaneutt mentioned this pull request Jul 24, 2023

shaneutt/1822 shaneutt/gateway-api#3

Closed

shaneutt mentioned this pull request Aug 7, 2023

update owner aliases shaneutt/gateway-api#4

Closed

shaneutt mentioned this pull request Sep 18, 2023

shaneutt/conformance tests for gateway addresses shaneutt/gateway-api#5

Closed

mikemorris mentioned this pull request Apr 3, 2024

Add ParametersRef to Gateway.spec.infrastructure #2924

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GEP-1651: Gateway Routability #1653

GEP-1651: Gateway Routability #1653

dprotaso commented Jan 16, 2023 •

edited

Loading

k8s-ci-robot commented Jan 16, 2023

dprotaso commented Jan 16, 2023

dprotaso commented Jan 16, 2023

robscott left a comment

robscott commented Feb 13, 2023

howardjohn commented Feb 13, 2023

robscott commented Feb 14, 2023

bowei commented Feb 14, 2023

robscott commented Feb 14, 2023

howardjohn commented Feb 14, 2023

pleshakov commented Feb 14, 2023

dprotaso commented Feb 21, 2023 •

edited

Loading

dprotaso commented Jun 16, 2023 •

edited

Loading

dprotaso commented Jun 16, 2023

dprotaso commented Jun 19, 2023

robscott commented Jun 20, 2023

k8s-ci-robot commented Jun 20, 2023

youngnick commented Jun 26, 2023

robscott commented Jun 26, 2023

dprotaso commented Jun 26, 2023

GEP-1651: Gateway Routability #1653

GEP-1651: Gateway Routability #1653

Conversation

dprotaso commented Jan 16, 2023 • edited Loading

k8s-ci-robot commented Jan 16, 2023

dprotaso commented Jan 16, 2023

dprotaso commented Jan 16, 2023

robscott left a comment

Choose a reason for hiding this comment

robscott commented Feb 13, 2023

howardjohn commented Feb 13, 2023

robscott commented Feb 14, 2023

bowei commented Feb 14, 2023

robscott commented Feb 14, 2023

howardjohn commented Feb 14, 2023

pleshakov commented Feb 14, 2023

dprotaso commented Feb 21, 2023 • edited Loading

Expanding scope of the GEP

GatewayClass or Gateway

Same Cluster or Same Network

dprotaso commented Jun 16, 2023 • edited Loading

dprotaso commented Jun 16, 2023

dprotaso commented Jun 19, 2023

robscott commented Jun 20, 2023

k8s-ci-robot commented Jun 20, 2023

youngnick commented Jun 26, 2023

robscott commented Jun 26, 2023

dprotaso commented Jun 26, 2023

dprotaso commented Jan 16, 2023 •

edited

Loading

dprotaso commented Feb 21, 2023 •

edited

Loading

dprotaso commented Jun 16, 2023 •

edited

Loading