Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding GEP-1748: Gateway API Interaction with Multi-Cluster Services #1843

Merged
merged 1 commit into from
Apr 6, 2023

Conversation

robscott
Copy link
Member

@robscott robscott commented Mar 17, 2023

Recommend reviewing deploy preview so examples are inlined: https://deploy-preview-1843--kubernetes-sigs-gateway-api.netlify.app/geps/gep-1748/.

What type of PR is this?
/kind gep

What this PR does / why we need it:
This formally defines the relationship between Gateway API and Multi-Cluster Services as part of #1748.

Does this PR introduce a user-facing change?:
I think the changelog notes should be saved for actual implementation PR, but open to ideas:

NONE

/hold for consensus
Want at least one LGTM each from SIG-MC, GW API maintainers, and GAMMA leads before removing hold. Feedback from everyone (including people not directly mentioned below) is very appreciated.

SIG-MC:
/cc @lauralorenz @JeremyOT @pmorie

GW API:
/cc @bowei @shaneutt @youngnick

GAMMA
/cc @howardjohn @keithmattix @mikemorris

@k8s-ci-robot k8s-ci-robot requested a review from bowei March 17, 2023 23:24
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 17, 2023
@k8s-ci-robot k8s-ci-robot added the kind/gep PRs related to Gateway Enhancement Proposal(GEP) label Mar 17, 2023
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 17, 2023
@robscott robscott force-pushed the gep-1748 branch 3 times, most recently from c64f10a to 2cc1153 Compare March 17, 2023 23:59
@shaneutt shaneutt added this to the v1.0.0 milestone Mar 20, 2023
geps/gep-1748.md Show resolved Hide resolved

### Services vs ServiceImports

It is important that all implementations provide a consistent experience. That
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the issue for me, IMO. Everything else is (to me) obviously correct.

However, this is saying that if you use Gateway API (which, at this point, is the ~entire community), you MUST use MCS for multicluster semantics, and MCS is Alpha and has a some problems blocking adoption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@howardjohn Could you expand a bit more on "problems blocking adoption"?

From Consul's perspective we do want a broader scope to enable routing to off-cluster destinations without a guarantee of "sameness" (sometimes described as "cross-clusterset"), but we're willing to start with an impl-specific approach for that and work towards expanding the MCS spec to hopefully support that in the future.

geps/gep-1748.md Outdated Show resolved Hide resolved
geps/gep-1748.md Outdated Show resolved Hide resolved
geps/gep-1748.md Outdated Show resolved Hide resolved
@youngnick
Copy link
Contributor

I will have comments on this, but I haven't had the bandwidth to review properly yet (requires me spending some more time on Cilium's existing Clustermesh to be able to talk about what it can do more intelligibly). Could we give me some time on this one please?

Copy link

@lauralorenz lauralorenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little nits suggested.

Representing SIG-Multicluster:
/lgtm

geps/gep-1748.md Outdated Show resolved Hide resolved
examples/standard/multicluster/httproute-gamma.yaml Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2023
@lauralorenz
Copy link

Sorry I did the github-native approve thing and it seems to mean something nowadays 😅 but the hold stands!

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 5, 2023
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On first review I don't have any specific immediate concerns with the contents of the GEP overall.

What I am concerned about is the launch straight to experimental, considering that we consider experimental to be something more like "next", and we don't even have like, conformance tests in mind yet?

geps/gep-1748.md Show resolved Hide resolved
@liwenwu-amazon
Copy link
Contributor

The GEP looks good!. In AWS VPC Lattice controller gateway API/multicluster implementation, VPC Lattice is the federation.

For serviceexport, the lattice controller register all of its endpoints in local cluster to VPC lattice

To reference a serviceimport as a backendRef of a HTTPRoute, the lattice controller will configure VPC lattice resource accordingly. So that the traffic will get routed to endpoints in that remote cluster(VPC) by VPC lattice dataplane.

In addition to K8S endpoints, serviceimport can references targets like EC2, lambda function and ECS container(s).

@youngnick
Copy link
Contributor

Because this is extended, this looks fine to me. I don't think we'll be doing this in Cilium for a while though.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 5, 2023
endpoints for the local Service within the same cluster.

```yaml
{% include 'standard/multicluster/httproute-gamma.yaml' %}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wonder if it's more suitable to define the serviceImport or service as the parentRef in the tcpRoute or udpRoute. To me, both of them are more like l4 resource, instead of l7.

Copy link
Contributor

@mikemorris mikemorris Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Service as a parentRef for TCPRoute is a viable use case, and is included in GEP-1426.

Supporting HTTPRoute is important for some more advanced configuration, such as redirecting only a specific path (like /admin) to a different Service backendRef when doing something like splitting up a monolith application.

### Mesh: ServiceImport as Parent

In some cases, you may want to override traffic destined for a Multi-Cluster
Service with a mesh. As part of the broader GAMMA initiative, ServiceImport can

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have any doc to describe how mesh integrates with gateway api?

Copy link
Contributor

@mikemorris mikemorris Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a short high-level summary is at https://gateway-api.sigs.k8s.io/contributing/gamma/, with more detail in GEP-1324 (describing the intent for expanding Gateway API to E/W and service mesh use cases) and GEP-1426 (describing a proposed concrete implementation).

Copy link
Contributor

@mikemorris mikemorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed two alternative suggestions for GAMMA examples, but generally supportive of this with the change to

Services SHOULD always be interpreted as references to endpoints within the same cluster

Further comments on specific behavior can be found in #1843 (comment)

/lgtm

geps/gep-1748.md Outdated Show resolved Hide resolved
Comment on lines 7 to 16
spec:
parentRefs:
- group: multicluster.x-k8s.io
kind: ServiceImport
name: store
rules:
- backendRefs:
- kind: Service
name: store
port: 8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
spec:
parentRefs:
- group: multicluster.x-k8s.io
kind: ServiceImport
name: store
rules:
- backendRefs:
- kind: Service
name: store
port: 8080
spec:
parentRefs:
- group: multicluster.x-k8s.io
kind: ServiceImport
name: store
rules:
- matches:
- path:
value: "/cart"
backendRefs:
- group: multicluster.x-k8s.io
kind: ServiceImport
name: cart

After reading through the updated content, I think splitting off one path from a ServiceImport to a different ServiceImport would be a better example for illustrating ServiceImport as a parentRef than redirecting ClusterSetIP traffic to stay cluster-local.

Configuring traffic to a ServiceImport ClusterSetIP to instead stay cluster-local feels a bit odd (and possibly more natural to configure by either not exporting the service in the first place or blocking store.svc.clusterset.local with egress policy).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we additionally want to add an example for redirecting ClusterIP traffic to a Multi-Cluster Service, or should we wait on explicitly endorsing that?

  parentRefs:
  - kind: Service
    name: store
  rules:
  - backendRefs:
    - group: multicluster.x-k8s.io
      kind: ServiceImport
      name: store

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks! Opted to leave out the additional example since this is about to merge, and don't want to increase scope for others that have already approved. Probably worth a follow up though.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2023
@robscott
Copy link
Member Author

robscott commented Apr 6, 2023

Thanks for all the feedback everyone. I think we've got consensus here, going to remove the hold so the next LGTM merges.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 6, 2023
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: keithmattix, lauralorenz, mikemorris, robscott, shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shaneutt shaneutt modified the milestones: v1.0.0, v0.7.0 Apr 6, 2023
@k8s-ci-robot k8s-ci-robot merged commit b1db26f into kubernetes-sigs:main Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

10 participants