Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add redirect loop protection to HTTPRequestRedirect #1185

Open
rainest opened this issue Jun 6, 2022 · 15 comments
Open

Add redirect loop protection to HTTPRequestRedirect #1185

rainest opened this issue Jun 6, 2022 · 15 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@rainest
Copy link
Contributor

rainest commented Jun 6, 2022

What would you like to be added:
I want the HTTPRequestRedirect to include language indicating how implementations handle redirects that result in loops, where the request URL received is equal to the request URL returned by the filter.

I propose that the spec be amended to include something like

If the result of applying an HTTPRequestRedirect would be a response whose Location header value matches the original request URL, implementations MUST NOT return the redirect request, and MUST instead proxy the request upstream.

Is this behavior desirable, and can implementations honor it easily?

Why this is needed:
https://gateway-api.sigs.k8s.io/v1alpha2/references/spec/#gateway.networking.k8s.io/v1alpha2.HTTPRequestRedirectFilter defines an HTTPRoute Filter for issuing HTTP redirects in response that match a set of criteria. My strictest interpretation of the spec is that implementations must honor these absolutely: there are no matching criteria other than the HTTPRoute itself, and if a request matches that HTTPRoute, the implementation must issue the redirect.

Because the redirect filter applies to any matching request, you can easily configure HTTPRoutes that will result in a redirect loop: clients will send a request with some URL, receive a redirect response directing them to the same URL, send a new request to that URL, and receive another redirect to the same URL ad infinitum. For example, this HTTPRoute performs an HTTP->HTTPS redirect for any request to the redirect.example domain:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: http-filter-redirect
spec:
  hostnames:
    - redirect.example
  rules:
    - filters:
      - type: RequestRedirect
        requestRedirect:
          scheme: https
          statusCode: 301

With the naive implementation above, attaching this HTTPRoute to a Gateway with both HTTP and HTTPS Listeners would result in a loop: http://example.redirect/ will be redirected to https://example.redirect and https://example.redirect will also be redirected, back to itself, indefinitely.

AFAIK to avoid this you'd need to create separate HTTPRoutes, one with the redirect and one without, and either bind them to separate Gateways or use a combination of allowedRoutes configuration and namespaced backendRefs to bind them to only one Listener on a single Gateway. I may have overlooked something here, but I don't see another way to ensure they don't collide. I am not aware of any reason users would want to configure a redirect loop, since HTTP-compliant clients will never receive a useful response from a looping URL, so I believe the strict interpretation of the current spec only needlessly complicates configuration.

While HTTP->HTTPS redirects are probably the most common case where sensible configuration will result in a loop, this problem is generic to any redirect configuration: a regex path match that redirects /foo/* to /foo/example will also loop indefinitely, since /foo/example matches the expression. The proposed change avoids loops regardless of the redirect type.

@rainest rainest added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 6, 2022
@rainest
Copy link
Contributor Author

rainest commented Jun 6, 2022

From our (Kong) perspective, we are in favor of avoiding loops, and iffy on the implementation. We support HTTP->HTTPS redirects as a standard part of our routing configuration, and configuring these redirects on dual-stack routes does indeed proxy HTTPS requests upstream.

We do not have standard support for generic redirects, however, and the simplest means we have of implementing non-protocol redirects would be functionality that is not easily able to insert dynamic data from the request (to reuse parts of the original request not changed by the filter). I do not think implementing a proper generic and dynamic redirect functionality will be difficult for us however, it's just not something that's immediately available--we'd need to implement something new even without this change to the spec.

If I'm right that there's no way to avoid this other than separate Gateways or fun allowedRoutes tricks, I that we'd probably actually need an updated HTTPRequestRedirect API to make excludes explicit, something like:

requestRedirect:
  scheme: https
  statusCode: 301
  exclude:
    scheme: https

@rainest rainest changed the title Add redirect loop protection to URLRewriteFilter Add redirect loop protection to HTTPRequestRedirect Jun 6, 2022
@youngnick
Copy link
Contributor

I think that the "if the request is the same, the redirect does nothing" rule is a pretty good guideline, that can easily be extended if we add extra functionality in the future.

Additionally, in the case that we're making the redirect exclusions explicit, aren't they always going to be specifying a thing to change about the request (in that example, the scheme), and then a thing to exclude (the scheme again)? Why would you ever want to redirect to a new path, and then exclude the scheme, or redirect the scheme and exclude a path? I agree that in general we should be avoiding implicit config and preferring explicit, but I'm really struggling to see cases where the "if the request is the same, the redirect does nothing" rule doesn't do what we want, and isn't easier to write.

In the community call today, @bowei mentioned the use case of having a redirect also add a header (often used with a CDN). I'd contend that there are two ways we could address that use case using the "same request = no action" rule:

  • We promise that we'll never add anything other than things that can fit in a URI (scheme, hostname, port, path). This would make sense and kind of fit, given that the 3xx series only allow the Location field, which contains only a URI. In this case, setting headers would need to be done by a header modification filter before the redirect filter.
  • We add header modification to the redirect filter, and then ensure that "same request" includes headers. I don't like this idea, because of header addition not being very standard on 3xx responses.

rainest added a commit to rainest/gateway-api that referenced this issue Jun 7, 2022
Remove the paragraph describing redirect loop handling. The spec
behavior is still under discussion and remains undefined for now.

See kubernetes-sigs#1185
rainest added a commit to rainest/gateway-api that referenced this issue Jun 7, 2022
Remove the paragraph describing redirect loop handling. The spec
behavior is still under discussion and remains undefined for now.

See kubernetes-sigs#1185
rainest added a commit to rainest/gateway-api that referenced this issue Jun 7, 2022
Remove the paragraph describing redirect loop handling. The spec
behavior is still under discussion and remains undefined for now.

See kubernetes-sigs#1185
@pleshakov
Copy link
Contributor

I wonder if the HTTPS redirect use case is a suitable example for the redirect loop problem.

For example, what if an HTTPRoute has many rules. This means the user will need to configure the redirect filter for every rule, right?:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: foo-route
spec:
  parentRefs:
  - name: example-gateway # binds both to an HTTP and HTTPS listeners
  hostnames:
  - "foo.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /path1
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-1
      port: 8080
  - matches:
    - path:
        type: PathPrefix
        value: /path2
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-2
      port: 8080
  - matches:
    - path:
        type: PathPrefix
        value: /path3
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301
    backendRefs:
    - name: svc-3
      port: 8080

This looks more verbose than having two HTTPRoutes (one that explicitly references an HTTP listener of example-gateway with a redirect for path / and the other that explicitly references an HTTPS listener of example-gateway with all the paths defined), especially when an HTTPRoute has many many paths.

AFAIK to avoid this you'd need to create separate HTTPRoutes, one with the redirect and one without, and either bind them to separate Gateways or use a combination of allowedRoutes configuration and namespaced backendRefs to bind them to only one Listener on a single Gateway.

I wonder why configuring allowedRoutes and namespaced backendRefs is necessary. Could we just choose the right listener in parentRef of the HTTPRoute?

@rainest
Copy link
Contributor Author

rainest commented Jun 8, 2022

Could we just choose the right listener in parentRef of the HTTPRoute?

Yes--reviewing again while less tired, I glazed over sectionName because it wasn't named listener, but that will allow selecting the desired Listener directly.

This means the user will need to configure the redirect filter for every rule, right?

Yes, that seems inherent to having the filters specified at the rule level. In the case where you have many paths you probably would want to bind the path-differentiated HTTPRoute to your HTTPS Listener and then have a separate HTTP listener HTTPRoute without the paths, only the hostname and a blanket any path rule (or a shared prefix) with the redirect filter.

We do still allow binding to both HTTP and HTTPS Listeners, so the question is whether you have to create that split configuration to avoid loops or if you have the option to create a single HTTPRoute if you don't have many matches and/or aren't aware of the need to split.

we're making the redirect exclusions explicit, aren't they always going to be specifying a thing to change about the request (in that example, the scheme), and then a thing to exclude (the scheme again)?

My expectation is yes: if we have explicit exemption configurations, almost all would use the proposed implicit config. It'd technically allow you to do something silly like exempt only some subset of requests from redirects, but I can't imagine why you would. With that in mind, if we have explicit configuration for this, it makes more sense to add a general avoidLoops: true/false.

We promise that we'll never add anything other than things that can fit in a URI (scheme, hostname, port, path). This would make sense and kind of fit, given that the 3xx series only allow the Location field

This is my expectation as well (and the current behavior): the redirect filter only lets you control the contents of Location since that's all that https://datatracker.ietf.org/doc/html/rfc7231#section-6.4.2 specifies. @bowei do you have examples of what you're referring to? Offhand I don't know of any de facto standard where a 3XX response indicates the original URL in Location but also includes some header instructing the client to change something other than the URL in subsequent requests. If that exists, but is uncommon, would it possibly make sense to use ExtensionRef pointing to something very similar to HTTPRequestRedirectFilter, but with configuration for the header injection and implementation logic to allow loops for that custom filter while avoiding them in the standard HTTPRequestRedirectFilter?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 6, 2022
@mikemorris
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 15, 2022
@robscott
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2022
@shaneutt shaneutt added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Dec 16, 2022
@shaneutt shaneutt added this to the v0.7.0 milestone Dec 16, 2022
@shaneutt shaneutt modified the milestones: v0.7.0, v1.0.0 Feb 21, 2023
@shaneutt shaneutt added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Feb 21, 2023
@Xunzhuo
Copy link
Member

Xunzhuo commented Mar 8, 2023

/assign

@nak3
Copy link
Member

nak3 commented Sep 5, 2023

Hello 👋 Is there any news about this?

@HummingMind
Copy link

HummingMind commented Nov 29, 2023

What a mess. I just spent a full day tryign to figure out how to redirect http://.../something to https://.../something.
There seems to be no way to do this with HTTPRoute without getting stuck in a redirect loop. Why would a request that is already HTTPS readirect again to HTTPS (if the host and the path are the same, but only the scheme needs to change)?

Makes no sense. 🤦🏻

@robscott
Copy link
Member

@HummingMind that definitely sounds frustrating. Redirect protection is significantly more complicated than it sounds, but our docs should make it more clear that you currently need a separate route for the redirect logic (https://gateway-api.sigs.k8s.io/guides/http-redirect-rewrite/#redirects).

@HummingMind
Copy link

@robscott I've read that section in the documentation and it did not clarify much.
I found an answer here: cilium/cilium#28796 (will be testing it shortly), where they talk about specifying a sectionName in the parentRefs, to attach to a specific listener for a Gateway, instead of the all the listeners. This, in turn, will require a separate HTTPRoute.

@dprotaso
Copy link
Contributor

dprotaso commented Feb 24, 2024

Curious what folks think about potentially matching on the scheme so that you wouldn't need two HTTPRoutes for redirects eg.

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute
metadata:
  name: http-filter-redirect
spec:
  hostnames:
    - redirect.example
  rules:
    - matches:
      - scheme: http 
      filters:
      - type: RequestRedirect
        requestRedirect:
          scheme: https
          statusCode: 301

Here the new bit is

  - matches:
    - scheme: http 

Ideally we wouldn't need this but 🤷 - this seems explicity and not to difficult in my mind

@dprotaso
Copy link
Contributor

dprotaso commented Feb 24, 2024

Seems like there's an issue already for matching scheme via pseudo headers - #2455 - though that might only work for HTTP/2 ?

One thing I'm curious about though is it possible to spoof a scheme?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests