Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: support requests forwarding #918

Merged
merged 2 commits into from Jan 28, 2019
Merged

feature: support requests forwarding #918

merged 2 commits into from Jan 28, 2019

Conversation

Traumeel
Copy link
Contributor

Introduces requests forwarding with a new backend <forward>

Example usage:

  • * -> <forward>
  • * -> setRequestHeader("X-Passed-Skipper", "true") -> <forward>;

Fixes #908

TCP tunnelling is not covered.

@hjacobs
Copy link
Contributor

hjacobs commented Jan 17, 2019

Thanks for the PR!

@hjacobs hjacobs requested a review from szuecs January 17, 2019 20:03
@szuecs
Copy link
Member

szuecs commented Jan 18, 2019

@Traumeel can you also add user documentation in docs/ and can you push the PR into the non fork skipper repository to a branch for example `feature/request-forward-proxy, please?

proxy/doc.go Outdated Show resolved Hide resolved
Copy link
Member

@szuecs szuecs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR: code looks good, tests added, developer docs, too

Thanks for the nice PR. I would like to test it before merge.

proxy/proxy.go Show resolved Hide resolved
@grassator
Copy link
Contributor

This extends eskip language with a completely new keyword. In my view if we are changing the language a more robust strategy for the future would be to allow filters to act as backends. I remember discussing this with @aryszka briefly. The change required would be much bigger of course, but would help generalize the setup and also allow for nice features such as dynamic backends.

@aryszka
Copy link
Contributor

aryszka commented Jan 18, 2019

thanks a lot for the nice PR!

@aryszka
Copy link
Contributor

aryszka commented Jan 18, 2019

👍

@aryszka
Copy link
Contributor

aryszka commented Jan 18, 2019

@grassator right, dynamic backends would be a nice addition. Though I would still distinguish the backends from the filters in the syntax. Anyway, a POC is always welcome 😄

@aryszka
Copy link
Contributor

aryszka commented Jan 18, 2019

@grassator this discussion #556 is also related to the dynamic backends.

@Traumeel
Copy link
Contributor Author

@aryszka sorry, updated PR. Trying to handle missing scheme case.

@szuecs
Copy link
Member

szuecs commented Jan 18, 2019

@Traumeel I tried to make https work, but it does not work. It is doing a CONNECT instead of GET, which is not implemented right now. If you are fine with http backend only we can merge it, if not there is more to do.
For CONNECT handling it might make sense to do it similar to https://github.com/zalando/skipper/blob/master/proxy/proxy.go#L678 and basically connect the socket and connect the sockets similar to https://github.com/zalando/skipper/blob/master/proxy/upgrade.go

@Traumeel
Copy link
Contributor Author

@szuecs let's merge, covers my use case

@UltraNemesis
Copy link
Contributor

Why does the scheme for forwarding need to be set separately at a global level?

As I see it, this is a very basic forward proxy scenario being baked into what has so far been a reverse proxy. The fundamental difference with the request meant for a forward proxy is that the RequestURI will contain the full target URI including scheme and host while in normal cases, only path will be present. So, both target host and scheme should be obtained from RequestURI and used to make the forward request.

Also, I don't see the utility of handling "CONNECT" to enable tunneling even in future. The main USP of skipper is being able to conditionally modify request/responses in the process of routing. With tunneling, that ability will be removed as all traffic passing through will be encrypted with skipper only maintaining a tunnel.

@szuecs
Copy link
Member

szuecs commented Jan 18, 2019

@UltraNemesis I agree that skipper’s main purpose and power is reverse proxy, but we also support upgrade headers, such that you can use websocket or spdy (kubectl exec/attach/log with audit logging).
I think we should implement CONNECT headers if we have <forward> as special backend. We already have <shunt> to shortcut response without backend and <loopback> to apply routing with the modified http.Request object.
The current implementation is good enough for the current use case we have and does not hurt anyone. We can also later silently update to handle CONNECT without much impact on anyone. I would create an issue to not forget the CONNECT feature and postpone it for later, when I have more time to implement this or someone take it.

Do you see something wrong or missing?

@UltraNemesis
Copy link
Contributor

@szuecs What you say about CONNECT makes perfect sense from that perspective. In fact, if skipper is going be usable as a forward proxy, I think you should also think about having an MITM mode for this as an option so that requests can be intercepted and predicates/filters can be applied before forwarding even in the HTTPS mode.

What I don't still understand is the need for the default scheme flag. The RequestURI will have full URI including scheme and in any case, without CONNECT handling, all requests will have to be HTTP.

Also, with regard to my own use case, my scenario about was about being able to select the target host from a filter based on headers or body while the requests are still made with skipper as a reverse proxy. I had setup something like this at my work using an F5 BIG IP iRule and evaluating whether I can replace it with skipper and a custom filter.

The addition of the "forward" backend type makes this easier, but what we will still need to have is a method like SetTargetHost in the filter context that can be used to set a target host for a request and skipper should make the request to that host.

@szuecs
Copy link
Member

szuecs commented Jan 19, 2019

@UltraNemesis The following should work with forward in your case:

r: * -> setRequestHeader("Host", "foo.example.org") -> <forward>

With the use of setRequestHeader filter and instead of *, Predicates you can also build more complex routes as you wish, the question is if you need <forward> in this case, because you have to already know in advance what kind of Host header you need.

Answering inline:

@szuecs What you say about CONNECT makes perfect sense from that perspective. In fact, if skipper is going be usable as a forward proxy, I think you should also think about having an MITM mode for this as an option so that requests can be intercepted and predicates/filters can be applied before forwarding even in the HTTPS mode.

Sounds good to me, CONNECT does not need to be only TCP pass through.

What I don't still understand is the need for the default scheme flag. The RequestURI will have full URI including scheme and in any case, without CONNECT handling, all requests will have to be HTTP.

The problem is that the RequestURI only contains path and query, there is no Scheme if you get a Request object passed.
See this example:

% cat >main.go
package main

import (
        "fmt"
        "io/ioutil"
        "log"
        "net/http"
)

func runServer() {

        http.HandleFunc("/foo", func(w http.ResponseWriter, r *http.Request) {
                fmt.Fprintf(w, "Hello, '%q' '%s'", r.URL.Scheme, r.URL)
        })

        log.Fatal(http.ListenAndServe(":8081", nil))
}

func main() {
        go runServer()

        resp, err := http.Get("http://127.0.0.1:8081/foo?q=hello")

        if err != nil {
                log.Fatalf("Failed to Get: %v", err)
        }
        defer resp.Body.Close()
        body, err := ioutil.ReadAll(resp.Body)
        if err != nil {
                log.Fatalf("Failed to Get: %v", err)
        }
        fmt.Printf("%s\n", body)
}
[sszuecs@uho:/tmp/foo]% go run main.go 
Hello, '""' '/foo?q=hello'

Also, with regard to my own use case, my scenario about was about being able to select the target host from a filter based on headers or body while the requests are still made with skipper as a reverse proxy. I had setup something like this at my work using an F5 BIG IP iRule and evaluating whether I can replace it with skipper and a custom filter.

Can you explain your use case in a separate issue?
We can discuss there how this could be done.

The addition of the "forward" backend type makes this easier, but what we will still need to have is a method like SetTargetHost in the filter context that can be used to set a target host for a request and skipper should make the request to that host.

-> setRequestHeader("myheader", "myValue")

@UltraNemesis
Copy link
Contributor

@szuecs : The problem in the code you posted is that you are not making a forward proxy compatible HTTP request. Try setting the server as forward proxy

	proxyURL, _ := url.Parse("http://127.0.0.1:8081")
	client := &http.Client{}

	if proxyURL != nil {
		client.Transport = &http.Transport{
			Proxy: http.ProxyURL(proxyURL),
		}
	}
        resp, err := client.Get("http://127.0.0.1:8080/foo?q=hello")

I recommend reading through this

https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/

When you make the request to a forward proxy, the specs require that the full Request URI including the scheme is sent instead of just relative URI like in normal cases.

Here is what the dump looks like

Direct Request

GET /foo?q=hello HTTP/1.1
Host: 127.0.0.1:8080
Accept-Encoding: gzip
User-Agent: Go-http-client/1.1

Proxy Request

GET http://127.0.0.1:8080/foo?q=hello HTTP/1.1
Accept-Encoding: gzip
User-Agent: Go-http-client/1.1

If skipper is going to act as a forward proxy with this implementation, it should go with the established standards.

@UltraNemesis
Copy link
Contributor

Here is code including a fully functional forward proxy implementation

package main

import (
	"fmt"
	"io"
	"io/ioutil"
	"log"
	"net"
	"net/http"
	"net/http/httputil"
	"net/url"
	"time"
)

type targetHandler struct{}

type proxyHandler struct{}

func main() {
	log.Println("Starting Proxy Server")
	go runProxyServer()

	log.Println("Starting Target Server")
	go runServer()

	proxyURL, _ := url.Parse("http://127.0.0.1:8888")

	log.Println("Make Direct HTTP Request")
	makeRequest(nil)

	log.Println("Make Proxy HTTP Request")
	makeRequest(proxyURL)
}

func makeRequest(proxyURL *url.URL) {
	client := &http.Client{}

	if proxyURL != nil {
		client.Transport = &http.Transport{
			Proxy: http.ProxyURL(proxyURL),
		}
	}

	resp, err := client.Get("http://127.0.0.1:8080/foo?q=hello")

	if err != nil {
		log.Fatalf("Failed to Get: %v", err)
	}
	defer resp.Body.Close()
	body, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		log.Fatalf("Failed to Get: %v", err)
	}
	log.Printf("%s\n", body)
}

func runServer() {
	log.Fatal(http.ListenAndServe(":8080", &targetHandler{}))
}

func runProxyServer() {

	http.ListenAndServe(":8888", &proxyHandler{})
}

func (h *targetHandler) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
	bytes, _ := httputil.DumpRequest(req, true)

	fmt.Println(string(bytes))

	fmt.Fprintf(rw, "Hello, '%q' '%s'", req.URL.Scheme, req.URL)
}

func (h *proxyHandler) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
	if req.Method == http.MethodConnect {
		handleTunneling(rw, req)
	} else {
		handleHTTP(rw, req)
	}

}

func handleHTTP(rw http.ResponseWriter, req *http.Request) {
	log.Printf("Request URI : %s", req.RequestURI)

	bytes, _ := httputil.DumpRequest(req, true)

	fmt.Println(string(bytes))

	newReq, _ := http.NewRequest(req.Method, req.RequestURI, req.Body)

	copyHeader(newReq.Header, req.Header)

	resp, err := http.DefaultClient.Do(newReq)

	if err != nil {
		log.Println(err)
	} else {
		copyHeader(rw.Header(), resp.Header)

		io.Copy(rw, resp.Body)

	}

}

func handleTunneling(rw http.ResponseWriter, req *http.Request) {
	target, err := net.DialTimeout("tcp", req.Host, 10*time.Second)

	if err != nil {
		http.Error(rw, err.Error(), http.StatusServiceUnavailable)
		return
	}

	rw.WriteHeader(http.StatusOK)

	hijacker, ok := rw.(http.Hijacker)

	if !ok {
		http.Error(rw, "Hijacking not supported", http.StatusInternalServerError)
		return
	}

	source, _, err := hijacker.Hijack()

	if err != nil {
		http.Error(rw, err.Error(), http.StatusServiceUnavailable)
	}
	go pipe(target, source)
	go pipe(source, target)
}

func pipe(target io.WriteCloser, source io.ReadCloser) {
	defer target.Close()
	defer source.Close()
	io.Copy(target, source)
}

func copyHeader(dst, src http.Header) {
	for k, vv := range src {
		for _, v := range vv {
			dst.Add(k, v)
		}
	}
}

@Traumeel
Copy link
Contributor Author

@UltraNemesis thx for the nice read.

https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/

When you make the request to a forward proxy, the specs require that the full Request URI including the scheme is sent instead of just relative URI like in normal cases.

So this should hold true for a proxy aware client, in case it is not URIs will be relative. If it is the case we need to reconstruct it.

Going to change the PR to handle it

@UltraNemesis
Copy link
Contributor

@Traumeel Yes, a properly implemented forward proxy aware client will always send the request in that format. You can count on it being that way.

To be honest, you don't really need to read the scheme at all. Any forward proxy aware client will send a CONNECT request to establish a tunnel if the target URL scheme is HTTPS . So, if you get anything other than a CONNECT, it will be HTTP.

@Traumeel
Copy link
Contributor Author

So, by default forwarding will work for HTTP only.

To access HTTPS URLs forward-scheme-https flag can be used, it will upgrade all requests with <forward> backends to HTTPS. This basically mimics the current behaviour of HTTP endpoint backend, where you can set the scheme explicitly. Proxy receives HTTP request from a client, adds headers (optionally) and sends HTTPS request to the target.

@UltraNemesis
Copy link
Contributor

Then, it is no longer a proper forward or reverse proxy scenario.

If you configure skipper as a forward proxy, A forward proxy aware client will automatically send a CONNECT request for HTTPS URLs. So, if you have HTTPS endpoint, you will have to manually change the request to HTTP first at the client side before sending it to skipper just so that skipper will accept it then change it HTTPS based on the flag. Further more, you have no control as its a global scheme.

It is no good as a reverse proxy case either as you would need to construct a custom HTTP request with Host header set to actual target rather than skipper and then send it to skipper. Such an implementation would screw up any chance to support SNI and Virtual Host based TLS in skipper in future.

So, I highly recommend sticking to the specs for out of the box behavior be it as a forward proxy ore reverse proxy. If any custom/overriding behavior like this is required, provision for that should be made as part of the filters framework.

In fact, I too have a pretty similar custom requirement that would fit on top of this. In my case, I want to make a request to skipper and I want to decide based on headers or body content which destination I want to send it to. Only difference from your case is that you just want to modify the scheme in the URL rather than whole URI. This is best fit into the filter system. Have a method in filter context which allows setting the Target Backend for the request.

@szuecs
Copy link
Member

szuecs commented Jan 22, 2019

@UltraNemesis as far as I get @Traumeel, the use case is to have a "malicious" DNS server that responds to all lookups to point to skipper, so clients are not aware to have a proxy MITM.
I think <forward> should be more the backend for the CONNECT case and we should have some other name for it. The Problem with the Scheme in the request is that skipper acting as MITM proxy does not know which backend protocol to use.

There are multiple options here:

  1. We could have a filter that copies a header to the statebag and the backend <mitm> could restore the value from the statebag.
  2. We could have a filter that copies a header to the statebag and a filter statebag to header and the backend <mitm> could restore the value from a hardcoded header, which could enable similar use cases as for example linkerd has with the overwrite route from a request (in our case we would have to set the backend to enable this behavior). This would be similar to the lbDecide() filter and LBMember() that do loadbalancing decision and the dropRequestHeader("X-Load-Balancer-Member") to clean up the internal decision.

@aryszka
Copy link
Contributor

aryszka commented Jan 22, 2019

I tend to agree with @UltraNemesis that following the standards as the default is more future proof.

As I understand, @Traumeel 's use case, it would be required to accept HTTP requests and forward them to HTTPS. Is this correct?

For the backend host, I would consider something like this:

  1. try to use the RequestURI
  2. if it RequestURI doesn't contain the host, then fallback to the Host header

For the scheme, I would avoid using a global default scheme, instead I would create a filter that overrides the scheme of the incoming request:

* -> setForwardScheme("https") -> <forward>

@Traumeel would this approach work for your use case?

@Traumeel
Copy link
Contributor Author

Traumeel commented Jan 22, 2019

@aryszka

As I understand, @Traumeel 's use case, it would be required to accept HTTP requests and forward them to HTTPS. Is this correct?

Correct.
My current setup works in the following way:

  • Hosts configured with bind9 which routes request to Skipper
  • Skipper is behind ELB (HTTPS is terminated, ELB writes X-Forwarded-Proto header and can get scheme from there)
  • Skipper injects headers and forwards

The setups works nicely for a predefined set of hosts, but it is cumbersome to update it with new hosts.

For the scheme, I would avoid using a global default scheme, instead I would create a filter that overrides the scheme of the incoming request:

* -> setForwardScheme("https") -> <forward>

@Traumeel would this approach work for your use case?

If I understand this correctly setForwardScheme would be a builtin filter. In general I think this approach might work, but I need to propagate this setting to request mapping somehow. As far as my understanding goes there are two possibilities: extend FilterContext interface or use StateBag. In both cases it would be possible to reuse this functionality from a custom filter.

Thus I was thinking to go with extending FilterContext interface to provide a possibility to redefine both host and scheme from a filter. Which will probably work in the same way SetOutgoingHost works. In this way filters will get full control of the final destination. Not sure if there are any implications which I don't currently see with this approach or if there is a technical blocker somewhere, but looks good to me at the moment.

@aryszka @szuecs What do you think? This will solve my use case and @UltraNemesis

The functionality can optionally be enabled for only a special backend type e.g. <filter-defined-backend>

@szuecs
Copy link
Member

szuecs commented Jan 22, 2019

After an offline discussion with @Traumeel I have the following idea:
We restore all request data from headers of the http.Request.
Like this everyone can use filters to modify the backend target, if the backend is type <filter-defined-host> (or maybe better <request-defined-backend> ?).

What do we need to do the call to do the http.Request similar to the call the original client?

  • http.Request has all data + data from a proxy in between (X-Forwarded-*), but it has no http.Request.Scheme defined which is OSI layer below HTTP so the parser of the http data does not get data ti reconstruct http.Request.Scheme.
  • http.Request.Scheme can be restored by the value of X-Forwarded-Proto header rfc link, if there is a http-proxy in between. For other cases you can use plain filter to modify the http.Request.Headers to achieve the same goal.
  • The backend to call we can get from the Host header.

Wit this implementation you can even write your custom filter and lookup a database to change the final request being made to an arbitrary chosen backend. You also do not need setForwardScheme() filter, because you can use setRequestHeader("X-Forwarded-Proto", "https").

Any drawbacks with this you see @aryszka @UltraNemesis ?

@aryszka
Copy link
Contributor

aryszka commented Jan 22, 2019

@szuecs where would be the backend host taken from in this case? From the Host header?

docs/tutorials/basics.md Outdated Show resolved Hide resolved
eskip/doc.go Outdated

A network endpoint address example:
HTTP endpoint:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not change this, because it highlights more the "network", that maps in the code to NetworkBackend for example and this document is meant for library users that want to understand the code.

eskip/eskip.go Outdated Show resolved Hide resolved
filters/filters.go Outdated Show resolved Hide resolved
filters/filters.go Outdated Show resolved Hide resolved
"github.com/opentracing/opentracing-go"
"github.com/zalando/skipper/filters"
"net/http"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop this change

proxy/doc.go Outdated Show resolved Hide resolved
proxy/proxy.go Show resolved Hide resolved
docs/operation/operation.md Outdated Show resolved Hide resolved
@szuecs
Copy link
Member

szuecs commented Jan 25, 2019

@Traumeel the filters you implemented should also be registered in filters/builtin/builtin.go

Signed-off-by: Maksym Aryefyev <maksym.aryefyev@zalando.de>
@Traumeel
Copy link
Contributor Author

@szuecs new filters are registered in filters/builtin/builtin.go. Fixed comments.

@szuecs
Copy link
Member

szuecs commented Jan 25, 2019

@UltraNemesis maybe you want to check this again, if it's ok for you?

@aryszka
Copy link
Contributor

aryszka commented Jan 26, 2019

👍

@aryszka
Copy link
Contributor

aryszka commented Jan 26, 2019

@Traumeel thanks a lot for the PR

Copy link
Member

@szuecs szuecs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small changes needed

filters/builtin/dynamicbackendfilter.go Show resolved Hide resolved
}
}

func (f *dynamicBackendFilter) Response(ctx filters.FilterContext) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the names only the types in an empty function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to define so many state bag keys? I would say that we should have only one key for setting Backend Url and all other filters should just set this value accordingly.

To set scheme, all you need to do is take current backend url, change the scheme and set the new value back to backend url.

To take backend url form host, you read its value and set the value to Backend url key.

eskip/eskip.go Show resolved Hide resolved
@UltraNemesis
Copy link
Contributor

Do we need to define so many state bag keys? I would say that we should have only one key for setting Backend Url and all other filters should just map and set this value accordingly.

To set scheme, all you need to do is take current backend url, change the scheme and set the new value back to backend url.

To take backend url from host, you read its value and set the value to Backend url key.

Every thing can be handled through a single key.

@Traumeel
Copy link
Contributor Author

@aryszka you are welcome :) It almost got through

@aryszka
Copy link
Contributor

aryszka commented Jan 28, 2019

about the two statebag keys: @Traumeel @UltraNemesis , I don't have a strong opinion on it, I find both ways fine.

@szuecs
Copy link
Member

szuecs commented Jan 28, 2019

👍

@szuecs szuecs merged commit e40612e into zalando:master Jan 28, 2019
@szuecs
Copy link
Member

szuecs commented Jan 28, 2019

Thanks to all reviewers and @Traumeel for this PR !

AlexanderYastrebov added a commit that referenced this pull request Sep 21, 2021
The #1860 pointed out to a bug
golang/go#48362 that leads to false
positive testable example due to trailing comment.

Moreover #918 changed test
input without changing the output and that went unnoticed due to this bug.

This change removes trailing comment to fix the example.
It also removes the test input instead of adding correct output because
the input does not match the final implemenation and thus is misleading.

Fixes #1860

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
AlexanderYastrebov added a commit that referenced this pull request Sep 21, 2021
The #1860 pointed out a bug
golang/go#48362 that leads to the false
positive testable example caused by a trailing comment.

Moreover #918 changed the test
input without changing the output and that went unnoticed due to this bug.

This change removes trailing comment to fix the example.
It also removes the test input instead of adding a correct output because
the input syntax is incorrect therefore misleading.

Fixes #1860

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
AlexanderYastrebov added a commit that referenced this pull request Sep 23, 2021
The #1860 pointed out a bug
golang/go#48362 that leads to the false
positive testable example caused by a trailing comment.

Moreover #918 changed the test
input without changing the output and that went unnoticed due to this bug.

This change removes trailing comment to fix the example.
It also removes the test input instead of adding a correct output because
the input syntax is incorrect therefore misleading.

Fixes #1860

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants