-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex rewriting of upstream/downstream headers in proxy #2144
Conversation
Hi @comp500 |
It could be used for modifying redirects for a mirrored site. For example, Wikipedia uses absolute redirects to en.wikipedia.org for the homepage and search. The directive could be used like this:
This would ensure that all redirects are for the correct host, rather than just the main page as in Matt's tweet. |
This is cool. Hopefully I or someone else gets a chance to review this soon! |
Are there any news regarding this pull request? Can I help somehow (eg. giving it a try in my environments and sharing the results)? |
{"proxy / localhost:8080 {\n transparent \nheader_upstream X-Test Tester \nheader_upstream X-Test Test Host \n}"}, | ||
|
||
// Test #3: transparent preset with multiple params | ||
{"proxy / localhost:8080 {\n transparent \nheader_upstream X-Test Tester \nheader_upstream X-Test Test Host \nheader_upstream X-Test er ing \n}"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@comp500
As I understand it, the first arg is a name of a header, the second is the regular expression of what we want to replace, and the last argument is the value for the replacement.
Will it work if the last argument contains white spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. It will work if any argument contains white spaces, because you can surround each argument in quotes, so the white spaces within are part of the argument.
e.g. header_downstream Location https://en.wikipedia.org "http://local host"
@lukenowak it would be great if you could test in your environment., |
UPD I'm trying to run it locally, but I get an error: This is what I did:
My Caddyfile
Did I miss something? Update I verified this functionality and it works excellent. |
Your caddyfile seems to work fine for me. Try checking out the branch in the ~/go/src/github.com/mholt/caddy folder instead, as the build script might be using the wrong package. |
@tobya I was able to remove specific cookies from the browser sent to the upstream by using:
It is super cool, that one header can be rewritten many times. So for me this feature is super useful and flexible. |
Not released yet functionality for regular expression cookie rewriting is available: caddyserver/caddy#2144
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is a useful change and definitely useful. I've only made a brief pass, but it's looking mostly good so far.
Add the extra parameter to header_upstream and header_downstream, and describe it. Give examples of it being used to change the Location header, possibly referencing proxy_redirect.
Can you write up the specific docs for this? This is a complex feature and it needs to be documented correctly. For example, are placeholders (captures) supported? To what extent can people use Go's regular expressions here?
I've written some docs in this gist, to be added to https://caddyserver.com/docs/proxy. If anything is unclear in it, please comment on the gist. |
Not released yet functionality for regular expression cookie rewriting is available: caddyserver/caddy#2144
@mholt , this PR would be very useful for us and seems almost complete. Is there any way I can contribute to getting this merged? Thanks! |
@LaurensBosscher You could help by testing the changes. Check out the latest master and merge this PR locally, compiling from source. You can also review the code and point out any potentially questionable behaviour if any. |
Not released yet functionality for regular expression cookie rewriting is available: caddyserver/caddy#2144 Caddy v0.11.1 fixes QUIC issues. Zenexer fixes for regerssion in v0.11.1 Note: Since Caddy release v0.11.0 certificates has to match sites served by Caddy. This will result in lack of response for sites served on HTTPS for which certificate names (subject COMMON_NAME, subjectAltName DNS or IP) does not match site name.
Caddy v0.11.1 fixes QUIC issues. Not released yet functionality for regular expression cookie rewriting is available: caddyserver/caddy#2144 Not released yet functionality for ca_certifices in proxy: caddyserver/caddy#2380 Zenexer fixes for regerssion in v0.11.1 Note: Since Caddy release v0.11.0 certificates has to match sites served by Caddy. This will result in lack of response for sites served on HTTPS for which certificate names (subject COMMON_NAME, subjectAltName DNS or IP) does not match site name.
Very much interested in this PR 👍 |
@Ramshackles Any chance you would be able to compile locally and test this change? |
@comp500 are you willing to continue working on this? Rebase on top of master (trivial change)? @tobya I tried to run the test suite after rebasing this work on master, but my go environment seems "damaged" ( |
@lukenowak Based on that I bet you just need to update to Go 1.12 to fix the error. |
Not sure how I would go about doing it, but guess I could try if needed. |
@lukenowak I am certainly willing to continue working on this. I've rebased my PR on top of master, although the AppVeyor build seems to still be using Go 1.11 for some reason. |
Yeah, just ignore the AppVeyor CI failures for now. AppVeyor still doesn't support Go 1.12 appveyor/ci#2875 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work. :) Let's give this a shot.
Any chance to ship this one into a release soon? I found out that many back-ends behave badly and don't honor the |
It'll go out in our next release, which should be 1.0beta1. |
This reverts commit 6d2019b, as new caddy has issues with tls certificate configuration: caddyserver/caddy#2588 About nxd-v0.11.5-4-g9d3151db: * not released yet functionality for regular expression cookie rewriting is available: caddyserver/caddy#2144 * not released yet functionality for ca_certifices in proxy: caddyserver/caddy#2380 * support for builtin log rotation disabling
@comp500 said:
Oh my goodness, more than one year after I discover this that in fact fixes a problem that I reported just recently on Wikimedia Phabricator (T232213) and here (#1011). This is so awesome! It's hearthwarming knowing that there is somebody out there fixing my bugs even without me knowing it! I have already pushed it on vikiansiklopedi.org and wikiproxy (formerly wikimirror, because - well - it's a proxy ont a mirror) |
FWIW, I just got this implemented into Caddy 2: https://github.com/caddyserver/caddy/wiki/v2:-Documentation#httphandlersreverse_proxy It reuses the existing |
1. What does this change do, exactly?
Adds an extra parameter to header_upstream and header_downstream that allows regex-based replacement of headers. They are in the format
header_upstream [header] [regex] [replacement]
This allows unwanted values from the server and client (e.g. redirects, cookies) to be modified by Caddy. This therefore allows search (and mobile) to be fixed on wikipedia.matt.life, and wikimirror.
2. Please link to the relevant issues.
#442 - An existing (but not merged) implementation of Location rewriting
#606 - Nginx proxy_redirect feature request
This change may be obsoleted by #1639, the proxy middleware rewrite.
3. Which documentation changes (if any) need to be made because of this PR?
Add the extra parameter to header_upstream and header_downstream, and describe it. Give examples of it being used to change the Location header, possibly referencing proxy_redirect.
4. Checklist