Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly set PATH_INFO when X-outside-url header with path is used #942

Closed
wants to merge 1 commit into from
Closed

Properly set PATH_INFO when X-outside-url header with path is used #942

wants to merge 1 commit into from

Conversation

daniel-k
Copy link
Contributor

When the X-outside-url header is passed in a request to devpi-server, the path portion is set as the wsgi SCRIPT_NAME variable so it can run in a subpath. However, the wsgi spec [1] states that SCRIPT_NAME and PATH_INFO together constitute the complete URL but PATH_INFO isn't modified. This commit adds stripping of the SCRIPT_NAME from the beginning of PATH_INFO.

[1] https://wsgi.readthedocs.io/en/latest/definitions.html#standard-environ-keys


For reference, here are two minimal configs for traefik and nginx to to serve devpi on the sub path /devpi

nginx.conf

server {
    location /devpi {
        proxy_set_header X-outside-url $scheme://$http_host/devpi;
        proxy_pass http://localhost:3141;
    }
}

traefik labels (for running in docker-compose)

labels:
  - "traefik.enable=true"
  - "traefik.http.routers.devpi.rule=PathPrefix(`/devpi`)"
  - "traefik.http.routers.devpi.middlewares=devpi-outside-url"
  - "traefik.http.middlewares.devpi-outside-url.headers.customrequestheaders.X-outside-url=http://localhost/devpi"

When the `X-outside-url` header is passed in a request to devpi-server,
the path portion is set as the wsgi `SCRIPT_NAME` variable so it can
run in a subpath. However, the wsgi spec [1] states that `SCRIPT_NAME`
and `PATH_INFO` together constitute the complete URL but `PATH_INFO` isn't
modified. This commit adds stripping of the `SCRIPT_NAME` from the
beginning of `PATH_INFO`.

[1] https://wsgi.readthedocs.io/en/latest/definitions.html#standard-environ-keys
@fschulze
Copy link
Contributor

Does this solve an actual bug? Are there any wrongly generated URLs? If so, please try to add a test that shows that. If you need help with that, please let me know.

@daniel-k
Copy link
Contributor Author

Sorry if it wasn't clear if or what bug it solves, I was a bit in a rush yesterday. My goal was to serve devpi-server via a reverse proxy in a sub path like http://localhost/devpi. I first tried stripping the URL prefix (/devpi) from the request via the reverse proxy, but this only works to a certain degree because any links that devpi generates will be wrong. Then I tried X-outside-url, but always ended up with redirection loops. After some digging I found that the issue is with OutsideURLMiddleware, because it manipulates SCRIPT_NAME unappropriately according to the spec, i.e. it also has to adapt PATH_INFO accordingly. When running the setup with a reverse proxy that sets X-outside-url (see configs above) and looking at the devpi-server logs, you'd see that a request to http://localhost/devpi would be logged as GET /devpi/devpi. Reason here being that /devpi is added to SCRIPT_NAME but not stripped from PATH_INFO.

I just had a look at the tests and managed to run the test suite successfully in its current state, but I'm a bit lost in all the layers of the testing harness around the app object. In order to debug and reproduce the redirection loop I tried to testapp.get('/') the root page, but just get some JSON back.

For reference, here are the relevant code section of waitress and webtest for treating SCRIPT_NAME and PATH_INFO:

@fschulze
Copy link
Contributor

I guess that it fixes #934 then. Now that I know what this is about, I can do the rest, as a test for it is certainly not trivial without knowing the suite quite well.

@daniel-k
Copy link
Contributor Author

daniel-k commented Dec 22, 2022

Yes, it should solve #934 👍

There is one important question that was really hard for me to answer when trying to understand how to use a setup with X-outside-url, which is if the requests to devpi should have the prefix stripped from the request URL (e.g. using StripPrefix when using traefik) or not. It would be great if we could add some documentation on that matter as well. If you want me to add some, please point me to the appropriate location.

Reagrding path stripping, this PR actually (unintentionally) allows both behaviours because it only strips the prefix from incoming requests conditionally, i.e. if it matches. I think it's the same behaviour like in waitress.

@fschulze
Copy link
Contributor

I tried writing a test case, but couldn't create anything that fails without this fix. It would be super useful to get the request headers devpi-server receives for the redirect loop case, or maybe you can point to the code that generates the wrong redirect when PATH_INFO isn't properly adjusted?

@fschulze
Copy link
Contributor

To be clear, I agree that your change looks correct according to the specs and your waitress, webtest links, but a failing test would be really nice.

@fschulze
Copy link
Contributor

Found my mistake. I have a test which shows this fix works as intended.

@daniel-k
Copy link
Contributor Author

Found my mistake. I have a test which shows this fix works as intended.

Awesome 🎉 Feel free to add the test to my PR, you should be able to push to my branch.

image

@fschulze
Copy link
Contributor

Merged by hand.

@fschulze fschulze closed this Feb 22, 2023
@daniel-k daniel-k deleted the fix/outside-url-setting-path-info branch February 22, 2023 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants