-
Notifications
You must be signed in to change notification settings - Fork 2
fix: process links w/o the prefix #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failing test looks great and does indeed capture a failure that the middleware change resolves.
However, the core problem had me scratching my head a bit.
It wasn't immediately obvious to me how we would arrive in a situation of the failure in the first place. Thinking back, I believe that the idea of the Process Links transformation was that we would make the following assumptions:
- the upstream STAC API would format its links with an origin matching the proxy's host, thanks to the forwarded headers we send:
stac-auth-proxy/src/stac_auth_proxy/handlers/reverse_proxy.py
Lines 44 to 52 in 2961a48
headers.setdefault( "Forwarded", f"for={proxy_client};host={proxy_host};proto={proxy_proto};path={proxy_path}", ) if self.legacy_forwarded_headers: headers.setdefault("X-Forwarded-For", proxy_client) headers.setdefault("X-Forwarded-Host", proxy_host) headers.setdefault("X-Forwarded-Path", proxy_path) headers.setdefault("X-Forwarded-Proto", proxy_proto) - The links would have a prefix matching the path of the upstream URL's server. This is what we are removing here:
# Remove the upstream_url path from the link if it exists if urlparse(self.upstream_url).path != "/": parsed_link = parsed_link._replace( path=parsed_link.path[len(urlparse(self.upstream_url).path) :] )
This fix addresses situations where assumption no.2 isn't correct. I believe this is likely caused by a situation where an upstream STAC API is being served with a root prefix (e.g. /prod
) but the server isn't properly configured with a root_path
argument (at least in FastAPI situations). So I think there could be an argument made that this is actually a misconfiguration on the upstream server's part.
Nonetheless, I don't know if there's actually anything wrong with us trying to smooth over possibly misconfigured upstream server links.
The only thing that I don't love about this current solution is that this code could replace multiple instances.
str(parsed_link.path).replace(
str(urlparse(self.upstream_url).path), ""
)
For example, if the upstream was http://upstream.example.com/collections/
(god forbid), then a link like http://proxy.example.com/collections/collections
would be rewritten to http://proxy.example.com/collections
. This might be overly cautious, but 🤷
How do you feel about the change made in d634039?
Makes sense to me!
Yeah, I think that's true. I poked around the project we were using this on, and it wasn't immediately obvious to me how to quickly feed the deployed API gateway url back down into the FastAPI app (I'm sure there's a way), so I reached for this solution. |
🤖 I have created a release *beep* *boop* --- ## [0.7.1](v0.7.0...v0.7.1) (2025-07-31) ### Bug Fixes * ensure OPTIONS requests are sent upstream without auth check ([#76](#76)) ([855183a](855183a)), closes [#75](#75) * process links w/o the prefix ([#70](#70)) ([8a09873](8a09873)) ### Documentation * update middleware descriptions ([d3d3769](d3d3769)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: ds-release-bot[bot] <116609932+ds-release-bot[bot]@users.noreply.github.com>
There's times when the underlying STAC API might not have a path on its root URL, but stac-auth-proxy has a path on its
upstream_url
; one such case is if there's an API Gateway in the way. This patch uses string replacement for link rewriting, rather than slicing, to handle these cases while preserving previous behavior.