-
Notifications
You must be signed in to change notification settings - Fork 1
some www.galaxyzoo.org urls contain '+' chars which don't resolve via s3 server #82
Comments
Have you got the URL for that subject on the old and new sites? I'm interested to see if the browser encodes the URL by default. If not, we can explicitly encode it when we parse subject locations. Assuming that doesn't break subject URLs for any other projects. |
So this subject is one of them https://talk.galaxyzoo.org/subjects/AGZ000atp8/ From this collection https://talk.galaxyzoo.org/collections/CGZS0003tq/ the URL is encoded correctly
however if we use the rewritten non-s3 URL www.galaxyzoo.org/subjects/decals/thumbnail/J211326.08%2B005811.6_thumbnail.jpeg we get a redirect via the nginx proxy to an s3 URL which has the path decoded in the rewritten location, https://s3.amazonaws.com/www.galaxyzoo.org/subjects/decals/thumbnail/J211326.08+005811.6_thumbnail.jpeg That's not great - it seems the issue here is the nginx static proxy and the rewrite rule. We may have to proxy pass these URLs (serve them directly) via NGINX instead of redirecting them to avoid this issue. |
This is getting more interesting....after testing a local version of the static nginx proxy, our static proxy seems to be preserving the encoded URLs correctly. Note the Local test of static proxy (only debug headers added)
This is a https redirect, note the encoding is preserved
However when we hit the https url we lose the encoding :(
Note the response Location header above is where lose the encoding, the request from the client is still encoded. Nginx logs in k8s are the decoded URL, it appears that the nginx ingress is rewriting the URL before it hits the static proxy pod
This looks relevant, kubernetes/ingress-nginx#1615 (comment)
|
Looking at it on my phone, the image is broken in this discussion about that subject. I rebuilt the discussion pages this morning. https://talk.galaxyzoo.org/boards/BGZ0000004/discussions/DGZ0001krf/ So that would be the redirect breaking the location? The subject and collections pages are built from master, but the discussion page is built from #81. |
resolved by zooniverse/static#176 |
related to #81 and #64
A GZ subject thumbnail URL like www.galaxyzoo.org/subjects/decals/thumbnail/J211326.08+005811.6_thumbnail.jpeg
will redirect to s3 URL via nginx static rewrite at https://github.com/zooniverse/static/blob/fe42d006be275b5e59e6e584e67fbeff500f426a/sites/www.galaxyzoo.org.conf#L10
E.g. the above subject URL redirects to the literal '+'
this doesn't
https://s3.amazonaws.com/www.galaxyzoo.org/subjects/decals/thumbnail/J211326.08+005811.6_thumbnail.jpeg
this works
https://s3.amazonaws.com/www.galaxyzoo.org/subjects/decals/thumbnail/J211326.08%2B005811.6_thumbnail.jpeg
I believe this will be the same in azure land (needs testing)
https://docs.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#blob-names
I haven't found a decent way to encode the URL in nginx (which strikes me as very strange) and i need to test how these '+' symbols in urls work in azure as well.
We may need to encode these URLs explicitly before publishing them to ensure they work as we expect. TDB
The text was updated successfully, but these errors were encountered: