Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem pulling from behind a proxy #391

Open
TannerW-STX opened this issue Feb 2, 2022 · 13 comments
Open

Problem pulling from behind a proxy #391

TannerW-STX opened this issue Feb 2, 2022 · 13 comments

Comments

@TannerW-STX
Copy link

I have been beating my head against this issue for some time now. This registry is installed on a compute node on an HPC cluster. All http(s) is proxied through the frontend of the cluster.

From the host machine, pushes work fine, but when I try to pull from the host I get:

singularity --debug pull library://tanner/maxquant:hello
DEBUG   [U=1043,P=26495]   persistentPreRun()            Singularity version: 3.9.2-bionic
DEBUG   [U=1043,P=26495]   persistentPreRun()            Parsing configuration file /etc/singularity/singularity.conf
DEBUG   [U=1043,P=26495]   handleConfDir()               /home/tanner.wilkerson/.singularity already exists. Not creating.
DEBUG   [U=1043,P=26495]   handleRemoteConf()            Ensuring file permission of 0600 on /home/tanner.wilkerson/.singularity/remote.yaml
DEBUG   [U=1043,P=26495]   getCacheParentDir()           environment variable SINGULARITY_CACHEDIR not set, using default image cache
DEBUG   [U=1043,P=26495]   apiGet()                      apiGet calling v1/images/tanner/maxquant:hello?arch=amd64
INFO    [U=1043,P=26495]   downloadWrapper()             Downloading library image
DEBUG   [U=1043,P=26495]   ConcurrentDownloadImage()     This library does not support architecture specific tags
DEBUG   [U=1043,P=26495]   ConcurrentDownloadImage()     The image returned may not be the requested architecture
DEBUG   [U=1043,P=26495]   ConcurrentDownloadImage()     Pulling from URL: v1/imagefile/tanner/maxquant:hello
DEBUG   [U=1043,P=26495]   DownloadImage()               Cleaning up incomplete download: /home/tanner.wilkerson/.singularity/cache/library/tmp_3969322992
FATAL   [U=1043,P=26495]   pullRun()                     While pulling library image: error fetching image: unable to download image: error downloading image: unexpected HTTP status 302: <nil>
sregistry_retry-uwsgi-1  | [pid: 67|app: 0|req: 34/158] 10.10.10.44 () {36 vars in 555 bytes} [Tue Feb  1 18:37:01 2022] GET /v1/imagefile/tanner/maxquant:hello?arch=amd64 => generated 0 bytes in 13 msecs (HTTP/1.1 302) 6 headers in 548 bytes (1 switches on core 0)

It seems like the redirect url is not pointing to the correct location for download?

I have recreated the exact install steps on a local desktop and everything worked out of the box. So this nil redirection issue is specific to the cluster installation of this registry.

Has anyone seen this issue before? I'd imagine it is a uwsgi/proxy setting that I need to set somewhere?

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

Did you deploy in the docker-compose containers or directly on the node?

@TannerW-STX
Copy link
Author

Docker-compose

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

So there are probably a few options. Arguably if Singularity honors the redirect (302) this would work out of the box, so you could open an issue on the Singularity repo and ask if that might be possible. Otherwise, you will need some kind of proxy between the container and client to say "this is actually being served from this url." I have never deployed in this context so I don't have a recipe for you, but my suggestion would be to tweak the nginx.conf to see if you can create the proxy for your specific setup. https://www.digitalocean.com/community/tutorials/understanding-nginx-http-proxying-load-balancing-buffering-and-caching. If I were you I'd go about doing this by:

  1. doing a bunch of Google-ing for this use case, finding something that feels similar
  2. try changing the nginx.conf, restart the container
  3. test the pull

rinse and repeat until you get something working! And if you do, please share here so we can update docs with a recipe for the next person.

@TannerW-STX
Copy link
Author

Sadly steps 1 - 3 are exactly what I have been doing for about a week now. I'll keep throwing stuff at the wall. I will keep this issue open until I finally admit defeat.
Thank you @vsoch

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

Well let's poke in other communities then to ask for help! This indeed must be something others have faced. I'll poke around some of my discussions channels - there must be an nginx.conf that will do the trick, and we just have to find it!

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

Huh, if you see that the uwsgi gets a hit, that does suggest that it's hitting the server - the question is where are you redirecting to? Are you able to get more detailed output from the Singularity client to perhaps print headers that are returned (e.g., there might be a Location header that shows the redirect). Django sometimes has this thing where it will redirect http to https, and that could be happening here, in which case you'd want to deploy with https. But maybe if you can figure out where it's redirecting that will get us one step closer to understanding what's going on, and if it's an issue with the registry configuration or something else.

@TannerW-STX
Copy link
Author

I have https deployed.
I tried to capture headers with tcpdump for brief moment and didn't have much luck at first. I will give tcpdump's output a deeper look and figure out the redirection.

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

Are you using the https nginx.conf and are all the redirects also https? I ask because if Django is set to redirect to https, you would get a redirect (this has happened to me before). I would check out this section https://docs.djangoproject.com/en/4.0/topics/security/#ssl-https of the docs because perhaps some of the settings.py options you could try to set. If we can just see what the header says it would be helpful to see if it's trying to do something around that!

@TannerW-STX
Copy link
Author

TannerW-STX commented Feb 2, 2022

I am using the https nginx.conf, but the only redirect I have edited is in shub/settings/config.py

I can confirm that all traffic is going through https though. Which is expected because don't singularity operations require ssl? Sadly headers are impossible(?) to get with tcpdump because of ssl. I can try wireshark.

Glancing at the docs you just sent over now

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

I think an option was added recently to not require it - I always used to edit and then compile the source code to disable it for local development! But if somehow it's going through https -> http -> https (django) if that's even possible via the proxy you have setup, I think you could see that error.

I would also check out this particular setting - I've never used it before but it seems to be specific for a proxy! https://docs.djangoproject.com/en/4.0/ref/settings/#std:setting-SECURE_PROXY_SSL_HEADER

@TannerW-STX
Copy link
Author

Dumb question (I don't mess with Django much)

where should I set these settings? I am assuming somewhere in shub/settings, but can config.py handle generic django settings?

@vsoch
Copy link
Member

vsoch commented Feb 2, 2022

where should I set these settings? I am assuming somewhere in shub/settings, but can config.py handle generic django settings?

config.py works! And you'll need to restart the containers each time, assuming you are using the docker-compose that binds everything to /code. It normally is just a single settings.py file, and for some reason I thought it would be better to have the settings module splot out over a bunch of files. I think it does make it easier to say "just edit config.py!" but also probably is more confusing to not find settings.py.

I'm poking more around Django docs - I would also try:

USE_X_FORWARDED_HOST = True

last section here (the main article is for apache but the concept is similar) https://ubuntu.com/blog/django-behind-a-proxy-fixing-absolute-urls. And maybe if the tools you are using don't work, just trying a requests.get() (in python) for a basic server endpoint to see what it returns? E.g., when you try to browse to the UI do you see it?

Another blog about it with a TLDR here: https://medium.com/@rui.jorge.rei/today-i-learned-nginx-reverse-proxying-for-django-projects-3ab17ad707f6

@ikirker
Copy link

ikirker commented Feb 3, 2022

As far as I'm aware, Wireshark won't help you if tcpdump doesn't: you'll need to actually engage with HTTPS to get the HTTP request headers.

I'd suggest pointing curl directly at the URL from the uwsgi log, to manually get a look at the headers and request progress, e.g.:

curl --head -vvv https://your-server.example.com/v1/imagefile/tanner/maxquant:hello?arch=amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants