-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add/minio backend multipart #298
Conversation
Signed-off-by: vsoch <vsochat@stanford.edu>
…t error output for scs-library-client... Signed-off-by: vsoch <vsochat@stanford.edu>
…ludes the host header Signed-off-by: vsoch <vsochat@stanford.edu>
okay here is an update after working today. I was able to at least get a signature (and the associated headers) into the signed url by adding it like this: from botocore.client import Config
...
s3_external = session.client(
"s3",
use_ssl=MINIO_SSL,
region_name=MINIO_REGION,
endpoint_url=MINIO_HTTP_PREFIX + MINIO_EXTERNAL_SERVER,
verify=False,
config=Config(signature_version='s3v4'),
) and from this comment I think the (still) signature mismatch is because s3v4 uses the host to validate (which is different inside the container, minio:9000 from outside the container 127.0.0.1:9000).
So my thinking is that one of the following needs to be done:
|
Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
And here are full logs from minio:
The first ones at the top are when the sregistry uwsgi container is started up / restart, it just creates a MinioClient (which is successfully used for a PUT pre signed URL without the multipart bit). Then for the multipart we need to use s3 clients, those are s3 and s3_internal in minio.py, and those are the following requests. You can see the upload request being created by the s3 client, and then the PUT URL is generated from s3_external. |
s3v4 requires preserving host header i.e use the host when generating the pre-signed signature - it is present to avoid MITM attacks on the payload. |
@harshavardhana what would you suggest for generating the signed URLs inside the docker container (minio:9000), where the hostname is different from the external client (outside the container, 127.0.0.1:9000). I tried adding the hostname "minio" to point to 127.0.0.1 in /etc/hosts so I could use the s3 client internally for all calls, but I got the same error. Could it be something else, or is there another header or setting that I'm missing? Do you work at Minio? Is there an equivalent function to generate signed urls, but for a multipart upload? I noticed that the put_object request will use multipart for over some size threshold, but that would be called directly from the client making the request. We'd need to generate the URLS in the same fashion as the single put request. |
@vsoch depends on who is going to access the URL to upload/download
To use rotating credentials you can use AssumeRole mechanism which allows for using AWS SDKs directly without generating pre-signed URLs and also without sharing the static credentials. Your server component should ensure to generate these temporary credentials and pass it to the client https://github.com/minio/minio/blob/master/docs/sts/assume-role.md |
@harshavardhana it's an external client, and I do create separate clients (s3 for making the internal request to start the multipart upload, and s3_external to generate the pre signed urls). See here for creating the clients and here for the requests.
I could pass this to the client, but I have no control over the client implementation - specifically it's using the scs-library-client so I have no way of changing those calls. Are you saying there is some way of wrapping this with the generation of pre-signed urls to just change how the authentication is done? |
No there is no such thing either you use pre-signed or AssumeRole you have to choose.
If its an external client then you need to use external IP. keeping separate constructors for both is the correct approach, you can, of course, share the connection pooling by passing in the same urllib3 pool manager. |
okay, I'll need to run more details by you tomorrow then, because (per our conversation now) I've done everything right thus far, but still there is the signature mismatch! I think it might be related to the (internal) client making the start request, the HostID returned in metadata is empty. I don't know if the UploadID passed on to the external_s3 client then creates a request that tries to match the requesting HostID to the one that started the request (and then gets an empty string vs. what looks like an md5 of something). I think the trick might be figuring out how to define the HostID for the start multipart request, so it matches for the external_s3 signed url request that gets checked against the start request with the UploadID? |
Another thing I'll try tomorrow is to see what the "correct" headers look like for a working (single) PUT request here and then maybe I can use those same signing functions (with the correct headers) to update the signature generated for my current (not working) pre-signed urls. |
It is not just about the correct presigned URL meaning lets say if you generated a presigned for external client and internal client cannot use it. Same as internal clients URL cannot be used external client - both will see signature mismatch due to host header mismatch. HostID is not useful in the response we can ignore that for now, that's just some unique ID but it is largely not that meaningful in MinIO. |
I am generating it with the external client (but from the internal host) and of course using it for the external client. As a sanity check, the same strategy with a minioClient and minioExternalClient generated with minio-py for a single presigned URL for a PUT works like a charm! It’s the multipart support for presigned URL that is missing. This is why I thought it might be useful to print out headers that are generated by minio-py for the working single PUT in case I need to add them for signature generation. |
hey @harshavardhana ! I joined the Minio slack and posted a bunch of information - let me know if / when you and/or your team are around so we can debug together! I'll keep trying things in the meantime :) |
…all error Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
okay I've just pushed the (now working!) solution for pre-signed multipart uploads - the issue was that the presign_v4 function was using an unsigned hash string, and I needed to copy the function and pass in the sha256 that is sent from my calling client. Finally, I've added the CompleteMultipartUpload view to finalize everything. |
Signed-off-by: vsoch <vsochat@stanford.edu>
d82b1a4
to
d742018
Compare
Signed-off-by: vsoch <vsochat@stanford.edu>
@RonaldEnsing @Aneoshun @lmcdasm @asmi10 you've all had recent issues with uploading large images or scaling, and I want to invite you to possibly review this PR if you might have the chance! Speciically, I've added a minio storage backend, which can serve the Sylabs library enpoints and also support multipart upload. I suspect that this would also allow for a custom setup where a center deploys a local Minio setup and then is able to use the same endpoints for the same registry! I would like to have a few testers give this a spin before merging - I'm fairly confident that this is a much better implementation than the current, but I want to have others look for a sanity check. For those of you that have existing registries - do not test on top of your existing registry! Since minio is a new storage endpoint, I was careful to not have it use the previous folder bound to the host (images) but I still wouldn't want to take any risk of containers being recreated, at least for the time being. Looking forward to hearing what you think! I had a great time working on this... Minio is awesome, and there are so many ways you can further configuration it! -> https://github.com/minio/minio/tree/master/docs/config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vsoch
My sincere apologies for the delay, the last few weeks have been more than hectic. It is only now that I have finally time to test you incredible PR.
In short: It solves all my issues.
The upload/download speed seems better. The memory footprint during upload is marginal, regardless of the size of the image (I can upload a 3GB image on a SRegistry server with only 2GB. That was simply impossible before. I got screenshots of before/after memory usage, the difference is obvious.
I have tested with two containers (400MB and 3GB). Note that I have not tested the SSL connections.
Alright, now a few things that I have noticed:
- During the build of the container I see this error, maybe it is fixable:
YAML retrying sregistry uwsgi coreschema simplejson openapi-codec user-agents backcall xmlsec spython googleapis-common-protos wrapt
ERROR: awscli 1.18.66 has requirement rsa<=3.5.0,>=3.1.2, but you'll have rsa 4.0 which is incompatible.
-
It seems that the download button for the uploaded containers has disappeared on the web UI (I can only see the "freeze" button). I don't know if it is like this on purpose or not.
-
For some reasons, after uploading 1 container in an empty collection, the collection list (on the web UI) tells me that I have 3 builds. When I click on the collection, to see the container list, I only see one container (the one I just uploaded).
-
After, reading the documentation, I am a bit confused between the usages of the
images
andminio-images
folders. It looks like I cannot start the docker-compose if both of these folders are created. However, onlyminio-images
seems to be used (as explained in the doc), whileimages
is just filled with empty folders named from./0
to./9
. I don't know if this is normal.
Thanks a lot @vsoch for your work. I am really looking forward to merging this PR and using it daily.
No worries! Times certainly aren't normal right now.
This is so great to hear! I was hoping this would be the case.
It looks like there is heated discussion about this and it's relatively new so we should stay on the lookout for an update. I've subscribed to the issue and created an issue here, I think that's probably the most I can do - if you try it out and there is an issue with https then we can downgrade. I opened an issue here #300.
Since we use minio now which requires a more robust auth flow, I thought it was best to remove this.
I'm not able to reproduce this. Could it be that you have a postgres container that was used previously and not updated (e.g., you pulled down to your repo to a previous install and didn't remove and re-create the database).
This exists totally to still provide some sort of fallback support. If a registry needs to somehow exist with both, I decided to use a different name so there isn't conflict. If you use minio, you can safely ignore the images folder, it isn't doing harm being there.
Definitely! I'll push some of the documentation changes soon for you to take a look at. And to debug your 3 containers per collection issue - maybe shell into the server and do a collection.containers.count()? And try to remove and re-create the postgres image? |
Signed-off-by: vsoch <vsochat@stanford.edu>
Edits -> 2ebed73 |
I cloned this PR in a new folder, stoped my running SRegistry instance, built the new SRegistry containers and start everything from scratch from there. My previous instance has probably hundreds of containers, so if it was a DB collision, then I expect to see a lot more ghost containers.
Could you walk me through this a bit more? Which on which container of the compose shall I shell into? |
Definitely! It's actually just the uwsgi container that you want. So if you are currently stopped, start containers:
Now we want to shell into the uwsgi container. The worker would give us equivalent access, but I typically use uwsgi. $ docker exec -it sregistry_uwsgi_1 bash This has us in the root of the container, /code, which is actually mounted at your repository. Be careful making changes from within the container because it will change permissions on your host, which typically you don't want. Next, django has a lovely interface via the manage.py module right there. You can easily get different kinds of shells, show urls, and interact with a lot of custom functions exposed by modules installed in the registry. Just try doing: python manage.py --help to check it out. Of course we want an interactive Python shell into the application so we can do: python manage.py shell Once we are in python, you can import your models, and query them! from shub.apps.main.models import *
Container.objects.count()
1
Collection.objects.count()
1
for collection in Collection.objects.all():
print(collection)
collection:1
Collection.objects.filter(name="collection")
# <QuerySet [<Collection: collection:1>]> And of course each collection is related to its' containers: collection.containers.count()
1 and you can get attributes: collection.private
False
collection.name
'collection'
collection.owners.first()
<User: vsoch> So we would be interested in the count of containers in the collection, and if you see 3, to loop over them and print the names, tags, etc. |
Thanks! So count() outputs
As you can see I have 2 Dummy containers. I don't know why. They appear in the counts in the UI, but are not listed with the other containers. |
Ahh! I might know what happened. After cloning the repo, I copied the config.py file from my original Registry (to have all my settings), which obviously led to an error because of the missing MinIO configuration. Then I had a second issue because I forgot to set the URL of the server here. Because of those two errors, I made two failed attempts at pushing a container. Could it be that the Dummy containers are created at this point, but not removed? |
That’s what the tag suggests! :) |
There is actually a task that handles cleaning these up, I didn't find a way to do it easily when it happens: https://github.com/singularityhub/sregistry/blob/a6b0c0d9d9ac67d240ebc295ef610af96d3e84fc/shub/apps/api/management/commands/cleanup_dummy.py. You should be able to run this with manage.py too python manage.py cleanup_dummy Does that resolve all the issues then? Ready for merge? |
Yes, cleanup_dummy removed this. Thanks a lot! I think it is ready to merge! |
This is an extension of #297 that includes multipart based on the example here I've successfully been able to:
However there is some error triggered by the scs-library-client in this function so that it calls the _multipart_abort endpoint and the request does not finish. I have a feeling it has something to do with a parameter missing for the request, for example here are the differences between a working PUT request (non multipart) and a not working single Part:
This URL works (a single PUT request for minio, generated with presigned_put_url
And note that the only way I am able to do this is by instantiating two minio clients, one with an internal url (inside the container the server is minio:9000), and one with an external url (from the client the server is 127.0.0.1:9000). That was tricky but seemed to work (the work in #297).
This next URL is generated based on the example here and I again needed to create an s3 client to handle an internal request to start the multipart upload:
And then a separate client (s3_external) to generate URLs with the upload id generated from above:
The above generates a url that looks like this and does not work as the Singularity client aborts and calls _multiupload_abort:
I was just able to enable debug output for Singularity, and then use the mc client for minio to determine the next issue to address:
Likely I need to look into how to make sure the correct signature version / method is being used, I'm not sure what the default is but I don't think it's the same that minio expects.