Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add/minio backend multipart #298

Merged
merged 11 commits into from
May 23, 2020
Merged

[WIP] Add/minio backend multipart #298

merged 11 commits into from
May 23, 2020

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Apr 4, 2020

This is an extension of #297 that includes multipart based on the example here I've successfully been able to:

  • instantiate an s3 client for the internal and external minio
  • start a multipart upload
  • generated signed urls
  • return them to Singularity

However there is some error triggered by the scs-library-client in this function so that it calls the _multipart_abort endpoint and the request does not finish. I have a feeling it has something to do with a parameter missing for the request, for example here are the differences between a working PUT request (non multipart) and a not working single Part:

This URL works (a single PUT request for minio, generated with presigned_put_url

http://127.0.0.1:9000/sregistry/test/busybox%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=mama%2F20200404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200404T212826Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=d557a51a7607c0a4b50f4cd9b8dbb962efe5de795f6a22e99c0a50abcb5760a9

And note that the only way I am able to do this is by instantiating two minio clients, one with an internal url (inside the container the server is minio:9000), and one with an external url (from the client the server is 127.0.0.1:9000). That was tricky but seemed to work (the work in #297).

This next URL is generated based on the example here and I again needed to create an s3 client to handle an internal request to start the multipart upload:

# https://github.com/boto/boto3/blob/develop/boto3/session.py#L185
s3 = session.client(
    "s3",
    use_ssl=MINIO_SSL,
    endpoint_url=MINIO_HTTP_PREFIX + MINIO_SERVER,
    region_name=MINIO_REGION,
)

...

s3.create_multipart_upload(bucket=MINIO_BUCKET, Key=storage)

And then a separate client (s3_external) to generate URLs with the upload id generated from above:

s3_external = session.client(
    "s3",
    use_ssl=MINIO_SSL,
    region_name=MINIO_REGION,
    endpoint_url=MINIO_HTTP_PREFIX + MINIO_EXTERNAL_SERVER,
    verify=False,
)

...

# looping through part ids:
    signed_url = s3_external.generate_presigned_url(
                ClientMethod="upload_part",
                Params={
                    "Bucket": MINIO_BUCKET,
                    "Key": storage,
                    "UploadId": upload_id,
                    "PartNumber": part_number,
                },
                ExpiresIn=MINIO_SIGNED_URL_EXPIRE_MINUTES,
            )

The above generates a url that looks like this and does not work as the Singularity client aborts and calls _multiupload_abort:

http://127.0.0.1:9000/sregistry/test/busybox%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?partNumber=1&uploadId=8ae1018b-a3f4-422a-8c0e-8aaff4569f8b&Signature=e1hqn6V8NxMw2TV3ctY1SkCUfJo%3D&AWSAccessKeyId=mama&Expires=1586035466

I was just able to enable debug output for Singularity, and then use the mc client for minio to determine the next issue to address:

127.0.0.1 [REQUEST s3.PutObjectPart] 22:59:38.025
127.0.0.1 PUT /sregistry/test/big:sha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?uploadId=a1071852-3407-4c2b-9444-6790cfafae51&partNumber=1&Expires=1586041182&Signature=2dN0tY%2F0esPKVAAA%2B%2F1584I0qqQ%3D&AWSAccessKeyId=minio
127.0.0.1 Host: 127.0.0.1:9000
127.0.0.1 Content-Length: 928
127.0.0.1 User-Agent: Go-http-client/1.1
127.0.0.1 X-Amz-Content-Sha256: 2fc597b42f249400d24a12904033454931eb3624e8c048fe825c360d9c1e61bf
127.0.0.1 Accept-Encoding: gzip
127.0.0.1 <BODY>
127.0.0.1 [RESPONSE] [22:59:38.025] [ Duration 670µs  ↑ 68 B  ↓ 806 B ]
127.0.0.1 403 Forbidden
127.0.0.1 X-Xss-Protection: 1; mode=block
127.0.0.1 Accept-Ranges: bytes
127.0.0.1 Content-Length: 549
127.0.0.1 Content-Security-Policy: block-all-mixed-content
127.0.0.1 Content-Type: application/xml
127.0.0.1 Server: MinIO/RELEASE.2020-04-02T21-34-49Z
127.0.0.1 Vary: Origin
127.0.0.1 X-Amz-Request-Id: 1602C00C5749AF1F
127.0.0.1 <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><Key>test/big:sha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Key><BucketName>sregistry</BucketName><Resource>/sregistry/test/big:sha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Resource><RequestId>1602C00C5749AF1F</RequestId><HostId>e9ba6dec-55a9-4add-a56b-dd42a817edf2</HostId></Error>
127.0.0.1 

Likely I need to look into how to make sure the correct signature version / method is being used, I'm not sure what the default is but I don't think it's the same that minio expects.

Signed-off-by: vsoch <vsochat@stanford.edu>
…t error output for scs-library-client...

Signed-off-by: vsoch <vsochat@stanford.edu>
…ludes the host header

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Member Author

vsoch commented Apr 5, 2020

okay here is an update after working today. I was able to at least get a signature (and the associated headers) into the signed url by adding it like this:

from botocore.client import Config
...
s3_external = session.client(
    "s3",
    use_ssl=MINIO_SSL,
    region_name=MINIO_REGION,
    endpoint_url=MINIO_HTTP_PREFIX + MINIO_EXTERNAL_SERVER,
    verify=False,
    config=Config(signature_version='s3v4'),
)

and from this comment I think the (still) signature mismatch is because s3v4 uses the host to validate (which is different inside the container, minio:9000 from outside the container 127.0.0.1:9000).

  • this page shows usage of the generate_presigned_url function but I don't think this is the issue, unless we can somehow customize the Host header that is used?
  • this shows inputs for generate_signed_url, but I don't see anything about setting the host. Interestingly, the returned metadata does include a "HostID" which is an empty string.
  • these are the s3 authentication types and per Minio documentation I think it just supports s3v4 and s3v2.
  • this shows the signer that generates the URL, I'm wondering if we need to interact with that directly with some customization of the input params to get around not being able to set the correct hostname?
  • this gist has a good example and I posted asking for help (no response yet).

So my thinking is that one of the following needs to be done:

  • the request to create the signed URL modified to use the external host header - I had thought by creating the "external_s3" client this would work, but I think the issue is that the request needs to be made with the regular s3 client, to ping the minio server from inside the container.
  • the signature version needs to be modified to not use the host as a variable (and this would also require modifying the minio server to use a different version other than s3v4
  • some change to how minio is deployed so it's accessed from inside and outside the container via the same hostname (if anyone has ideas this would probably be best to try first).

Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Member Author

vsoch commented Apr 5, 2020

And here are full logs from minio:

minio 
minio [REQUEST s3.HeadBucket] 22:23:56.991
minio HEAD /sregistry/
minio Host: minio:9000
minio X-Amz-Date: 20200405T222356Z
minio Accept-Encoding: identity
minio Authorization: AWS4-HMAC-SHA256 Credential=minio/20200405/us-east-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=0ea18b5f4f1a98901dd99c33cc38a164e0018c4282e78499d7d5b04e56b9d2cd
minio Content-Length: 0
minio User-Agent: MinIO (Linux; x86_64) minio-py/5.0.8
minio X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
minio 
minio [RESPONSE] [22:23:56.991] [ Duration 171µs  ↑ 93 B  ↓ 218 B ]
minio 200 OK
minio Vary: Origin
minio X-Amz-Request-Id: 16030CAE690AF085
minio X-Xss-Protection: 1; mode=block
minio Accept-Ranges: bytes
minio Content-Length: 0
minio Content-Security-Policy: block-all-mixed-content
minio Server: MinIO/RELEASE.2020-04-04T05-39-31Z
minio 
minio 
minio [REQUEST s3.NewMultipartUpload] 22:23:57.186
minio POST /sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?uploads
minio Host: minio:9000
minio X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
minio X-Amz-Date: 20200405T222357Z
minio Accept-Encoding: identity
minio Authorization: AWS4-HMAC-SHA256 Credential=minio/20200405/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=3481893c75d10f99a20c1f4b181971ff845c59a9235076d9576ed71c2a88ea92
minio Content-Length: 0
minio User-Agent: Boto3/1.12.36 Python/3.5.7 Linux/5.3.0-42-generic Botocore/1.15.36
minio 
minio [RESPONSE] [22:23:57.186] [ Duration 490µs  ↑ 93 B  ↓ 580 B ]
minio 200 OK
minio Server: MinIO/RELEASE.2020-04-04T05-39-31Z
minio Vary: Origin
minio X-Amz-Request-Id: 16030CAE74AF4D9B
minio X-Xss-Protection: 1; mode=block
minio Accept-Ranges: bytes
minio Content-Length: 330
minio Content-Security-Policy: block-all-mixed-content
minio Content-Type: application/xml
minio <?xml version="1.0" encoding="UTF-8"?>
<InitiateMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>sregistry</Bucket><Key>test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Key><UploadId>b6d2946e-c2d3-400d-b2a4-897d421de5f6</UploadId></InitiateMultipartUploadResult>
minio 
127.0.0.1 [REQUEST s3.PutObjectPart] 22:23:57.809
127.0.0.1 PUT /sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874?partNumber=1&uploadId=b6d2946e-c2d3-400d-b2a4-897d421de5f6&X-Amz-Date=20200405T222357Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-SignedHeaders=host&X-Amz-Credential=minio%2F20200405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=5&X-Amz-Signature=95e243bd0bb9d0103f3cfecd8d60285544289dbfd4afd4ea7fb3e18e3605fede
127.0.0.1 Host: 127.0.0.1:9000
127.0.0.1 Accept-Encoding: gzip
127.0.0.1 Content-Length: 928
127.0.0.1 User-Agent: Go-http-client/1.1
127.0.0.1 X-Amz-Content-Sha256: 2fc597b42f249400d24a12904033454931eb3624e8c048fe825c360d9c1e61bf
127.0.0.1 <BODY>
127.0.0.1 [RESPONSE] [22:23:57.809] [ Duration 2.014ms  ↑ 68 B  ↓ 818 B ]
127.0.0.1 403 Forbidden
127.0.0.1 Server: MinIO/RELEASE.2020-04-04T05-39-31Z
127.0.0.1 Vary: Origin
127.0.0.1 X-Amz-Request-Id: 16030CAE99BC2C2E
127.0.0.1 X-Xss-Protection: 1; mode=block
127.0.0.1 Accept-Ranges: bytes
127.0.0.1 Content-Length: 561
127.0.0.1 Content-Security-Policy: block-all-mixed-content
127.0.0.1 Content-Type: application/xml
127.0.0.1 <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><Key>test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Key><BucketName>sregistry</BucketName><Resource>/sregistry/test/chonker%3Asha256.92278b7c046c0acf0952b3e1663b8abb819c260e8a96705bad90833d87ca0874</Resource><RequestId>16030CAE99BC2C2E</RequestId><HostId>7f978b0a-2482-4918-94e4-a7ed69e11a59</HostId></Error>
127.0.0.1 

The first ones at the top are when the sregistry uwsgi container is started up / restart, it just creates a MinioClient (which is successfully used for a PUT pre signed URL without the multipart bit). Then for the multipart we need to use s3 clients, those are s3 and s3_internal in minio.py, and those are the following requests. You can see the upload request being created by the s3 client, and then the PUT URL is generated from s3_external.

@harshavardhana
Copy link

s3v4 requires preserving host header i.e use the host when generating the pre-signed signature - it is present to avoid MITM attacks on the payload.

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

@harshavardhana what would you suggest for generating the signed URLs inside the docker container (minio:9000), where the hostname is different from the external client (outside the container, 127.0.0.1:9000). I tried adding the hostname "minio" to point to 127.0.0.1 in /etc/hosts so I could use the s3 client internally for all calls, but I got the same error. Could it be something else, or is there another header or setting that I'm missing?

Do you work at Minio? Is there an equivalent function to generate signed urls, but for a multipart upload? I noticed that the put_object request will use multipart for over some size threshold, but that would be called directly from the client making the request. We'd need to generate the URLS in the same fashion as the single put request.

@harshavardhana
Copy link

@vsoch depends on who is going to access the URL to upload/download

  • if it's an external client which has only visibility of external IP to the setup then it should be external IP.
  • if it's an internal client on the same network then you can use internal IP

To use rotating credentials you can use AssumeRole mechanism which allows for using AWS SDKs directly without generating pre-signed URLs and also without sharing the static credentials.

Your server component should ensure to generate these temporary credentials and pass it to the client

https://github.com/minio/minio/blob/master/docs/sts/assume-role.md

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

if it's an external client which has only visibility of external IP to the setup then it should be external IP.

@harshavardhana it's an external client, and I do create separate clients (s3 for making the internal request to start the multipart upload, and s3_external to generate the pre signed urls). See here for creating the clients and here for the requests.

Your server component should ensure to generate these temporary credentials and pass it to the client

I could pass this to the client, but I have no control over the client implementation - specifically it's using the scs-library-client so I have no way of changing those calls. Are you saying there is some way of wrapping this with the generation of pre-signed urls to just change how the authentication is done?

@harshavardhana
Copy link

I could pass this to the client, but I have no control over the client implementation - specifically it's using the scs-library-client so I have no way of changing those calls. Are you saying there is some way of wrapping this with the generation of pre-signed urls to just change how the authentication is done?

No there is no such thing either you use pre-signed or AssumeRole you have to choose.

@harshavardhana it's an external client, and I do create separate clients (s3 for making the internal request to start the multipart upload, and s3_external to generate the pre signed urls). See here for creating the clients and here for the requests.

If its an external client then you need to use external IP. keeping separate constructors for both is the correct approach, you can, of course, share the connection pooling by passing in the same urllib3 pool manager.

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

okay, I'll need to run more details by you tomorrow then, because (per our conversation now) I've done everything right thus far, but still there is the signature mismatch! I think it might be related to the (internal) client making the start request, the HostID returned in metadata is empty. I don't know if the UploadID passed on to the external_s3 client then creates a request that tries to match the requesting HostID to the one that started the request (and then gets an empty string vs. what looks like an md5 of something). I think the trick might be figuring out how to define the HostID for the start multipart request, so it matches for the external_s3 signed url request that gets checked against the start request with the UploadID?

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

Another thing I'll try tomorrow is to see what the "correct" headers look like for a working (single) PUT request here and then maybe I can use those same signing functions (with the correct headers) to update the signature generated for my current (not working) pre-signed urls.

@harshavardhana
Copy link

okay, I'll need to run more details by you tomorrow then, because (per our conversation now) I've done everything right thus far, but still there is the signature mismatch! I think it might be related to the (internal) client making the start request, the HostID returned in metadata is empty. I don't know if the UploadID passed on to the external_s3 client then creates a request that tries to match the requesting HostID to the one that started the request (and then gets an empty string vs. what looks like an md5 of something). I think the trick might be figuring out how to define the HostID for the start multipart request, so it matches for the external_s3 signed url request that gets checked against the start request with the UploadID?

It is not just about the correct presigned URL meaning lets say if you generated a presigned for external client and internal client cannot use it. Same as internal clients URL cannot be used external client - both will see signature mismatch due to host header mismatch.

HostID is not useful in the response we can ignore that for now, that's just some unique ID but it is largely not that meaningful in MinIO.

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

I am generating it with the external client (but from the internal host) and of course using it for the external client. As a sanity check, the same strategy with a minioClient and minioExternalClient generated with minio-py for a single presigned URL for a PUT works like a charm! It’s the multipart support for presigned URL that is missing. This is why I thought it might be useful to print out headers that are generated by minio-py for the working single PUT in case I need to add them for signature generation.

@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

hey @harshavardhana ! I joined the Minio slack and posted a bunch of information - let me know if / when you and/or your team are around so we can debug together! I'll keep trying things in the meantime :)

…all error

Signed-off-by: vsoch <vsochat@stanford.edu>
Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Member Author

vsoch commented Apr 6, 2020

okay I've just pushed the (now working!) solution for pre-signed multipart uploads - the issue was that the presign_v4 function was using an unsigned hash string, and I needed to copy the function and pass in the sha256 that is sent from my calling client. Finally, I've added the CompleteMultipartUpload view to finalize everything.

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch vsoch force-pushed the add/minio-backend-multipart branch from d82b1a4 to d742018 Compare April 6, 2020 23:13
Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Member Author

vsoch commented Apr 7, 2020

@RonaldEnsing @Aneoshun @lmcdasm @asmi10 you've all had recent issues with uploading large images or scaling, and I want to invite you to possibly review this PR if you might have the chance! Speciically, I've added a minio storage backend, which can serve the Sylabs library enpoints and also support multipart upload. I suspect that this would also allow for a custom setup where a center deploys a local Minio setup and then is able to use the same endpoints for the same registry! I would like to have a few testers give this a spin before merging - I'm fairly confident that this is a much better implementation than the current, but I want to have others look for a sanity check.

For those of you that have existing registries - do not test on top of your existing registry! Since minio is a new storage endpoint, I was careful to not have it use the previous folder bound to the host (images) but I still wouldn't want to take any risk of containers being recreated, at least for the time being. Looking forward to hearing what you think! I had a great time working on this... Minio is awesome, and there are so many ways you can further configuration it! -> https://github.com/minio/minio/tree/master/docs/config

Copy link
Contributor

@Aneoshun Aneoshun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vsoch

My sincere apologies for the delay, the last few weeks have been more than hectic. It is only now that I have finally time to test you incredible PR.

In short: It solves all my issues.
The upload/download speed seems better. The memory footprint during upload is marginal, regardless of the size of the image (I can upload a 3GB image on a SRegistry server with only 2GB. That was simply impossible before. I got screenshots of before/after memory usage, the difference is obvious.

I have tested with two containers (400MB and 3GB). Note that I have not tested the SSL connections.

Alright, now a few things that I have noticed:

  1. During the build of the container I see this error, maybe it is fixable:
YAML retrying sregistry uwsgi coreschema simplejson openapi-codec user-agents backcall xmlsec spython googleapis-common-protos wrapt
ERROR: awscli 1.18.66 has requirement rsa<=3.5.0,>=3.1.2, but you'll have rsa 4.0 which is incompatible.
  1. It seems that the download button for the uploaded containers has disappeared on the web UI (I can only see the "freeze" button). I don't know if it is like this on purpose or not.

  2. For some reasons, after uploading 1 container in an empty collection, the collection list (on the web UI) tells me that I have 3 builds. When I click on the collection, to see the container list, I only see one container (the one I just uploaded).

  3. After, reading the documentation, I am a bit confused between the usages of the images and minio-images folders. It looks like I cannot start the docker-compose if both of these folders are created. However, only minio-images seems to be used (as explained in the doc), while images is just filled with empty folders named from ./0 to ./9. I don't know if this is normal.

Thanks a lot @vsoch for your work. I am really looking forward to merging this PR and using it daily.

docs/_docs/install/server.md Outdated Show resolved Hide resolved
docs/_docs/install/server.md Show resolved Hide resolved
@vsoch
Copy link
Member Author

vsoch commented May 23, 2020

My sincere apologies for the delay, the last few weeks have been more than hectic. It is only now that I have finally time to test you incredible PR.

No worries! Times certainly aren't normal right now.

In short: It solves all my issues. The upload/download speed seems better. The memory footprint during upload is marginal, regardless of the size of the image (I can upload a 3GB image on a SRegistry server with only 2GB. That was simply impossible before. I got screenshots of before/after memory usage, the difference is obvious.

This is so great to hear! I was hoping this would be the case.

During the build of the container I see this error, maybe it is fixable:

YAML retrying sregistry uwsgi coreschema simplejson openapi-codec user-agents backcall xmlsec spython googleapis-common-protos wrapt
ERROR: awscli 1.18.66 has requirement rsa<=3.5.0,>=3.1.2, but you'll have rsa 4.0 which is incompatible.

It looks like there is heated discussion about this and it's relatively new so we should stay on the lookout for an update. I've subscribed to the issue and created an issue here, I think that's probably the most I can do - if you try it out and there is an issue with https then we can downgrade. I opened an issue here #300.

It seems that the download button for the uploaded containers has disappeared on the web UI (I can only see the "freeze" button). I don't know if it is like this on purpose or not.

Since we use minio now which requires a more robust auth flow, I thought it was best to remove this.

For some reasons, after uploading 1 container in an empty collection, the collection list (on the web UI) tells me that I have 3 builds. When I click on the collection, to see the container list, I only see one container (the one I just uploaded).

I'm not able to reproduce this. Could it be that you have a postgres container that was used previously and not updated (e.g., you pulled down to your repo to a previous install and didn't remove and re-create the database).

After, reading the documentation, I am a bit confused between the usages of the images and minio-images folders. It looks like I cannot start the docker-compose if both of these folders are created. However, only minio-images seems to be used (as explained in the doc), while images is just filled with empty folders named from ./0 to ./9. I don't know if this is normal.

This exists totally to still provide some sort of fallback support. If a registry needs to somehow exist with both, I decided to use a different name so there isn't conflict. If you use minio, you can safely ignore the images folder, it isn't doing harm being there.

Thanks a lot @vsoch for your work. I am really looking forward to merging this PR and using it daily.

Definitely! I'll push some of the documentation changes soon for you to take a look at. And to debug your 3 containers per collection issue - maybe shell into the server and do a collection.containers.count()? And try to remove and re-create the postgres image?

Signed-off-by: vsoch <vsochat@stanford.edu>
@vsoch
Copy link
Member Author

vsoch commented May 23, 2020

Edits -> 2ebed73

@Aneoshun
Copy link
Contributor

I'm not able to reproduce this. Could it be that you have a postgres container that was used previously and not updated (e.g., you pulled down to your repo to a previous install and didn't remove and re-create the database).

I cloned this PR in a new folder, stoped my running SRegistry instance, built the new SRegistry containers and start everything from scratch from there. My previous instance has probably hundreds of containers, so if it was a DB collision, then I expect to see a lot more ghost containers.

Definitely! I'll push some of the documentation changes soon for you to take a look at. And to debug your 3 containers per collection issue - maybe shell into the server and do a collection.containers.count()? And try to remove and re-create the postgres image?

Could you walk me through this a bit more? Which on which container of the compose shall I shell into? db? From there, in a bash shell, where shall I invoke collection.containers.count()?

@vsoch
Copy link
Member Author

vsoch commented May 23, 2020

Could you walk me through this a bit more? Which on which container of the compose shall I shell into? db? From there, in a bash shell, where shall I invoke collection.containers.count()?

Definitely! It's actually just the uwsgi container that you want. So if you are currently stopped, start containers:

docker-compose up -d`
$ docker-compose ps
        Name                       Command               State              Ports           
--------------------------------------------------------------------------------------------
sregistry_db_1          docker-entrypoint.sh postgres    Up      5432/tcp                   
sregistry_minio_1       /usr/bin/docker-entrypoint ...   Up      0.0.0.0:9000->9000/tcp     
sregistry_nginx_1       nginx -g daemon off;             Up      443/tcp, 0.0.0.0:80->80/tcp
sregistry_redis_1       docker-entrypoint.sh redis ...   Up      6379/tcp                   
sregistry_scheduler_1   python /code/manage.py rqs ...   Up      3031/tcp                   
sregistry_uwsgi_1       /bin/sh -c /code/run_uwsgi.sh    Up      3031/tcp                   
sregistry_worker_1      python /code/manage.py rqw ...   Up      3031/tcp  

Now we want to shell into the uwsgi container. The worker would give us equivalent access, but I typically use uwsgi.

$ docker exec -it sregistry_uwsgi_1 bash

This has us in the root of the container, /code, which is actually mounted at your repository. Be careful making changes from within the container because it will change permissions on your host, which typically you don't want.

Next, django has a lovely interface via the manage.py module right there. You can easily get different kinds of shells, show urls, and interact with a lot of custom functions exposed by modules installed in the registry. Just try doing:

python manage.py --help

to check it out. Of course we want an interactive Python shell into the application so we can do:

python manage.py shell

Once we are in python, you can import your models, and query them!

from shub.apps.main.models import *                                                                                                                              

Container.objects.count()                                                                                                                                        
1

Collection.objects.count()                                                                                                                                       
1

for collection in Collection.objects.all():
    print(collection)

collection:1

Collection.objects.filter(name="collection")                                                                                                                     
# <QuerySet [<Collection: collection:1>]>

And of course each collection is related to its' containers:

collection.containers.count()                                                                                                                                    
1

and you can get attributes:

collection.private
False

collection.name                                                                                                                                                  
'collection'

collection.owners.first()                                                                                                                                       
<User: vsoch>

So we would be interested in the count of containers in the collection, and if you see 3, to loop over them and print the names, tags, etc.

@Aneoshun
Copy link
Contributor

Thanks! So count() outputs 4 while I only have 2 containers. Here are the names of the containers

<QuerySet [<Container: gitlab-ci/airl_env:DUMMY-2f3832c0-5c9d-4b08-9c65-d769db45f9ba>, <Container: gitlab-ci/airl_env:DUMMY-304812db-c526-40af-bc05-d61368aba5c3>, <Container: gitlab-ci/airl_env:pytorch>, <Container: gitlab-ci/airl_env:base_ci>]>

As you can see I have 2 Dummy containers. I don't know why. They appear in the counts in the UI, but are not listed with the other containers.

@Aneoshun
Copy link
Contributor

Ahh! I might know what happened. After cloning the repo, I copied the config.py file from my original Registry (to have all my settings), which obviously led to an error because of the missing MinIO configuration. Then I had a second issue because I forgot to set the URL of the server here. Because of those two errors, I made two failed attempts at pushing a container. Could it be that the Dummy containers are created at this point, but not removed?

@vsoch
Copy link
Member Author

vsoch commented May 23, 2020

That’s what the tag suggests! :)

@vsoch
Copy link
Member Author

vsoch commented May 23, 2020

There is actually a task that handles cleaning these up, I didn't find a way to do it easily when it happens: https://github.com/singularityhub/sregistry/blob/a6b0c0d9d9ac67d240ebc295ef610af96d3e84fc/shub/apps/api/management/commands/cleanup_dummy.py. You should be able to run this with manage.py too

python manage.py cleanup_dummy

Does that resolve all the issues then? Ready for merge?

@Aneoshun
Copy link
Contributor

Yes, cleanup_dummy removed this. Thanks a lot! I think it is ready to merge!

@vsoch vsoch merged commit 401c1b8 into master May 23, 2020
@vsoch vsoch deleted the add/minio-backend-multipart branch May 26, 2020 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants