New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting credentials passed via IAM roles fails on beefy instance types #16
Comments
Can't reproduce the issue. s3am already works on toil-box. Will try on a toil cluster next. |
Works on a cluster too. It appears that s3am already works with credentials coming from IAM roles. This is actually a Boto feature and as such should apply to Toil, too. @jvivian Why do you need a .boto on toil nodes? Please reply and assign back to me. |
I'll try and reproduce this tomorrow. |
Here is how I reproduced this error: Spawned a toil-cluster via cgcloud (key_config contains master.key and config.txt): Applied the bind mount fix for docker: rsynced over the toil and launch script to /home/mesosbox/ on the master: I prayed to the dark lord Cthulu and launched the pipeline: When I got the S3AM_upload step in the pipeline, it failed with the following series of errors: Copied over my .boto file to /home/mesosbox/ and /home/ubuntu/ and reran the pipeline with Ran |
It might be the same problem as in https://groups.google.com/forum/#!topic/boto-users/bq0tMxNbjCg which describes a intermittent problem. From the trace back in the pastebin I can tell that the authentication must have succeeded during initial stages of s3am, since the failure is happening during a part upload. @jvivian, I think you are running too many instances of s3am or that each instance has too many children. s3am already parallelizes transfers using as many children as there are cores. With IAM roles, each child needs to obtain the credentials by requesting the EC2 metadata endpoint via HTTP. If you run many s3am instances with many children each that might simply overload the metadata endpoint or hit some throttling. Reduce the number of children using s3am's |
S3AM was only called one time via subprocess and this was on a single master/slave setup so there were no other parallel calls being made. |
I see. I will try to repro on a c3.8xlarge where s3am will use 32 concurrent part uploads, each incurring a metadata endpoint request. Maybe 32 is enough to cause problems. If that's he case I may want to extract the the credentials from the parent process' boto connection and somehow inject them into the child processes. |
This would allow s3am to be run on EC2 without a .boto.
The text was updated successfully, but these errors were encountered: