-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Requester Pays S3 buckets #346
Comments
Sorry, I have no experience with requester pays buckets. It seems like a flag would work. A config key might be a little more dangerous as users could potentially make large numbers of requests while forgetting about the profile setting. |
Some more details about my ideas for how to do this, and the sort of existing practice I'm looking for… Looking at their source code, it seems that e.g. Tim Kay's aws has a However because samtools, htslib, and htslib's libcurl plugin are all separate, it would be inconvenient to have this as a samtools/bcftools/other-client-software command-line option that would need to be implemented separately for each client program and communicated to htslib and thence to the plugin. What would be more workable for us would be to encode this flag into the URL (which is of course already the main parameter in the interface to the plugin) and/or into the S3 configuration files that the plugin consults. There's a few places in an S3 pseudo-URL where such a flag could be bolted on:
We already have schemes like It appears that aws/s3cmd also add the requester-pays header based on their configuration files: if ~/.awsrc contains We also read ~/.aws/credentials and (when I get around to writing the documentation!) will recommend this as the best config file for storing this stuff. So it would be good if there were a key for setting requester-pays in this standard config file. However I don't know if there is such a key, and I don't know where ~/.aws/credentials is documented other than this AWS security blog posting. As @DonFreed noted, putting this in your configuration file is a little dangerous, but we could recommend a setup like the following:
With that, So what we're looking for is:
|
* EDITED*
OK, I still think this sort of thing is out of scope for htslib, but I am havign trouble (surprise) compiling in the lib-curl bindings to get HTTPS support. Anyone have good and complete instructions for OS X with Homebrew? |
OK, was able to compile htslib with good libcurl support. Confirmed that it can take a presigned URL to view files: import boto3
client = boto3.client('s3')
url = client.generate_presigned_url("get_object", Params={"Bucket":"angel-reqpay","Key":"test.cram" , "RequestPayer":'requester'})
print("./htsfile -h '{0}'".format(url))
Assuming that a local directory version of $ python ~/Desktop/s3-presigned-samtools.py
./htsfile -h 'https://angel-reqpay.s3.amazonaws.com/test.cram?AWSAccessKeyId=XXXXXXXXXXXXXXXXX&x-amz-request-payer=requester&Expires=1458309035&Signature=XXXXXXXXXXXXX'
$ ./htsfile -h 'https://angel-reqpay.s3.amazonaws.com/test.cram?AWSAccessKeyId=XXXXXXXXXXXXXXXXX&x-amz-request-payer=requester&Expires=1458309035&Signature=XXXXXXXXXXXXX' | head -4
@HD VN:1.4 GO:none SO:coordinate
@SQ SN:chr1 LN:248956422 AS:GRCh38 M5:6aef897c3d6ff0c78aff06ac189178dd UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa SP:Human
@SQ SN:chr2 LN:242193529 AS:GRCh38 M5:f98db672eb0993dcfdabafe2a882905c UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa SP:Human
@SQ SN:chr3 LN:198295559 AS:GRCh38 M5:76635a41ea913a405ded820447d067b0 UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa SP:Human
Since this is my bucket, I can't be 100% sure that my signature is disabling the requester pays option, so I would appreciate it if someone else confirms that the above works. |
Also the |
I gave the script generated by that URL a try and it seemed to work? I'm not 100% sure where I would check that I got billed rather than you though? |
You would get billed, but it would be like $0.002 cents at most. |
Okay, the htsfile version of the command works. However I've realised there's a problem with using this in practice. When you do for example:
The .crai is going to be a separate file and thus another URL. Am I correct in assuming that there is either no test.crai in your bucket or that the .crai being a different URL would require it's own generated pre-signed URL? |
So samtools will look for a local index file. You can use boto to download The point is that any system is going to have weird protocols. Perhaps that |
Requester Pays buckets need an extra
X-Amz-Request-Payer: requester
header.Clearly it would not be appropriate for htslib/samtools to set it all the time (as it represents explicit acknowledgement from the user that they will be charged). So we could set it if some flag was present in the URL or perhaps via an extra config file key on the profile used. @DonFreed or anyone else: are you aware of any existing practice in this area?
[As reported at biostars.org.]
The text was updated successfully, but these errors were encountered: