-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sda-download] Implement random access in encrypted files #696
Comments
Assuming we want to avoid reencryption of possibly large amounts of data, this should use the intended support for this in the crypt4gh file format. In short, each file/data stream is split into 64kbyte blocks that are encrypted/ separately. This is also the smallest unit for decryption as these blocks are what MACs are created for. This means that to send logical byte 65535-65536 (base 0), one would need to send the reencrypted header and the first two data blocks (65536+extra bytes for crypt4gh). As the receiver only want those two bytes, there would also need to be a data edit list in the header to instruct it to throw away bytes 0-65534 and 65537-131071. So the header reencryption service needs to be able to accept a dataeditlist to be put in the header. Currently, I think there's only the chacha20_ietf_poly1305 cipher, so a fixed block size of 65564 can be used, but possibly it might make sense to have a function in the crypt4gh library that takes a header and responds with the block size (or similar). |
For both the unencrypted and encrypted data out case, there will also be a performance motive to not request the entire object from the archive and only return the wanted bit but rather only requesting the range actually needed. For the encrypted case, this is fairly simple - the s3 download client could pass a The question would be if we would prefer having a unified handling for unencrypted and encrypted. For the unencrypted case, it might make sense to have a reader that maps calls to |
When decrypting a partial file, the resulting file size should be what was originally asked for, not more. Ie, the extra data passed on to meet the next data boundary block should be removed. Use |
As an sda-user
I want to be able to download specific parts of an encrypted file
In order to be able to get only the region I am interested in
The service currently allows to download specific byte ranges of unencrypted files but in the case of encrypted files, that's only possible for byte ranges that start from the beginning of the file. We need to be able to support random byte ranges of encrypted files, to support the htsget case.
A/C
The text was updated successfully, but these errors were encountered: