Adding encryption to sync command #12

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
3 participants
@firstclown

I added encryption to the sync command by storing extra metadata for encrypted files, the original file's md5 and the original file's size. There needs to be a HEAD call on every resource now for a sync, but it shouldn't happen on a straight get or put.

@mludvig

This comment has been minimized.

Show comment
Hide comment
@mludvig

mludvig Nov 15, 2011

Contributor

Hi, thanks for your work. I'm keen to merge such a functionality, however the overhead of calling HEAD every time seems to be too huge. Many people run s3cmd on buckets with millions of files and some others (me) run it in on a remote South Pacific island with a high latency to S3 datacentres.

I would much prefer to store the attributes locally, for example in .s3cmd.info in every directory which could be a Python Pickle file, or sqlite3 database file or something like that. Alternatively store the attributes in xattr on filesystems that support it (most current Linux/Unix filesystems do). That way s3cmd would need to make the HEAD calls only if the required attributes couldn't be found locally.

How does that sound? Are you ok to implement that?

Thanks!

Contributor

mludvig commented Nov 15, 2011

Hi, thanks for your work. I'm keen to merge such a functionality, however the overhead of calling HEAD every time seems to be too huge. Many people run s3cmd on buckets with millions of files and some others (me) run it in on a remote South Pacific island with a high latency to S3 datacentres.

I would much prefer to store the attributes locally, for example in .s3cmd.info in every directory which could be a Python Pickle file, or sqlite3 database file or something like that. Alternatively store the attributes in xattr on filesystems that support it (most current Linux/Unix filesystems do). That way s3cmd would need to make the HEAD calls only if the required attributes couldn't be found locally.

How does that sound? Are you ok to implement that?

Thanks!

@firstclown

This comment has been minimized.

Show comment
Hide comment
@firstclown

firstclown Nov 15, 2011

I'll look at the idea. I haven't done much Python, but I'll dig into what
would be needed for this.

I also didn't like the HEAD call solution either, but I couldn't come up
with a quick way of doing it. I like you ideas though and will look at
putting that in. Feel free to not accept this solution since it will cause
the problems you mentioned.

On Mon, Nov 14, 2011 at 10:06 PM, Michal Ludvig <
reply@reply.github.com

wrote:

Hi, thanks for your work. I'm keen to merge such a functionality, however
the overhead of calling HEAD every time seems to be too huge. Many people
run s3cmd on buckets with millions of files and some others (me) run it in
on a remote South Pacific island with a high latency to S3 datacentres.

I would much prefer to store the attributes locally, for example in .
s3cmd.info in every directory which could be a Python Pickle file, or
sqlite3 database file or something like that. Alternatively store the
attributes in xattr on filesystems that support it (most current Linux/Unix
filesystems do). That way s3cmd would need to make the HEAD calls only if
the required attributes couldn't be found locally.

How does that sound? Are you ok to implement that?

Thanks!


Reply to this email directly or view it on GitHub:
#12 (comment)

I'll look at the idea. I haven't done much Python, but I'll dig into what
would be needed for this.

I also didn't like the HEAD call solution either, but I couldn't come up
with a quick way of doing it. I like you ideas though and will look at
putting that in. Feel free to not accept this solution since it will cause
the problems you mentioned.

On Mon, Nov 14, 2011 at 10:06 PM, Michal Ludvig <
reply@reply.github.com

wrote:

Hi, thanks for your work. I'm keen to merge such a functionality, however
the overhead of calling HEAD every time seems to be too huge. Many people
run s3cmd on buckets with millions of files and some others (me) run it in
on a remote South Pacific island with a high latency to S3 datacentres.

I would much prefer to store the attributes locally, for example in .
s3cmd.info in every directory which could be a Python Pickle file, or
sqlite3 database file or something like that. Alternatively store the
attributes in xattr on filesystems that support it (most current Linux/Unix
filesystems do). That way s3cmd would need to make the HEAD calls only if
the required attributes couldn't be found locally.

How does that sound? Are you ok to implement that?

Thanks!


Reply to this email directly or view it on GitHub:
#12 (comment)

@firstclown

This comment has been minimized.

Show comment
Hide comment
@firstclown

firstclown Nov 15, 2011

I'm going to close this request and work on the new way of doing it. I also shouldn't have been working in master anyway, so I'm going to refactor my git branches so this won't cause problems in the future. I'll re request a pull when I'm finished.

I'm going to close this request and work on the new way of doing it. I also shouldn't have been working in master anyway, so I'm going to refactor my git branches so this won't cause problems in the future. I'll re request a pull when I'm finished.

@firstclown firstclown closed this Nov 15, 2011

@vsespb

This comment has been minimized.

Show comment
Hide comment
@vsespb

vsespb Jun 9, 2013

I know this ticket is closed now.

But it looks like implementing this similar was is not a good idea: md5 of original, unencrypted file is stored in x-amz-meta-s3tools-orig_md5 headers ?
IMHO it's a security problem.

http://stackoverflow.com/questions/2845986/does-having-an-unencrypted-sha-224-checksum-create-a-vulnerability

http://www.cs.jhu.edu/~astubble/dss/winzip.pdf
«Due to a security flaw in AE-1 (CRC of plaintext is included in unencrypted format in the output), it was replaced by AE-2 in WinZip 9.0 Beta 3.»

vsespb commented Jun 9, 2013

I know this ticket is closed now.

But it looks like implementing this similar was is not a good idea: md5 of original, unencrypted file is stored in x-amz-meta-s3tools-orig_md5 headers ?
IMHO it's a security problem.

http://stackoverflow.com/questions/2845986/does-having-an-unencrypted-sha-224-checksum-create-a-vulnerability

http://www.cs.jhu.edu/~astubble/dss/winzip.pdf
«Due to a security flaw in AE-1 (CRC of plaintext is included in unencrypted format in the output), it was replaced by AE-2 in WinZip 9.0 Beta 3.»

@vsespb vsespb referenced this pull request in vsespb/mt-aws-glacier Sep 27, 2013

Open

Encryption #43

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment