Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different behaviors of s3cmd sync with both remotes #1001

Open
icy opened this issue Sep 14, 2018 · 5 comments
Open

Different behaviors of s3cmd sync with both remotes #1001

icy opened this issue Sep 14, 2018 · 5 comments

Comments

@icy
Copy link

icy commented Sep 14, 2018

I'm using s3cmd version 2.0.0 and have noticed the different behaviors of s3cmd sync. Consider the following command:

s3cmd sync s3://foo/bucket/some/file s3://bar/bucket/some/file

it's expected that the command transfers the file s3://foo/bucket/some/file to s3://bar/bucket/some/file. However:

  1. If the destination file doesn't exist, it will be created new file s3://bar/bucket/some/file
  2. If the destination file does exist, a complete new file is created s3://foo/bucket/some/filefile (note: filefile)

Using cp command doesn't have this problem.

Also note that both s3cmd cp and s3cmd sync for file don't have any checksum and it always does file copying (this is expensive).

Is this documented somewhere, or am I missing something?

Thanks a lot.

@icy
Copy link
Author

icy commented Sep 14, 2018

NB: Using s3cmd sync s3://foo/bucket/some/file s3://bar/bucket/some/ works correctly (file will not be copied if checksums match)

@fviard
Copy link
Contributor

fviard commented Sep 14, 2018 via email

@icy
Copy link
Author

icy commented Sep 14, 2018

Dear @fviard , I have installed s3cmd from branch master (running python setup.py install), and I think the problem is still there

# s3cmd --version
s3cmd version 2.0.2

# s3cmd ls s3://sync0--example/
2018-09-14 08:38          717  s3://sync0--example/Makefile

# s3cmd sync s3://example/Makefile s3://sync0--example/Makefile

Summary: 1 source files to copy, 0 files at destination to delete
remote copy: 's3://example/Makefile' -> 's3://sync0--example/MakefileMakefile'
Done. Copied 1 files in 1.0 seconds, 1.00 files/s.

# s3cmd ls s3://sync0--example/
2018-09-14 08:38          717  s3://sync0--example/Makefile
2018-09-14 08:51          717  s3://sync0--example/MakefileMakefile

@fviard
Copy link
Contributor

fviard commented Sep 14, 2018

Hum I see, I realize that you are in a case of remote to remote.
Your issue should be related to this one:
#850

I think that the root cause is that when preparing the job, we don't know for either side if they are files or folders.

@icy
Copy link
Author

icy commented Sep 14, 2018

I see. I think this is a buggy feature of S3 which allows file and folder to have the same name. On S3 it's possible to have these 3 things

s3://sync0--example/Makefile  # A file
s3://sync0--example/Makefile/ # A folder
s3://sync0--example/Makefile/Makefile # A file

This is very different from the traditional *nix file system.

On #850 it's suggested to use the trailing slash to specify the folder. If we can't know remote properties (it's impossible, isn't it?), I think we may learn from the client arguments instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants