New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws s3 sync --exclude does't work consistently #541

Closed
sajee opened this Issue Dec 9, 2013 · 9 comments

Comments

Projects
None yet
7 participants
@sajee
Copy link

sajee commented Dec 9, 2013

On OSX 10.8.5, I'm trying to exclude .DS_Store but --exclude doesn't seem to work as expected.

$  aws --version
aws-cli/1.2.7 Python/2.7.2 Darwin/12.5.0

$ ls -Ra
./         ../        .DS_Store  a/         test.txt

./a:
./         ../        .DS_Store

$ aws s3 sync  ./  s3://sajee-sync/  --exclude "*.DS_Store"   --dryrun
(dryrun) upload: ./test.txt to s3://sajee-sync/test.txt

$ cd a
$ aws s3 sync  ../  s3://sajee-sync/  --exclude "*.DS_Store"   --dryrun
(dryrun) upload: ../.DS_Store to s3://sajee-sync/.DS_Store
(dryrun) upload: ../test.txt to s3://sajee-sync/test.txt

Why is ../.DS_Store getting uploaded?

Instead of --exclude "*.DS_Store", I tried --exclude .DS_Store, --exclude *.DS_Store, ".DS_Store" but all produce the same results.

@sellers

This comment has been minimized.

Copy link

sellers commented Dec 11, 2013

==EDITED by author===

I believe that --include does not work and ends up including even if the filter does not work. I have not read the spec yet but if you are trying to sync from S3 down to a local file system and use --include, it does not behave as expected. Notice in the examples below how the "does not match" is identified, but the "should_include:" is set to true.

ex:

2013-12-11 18:32:15,643 - awscli.customizations.s3.filters - DEBUG - atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131119T061701 did not match include filter: atlas-config.ec2.arbor.net/20131210T
2013-12-11 18:32:15,643 - awscli.customizations.s3.filters - DEBUG - =atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131119T061701 final filtered status, should_include: True
2013-12-11 18:32:15,644 - awscli.customizations.s3.filters - DEBUG - /var/tmp/feed-web-1/access-20131119T061701 did not match include filter: /var/tmp/20131210T
2013-12-11 18:32:15,644 - awscli.customizations.s3.filters - DEBUG - =/var/tmp/feed-web-1/access-20131119T061701 final filtered status, should_include: True

2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131122T071702 did not match include filter: atlas-config.ec2.arbor.net/20131210T
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - =atlas-config.ec2.arbor.net/logs/feed-web-1/access-20131122T071702 final filtered status, should_include: True
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - /var/tmp/feed-web-1/access-20131122T081702 did not match include filter: /var/tmp/20131210T
2013-12-11 18:49:08,615 - awscli.customizations.s3.filters - DEBUG - =/var/tmp/feed-web-1/access-20131122T081702 final filtered status, should_include: True

For the good of others - this is actually by code design. The conditional in filters.py only checkes for matched and then --include and then elseif --exclude, with a default of True. As a result, you MUST use --exclude="_" --include ="_pattern*" in order to just match the files you want.

otherwise, filters.py would have to be forked/modified to handle explicit use cases of --include w/o --exclude.

As to why --exclude does not work in the above, what happens if you put --debug after sync in your line? It will show the printout of what filters.py is doing.

@sajee

This comment has been minimized.

Copy link

sajee commented Dec 12, 2013

Here's the problem:

2013-12-12 15:52:14,482 - awscli.customizations.s3.filters - DEBUG - /Temp/s3-sync-test/.DS_Store did not match exclude filter: /Temp/s3-sync-test/a/.DS_Store
2013-12-12 15:52:14,482 - awscli.customizations.s3.filters - DEBUG - =/Temp/s3-sync-test/.DS_Store final filtered status, should_include: True
2013-12-12 15:52:14,482 - botocore.service - DEBUG - Creating operation objects for: Service(s3)

A specific path to .DS_Store is being used so any .DS_Store that doesn't match that path will be included. How do I exclude any files that match .DS_Store?

@sellers

This comment has been minimized.

Copy link

sellers commented Dec 12, 2013

yes, pathing is important, to prefix of * may be needed if it's not in the root of the bucket/container.

aws s3 sync ./ s3://sajee-sync/ --debug --exclude "*/.DS_Store" --dryrun

I'm not sure why *.DS_Store didn't work for you - but I'd like to see what the above shows. Also, can you aws --version for us too.

@sajee

This comment has been minimized.

Copy link

sajee commented Dec 13, 2013

Same result. "*/.DS_Store" didn't make a difference.

The aws --version is up top in my original post.

@jamesls

This comment has been minimized.

Copy link
Member

jamesls commented Dec 16, 2013

Looks like you're running into this issue: #548
which is fixed in #554 and will go out in the next release.

@jamesls jamesls closed this Dec 16, 2013

@thedukeness

This comment has been minimized.

Copy link

thedukeness commented Oct 7, 2015

I'm still receiving this error as well
aws-cli/1.8.9 Python/2.6.6 Linux/2.6.32-573.7.1.el6.x86_64

@monsur

This comment has been minimized.

Copy link

monsur commented Dec 31, 2015

I am still experiencing this issue as well. --exclude patterns that are exact matches (e.g. '.DS_Store'), do not work. I must have a wildcard in order for the --exclude to work.

@gregsadetsky

This comment has been minimized.

Copy link

gregsadetsky commented Jun 4, 2016

Same here. I've had to specifically write --exclude "*.DS_Store*" for the matching to work. Thanks

@ericpeters0n

This comment has been minimized.

Copy link

ericpeters0n commented Jul 13, 2016

This either does not seem to behave logically, or is not well-documented... I can't tell which.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment