Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws s3 sync skips many files with error "File does not exist." #3514

Closed
Typel opened this issue Aug 21, 2018 · 19 comments
Closed

aws s3 sync skips many files with error "File does not exist." #3514

Typel opened this issue Aug 21, 2018 · 19 comments
Labels
guidance Question that needs advice or information.

Comments

@Typel
Copy link

Typel commented Aug 21, 2018

C:\Users\Typel>aws --version
aws-cli/1.15.77 Python 2.7.9 Windows/7 botocore/1.10.76

The aws s3 sync command skips many files, complaining that the "File does not exist." even though it does (I checked).

I thought it might be caused by unconventional or long filenames, but even counting the full path they still come in well under the 1024 character limit imposed by S3 (longest was around 300 characters, though the filename itself was only 80). Furthermore, none of the skipped files have any strange characters in their names; they are basic alphanumeric. Some filenames did have spaces, some had a combination of dashes and/or underscores, and all had a 3-letter extension (such as .pdf).

I also compared file security settings and ownership between files that did and did not transfer from within the same folder - they are identical.

Steps to Reproduce:

  1. Have a file repository locally and an S3 bucket.
  2. aws s3 sync C:\resources\ s3:/backup --delete --debug
  3. Wait for it... most files transfer just fine, but at the end...
  4. Get a whole bunch of warnings like this: warning: Skipping file C:\resources\director\National Data\Activity and Expense Report\2011\November\ATA-R Report 11-12-2011\ATA A-E Report 11-12-11 Parking Fees.pdf. File does not exist.

More Debug Info

Below is some of the debug messaging produced just before and just after the errors in question:

2018-08-21 17:17:19,565 - MainThread - botocore.hooks - DEBUG - Event needs retry.s3.ListObjects: calling handler <botocore.retryhandler.RetryHandler object at 0x00000000044F3B70>
2018-08-21 17:17:19,565 - MainThread - botocore.retryhandler - DEBUG - No retry needed.
2018-08-21 17:17:19,565 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjects: calling handler <bound method S3RegionDirector.redirect_from_error of <botocore.utils.S3RegionDirector object at 0x00000000044F3BA8>>
2018-08-21 17:17:19,565 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function decode_list_object at 0x0000000003294588>
2018-08-21 17:17:19,565 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function enhance_error_msg at 0x0000000003F6C588>
warning: Skipping file C:\resources\director\National Data\Activity and Expense Report\2011\November\ATA-R Report 11-12-2011\ATA A-E Report 11-12-11 Parking Fees.pdf. File does not exist.
...
2018-08-21 17:17:21,690 - Thread-1 - awscli.customizations.s3.results - DEBUG - Shutdown request received in result processing thread, shutting down result thread.

Please let me know if there is anything else I can/should provide. I would have pasted more of the log in here but Windows has a pretty short command line buffer. Is there maybe a way to redirect the --debug output to a file?

@justnance justnance self-assigned this Aug 30, 2018
@Ayase2e
Copy link

Ayase2e commented Dec 12, 2018

Yes same issue on windows 2012 R2 with cli version 1.16

PS C:> aws --version
aws-cli/1.16.72 Python/3.6.0 Windows/2012ServerR2 botocore/1.12.62

Tried sync file to s3 and got below error.

PS C:> aws s3 sync test.conf s3://XXXX
warning: Skipping file C:\test.conf. File does not exist.

cp command is working whereas sync command say file not exist

@justnance
Copy link

@Typel - Thanks for reaching out. I am investigating this issue and trying to reproduce the same results.

Is there maybe a way to redirect the --debug output to a file?

Currently, the best way to capture a debug log to a file is to redirect stderr to a file (#642):

aws s3 sync C:\resources\ s3:/backup --delete --debug 2> debug.log

I found this issue appears to be related to #1082. We have also seen the "File does not exist" when there are permission restrictions. Please reply and let us know if this is still an issue.

@justnance justnance added guidance Question that needs advice or information. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Jan 28, 2019
@Typel
Copy link
Author

Typel commented Feb 7, 2019

Hello @justnance thanks for taking the time to look into this issue. I just went ahead and updated to the latest aws-cli version and ran the command you specified and then searched the debug log for the same error. In short, it is still occurring quite often. I'm thinking maybe it has to do with long folder/filenames since the affected files seem to be those with long names. Unfortunately, I can't just rename the files since we are using this for backing up an existing system as-is.

C:\Users\Typel>aws --version
aws-cli/1.16.96 Python/3.6.0 Windows/7 botocore/1.12.86

Some more examples of the error from the resulting log:

C:\Users\Typel>aws s3 sync C:\resources\ s3://backup --delete --debug 2> debug.log

...
warning: Skipping file C:\resources\safe\test-website-files\2019.01.15\public-html\wp-content\uploads\2018\10\benchmarking-and-market-leader-concept-manager-businessman-coach-leadership-draw-graph-with-three-lines-one-of-them-represent-the-best-company-in-competition-100x100.jpg. File does not exist.
warning: Skipping file C:\resources\safe\test-website-files\2019.01.15\public-html\wp-content\uploads\2018\10\benchmarking-and-market-leader-concept-manager-businessman-coach-leadership-draw-graph-with-three-lines-one-of-them-represent-the-best-company-in-competition-1024x575.jpg. File does not exist.
warning: Skipping file C:\resources\safe\test-website-files\2019.01.15\public-html\wp-content\uploads\2018\10\benchmarking-and-market-leader-concept-manager-businessman-coach-leadership-draw-graph-with-three-lines-one-of-them-represent-the-best-company-in-competition-150x150.jpg. File does not exist.
...
warning: Skipping file C:\resources\safe\test-website-files\2019.01.23\public-html\wp-content\uploads\2019\01\businessman-in-suit-with-two-hands-in-position-to-protect-something-focus-on-hand-blur-out-the-suit-it-indicates-many-aspects-such-as-car-insurance-coverage-support-assurance-reliability-768x512.jpg. File does not exist.
warning: Skipping file C:\resources\safe\test-website-files\2019.01.23\public-html\wp-content\uploads\2019\01\businessman-in-suit-with-two-hands-in-position-to-protect-something-focus-on-hand-blur-out-the-suit-it-indicates-many-aspects-such-as-car-insurance-coverage-support-assurance-reliability-848x300.jpg. File does not exist.
warning: Skipping file C:\resources\safe\test-website-files\2019.01.23\public-html\wp-content\uploads\2019\01\businessman-in-suit-with-two-hands-in-position-to-protect-something-focus-on-hand-blur-out-the-suit-it-indicates-many-aspects-such-as-car-insurance-coverage-support-assurance-reliability.jpg. File does not exist.
...

@justnance
Copy link

@Typel - Thanks for the update and additional information. I'm looking into this one again and will post another update soon.

@justnance justnance added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Feb 18, 2019
@jdefontes
Copy link

I've been getting the same error message with aws s3 cp, and I was able to get the error to go away by shortening the filenames in question. (Not sure if it was the length of the absolute path, or just the filename portion.)

@mdavis77
Copy link

The same thing is happening to me. The sync was working yesterday, and now today every file is getting skipped. I copied and pasted the local path into windows explorer and the file is definitely there and opens. I had left the sync running during the night, and at some point it just stopped working and started skipping all the files.
It looks like it got about 60 subfolders in the main root folder before it stopped.

@mdavis77
Copy link

Note, when I go to a different computer and run it, it works! I closed all command prompt windows and tried again, still skips all files on this server. I cannot really reboot the server, since there are a lot of other things running on it. Is there any process I can kill? It almost seems like aws cli has left something in memory?

@adrianrocamora
Copy link

Hi, I'm getting the same error on one of my servers at the moment. I'll investigate further and come back

@ulver2812
Copy link

ulver2812 commented Mar 10, 2019

Same problem for me, multiple 'File does not exist' warning on my Win server 2012 R2 and a few 'File has an invalid timestamp. Passing epoch time as timestamp.' warning.

UPDATE: had another warning, this time is "File/Directory is not readable."

@mdavis77
Copy link

I was able to reboot the server and the problem was resolved. Before the reboot, I also tried Cloudberry Explorer and it worked for a time, then started throwing Out of Memory errors instantly when I tried to copy a file. There were some registry changes suggested, perhaps some of those helped and didn't take effect until the reboot. All I know is, I did a sync after the reboot and it finished eventually.

@justnance justnance removed guidance Question that needs advice or information. investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Mar 11, 2019
@3ggaurav
Copy link

Solution(Answer) -- Yes, Same problem was occurring for me also i.e. 'File does not exist' warning.
So aws s3 sync is used to synchronize folders only not for a particular file/files. It's expecting both source and target to be folders (local folders or S3 URIs). If you put your file into a folder and then sync your folder to s3 folder. It work fine. Thank you.

@queglay
Copy link

queglay commented Jun 15, 2019

I also have something like this problem. it appears the filter isn't strict enough, since I can't explain why it would even try to sync these files it complains about at all - it shouldn't

aws s3 sync 's3://man.test.com/prod/tst/s3sync/upload/houdini/' '/aws_sydney_prod/tst/s3sync/upload/houdini/' --exclude '*' --include 'tst_s3sync_upload_wrk_uploadtest_v017.000_ag_test_sync.hip'
warning: Skipping file /aws_sydney_prod/tst/s3sync/upload/houdini/backup/tst_s3sync_upload_wrk_uploadtest_v013.000_ag_startover_bak5.hip. File does not exist.
warning: Skipping file /aws_sydney_prod/tst/s3sync/upload/houdini/backup/tst_s3sync_upload_wrk_uploadtest_v013.000_ag_startover_bak4.hip. File does not exist.
warning: Skipping file /aws_sydney_prod/tst/s3sync/upload/houdini/render/tst_s3sync_upload_wrk_uploadtest_v013.000_ag_startover.mantra_rop_1.1081.exr. File does not exist.

Above you can see it is complaining about files that shouldn't match the --include condition. Id love any pointers on this. my goal is to use sync instead of cp because sync will only copy if there are changes, and usualy in my case there will be many files like 0001-9999.

@Typel
Copy link
Author

Typel commented Jun 19, 2019

It does appear that this (the original issue here) was caused by a 260-character path limitation imposed by Windows. Microsoft only recently lifted this limitation in modern versions, although it requires a registry hack to unlock.

Basically, as aws-cli syncs up to S3, if any of the local files are in folders with full paths longer than 260 characters, Windows acts as if the file just isn't there. Note that this limitation was being imposed locally since the S3 max path is 1024 characters.

For anyone still looking for a workaround - you will need to either upgrade the client machine to an operating system that can handle long paths (or unlock long paths in Windows 10), or rename folders so that all paths being synced are < 260 characters long. If my suspicion is correct and the issue is based on a limitation of the operating system, I think an outright fix is most likely outside the scope of this project.

@justnance justnance removed their assignment Aug 9, 2019
@ppulusu
Copy link

ppulusu commented Nov 4, 2019

I have the same issue when executing the command on a Ubuntu machine.

@Typel
Copy link
Author

Typel commented Dec 5, 2019

@ppulusu I believe the max filename length is 255 characters for most filesystems in Ubuntu, and a max path length of 4096 characters. S3 itself imposes a 1024-character limit (although I'm not certain whether that is path-inclusive or just for the filename itself). If you run a sync in debug mode you should be able to examine the resulting log and see if the offending files fall outside any of those limitations.

@drbeka
Copy link

drbeka commented May 28, 2020

hi,
I have the same issue.
Using
aws-cli/2.0.13 Python/3.7.3 Linux/2.6.32-754.28.1.el6.x86_64 botocore/2.0.0dev17

I get many warnings like
warning: Skipping file /apps/jenkins/tools/terraform. File does not exist
The path names are not long,
the folders exist.

So no idea why I get this warning.

@kdaily kdaily added the needs-triage This issue or PR still needs to be triaged. label Aug 31, 2020
@sultonachmad
Copy link

i have the same problem, and it is windows problem that can not read path more than 255 characters

@kdaily
Copy link
Member

kdaily commented Sep 24, 2020

Hi all, given that the original error was related to filename length limits, this issue will be closed.

@drbeka, If you are still encountering this issue, can you please open a new issue and provide some more details, please?

Thanks!

@kdaily kdaily closed this as completed Sep 24, 2020
@kdaily kdaily added guidance Question that needs advice or information. and removed needs-triage This issue or PR still needs to be triaged. labels Sep 24, 2020
@duhaime
Copy link

duhaime commented Sep 30, 2020

I hit this error because I was invoking the sync command incorrectly. I was using:

aws s3 sync ./bundle-11500000.tar.gz s3://duhaime/ip-data-lab/bundle-11500000.tar.gz

I moved all my .tar.gz files to ./bundles and uploaded them with:

aws s3 sync bundles s3://duhaime/ip-data-lab/bundles/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests