Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --no-overwrite option to aws s3 cp/mv #2874

Open
alexjurkiewicz opened this issue Oct 6, 2017 · 64 comments
Open

Add --no-overwrite option to aws s3 cp/mv #2874

alexjurkiewicz opened this issue Oct 6, 2017 · 64 comments
Assignees
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue s3

Comments

@alexjurkiewicz
Copy link

It would be nice to have a convenience function --no-overwrite for aws s3 cp/mv commands, which would check the target destination doesn't already exist before putting a file into an s3 bucket.

Of course this logic couldn't be guaranteed by the AWS API (afaik...) and is vulnerable to race conditions, etc. But it would be helpful to prevent unintentional mistakes!

@kyleknap
Copy link
Contributor

kyleknap commented Oct 6, 2017

Marking as a feature request. The tricky part if we did it in cp or mv is that the CLI may have to query S3 to see if the file exists before trying to upload it. So it may make more sense to add it to sync as it already does that.

@kyleknap kyleknap added feature-request A feature should be added or improved. s3 labels Oct 6, 2017
@sgrimm-sg
Copy link

I'd like to see this in cp and/or mv as well.

The reason I don't use sync for this right now is that sync has major performance problems if the destination bucket has a lot of existing files under the target directory.

When you run aws s3 cp --recursive newdir s3://bucket/parentdir/, it only visits each of the files it's actually copying.

When you run aws s3 sync newdir s3://bucket/parentdir/, it visits the files it's copying, but also walks the entire list of files in s3://bucket/parentdir (which may already contain thousands or millions of files) and gets metadata for each existing file.

On a sufficiently large destination bucket, aws s3 cp --recursive can take seconds and aws s3 sync can take hours to copy the same data.

Obviously fixing sync would be nice, but if adding a "check to see if the file already exists" query to cp is a more tractable problem than revamping the sync code to make it fast, it might make sense to do that instead.

@smaslennikov
Copy link

I'm also very interested in this feature. An optional interactive prompt for overwriting files would also be nice to have.

@shabeebk
Copy link

yes @sgrimm-sg, it makes sense. I am also interested to see CLI cp command which can actually handle these conditions.

@jhoblitt
Copy link

It would be extremely useful for this to be an option on aws s3 sync. rsync has this functionality available as --ignore-existing. My preference would be try to use the same option names as rsync as I suspect there are a lot of folks already familiar with rsync.

@ASayre
Copy link
Contributor

ASayre commented Feb 6, 2018

Good Morning!

We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.

This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.

As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.

We’ve imported existing feature requests from GitHub - Search for this issue there!

And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue.

GitHub will remain the channel for reporting bugs.

Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface

-The AWS SDKs & Tools Team

This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168406-add-no-overwrite-option-to-aws-s3-cp-mv

@kenorb
Copy link

kenorb commented Feb 13, 2018

Related:

@jamesls
Copy link
Member

jamesls commented Apr 6, 2018

Based on community feedback, we have decided to return feature requests to GitHub issues.

@guyisra
Copy link

guyisra commented Oct 21, 2018

@jamesls that's great! can you please respond to the suggest at hand? --no-overwrite would be a great addition and it will avoid wrapping the calls with scripts

@evanstucker-hates-2fa
Copy link

+1 to this issue. I propose -n, --no-clobber to match existing Linux cp command options.

@CaptainPalapa
Copy link

Has there been any implementation of this request? Trying to work with Windows batch files to do local backup > S3, this is the easiest methods, a simple no-overwrite or similar flag.

@adiii717
Copy link

Any update regarding this feature?

@julio75012
Copy link

Any update regarding this feature ? Thanks

@noelnamai
Copy link

Any update regarding this feature?

1 similar comment
@avanier
Copy link

avanier commented May 2, 2019

Any update regarding this feature?

@pgolebiowski
Copy link

sev3, +1

@southpaw5271
Copy link

Really need this feature added as S3 sync does not seem to upload every file.

@mehmetfazil
Copy link

Any updates or workarounds?

@southpaw5271
Copy link

Any updates or workarounds?

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

@kevb
Copy link

kevb commented Jul 4, 2019

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

@southpaw5271 - care to share your script and save me some time ? ; )

@southpaw5271
Copy link

I had to write a python script to load all of the items in the bucket into an array (list), then load all the items from the directory I want to sync, then compare the arrays and upload the local items not in the S3 array.

@southpaw5271 - care to share your script and save me some time ? ; )

I don't seem to have it anymore :( Sorry!

@mpdude
Copy link

mpdude commented Aug 28, 2019

This flag would also be valuable for the cp command, since sync does not allow to copy a file while changing the destination name.

aws s3 cp --no-overwrite ./somefile s3://bucket/othername

@RobElEmYew
Copy link

We also need the --no-overwrite option from s3 to local. We've been burned by accidental overwrites from well-meaning individuals, and this would be a very much appreciated way to put up a "guardrail" for them. Thanks!

@EralpB
Copy link

EralpB commented Jan 13, 2020

any update?

@jfstephe
Copy link

@kdaily - any news on this?

@kdaily kdaily self-assigned this Nov 11, 2021
@kdaily kdaily added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Nov 11, 2021
@kdaily
Copy link
Member

kdaily commented Nov 17, 2021

@jfstephe, unfortunately not. I'll keep checking!

@kdaily kdaily removed the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Nov 17, 2021
@kdaily kdaily linked a pull request Nov 17, 2021 that will close this issue
@mmmvvvppp
Copy link

@kdaily Any news on this MR? Thanks

@jfstephe
Copy link

jfstephe commented Oct 17, 2022

FYI I developed this with a colleague. It's an S3-only based way of ensuing that only one person will be updating an S3 file at any one time. It's javascript based. Not suggesting this is better than AWS supporting the requested feature, and not suggesting this is better than using a database, but for us this was the best option for now, and may prove useful for others.

https://github.com/jfstephe/aws-s3-lock

@tim-finnigan tim-finnigan added the p2 This is a standard priority issue label Nov 14, 2022
@MaheshB0ngani
Copy link

Really a needed feature. Still waiting for a solution to this.

@Xyncgas
Copy link

Xyncgas commented Dec 23, 2022

Another year. Object storage doesn't support Mutex, Browser doesn't support PWA, WebAssembly no 2.0, you can convince now they are zenophobic

@laxika
Copy link

laxika commented Feb 19, 2023

Is this really something that cannot be developed in under 5 years?

@shqear93
Copy link

it's been 5 years, and no progress

@marty1885
Copy link

Not to bother people. But I'll provide a use case for this.

There was a timing bug in our build service. Which ended making our servers build the latest code into packages thinking it was the previous version. The worst part is that was a release build. That glitch caused an impostor version to be uploaded to our S3 storage and people downloaded it. A --no-overwrite flag would have avoided the catastrophe.

Please add the flag.

@Xyncgas
Copy link

Xyncgas commented Apr 11, 2023

It costs amazon money to support this because no-overwrite means there is going to be a single thread bottle neck which will be looking at whether the object exist

Essentially, the request itself requires the server to compute whether something exists, without requests from other origin causing racing problem

But they should eat this cost, instead of having us routing all the request first to our own server / worker to implement this check manually, for once the company should offer this tiny feature

S3 is object storage, their selling point is infinite scalability, amazon's refusal on offer this would show a rather poor posture from the tech giant

Meanwhile, amazon's pricing really says a lot about making things cheaper and fair for both player, although other platforms might be catching up with all the unlimited request and egress features too

@laxika
Copy link

laxika commented Apr 12, 2023

@Xyncgas

Essentially, the request itself requires the server to compute whether something exists, without requests from other origin causing racing problem

This should be clearly stated in the docs. I think most of the time this is not an issue (at least for a good amount of the usecases).

But they should eat this cost, instead of having us routing all the request first to our own server / worker to implement this check manually, for once the company should offer this tiny feature

I would happily eat the cost myself. The dev time spent on implementing this manually is 1000 times more than I'll ever spend on the request costs.

@askdesigners
Copy link

absolutely pitch-perfect AWS

@mplattu
Copy link

mplattu commented Apr 25, 2023

Has anyone come up with a x-ish one-liner to overcome this?

@jfstephe
Copy link

See comment above re: https://github.com/jfstephe/aws-s3-lock . May help?

@bradisbell
Copy link

This can be done with the If-None-Match: * HTTP request header.

@four43
Copy link

four43 commented Feb 29, 2024

This can be done with the If-None-Match: * HTTP request header.

That's a great idea. This seems like it would be trivial to implement with a flag if that actually works...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue s3
Projects
Status: Ready for Review
Development

Successfully merging a pull request may close this issue.