Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip existing files in file_copy(overwrite = FALSE) #213

Open
kstierhoff opened this issue Aug 12, 2019 · 7 comments
Open

Skip existing files in file_copy(overwrite = FALSE) #213

kstierhoff opened this issue Aug 12, 2019 · 7 comments
Labels
feature a feature request or enhancement

Comments

@kstierhoff
Copy link

Is there a reason why file_copy(overwrite = FALSE) must throw an error when a file already exists? Is it possible to allow the function to only copy files that don't exist, similar to file.copy(overwrite = FALSE)? Thanks for considering.

@jimhester
Copy link
Member

Allowing this would complicate the API for limited benefit`.

file.copy() not failing when files exist has caused numerous unintentional bugs, avoiding this is one of the reasons the fs package exists.

However we could possibly add a fail argument like now exists for dir_ls(), which would issue a warning instead of failing if a file exists.

@jimhester jimhester added the feature a feature request or enhancement label Aug 12, 2019
@kstierhoff
Copy link
Author

kstierhoff commented Aug 12, 2019 via email

@orgadish
Copy link

I think this functionality would be really useful. Though I agree it should not be the default.

Either a fail argument, or for example, allow overwrite to take T, F, or "skip".

Is there a reason this was never implemented?

@gaborcsardi
Copy link
Member

FWIW (macos) cp has a -n option that skips existing files. I guess we could implement this, should we have time for it in the future.

It does seem like an anti-pattern, because after running

cp -r -n source target

you cannot be sure that source and target are the same, and this makes reasoning hard.

If the goal is to make target the same as source, then you probably also do not win much time with -n because copying any file within the same file system is an O(1) operation, practically always.

@orgadish
Copy link

My primary need for the skipping is precisely because I'm copying from a very slow filesystem to my local machine.

I don't think that the risk of files not being the same means it shouldn't be an option to skip, especially if it's not the default. I think it's ok to put some burden on the end-user. This could be even more explicit if there were a couple options, eg:

  • "skip_path" skips if the path exists.
  • "skip_info" skips if the file metadata is the same (eg from dir_info)

@vorpalvorpal
Copy link

My primary need for the skipping is precisely because I'm copying from a very slow filesystem to my local machine.

I don't think that the risk of files not being the same means it shouldn't be an option to skip, especially if it's not the default. I think it's ok to put some burden on the end-user. This could be even more explicit if there were a couple options, eg:

* "skip_path" skips if the path exists.

* "skip_info" skips if the file metadata is the same (eg from `dir_info`)

skip_info would be a great option!

@orgadish
Copy link

@vorpalvorpal Instead of copying the files locally to avoid dealing with the slow server, I ended up writing a cached_read function (https://github.com/orgadish/cachedread) which reads from the slow server once and writes a file locally. It has an equivalent option to skip_info that checks if the metadata is the same as before to determine whether to read from the local cache file or the server file.

I still think it would be useful to enable a "mirror tree" functionality, though. Perhaps it just warrants a new fs::mirror_dir function, rather than complicating the file_copy API, as @jimhester warned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants