Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File renaming should be the last operation of an atomic write #16245

Merged
merged 1 commit into from Aug 20, 2015

Conversation

@byroot
Copy link
Member

byroot commented Jul 21, 2014

That way there is just no possible race conditions.

cc @wisq

@arthurnn
arthurnn reviewed Jul 21, 2014
View changes
activesupport/lib/active_support/core_ext/file/atomic.rb Outdated
@@ -16,32 +16,32 @@ class File
# end
def self.atomic_write(file_name, temp_dir = Dir.tmpdir)
require 'tempfile' unless defined?(Tempfile)
require 'fileutils' unless defined?(FileUtils)

This comment has been minimized.

Copy link
@arthurnn

arthurnn Jul 21, 2014

Member

Why remove this require?

This comment has been minimized.

Copy link
@byroot

byroot Jul 21, 2014

Author Member

First because it's already required at the top of the filed. And second because I now use File.rename instead of FileUtils.mv

@wisq
wisq reviewed Jul 21, 2014
View changes
activesupport/lib/active_support/core_ext/file/atomic.rb Outdated
rescue Errno::EPERM
# Changing file ownership failed, moving on.
# Overwrite original file with temp file
File.rename(temp_file.path, file_name)

This comment has been minimized.

Copy link
@wisq

wisq Jul 21, 2014

If you're going to use File.rename (which you should, because FileUtils.mv is not atomic) you'll need to ensure that temp_dir is somewhere on the same filesystem. (Generally, the best place is the same directory you're creating the file in.)

This comment has been minimized.

Copy link
@byroot

byroot Jul 21, 2014

Author Member

According to the documentation it's up to the caller to ensure that. But that's a good point.

This comment has been minimized.

Copy link
@wisq

wisq Jul 21, 2014

Yeah, but the problem is that I'm sure many, many people have been using this method without specifying a directory, and it has Just Worked™ because of FileUtils.mv, even though they were getting a very non-atomic atomic_write.

If we change this without adjusting the temp_dir default, we a) will break a lot of code, and b) will make it a pain to use, since who wants to do File.atomic_write(some_path, File.dirname(some_path)) all the time instead of just File.atomic_write(some_path) ?

This comment has been minimized.

Copy link
@byroot

byroot Jul 21, 2014

Author Member

I just agreed that defaulting to the original file directory was the best idea ;)

This comment has been minimized.

Copy link
@wisq

wisq Jul 21, 2014

Yeah, sorry, diff comments show up separately from PR comments.

@wisq

This comment has been minimized.

Copy link

wisq commented Jul 21, 2014

The main problem here is backwards compatibility. If people are explicitly specifying a temp_dir that is not on the same filesystem as the target file, or are not specifying temp_dir at all and /tmp is on a different FS (which is very, very common), then this will break their code because File.rename does not work cross-device.

You could default temp_dir to be the directory of the target file. That would still break in cases where people explicitly specify a target directory that's on a different FS — but at least in the default no-directory-specified case, you would be using a directory on the same FS.

Then there's the risk that people start complaining that we're putting temporary files in their target directories rather than in /tmp. Not sure how much of a problem this may end up being, but it's definitely a behaviour change and might be of a concern.

All this is to say, backwards compatibility makes the switch from FileUtils.mv to File.rename difficult, even though it definitely is something that should happen, because FileUtils.mv is not atomic unless it's on the same filesystem (in which case it's just doing the atomic File.rename anyway).

(I think these problems are the reason I didn't contribute the code upstream when I rewrote this method in Shopify two years ago. Didn't have the time to figure out how to deal with the fact that my choices were "break existing code" or "continue having a non-atomic atomic_write".)

@byroot

This comment has been minimized.

Copy link
Member Author

byroot commented Jul 21, 2014

You could default temp_dir to be the directory of the target file

I think this is the best thing to do. Most unix utilities that does atomic writes like that tend to suffix the original file name with .temp or .partial.

That would still break in cases where people explicitly specify a target directory that's on a different FS

Their problem. If it's a limitation that we cannot overcome, I prefer to break than having an atomic_write method, that silently fallback to a non atomic write in some cases.

@wisq

This comment has been minimized.

Copy link

wisq commented Jul 21, 2014

Right, I would agree with making temp_dir default to File.dirname(file_name).

@byroot

This comment has been minimized.

Copy link
Member Author

byroot commented Jul 21, 2014

@wisq How about now? (and also is there any lame english spelling mistake in my documentation change?)

@wisq

This comment has been minimized.

Copy link

wisq commented Jul 21, 2014

s/need/needs/ but otherwise 👍 from me.

@byroot byroot force-pushed the byroot:more-atomic-write branch 2 times, most recently from 97b617e to 2f0dd4d Aug 19, 2015
@byroot

This comment has been minimized.

Copy link
Member Author

byroot commented Aug 20, 2015

@wisq

This comment has been minimized.

Copy link

wisq commented Aug 20, 2015

BTW, if backwards compatibility is a concern, one option would be to specifically catch the "invalid cross-device link" error (Errno::EXDEV) and fall back to FileUtils.mv, but raise a (deprecation?) warning about how it's non-atomic because they specified a temp_dir that's incompatible with the target.

@byroot byroot force-pushed the byroot:more-atomic-write branch from 2f0dd4d Aug 20, 2015
@byroot byroot force-pushed the byroot:more-atomic-write branch to 5d3b3cb Aug 20, 2015
matthewd added a commit that referenced this pull request Aug 20, 2015
File renaming should be the last operation of an atomic write
@matthewd matthewd merged commit eb73110 into rails:master Aug 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.