New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atomic writes #167
Atomic writes #167
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 14 files at r1.
Reviewable status: 12 of 14 files reviewed, 19 unresolved discussions (waiting on @lehins)
rio/cbits/rio.c, line 34 at r1 (raw file):
return S_IWUSR; }
Calling into C is not necessary, all of this can be done at compile time with a .hsc
file.
It would probably be easier there.
If we want to keep calling into C for some reasons, we should probably add haskell_rio
into the function names or so to ensure those symbols are there only once, in case some other C library linked into the final project also decides to call their functions __o_tmpfile
.
rio/src/RIO/File.hs, line 49 at r1 (raw file):
functionality and the one that is done atomically. Even if the orignal file is removed while it is being modified, due to atomicity, it will be restored with all modifications, if any. The important consequence of this fact is that
I recommend to insert the reason here, so that the user understands without doubt what's going on: This is because the file is atomically linked into the file system at the end.
rio/src/RIO/File.hs, line 135 at r1 (raw file):
-- After here, we have our own imports foreign import ccall unsafe "rio.c __o_tmpfile" o_TMPFILE :: CInt
I find it somewhat confusing that o_TMPFILE
can also be 0
.
That should have a comment here.
rio/src/RIO/File.hs, line 142 at r1 (raw file):
foreign import ccall safe "fcntl.h openat" c_safe_openat :: Fd -> CFilePath -> CInt -> CMode -> IO CInt
If I remember correctly, we intentionally used only C*
types in foreign function declarations.
I'm always a bit on the fence with this.
The key thing is that there is no type checking for foreign imports; one can write arbitrary stuff in there and nothing checks it. That's why I always try to make foreign import type signatures look as similar to the man page's types as possible, so that it looks "obviously correct" and nobody has to look up in the docs whether Fd
really is CInt
.
rio/src/RIO/File.hs, line 151 at r1 (raw file):
foreign import ccall safe "unistd.h linkat" c_safe_linkat :: CInt -> CFilePath -> CInt -> CFilePath -> CInt -> IO CInt
If we really wanted to go for Fd
, then c_safe_linkat
should probably use it too.
rio/src/RIO/File.hs, line 208 at r1 (raw file):
-- | Optionally set file permissions, call @fsync@ on the file handle and then -- close it.
The comment does not align with the name of this function. If it closes the file as the haddocks say, it should be in the name. But I don't see it closing the file, so maybe the comment is outdated. In the first commit in this series, it did close the file. I think it makes sense that it doesn't close the file.
rio/src/RIO/File.hs, line 215 at r1 (raw file):
-- | Call @fsync@ on the opened directory file descriptor fsyncDirectoryFd :: String -> Fd -> IO () fsyncDirectoryFd fname = fsyncFileDescriptor (fname ++ "/Directory")
This is confusing, as there can be a file called Directory
.
How about using /../
instead?
Then the error message will show e.g. path/to/file/../
, which makes clear that we're talking about the directory called file
(you can even call stat
on it and get the expected result).
rio/src/RIO/File.hs, line 228 at r1 (raw file):
-- openFileFromDir :: MonadIO m => Fd -> FilePath -> IOMode -> m Handle openFileFromDir dirFd (dropDirectoryIfRelative -> fp) iomode =
Personally I'm no fan of view patterns in public type signatures.
They make them harder to parse visually, make that now there isn't actually a variable that corresponds to the actual FilePath
that was passed into the function (so you can't trace
it easily for debugging or, use ghci's debugger to print the variable), and add more fancy Haskell stuff one has to learn to understand the code for little added benefit.
I'd just let
-bind it.
rio/src/RIO/File.hs, line 228 at r1 (raw file):
-- openFileFromDir :: MonadIO m => Fd -> FilePath -> IOMode -> m Handle openFileFromDir dirFd (dropDirectoryIfRelative -> fp) iomode =
Why do we drop relative parent directories here?
I think that is unnecessary, and it would make sense for the function to support them, just like openat()
does.
I would find it very surprising to discover that this function drops part of what I gave it.
rio/src/RIO/File.hs, line 257 at r1 (raw file):
dropDirectoryIfRelative fp | isRelative fp = takeFileName fp | otherwise = fp
I think this function is unnecessary, see comments on its use sites.
rio/src/RIO/File.hs, line 259 at r1 (raw file):
| otherwise = fp -- | Open an anonymous temporary file
... in the given directory
? Should the function name also end with FromDir
, like the function above, for consistency? What about the other comments on the function above, do they apply here too / should they be copied or referred to?
This function should probably have parameter haddocks for extra clarity.
Especially the first argument, eDir
. It should say that the Fd or file path should point to a directory. If it's a path, does it need to exist before?
What happens when an Fd or path pointing to a file, instead of dir, is given?
rio/src/RIO/File.hs, line 262 at r1 (raw file):
openAnonymousTempFile :: MonadIO m => Either Fd FilePath -> FilePath -> IOMode -> m Handle openAnonymousTempFile eDir filePath iomode =
What's the filePath
argument good for? Can it be removed?
rio/src/RIO/File.hs, line 280 at r1 (raw file):
(do fileFd <- throwErrnoIfMinus1Retry "openAnonymousTempFile" $ fopen (o_TMPFILE .|. ioModeToTmpFlags iomode) (s_IRUSR .|. s_IWUSR)
We must handle the situation that O_TMPFILE
is not supported on the filesystem.
This means implementing a fallback for the case that EOPNOTSUPP
is returned.
rio/src/RIO/File.hs, line 319 at r1 (raw file):
-- -- It is important to note, that whenever a file descriptor for the containing -- directory is supplied, renaming and linking will be done in it's context,
it's
-> its
rio/src/RIO/File.hs, line 323 at r1 (raw file):
atomicTempFileCreate :: Maybe Fd -- ^ Possible handle for the directory where the target file is located
is located
sounds like it must already exist. shall end up in
?
rio/src/RIO/File.hs, line 331 at r1 (raw file):
-- `o_TMPFILE` -> FilePath -- ^ File path for the target file.
Relative, absolute, and what happens in each case and for either case of what the mDirFd
is?
rio/src/RIO/File.hs, line 376 at r1 (raw file):
-- to atomic rename. -> Either Handle FilePath -- ^ Temporary file's path
It's not clear from the docs that this file must already exist, or the function will fail.
Let's add that to top-level haddocks of the function. Those haddocks should also say in which situations this function is to be used.
rio/src/RIO/File.hs, line 390 at r1 (raw file):
Just dirFd -> withFilePath (dropDirectoryIfRelative filePath) $ \cToFilePath -> withFilePath (dropDirectoryIfRelative tmpFilePath) $ \cTmpFilePath ->
I think both uses of dropDirectoryIfRelative
are unnecessary and should be dropped.
rio/src/RIO/File.hs, line 435 at r1 (raw file):
-- of @open@ -> FilePath -- ^ Source file path. The file may exist or may not.
It's not clear from this that if a directory Fd
is given, filePath
is completely ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 14 files reviewed, 20 unresolved discussions (waiting on @lehins)
rio/src/RIO/File.hs, line 335 at r1 (raw file):
atomicTempFileCreate mDirFd mFileMode tmpFileHandle filePath = withHandleFd tmpFileHandle $ \fd@(Fd cFd) -> withFilePath ("/proc/self/fd/" ++ show cFd) $ \cFromFilePath ->
/proc
does not exist on the BSDs and OSX.
Please add a stating how it ensured that this path cannot be taken, and also mention it in the haddocks that this function is Linux only (or /proc
-only).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 14 files reviewed, 20 unresolved discussions (waiting on @lehins)
rio/src/RIO/File.hs, line 335 at r1 (raw file):
a stating
I mean a comment stating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 14 files reviewed, 20 unresolved discussions (waiting on @lehins and @nh2)
rio/src/RIO/File.hs, line 135 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I find it somewhat confusing that
o_TMPFILE
can also be0
.
That should have a comment here.
I'll add a comment about it
rio/src/RIO/File.hs, line 142 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
If I remember correctly, we intentionally used only
C*
types in foreign function declarations.I'm always a bit on the fence with this.
The key thing is that there is no type checking for foreign imports; one can write arbitrary stuff in there and nothing checks it. That's why I always try to make foreign import type signatures look as similar to the man page's types as possible, so that it looks "obviously correct" and nobody has to look up in the docs whether
Fd
really isCInt
.
If I can add at least a little bit of type safety to it I always will. I don't care if any body has to lookup the fact that Fd
is a newtype wrapper around CInt
, as long as it is. If I could, I would make every single argument in this function a separate Haskell type. Most important reason why I feel strongly about it is that within this module it is impossible to mix up the arguments and supply some flag instead of a file descriptor.
To keep us both happy I'll add a function synonym for each FFI with proper types
rio/src/RIO/File.hs, line 151 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
If we really wanted to go for
Fd
, thenc_safe_linkat
should probably use it too.
Not quite. Since we use special flag AT_FDCWD
in place of Fd
in some cases, but in other we use Fd
rio/src/RIO/File.hs, line 208 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
The comment does not align with the name of this function. If it closes the file as the haddocks say, it should be in the name. But I don't see it closing the file, so maybe the comment is outdated. In the first commit in this series, it did close the file. I think it makes sense that it doesn't close the file.
Yes, the comment is old. handle is closed elsewhere
rio/src/RIO/File.hs, line 215 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
This is confusing, as there can be a file called
Directory
.How about using
/../
instead?Then the error message will show e.g.
path/to/file/../
, which makes clear that we're talking about the directory calledfile
(you can even callstat
on it and get the expected result).
That is not the filepath. It is fname
as in function name. Used for error reporting. Eg.
fsyncDirectoryFd "atomicDurableTempFileCreate" dirFd
usage in atomicDurableTempFileCreate
function
rio/src/RIO/File.hs, line 228 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Personally I'm no fan of view patterns in public type signatures.
They make them harder to parse visually, make that now there isn't actually a variable that corresponds to the actual
FilePath
that was passed into the function (so you can'ttrace
it easily for debugging or, use ghci's debugger to print the variable), and add more fancy Haskell stuff one has to learn to understand the code for little added benefit.I'd just
let
-bind it.
That's a pretty subjective opinion :P
In my personal opinion this makes it a lot clearer what is going on. Nice and concise, you do not need to follow extra bindings to see what fp
really means. And sure you can trace it:
openFileFromDir dirFd fullPath@(dropDirectoryIfRelative -> fp) iomode = trace fullPath $
In any case, it does not affect semantics, so I'll leave it as is.
rio/src/RIO/File.hs, line 228 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Why do we drop relative parent directories here?
I think that is unnecessary, and it would make sense for the function to support them, just like
openat()
does.I would find it very surprising to discover that this function drops part of what I gave it.
We do that to fix: #160 Checkout man openat
and what it says about relative directories.
rio/src/RIO/File.hs, line 257 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I think this function is unnecessary, see comments on its use sites.
You are thinking wrong ;) See my comments at use sites. Also if you don't believe me, try removing it and running the test suite.
rio/src/RIO/File.hs, line 259 at r1 (raw file):
Sure, I renamed the function and added a link. Added arguments haddock too
... in the given directory? Should the function name also end with FromDir, like the function above, for consistency? What about the other comments on the function above, do they apply here too / should they be copied or referred to?
I have no idea. It is not designed to be called on the file Fd, only directory Fd. Glad you brought it up. We can add more types for this.
What happens when an Fd or path pointing to a file, instead of dir, is given?
rio/src/RIO/File.hs, line 262 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
What's the
filePath
argument good for? Can it be removed?
It is used for giving an arbitrary name to the file descriptor
rio/src/RIO/File.hs, line 280 at r1 (raw file):
Cool. I'll need to do a bit of work on that. Fallback functionality is already available, so it is a matter of catching an exception and retrying it.
EOPNOTSUPP
rio/src/RIO/File.hs, line 319 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
it's
->its
Done.
rio/src/RIO/File.hs, line 323 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
is located
sounds like it must already exist.shall end up in
?
Well, the file is already in that directory, just without a name. In other words it had to be opened with openAnonymousTempFileFromDir
rio/src/RIO/File.hs, line 390 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I think both uses of
dropDirectoryIfRelative
are unnecessary and should be dropped.
They are indeed required
rio/src/RIO/File.hs, line 435 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
It's not clear from this that if a directory
Fd
is given,filePath
is completely ignored.
What do you mean ignored? it is also further passed down into openAnonymousTempFile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 14 files at r1.
Reviewable status: 13 of 14 files reviewed, 35 unresolved discussions (waiting on @lehins and @nh2)
rio/src/RIO/File.hs, line 342 at r1 (raw file):
throwErrnoIfMinus1Retry ("atomicFileCreate - c_safe_linkat - " ++ which) $ c_safe_linkat at_FDCWD cFromFilePath cDirFd to at_SYMLINK_FOLLOW
This needs a direct reference to the man page, perhaps even a quote, to show that this is the standard way of doing it and not something fancy we came up with.
rio/src/RIO/File.hs, line 347 at r1 (raw file):
safeLink "anonymous" cToFilePath case eExc of Right _ -> pure ()
What are we ignoring here? Looks like ()
. Let's give safeLink
a type signature with :: ... ()
, and then match ()
here to remove the question.
rio/src/RIO/File.hs, line 367 at r1 (raw file):
case mDirFd of Nothing -> (at_FDCWD, filePath) Just (Fd cDirFd') -> (cDirFd', takeFileName filePath)
Like for dropDirectoryIfRelative
, I think that this takeFileName
is probably not good.
rio/src/RIO/File.hs, line 407 at r1 (raw file):
-- | Create a temporary file for a matching possibly exiting target file that -- will be replaced in the future. Temporary file -- is meant to be renamed afterwards, thus it is only delted upon error.
delted
-> deleted
Should also comment exactly on how the created file is called.
rio/src/RIO/File.hs, line 410 at r1 (raw file):
-- -- __Important__: Temporary file is not removed and file handle is not closed if -- there was no exception thrown by the supplied action.
I'd put that behaviour right into the function name.
rio/src/RIO/File.hs, line 428 at r1 (raw file):
tmpFileName = "." <> fileName <> ".tmp" withAnonymousBinaryTempFileFor ::
Haddocks on this one and the following with recommendations on when which one is or should be used.
rio/src/RIO/File.hs, line 446 at r1 (raw file):
dir = maybe (Right (takeDirectory filePath)) Left mDirFd withSimpleBinaryTempFileFor ::
Simple
doesn't tell me much here, perhaps this should be withImmediatelyDeletedBinaryTempFileFor
or withNonAnonymous...
to more obviously contrast with the anonymous
approach?
rio/src/RIO/File.hs, line 453 at r1 (raw file):
-- of @open@ -> FilePath -- ^ Source file path. The file may exist or may not.
Source file path
doesn't mean anything to me in context of what I'd expect the function to do. Can we clarify this somehow? (Same for function above.)
rio/src/RIO/File.hs, line 522 at r1 (raw file):
case iomode of ReadMode -- We do not need to consider an durable operations when we are in a
an durable operations
-> durable durable operations
rio/src/RIO/File.hs, line 528 at r1 (raw file):
-> withDirectory (takeDirectory filePath) $ \dirFd -> withFileInDirectory dirFd filePath iomode $ \tmpFileHandle -> do
This looks like the place to me where we should takeFileName
: Right next to where we do takeDirectory
, as opposed to inside the function.
rio/src/RIO/File.hs, line 548 at r1 (raw file):
-> withDirectory (takeDirectory filePath) $ \dirFd -> if o_TMPFILE == 0
Comment on what this means (the platform doesn't support O_TMPFILE
)
rio/src/RIO/File.hs, line 598 at r1 (raw file):
Let's add here something like
If possible, use one of the other functions in this module that does not perform re-opening of already closed files.
rio/src/RIO/File.hs, line 685 at r1 (raw file):
Let's add here
If you do not need durability but only atomicity, use
withBinaryFileAtomic
instead, which is faster as it does not perform @fsync()@.
rio/src/RIO/File.hs, line 721 at r1 (raw file):
-- -- It is similar to `withBinaryFileDurableAtomic`, but without the durability -- part. What it means is that all modification can still disappear after it has
What it means is that
-> It means that
rio/src/RIO/File.hs, line 725 at r1 (raw file):
Let's add a Performance considerations
section here like in the function above, saying something like:
The same performance caveats for non-truncating writes apply as for
withBinaryFileDurableAtomic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 14 files at r1.
Reviewable status: all files reviewed, 37 unresolved discussions (waiting on @lehins and @nh2)
rio/test/RIO/FileSpec.hs, line 69 at r1 (raw file):
withSystemTempDirectory "rio" $ \dir -> do let fp = dir </> fname ++ "-read" withHelloFileTestable fp ReadWriteMode (`BS.hGetNonBlocking` BS.length hello) `shouldReturn`
Is this correct? As far as I can tell, hGetNonBlocking
might as well return "Hell"
if you're unlucky. I think hGet
would be correct.
rio/test/RIO/FileSpec.hs, line 121 at r1 (raw file):
it "exception - Does not corrupt files" $ bool expectFailure id atomic $ -- should fail for non-atomic forAll (elements [WriteMode, ReadWriteMode, AppendMode]) $ \iomode ->
What is the purpose of using a QuickCheck
Gen
here, why not just run all 3 cases deterministically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 34 unresolved discussions (waiting on @lehins and @nh2)
rio/src/RIO/File.hs, line 215 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
That is not the filepath. It is
fname
as in function name. Used for error reporting. Eg.
fsyncDirectoryFd "atomicDurableTempFileCreate" dirFd
usage inatomicDurableTempFileCreate
function
Ah right.
rio/src/RIO/File.hs, line 228 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
We do that to fix: #160 Checkout
man openat
and what it says about relative directories.
If the pathname given in pathname is relative, then it is interpreted
relative to the directory referred to by the file descriptor dirfd
(rather than relative to the current working directory of the calling
process, as is done by open() for a relative pathname).
If pathname is relative and dirfd is the special value AT_FDCWD, then
pathname is interpreted relative to the current working directory of
the calling process (like open()).
None of these two seem to forbid to do openat(myDirFd, "path/to/file/from/there")
.
So what I mean is that this function should support that use case, and that at the top-level place where it's called (I just added a comment in one such location), the split should be done accordingly, like
openat(openDirFd(takeDirName path), takeFileName path)
as opposed to implementing the takeFileName
missing up there all the way down here.
rio/src/RIO/File.hs, line 262 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
It is used for giving an arbitrary name to the file descriptor
Could we derive it from the eDir
argument (that one's fdName
if it's a Left
, and takeFileName
if it's a Right
) instead?
It seems a bit redundant to pass in that extra info to name something that is anonymous, when we already have information about the thing that it's for, and removing an argument would make the function easier to understand.
rio/src/RIO/File.hs, line 323 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Well, the file is already in that directory, just without a name. In other words it had to be opened with
openAnonymousTempFileFromDir
Ah, that makes sense. Let's write it explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 14 files reviewed, 35 unresolved discussions (waiting on @lehins and @nh2)
rio/src/RIO/File.hs, line 563 at r2 (raw file):
liftIO $ withDirectory (takeDirectory filePath) $ \dirFd -> withFileInDirectory dirFd filePath ReadMode (const $ return ())
Does this function actually still call fsync()
anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 14 files reviewed, 35 unresolved discussions (waiting on @fsync, @lehins, and @nh2)
rio/cbits/rio.c, line 34 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Calling into C is not necessary, all of this can be done at compile time with a
.hsc
file.It would probably be easier there.
If we want to keep calling into C for some reasons, we should probably add
haskell_rio
into the function names or so to ensure those symbols are there only once, in case some other C library linked into the final project also decides to call their functions__o_tmpfile
.
sounds good. I'll add rio_
to the name, adding haskell into C function names feels weird.
rio/src/RIO/File.hs, line 49 at r1 (raw file):
That's not the reason why though, it is because a copy of the file was made prior to any modification taking place.
This is because the file is atomically linked into the file system at the end.
Moreover I don't want to put "linking" in the public documentation because many will not know what this even means, secondly linking is done only on Linux and only when the file does NOT exist, because with linking you can't atomically replace a file.
rio/src/RIO/File.hs, line 151 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Not quite. Since we use special flag
AT_FDCWD
in place ofFd
in some cases, but in other we useFd
I created synonyms for all system calls that improve on type safety. So that should be taken care of
rio/src/RIO/File.hs, line 228 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
If the pathname given in pathname is relative, then it is interpreted relative to the directory referred to by the file descriptor dirfd (rather than relative to the current working directory of the calling process, as is done by open() for a relative pathname). If pathname is relative and dirfd is the special value AT_FDCWD, then pathname is interpreted relative to the current working directory of the calling process (like open()).
None of these two seem to forbid to do
openat(myDirFd, "path/to/file/from/there")
.So what I mean is that this function should support that use case, and that at the top-level place where it's called (I just added a comment in one such location), the split should be done accordingly, like
openat(openDirFd(takeDirName path), takeFileName path)
as opposed to implementing the
takeFileName
missing up there all the way down here.
What you are now recommending is to drop the path for every file, regardless if it's relative or not. That suggestion should work, I think, since we always derive the directory from the file path. I'll change that.
I thought about doing it that way before, but I was worrying a little bit if there is some edge case that I am not thinking of where a file path is absolute and we switch it to relative. But it does make sense, because all files are always opened in the directory, which means this should be ok.
Note that it is different from what you where suggesting to not drop the path and keep the file path in the relative or absolute form, cause using relative path would be (was) a bug. Eg:
openat(openDirFd(takeDirName "foo/bar.txt"), "foo/bar.txt")
while this is ok
openat(openDirFd(takeDirName "/foo/bar.txt"), "/foo/bar.txt")
rio/src/RIO/File.hs, line 257 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
You are thinking wrong ;) See my comments at use sites. Also if you don't believe me, try removing it and running the test suite.
Now that we'll make all files paths to be relative to the directory Fd
, that function is unnecessary
rio/src/RIO/File.hs, line 262 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Could we derive it from the
eDir
argument (that one'sfdName
if it's aLeft
, andtakeFileName
if it's aRight
) instead?It seems a bit redundant to pass in that extra info to name something that is anonymous, when we already have information about the thing that it's for, and removing an argument would make the function easier to understand.
I got rid of FilePath
for the directory. So the argument cannot be dropped, but now instead of an Either DirFd FilePath
we have Maybe DirFd
, which is certainly an improvement. Thanks.
rio/src/RIO/File.hs, line 446 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Simple
doesn't tell me much here, perhaps this should bewithImmediatelyDeletedBinaryTempFileFor
orwithNonAnonymous...
to more obviously contrast with theanonymous
approach?
it is not really ImmediatelyDeleted
, but immediately closed. Anyways NonAnonymous
sounds fine, changed to that
rio/src/RIO/File.hs, line 453 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Source file path
doesn't mean anything to me in context of what I'd expect the function to do. Can we clarify this somehow? (Same for function above.)
Every single exported function in this module supplies a FilePath
, that is the one I am talking about. What would you like me to write in this context?
rio/src/RIO/File.hs, line 548 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Comment on what this means (the platform doesn't support
O_TMPFILE
)
What's going on in here, the code is outdated?
I changed it to o_TMPFILE == o_TMPFILE_not_supported
I Also added a comment describing what that binding means.
rio/src/RIO/File.hs, line 598 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Let's add here something like
If possible, use one of the other functions in this module that does not perform re-opening of already closed files.
I can add it, but I do not understand what you are trying to say. Is your intention is to say to the user: "This function does not provide the same guarantee as if you would open and modify a file using withBinaryFileDurable
or writeBinaryFileDurable
, since they ensure that the @fsync()@ is called before the file is closed, so if possible use those instead."
rio/src/RIO/File.hs, line 685 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Let's add here
If you do not need durability but only atomicity, use
withBinaryFileAtomic
instead, which is faster as it does not perform @fsync()@.
Done.
rio/src/RIO/File.hs, line 721 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
What it means is that
->It means that
Done.
rio/src/RIO/File.hs, line 725 at r1 (raw file):
I'll rephrase it a bit, since it is not clear which performance caveats. @fsync@ performance or copy file performance
The same performance caveats apply as for
withBinaryFileDurableAtomic
due
to making a copy of the content of existing files during non-truncating
writes.
Something like that maybe, what do you think?
rio/src/RIO/File.hs, line 563 at r2 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Does this function actually still call
fsync()
anywhere?
Good catch! Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 11 of 14 files reviewed, 35 unresolved discussions (waiting on @fsync and @nh2)
rio/src/RIO/File.hs, line 280 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Cool. I'll need to do a bit of work on that. Fallback functionality is already available, so it is a matter of catching an exception and retrying it.
EOPNOTSUPP
Cool, I implemented it by checking if the error is of UnsupportedOperation
type, in which case it falls back on a usual named temporary file handling.
rio/src/RIO/File.hs, line 323 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Ah, that makes sense. Let's write it explicitly.
Done
rio/src/RIO/File.hs, line 331 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Relative, absolute, and what happens in each case and for either case of what the
mDirFd
is?
mDirFd
will depend on that file path, if supplied, but that invariant is not enforced within this function, since it is internal to this module. It will not matter if it is relative or not, since only the filename will be used whenever we are dealing in a directory context. If mDirFd
is nothing and eTmpFile
is not a handle, than it is just a usual rename.
I added a bit of comment, but most of that information is a lot easier to understand from just following the code rather than trying to understand my comments
rio/src/RIO/File.hs, line 335 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
a stating
I mean a comment stating
I added a note that it is a Linux only function
rio/src/RIO/File.hs, line 342 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
This needs a direct reference to the man page, perhaps even a quote, to show that this is the standard way of doing it and not something fancy we came up with.
I added a comment to use man
for those who do not trust us.
rio/src/RIO/File.hs, line 347 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
What are we ignoring here? Looks like
()
. Let's givesafeLink
a type signature with:: ... ()
, and then match()
here to remove the question.
I switched to throwErrnoIfMinus1Retry_
, in usage of safeLink
, and added pattern match on ()
so it is easy to see that it is ()
. Type signature can be inferred easily and as such is unnecessary.
rio/src/RIO/File.hs, line 367 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Like for
dropDirectoryIfRelative
, I think that thistakeFileName
is probably not good.
What do you mean not good? It is required, since it will be a bug otherwise. If I do not call takeFileName
here it will only work for absolute file paths and will be broken for relative ones.
Even in case of dropDirectoryIfRelative
it was absolutely fine, I only got rid of it cause you did not like it for some reason.
rio/src/RIO/File.hs, line 376 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
It's not clear from the docs that this file must already exist, or the function will fail.
Let's add that to top-level haddocks of the function. Those haddocks should also say in which situations this function is to be used.
Why do you need the docs for something obvious like that? Wouldn't you expect mv foo bar
to fail if foo
doesn't exists? Well, it is a file renaming function, as such you cannot rename something that wasn't there to begin with. I wouldn't even add such haddock if the function was exported as part of public API.
It shouldn't be used anywhere, where it isn't already used.
rio/src/RIO/File.hs, line 390 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
They are indeed required
I switched to takeFileName
in both places.
rio/src/RIO/File.hs, line 407 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
delted
->deleted
Should also comment exactly on how the created file is called.
Fixed spelling error.
The file is called temporary :) But if you are referring to the name of the file, then it is right in the code, few lines below, adding it here is redundant.
rio/src/RIO/File.hs, line 410 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I'd put that behaviour right into the function name.
Sure, you would. But I wouldn't. I like the way it is named.
rio/src/RIO/File.hs, line 428 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Haddocks on this one and the following with recommendations on when which one is or should be used.
It's right in the name. One is anonymous, another one is not. Secondly, it shouldn't be used anywhere that it isn't already used. The whole function is literally a one liner and if someone can't understand what it is doing, he/she should not be using it.
All of these functions are internal, so they do not deserve as much haddock.
rio/src/RIO/File.hs, line 522 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
an durable operations
->durable durable operations
you mean just durable operations
? if so then fixed.
rio/src/RIO/File.hs, line 528 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
This looks like the place to me where we should
takeFileName
: Right next to where we dotakeDirectory
, as opposed to inside the function.
The full path is now used to name the file descriptor, so we don't want to loose the file path here
rio/src/RIO/File.hs, line 548 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
What's going on in here, the code is outdated?
I changed it too_TMPFILE == o_TMPFILE_not_supported
I Also added a comment describing what that binding means.
improved it further, withAnonymousBinaryTempFileFor
now returns Nothing
when it is not supported.
rio/test/RIO/FileSpec.hs, line 69 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
Is this correct? As far as I can tell,
hGetNonBlocking
might as well return"Hell"
if you're unlucky. I thinkhGet
would be correct.
It was left over from me trying things out, so switched to hGet
instead. Thanks
rio/test/RIO/FileSpec.hs, line 121 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
What is the purpose of using a
QuickCheck
Gen
here, why not just run all 3 cases deterministically?
It will run a 100 tests instead of 3, but it's not that important, so I switched it to mapM_
.
* Fixed commercialhaskell#160 * Fix race condition where file exists, but possibly removed prior to copy * Fix durability on original content of the file * Prevent atomic renamed of corrupt files, which could have been caused by an exception * Added atomic file writing without durability * Make sure there are no temporary dangling files are left around upon exception * Make sure permissions are properly restored on the atomically renamed file * Reorganize the module RIO.Files to improve clarity * Improve type safety of FFI calls * For atomic write switch to using O_TMPFILE on Linux, while falling back on usual temporary files for OSs and FSs tha do not support that feature * Improve tests for file operations and add more tests for atomic file operations * Move OS specific deps into library. * Enable and fix all compile time warnings. * Addition of `readBinaryFile`, `writeBinaryFile` and `withBinaryFileLazy`. * Blank tests on Windows for atomic+durable writes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3, 4 of 4 files at r4.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @fsync and @lehins)
rio/src/RIO/File.hs, line 49 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
That's not the reason why though, it is because a copy of the file was made prior to any modification taking place.
This is because the file is atomically linked into the file system at the end.
Moreover I don't want to put "linking" in the public documentation because many will not know what this even means, secondly linking is done only on Linux and only when the file does NOT exist, because with linking you can't atomically replace a file.
The explanations you added are good, that makes it clear.
rio/src/RIO/File.hs, line 151 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
I created synonyms for all system calls that improve on type safety. So that should be taken care of
Yep, that looks good.
rio/src/RIO/File.hs, line 228 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
What you are now recommending is to drop the path for every file, regardless if it's relative or not. That suggestion should work, I think, since we always derive the directory from the file path. I'll change that.
I thought about doing it that way before, but I was worrying a little bit if there is some edge case that I am not thinking of where a file path is absolute and we switch it to relative. But it does make sense, because all files are always opened in the directory, which means this should be ok.
Note that it is different from what you where suggesting to not drop the path and keep the file path in the relative or absolute form, cause using relative path would be (was) a bug. Eg:
openat(openDirFd(takeDirName "foo/bar.txt"), "foo/bar.txt")
while this is ok
openat(openDirFd(takeDirName "/foo/bar.txt"), "/foo/bar.txt")
Moving from dropDirectoryIfRelative
to takeFileName
isn't exactly what I meant.
I was arguing that we should support that the given FilePath
argument can be a proper relative path (relative to dirFd
, and possibly having multiple path components), thus making openFileFromDir
a very thin wrapper around openat
that supports the same functionality.
For example, I think it would make sense and be intuitive if openFileFromDir
could be callled like this:
withDirectory "/home/alexey" $ \homeFd -> do
handle <- openFileFromDir fd ".aws/credentials" ...
I had in mind that for the situation where you already have a full FilePath
like path/to/somedir/somefile
or /path/to/somedir/somefile
at hand and you "just want to open it with openat", the caller would do the splitting accordingly, providing the corresponding arguments to openFileFromDir
. For example withBinaryFileDurablePosix
(which calls openFileFromDir
via withFileInDirectory
) would, instead of the current
withDirectory (takeDirectory filePath) $ \dirFd ->
withFileInDirectory dirFd filePath iomode $ \tmpFileHandle -> do
res <- action tmpFileHandle
use the following:
withDirectory (takeDirectory filePath) $ \dirFd ->
withFileInDirectory dirFd (takeFileName filePath) iomode $ \tmpFileHandle -> do
res <- action tmpFileHandle
Do you see what I mean?
I think this would be better because then openFileFromDir
would behave just like openat
, thus having familiar behaviour. It would also remove a not-stated invariant from the function (namely dirFd
corresponds to takeDirName fileName
).
This should be easy to do: All we need is to move the call to takeFileName
from here up to the call sites of openFileFromDir
(which is currently just uses of withFileInDirectory
).
The only thing it'll lack will be the name-of-the-file-path argument to mkHandleFromFD
that appears in error messages. I would make that an explicit argument the caller can provide.
Thus:
openFileFromDir :: MonadIO m => DirFd -> FilePath -> FilePath -> IOMode -> m Handle
openFileFromDir dirFd relPathFromDir fullPathForErrors iomode =
-- implementation stays unchanged (just the variable names get updated)
I can also make a branch to show what I mean if you like.
rio/src/RIO/File.hs, line 280 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Cool, I implemented it by checking if the error is of
UnsupportedOperation
type, in which case it falls back on a usual named temporary file handling.
Nice
rio/src/RIO/File.hs, line 331 at r1 (raw file):
I see. I'm happy with that explanation, but still want to to point out:
most of that information is a lot easier to understand from just following the code rather than trying to understand my comments
I still don't buy that: I understand it now, but it's because of your answer to my question.
As you pointed out: There are many cross-function invariants in this 800-lines file which are neither enforced nor stated. It means one cannot judge the correctness of a given function in isolation. Figuring out all invariants from "just following the code" and keeping it all in working memory is hard. It is much easier to understand your comments.
rio/src/RIO/File.hs, line 367 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
What do you mean not good? It is required, since it will be a bug otherwise. If I do not call
takeFileName
here it will only work for absolute file paths and will be broken for relative ones.Even in case of
dropDirectoryIfRelative
it was absolutely fine, I only got rid of it cause you did not like it for some reason.
I think we had a misunderstanding on that one (I've added a longer explanation in the ohter place further up at openFileFromDir
).
With "not good" meant, like over there: "This function should support files in relative subdirectories or ../
style paths, like openat()
does; there is no reason not to support those."
rio/src/RIO/File.hs, line 376 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Why do you need the docs for something obvious like that? Wouldn't you expect
mv foo bar
to fail iffoo
doesn't exists? Well, it is a file renaming function, as such you cannot rename something that wasn't there to begin with. I wouldn't even add such haddock if the function was exported as part of public API.It shouldn't be used anywhere, where it isn't already used.
I think you're right, I must have confused myself somehow here.
rio/src/RIO/File.hs, line 390 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
I switched to
takeFileName
in both places.
I meant this in relation to my comment on openFileFromDir
: Like openat()
, renameat(..., oldpath, ... newpath)
supports giving full relative file paths as oldpath
and newpath
, not only file names.
So we could remove the invariant Whenever DirFd is supplied, it must be the containgin directory fo this file
, and instead simply make filePath
relative from dirFd
.
rio/src/RIO/File.hs, line 407 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Fixed spelling error.
The file is called temporary :) But if you are referring to the name of the file, then it is right in the code, few lines below, adding it here is redundant.
What I mean is that the user-facing documentation (e.g. module or exported-function haddocks) should explain how those temporary files are named.
Here is an example of how this information is important, from a production server breakage system that occurred to me 3 days ago:
I had an rsync
loop to backup files from one server to another. Software that wrote into the directory was upgraded to do atomic renames, like we add here. Because of the way rsync works (first obtaining file names, then doing the copy), it is possible that it sees a .xxx.tmp
file that's gone when it wants to copy it, and thus rsync errored out with file vanished
errors, and I got paged. The correct solution here is to not backup temporary files, using the --exclude
flag. For that you have to know how the temporary files are named.
rio/src/RIO/File.hs, line 435 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
What do you mean ignored? it is also further passed down into
openAnonymousTempFile
.
It was passed into the function, but that one didn't do any filesytem-y things with it if a directory Fd
was given; it used it only in the fdName
used for exceptions, as far as I can tell.
I had just pointed it out because I reviewed it (as you noticed) from the perspective of pointing out all unstated cross-function invariants (as my opinion is that there should be none).
rio/src/RIO/File.hs, line 453 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
Every single exported function in this module supplies a
FilePath
, that is the one I am talking about. What would you like me to write in this context?
I am not sure what the best name for it should be. But for example in one callee, withBinaryTempFileFor
, this becomes a target file
instead of a source file
.
Doesn't source
suggest that we read from that file somehow? As far as I can tell, we don't read from it.
rio/src/RIO/File.hs, line 522 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
you mean just
durable operations
? if so then fixed.
Ah yes :D
rio/src/RIO/File.hs, line 548 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
improved it further,
withAnonymousBinaryTempFileFor
now returnsNothing
when it is not supported.
That's great
rio/src/RIO/File.hs, line 598 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
I can add it, but I do not understand what you are trying to say. Is your intention is to say to the user: "This function does not provide the same guarantee as if you would open and modify a file using
withBinaryFileDurable
orwriteBinaryFileDurable
, since they ensure that the @fsync()@ is called before the file is closed, so if possible use those instead."
Yes
rio/src/RIO/File.hs, line 725 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
I'll rephrase it a bit, since it is not clear which performance caveats. @fsync@ performance or copy file performance
The same performance caveats apply as for
withBinaryFileDurableAtomic
due
to making a copy of the content of existing files during non-truncating
writes.Something like that maybe, what do you think?
That's good
rio/src/RIO/File.hs, line 178 at r3 (raw file):
c_renameat :: DirFd -> CFilePath -> DirFd -> CFilePath -> IO CInt c_renameat (DirFd (Fd fdFrom)) cFpFrom (DirFd (Fd fdTo)) cFdTo = c_safe_renameat fdFrom cFpFrom fdTo cFdTo
Is cFdTo
a typo of cFpTo
(it's a CFilePath
, not an Fd
)?
rio/src/RIO/File.hs, line 304 at r3 (raw file):
than
-> then
There seems to be some word missing in the sentence, I can't quite parse it:
If the file descriptor for the directory the target file is/will be located in
Maybe something like "if given"?
rio/test/RIO/FileSpec.hs, line 121 at r1 (raw file):
Previously, lehins (Alexey Kuleshevich) wrote…
It will run a 100 tests instead of 3, but it's not that important, so I switched it to
mapM_
.
There's another instance of this just above
rio/src/RIO/File.hs, line 228 at r1 (raw file): IMHO the type signature and functionality of this function is strictly worse:
than this:
With the former it is way to easy to mix up the arguments and get into unpredictable behavior. There is still an invariant that the file must exist in the folder that One more important point is that it allows for anybody to think that this is correct, which it is not, because in that case the full path will be used and the
See
Moreover, I think looking at the call site it is pretty clear that we are dealing with the same file:
One more point is, which I am not quite sure about, but I think I'am correct in my assumption is that doing something like that:
would mess up durable writes, since than it would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @fsync and @lehins)
rio/src/RIO/File.hs, line 228 at r1 (raw file):
in that case the full path will be used and the dirFd will simply be ignored
To me it being like openat()
would be the main benefit; it would inherit it's behaviour, but you'd have to read only 1 man page and knowledge from it would directly translate to the Haskell functions without surprises.
I would also find it good if we could export such openat wrapper then.
But we can do that once we actually want to give that openat wrapper to somebody.
I'm OK with the current approach; let's keep it as is until we want to export such functionality.
would mess up durable writes
Yes, ensuring that the right dir is fsynced would remain the task of the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @fsync and @lehins)
rio/src/RIO/File.hs, line 367 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I think we had a misunderstanding on that one (I've added a longer explanation in the ohter place further up at
openFileFromDir
).With "not good" meant, like over there: "This function should support files in relative subdirectories or
../
style paths, likeopenat()
does; there is no reason not to support those."
Clarified in the other discussion.
rio/src/RIO/File.hs, line 390 at r1 (raw file):
Previously, nh2 (Niklas Hambüchen) wrote…
I meant this in relation to my comment on
openFileFromDir
: Likeopenat()
,renameat(..., oldpath, ... newpath)
supports giving full relative file paths asoldpath
andnewpath
, not only file names.So we could remove the invariant
Whenever DirFd is supplied, it must be the containgin directory fo this file
, and instead simply makefilePath
relative fromdirFd
.
Clarified in the other discussion.
rio/src/RIO/File.hs, line 331 at r1 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
It has 440 lines of code, it is 800 lines file only because of comments. :P |
rio/src/RIO/File.hs, line 178 at r3 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
Yes, it was a typo. Thanks. |
rio/src/RIO/File.hs, line 304 at r3 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
I just tried to read that piece of documentation myself and it didn't make any sense to me either. :D
See, it is a lot easier to make human language ambiguous, at least for me ;) Fixed. |
rio/src/RIO/File.hs, line 407 at r1 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
Totally agree with you, if that function was exported, except it is not. So comment here would really be useless for the user and for the developer it is easier to see it in the code, rather than read comments. At the same time, there is no mention of this temporary file naming pattern anywhere in the functions that are exported, I'll add the naming pattern to the relevant places, which will be included in generated haddock. |
rio/src/RIO/File.hs, line 453 at r1 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
I'll simply remove the comment for this argument. You can't name it, I don't want to name it. This function is not exported, so I really don't care. |
rio/src/RIO/File.hs, line 598 at r1 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
Done. |
rio/test/RIO/FileSpec.hs, line 121 at r1 (raw file): Previously, nh2 (Niklas Hambüchen) wrote…
Force of habit relying on generators. Fixed. |
* Durability of file writes with the help of `fsync()` syscall on the file and containing folder * Atomicity of file writes with the help of atomic renames with `rename()`, `renameat()` and `linkat()` syscalls, depending on how it was opened * Anonymous temp file utilization on Linux for atomic writes with the help of `openat()` and `O_TMPFILE` * Instead of merging this functionality into commercialhaskell/rio#167 it has been moved upstream into `unliftio`
* Durability of file writes with the help of `fsync()` syscall on the file and containing folder * Atomicity of file writes with the help of atomic renames with `rename()`, `renameat()` and `linkat()` syscalls, depending on how it was opened * Anonymous temp file utilization on Linux for atomic writes with the help of `openat()` and `O_TMPFILE` * Instead of merging this functionality into commercialhaskell/rio#167 it has been moved upstream into `unliftio`
This should be closed in favor of fpco/unliftio#48 Once it is merged in unliftio I'll create a separate PR with simple re-exports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5.
Reviewable status: complete! all files reviewed, all discussions resolved
* Durability of file writes with the help of `fsync()` syscall on the file and containing folder * Atomicity of file writes with the help of atomic renames with `rename()`, `renameat()` and `linkat()` syscalls, depending on how it was opened * Anonymous temp file utilization on Linux for atomic writes with the help of `openat()` and `O_TMPFILE` * Instead of merging this functionality into commercialhaskell/rio#167 it has been moved upstream into `unliftio` * Dropped support for ghc-7.8
* Fixed version of the functions where moved upstream into `unliftio` and are now re-exported from `rio`. See commercialhaskell#167 for more info and the actual PR review. * Bump up the version, update changelog and pin newer version of `typed-process` in stack.yaml with the hash
Modified version of the code from this PR was merged into |
This PR implements atomic file writes and fixes a bunch of issues with durable atomic and only durable file writes. In order to see the actual locations with short description of each issue see this PR
Below is higher level overview of the issues being fixed:
withBinaryFileDurableAtomic
fails with nested folders #160 (atomic renames aren't working with relative paths)copyFile
and we cannot guarantee successful fsync when handle is closedwithBinaryFileDurable
produces an error withReadMode
doesFileExist
andcopyFile
, which can cause an unexpectedDoesNotExist
exceptionopen
withO_TMPFILE
we can provide isolation, which is the implemented approach in this PR.Besides addition of
writeBinaryFileAtomic
andwithBinaryFileAtomic
, this PR also includes a bunch of tests for the all functions inRIO.Files
(close to 100% coverage).Few general changes:
-DWINDOWS
flag into thelibrary
section, all tests are OS agnostic and do not directly depend onunix
norWin32
-Wall
by default.readBinaryFile
,writeBinaryFile
andwithBinaryFileLazy
as synonyms for consistent namingRIO.Prelude.Types
fromRIO.Prelude
(it was imported qualified, which prevented from types being re-exported).import RIO
is not affected.RIO.File
This change is