Skip to content

Do we need to clarify the fsync() close() Unix-ism? #42619

@federicomenaquintero

Description

@federicomenaquintero

I was doing a little investigation of how/when we close(2) files. This is done in impl Drop for FileDesc, libstd/sys/unix/fd.rs.

This is of course fine. In addition, I was wondering if Rust needs to expose the close() call in the public API, so people could actually check for errors when closing files, and per Unix lore, be safe when closing files on NFS yields an error.

Pretty much every API which wraps close(2) has run into the same issue. The links provided in the comments in the implementation for Glib's g_close() are enlightening, particularly this one. Summary:

  • You can't recover from EINTR on close(), so you should just ignore it and give up.
  • All other errors are of concern; pending writes and quotas (or just low disk space) mean that close() could give you ENOSPC or EIO or whatever.

I think there are two cases to consider:

  • The program is writing a random, "non-important" file. Maybe a temporary file or whatever. Maybe it can just close() the file and don't do anything about errors.
  • The program is writing a file with the user's data (i.e. File/Save on a word processor). In that case you really want to check errors on close(), to tell the user that something didn't work. Is it low disk space? Maybe the user can delete some files and try to save again. I/O error? Try saving to another disk. I.e. do anything and save before the program has a chance to crash elsewhere :)

People recommend doing fsync() before close() if you really care about the bits being on the disk. This makes sense; fsync() means that both file data and its metadata are written out. The question is whether a successful fsync() means that we can pretty much ignore the result of close().

My next question was about what the kernel is actually doing. On Linux, close() is implemented here:

SYSCALL_DEFINE1(close, unsigned int, fd)
{
	int retval = __close_fd(current->files, fd);

	/* can't restart close syscall because file table entry was cleared */
	if (unlikely(retval == -ERESTARTSYS ||
		     retval == -ERESTARTNOINTR ||
		     retval == -ERESTARTNOHAND ||
		     retval == -ERESTART_RESTARTBLOCK))
		retval = -EINTR;

	return retval;
}

That __close_fd() is here:

int __close_fd(struct files_struct *files, unsigned fd)
{
	struct file *file;

        ...
	return filp_close(file, files);

out_unlock:
	...
	return -EBADF;
}

And filp_close() is the one that actually calls file system implementations:

int filp_close(struct file *filp, fl_owner_t id)
{
	int retval = 0;

        ...
	if (filp->f_op->flush)
		retval = filp->f_op->flush(filp, id);

        ...
	return retval;
}

On Linux, the only thing (outside of EBADF) that can cause close() to return an error is a file system's flush() returning an error... or EINTR from a signal.

I then looked for which file systems implement f_op->flush(). It's only afs, cifs, ecryptfs, exofs, fuse, nfs. And on nfs's implementation, it just calls vfs_flush(), which calls underlying_filesystem->fsync(). Fuse's is hairier, as it actually depends on the implementing process. I didn't look at the others.

Also, I have no idea of what other operating systems do!

I think we could have a recommendation to call File.sync_all() for really important, user's data. I am not yet sure if we should export close() in the File API, as I cannot answer "is successful fsync() then close() enough for the safety of the user's data" yet :(

Comments are appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions