New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use atomic file writing and updating for configuration and key files #1060

Closed
enkore opened this Issue May 18, 2016 · 14 comments

Comments

Projects
None yet
3 participants
@enkore
Copy link
Contributor

enkore commented May 18, 2016

Proposed sequence

  • write to temporary
  • fsync(temp)
  • rename(temp, real)
  • fsync(containing_dir) <-- important for the repository config when we implement DEKs

馃挵 there is a bounty for this

@enkore enkore self-assigned this May 18, 2016

@enkore enkore added this to the 1.1 - near future goals milestone May 18, 2016

@ThomasWaldmann

This comment has been minimized.

Copy link
Member

ThomasWaldmann commented May 21, 2016

Note: the temporary file needs to be in same directory as the target file (just maybe some prefix/postfix, like ~*.tmp).

@ThomasWaldmann

This comment has been minimized.

Copy link
Member

ThomasWaldmann commented Jun 26, 2016

Note: I've put a bounty on this and have unassigned @enkore (who was assigned on this for rather long).

Anybody (including @enkore) feel free to work on this, please tell on bountysource when you start.

@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jun 26, 2016

Had forgot about it, but here's a draft I had laying around from a month ago or so. Feel free to recycle :)

class SaveFile:
    """
    Update file contents atomically.

    Must be used as a context manager.

    On a journaling file system the file contents are always updated
    atomically and won't become corrupted, even on pure failures or
    crashes (for caveats see SyncFile).
    """

    SUFFIX = '.tmp'

    def __init__(self, path):
        self.path = path
        self.tmppath = os.path.join(self.path, self.SUFFIX)

    def __enter__(self):
        # XXX use platform-specific SyncFile
        self.fd = open(self.tmppath, 'xb')
        return self.fd

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.fd.close()
        if exc_type is not None:
            os.unlink(self.tmppath)
            return
        os.rename(self.tmppath, self.path)
        # XXX use platform-specific sync_dir
        sync_dir(os.path.dirname(self.path))

    def write(self, data):
        self.fd.write(data)

(this would go into borg.platform.base)

@PlasmaPower

This comment has been minimized.

Copy link
Contributor

PlasmaPower commented Jun 27, 2016

fsync(containing_dir)

fsync for python at least only applies to specific file descriptors. Any ideas on how to do this? I think on *nix you might be able to open a file descriptor for a directory, but I think that's not true for Windows.

@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jun 27, 2016

Got an abstraction for that

        # XXX use platform-specific sync_dir
        sync_dir(os.path.dirname(self.path))

It's possible on Windows but that stuff needs more investigating how one can reliably cause a transaction flush there.

@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jun 30, 2016

While doing "corruption testing" (results below) with various file systems (NTFS-3g, ext4, XFS) [2] I noticed that given just the right sequence of operations [1] one can manage to put some data in a freshly created repository on XFS and then lose the repo config rendering the repo unusable until an advanced power-user re-creates the config file.

I don't think this is a critical/big problem in normal usage patterns since the first create after an init will almost certainly take longer than the dirty-writeout timeout of the kernel (one to a couple minutes).


Results:

  • ntfs-3g isn't reliable at all on this system (specific USB controller + hub + thumb drive + software versions, other combinations yet to be tested)
    • See below for one example, but I observed even more fatal corruption (EIO on data/0 for example) which wouldn't even allow one to copy the repository to safety. Ouch!
  • ext4 doesn't seem to exhibit any corruption / issues at all
  • XFS seems more lenient than ext4, likely because ext4 is ordered write mode by default
    • Except the described case no corruption of the repository was seen
    • And the FS is also not corrupted at all
    • IOW ext4 simply guarantees more data-metadata consistency than XFS, in some circumstances and with the default options (disable ordered write and it'd be exactly the same as XFS).
  • XFS seems to "pin" the kernel device so that one needs -o nouuid to re-mount the thumb drive after it was removed since the kernel still thinks the device is present (explanation a few weeks ago from @verygreen, much appreciated). This can be seen from repeated XFS messages on the old device after it was physically removed when the XFS driver internally still seems to try to access the journal.
  • Naturally the lock is stale and needs break-lock (this may be something cleaned up with a auto-lock-release)

[1] borg init, borg create small archive (<1 MB in my case), borg create big archive, pull drive during create.
[2] First tests made for #1211, with rather ugly results:

Rather curious: ntfs-3g journaling doesn't seem to be very reliable. After a few pulls with no repository corruption I get an EIO on open(2) on a data file and a rather ugly ntfs-3g message:

Jun 27 20:59:35 tiger ntfs-3g[27314]: Trying to read non-allocated mft records (83 > 82): Nicht erlaubter Seek (seek not permissible)
Jun 27 20:59:35 tiger ntfs-3g[27314]: Trying to read non-allocated mft records (83 > 82): Nicht erlaubter Seek

Even root can't touch that inode anymore, not even rm the dirent. I'll see whether Windows chkdsk can fix that, or whether I have to reformat my thumb drive?!

@enkore enkore referenced this issue Jul 9, 2016

Open

Dealing with attic issues #5

240 of 246 tasks complete
@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jul 9, 2016

Stapling jborg/attic#177

safe to say it's an "issue" outside attic. i.e. crash before the file was committed to disk and stuff like that. #1060 goes in the direction.

@PlasmaPower

This comment has been minimized.

Copy link
Contributor

PlasmaPower commented Jul 26, 2016

ntfs-3g isn't reliable at all on this system

Could you test this on a loop device? I'm not experiencing any corruption issues there:

truncate -s 1G part.raw
mkfs.ntfs -F part.raw
mkdir /mnt/loop
ntfs-3g part.raw /mnt/loop
@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jul 26, 2016

How can I remove a loop device in a similarly unexpected fashion as physically pulling a hot drive?

@PlasmaPower

This comment has been minimized.

Copy link
Contributor

PlasmaPower commented Jul 26, 2016

@enkore Good question, I think pkill -9 ntfs-3g will do that.

@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jul 26, 2016

No, doesn't seem to cause the same issue. It smells like a hardware problem e.g. someone in the USB stack may be reordering writes or dropping sync-like commands.

@PlasmaPower

This comment has been minimized.

Copy link
Contributor

PlasmaPower commented Jul 26, 2016

Do you think it might be a race condition visible with the increased latency? I don't know of a good way to simulate increased latency on a disk, maybe sshfs with tc qdisc netem?

@enkore

This comment has been minimized.

Copy link
Contributor

enkore commented Jul 26, 2016

I really don't know. That live/visible metadata refers to non-existing MFT entries suggests (in my mind) that some basic precondition of the journaling must've been broken. I ATM don't have any mag drives I'd be willing to use for testing this, but when I get some I'll test more hardware wrt. this bug. (And take it upstream, if they have a proper list/bugtracker)

@ThomasWaldmann

This comment has been minimized.

Copy link
Member

ThomasWaldmann commented Aug 14, 2016

@enkore claim the bounty ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment