Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/filestorage] Compaction fails if the target directory is on a different filesystem #13449

Closed
swiatekm-sumo opened this issue Aug 22, 2022 · 5 comments
Assignees
Labels

Comments

@swiatekm-sumo
Copy link
Contributor

Describe the bug
File storage compaction uses a mid-step directory for storing the compacted DB before moving it back to the storage path. This move is implemented as an os.Rename (

if err = os.Rename(compactedDbPath, dbPath); err != nil {
) and fails if the directories are on different filesystem with an EXDEV error code.

Steps to reproduce
Set the storage directory and the compaction directory on different filesystems, and trigger a compaction.

What did you expect to see?
I expected the compaction to succeed.

What did you see instead?
I got an "invalid cross-device link" error message, and then the storage got into a state where nothing could be written to it, as the db wasn't open after the error

What version did you use?
v0.54.0 and v0.58.0.

What config did you use?
Cut down to the important bits:

extension:
  filestorage:
    directory: /var/lib/otc/storage
    compaction:
      on_start: true
      on_rebound: true
      directory: /tmp/otc

Environment
Originally saw this in Kubernetes, AWS EKS 1.21 to be exact, with the directories placed respectively on an EBS volume and a local tmpDir volume. It can be reproduced on any two different filesystems though, including tmpfs on Linux.

Additional context

By itself, this problem should simply cause the compaction to fail and the storage to continue functioning normally, but due to us unnecessarily removing the original DB file here, it actually stops the storage client from working completely. Not sure if that's worth opening a separate issue, WDYT @djaglowski ?

@swiatekm-sumo swiatekm-sumo added the bug Something isn't working label Aug 22, 2022
@swiatekm-sumo
Copy link
Contributor Author

FYI @djaglowski I'm planning to submit a PR to fix this, opened the issue so I don't forget. I think the fix is simply to fall back to a Truncate + Copy if the Rename fails, but I'm open to alternatives.

@djaglowski
Copy link
Member

By itself, this problem should simply cause the compaction to fail and the storage to continue functioning normally, but due to us unnecessarily removing the original DB file here, it actually stops the storage client from working completely. Not sure if that's worth opening a separate issue, WDYT @djaglowski ?

I agree the current behavior is incorrect.

I wonder if a more robust solution would be to use something like a blue/green pattern, where the client alternates between two file names. This would mean we do not need to depend on any renaming, but also that we are not leaking files. Very roughly in pseudocode:

func (c *Client) Start() {
    c.currentDB := bbolt.Open(name + ".blue")
    c.blue = true
}

func (c *Client) Compact() {
    var compactDB
    if c.blue {
        compactDB = bbolt.Open(name + ".green")
    } else {
        compactDB = bbolt.Open(name + ".blue")
    }

    if err := bbolt.PerformCompactionFromTo(c.currentDB, compactDB); err == nil {
        os.Remove(c.currentDB)
        c.currentDB = compactDB
        c.blue = !c.blue
    } // else keep using the old one
}

@swiatekm-sumo
Copy link
Contributor Author

By itself, this problem should simply cause the compaction to fail and the storage to continue functioning normally, but due to us unnecessarily removing the original DB file here, it actually stops the storage client from working completely. Not sure if that's worth opening a separate issue, WDYT @djaglowski ?

I agree the current behavior is incorrect.

I wonder if a more robust solution would be to use something like a blue/green pattern, where the client alternates between two file names. This would mean we do not need to depend on any renaming, but also that we are not leaking files. Very roughly in pseudocode:

func (c *Client) Start() {
    c.currentDB := bbolt.Open(name + ".blue")
    c.blue = true
}

func (c *Client) Compact() {
    var compactDB
    if c.blue {
        compactDB = bbolt.Open(name + ".green")
    } else {
        compactDB = bbolt.Open(name + ".blue")
    }

    if err := bbolt.PerformCompactionFromTo(c.currentDB, compactDB); err == nil {
        os.Remove(c.currentDB)
        c.currentDB = compactDB
        c.blue = !c.blue
    } // else keep using the old one
}

That'd be easier to get right, but it has the downside of requiring both files to be on the same filesystem. One situation where compaction is useful is if we run out of space on the device, and then compacting to a different device and moving the file back allows us to actually reclaim the space. With this proposal, we'd simply be unable to compact at all.

On the other hand, it's tricky to get the current method working correctly so it always leaves a working database in case of failure. Moving files between devices is non-atomic, unlike os.Rename, so we can get stuck with a corrupt DB if we get killed at an unlucky time.

@djaglowski
Copy link
Member

Good points. I think you are more tuned into the compaction use case, so I'm happy to review the implementation you think is most appropriate here.

@swiatekm-sumo
Copy link
Contributor Author

Fixed in #13730

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants