-
Notifications
You must be signed in to change notification settings - Fork 775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sync issue where data writes could appear before metadata writes #948
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some forms of storage, mainly anything with an FTL, eMMC, SD, etc, do not guarantee a strict write order for writes to different blocks. It would be good to test that this doesn't break littlefs. This adds LFS_EMUBD_POWERLOSS_OOO to lfs_emubd, which tells lfs_emubd to try to break any order-dependent code on powerloss. The behavior right now is a bit simple, but does result in test breakage: 1. Save the state of the block on first write (erase really) after sync/init. 2. On powerloss, revert the first write to its original state. This might be a bit confusing when debugging, since the block will appear to time-travel, but doing anything fancier would make emubd quite a bit more complicated. You could also get a bit fancier with which/how many blocks to revert, but this should at least be sufficient to make sure bd sync calls are in the right place.
Long story short we aren't calling sync correctly in littlefs. This fixes that. Some forms of storage, mainly anything with an FTL, eMMC, SD, etc, do not guarantee a strict write order for writes to different blocks. In theory this is what bd sync is for, to tell the bd when it is important for the writes to be ordered. Currently, littlefs calls bd sync after committing metadata. This is useful as it ensures that user code can rely on lfs_file_sync for ordering external side-effects. But this is insufficient for handling storage with out-of-order writes. Consider the simple case of a file with one data block: 1. lfs_file_write(blablabla) => writes data into a new data block 2. lfs_file_sync() => commits metadata to point to the new data block But with out-of-order writes, the bd is free to reorder things such that the metadata is updated _before_ the data is written. If we lose power, that would be bad. The solution to this is to call bd sync twice: Once before we commit the metadata to tell the bd that these writes must be ordered, and once after we commit the metadata to allow ordering with user code. As a small optimization, we only call bd sync if the current file is not inlined and has actually been modified (LFS_F_DIRTY). It's possible for inlined files to be interleaved with writes to other files. Found by MFaehling and alex31
This was referenced Feb 27, 2024
Unlike the heuristic based testing, exhaustive powerloss testing effectively forks the current test and runs both the interrupted and uninterrupted test states to completion. But emubd wasn't expecting bd->cfg->powerloss_cb to return. The fix here is to keep track to both the old+new out-of-order block states and unrevert them if bd->cfg->powerloss_cb returns. This may leak the temporary copy, but powerloss testing is already inherently leaky.
We need to decrement the saved block state on sync, when we reset out-of-order emulation. Otherwise we leak blocks out the wazoo.
Tests passed ✓, Code: 17024 B (+0.3%), Stack: 1432 B (+0.0%), Structs: 812 B (+0.0%)
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Long story short we aren't calling sync correctly in littlefs. This fixes that.
Some forms of storage, mainly anything with an FTL, eMMC, SD, etc, do not guarantee a strict write order for writes to different blocks. In theory this is what bd sync is for, to tell the bd when it is important for the writes to be ordered.
Currently, littlefs calls bd
sync
after committing metadata. This is useful as it ensures that user code can rely onlfs_file_sync
for ordering external side-effects.But this is insufficient for handling storage with out-of-order writes.
Consider the simple case of a file with one data block:
lfs_file_write(blablabla)
=> writes data into a new data blocklfs_file_sync()
=> commits metadata to point to the new data blockBut with out-of-order writes, the bd is free to reorder things such that the metadata is updated before the data is written. If we lose power, that would be bad.
The solution to this is to call bd
sync
twice: Once before we commit the metadata to tell the bd that these writes must be ordered, and once after we commit the metadata to allow ordering with user code.As a small optimization, we only call bd
sync
if the current file is not inlined and has actually been modified (LFS_F_DIRTY). It's possible for inlined files to be interleaved with writes to other files.This PR also adds LFS_EMUBD_POWERLOSS_OOO to
lfs_emubd
, which tellslfs_emubd
to try to break any order-dependent code on powerloss. This should prevent a regression in the future.Found by @MFaehling and @alex31
Related issues: #822, #936