Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward-looking erase-state CRCs #497

Merged
merged 12 commits into from
Apr 26, 2023
Merged

Forward-looking erase-state CRCs #497

merged 12 commits into from
Apr 26, 2023

Conversation

geky
Copy link
Member

@geky geky commented Dec 7, 2020

Original issue here: #245 (comment)

This is a proposal to fix the out-of-order writes found by @pjsg's fuzzing work.

The problem is that it is possible for (non-NOR) block devices to write pages in any order, or to even write random data in the case of a power-loss. This breaks littlefs's use of the first bit in a page to indicate the erase-state.

@pjsg notes this behavior is documented in the W25Q here:
https://community.cypress.com/docs/DOC-10507


The basic idea here is to CRC the next page, and use this "erase-state CRC" to check if the next page is erased and ready to accept programs.

.------------------. \   commit
|     metadata     | |
|                  | +---.
|                  | |   |
|------------------| |   |
| erase-state CRC -----. |
|------------------| | | |
|   commit CRC    ---|-|-'
|------------------| / |
|     padding      |   | padding (doesn't need CRC)
|                  |   |
|------------------| \ | next prog
|     erased?      | +-'
|        |         | |
|        v         | /
|                  |
|                  |
'------------------'

This is made a bit annoying since littlefs doesn't actually store the page (prog_size) in the superblock, since it doesn't need to know the size for any other operation. We can work around this by storing both the CRC and size of the next page when necessary.

Another interesting note is that we don't need to any bit tweaking information, since we read the next page every time we would need to know how to clobber the erase-state CRC. And since we only read prog_size, this works really well with our caching, since the caches must be a multiple of prog_size.

This also brings back the internal lfs_bd_crc function, in which we can use some optimizations added to lfs_bd_cmp in #395.

TODO:

  • Fix failing tests cases
  • Add tests for backwards compatibility
  • Implement proper handling of on-disk minor version bump
  • Documentation update for DESIGN/SPEC

Depends on #495
Related to #245

@geky geky added needs minor version new functionality only allowed in minor versions needs fix we know what is wrong needs documentation needs documentation needs test all fixes need test coverage to prevent regression labels Dec 7, 2020
@geky geky changed the base branch from devel to test-and-bench-runners December 11, 2022 06:26
@geky geky added needs test all fixes need test coverage to prevent regression and removed needs fix we know what is wrong needs test all fixes need test coverage to prevent regression labels Dec 11, 2022
@geky geky force-pushed the crc-rework-2 branch 2 times, most recently from 918bb02 to 31f4d26 Compare December 11, 2022 07:18
@geky geky force-pushed the crc-rework-2 branch 2 times, most recently from db2abfb to aa225a9 Compare December 16, 2022 06:05
geky and others added 6 commits December 17, 2022 12:42
This change is necessary to handle out-of-order writes found by pjsg's
fuzzing work.

The problem is that it is possible for (non-NOR) block devices to write
pages in any order, or to even write random data in the case of a
power-loss. This breaks littlefs's use of the first bit in a page to
indicate the erase-state.

pjsg notes this behavior is documented in the W25Q here:
https://community.cypress.com/docs/DOC-10507

---

The basic idea here is to CRC the next page, and use this "erase-state CRC" to
check if the next page is erased and ready to accept programs.

.------------------. \   commit
|     metadata     | |
|                  | +---.
|                  | |   |
|------------------| |   |
| erase-state CRC -----. |
|------------------| | | |
|   commit CRC    ---|-|-'
|------------------| / |
|     padding      |   | padding (doesn't need CRC)
|                  |   |
|------------------| \ | next prog
|     erased?      | +-'
|        |         | |
|        v         | /
|                  |
|                  |
'------------------'

This is made a bit annoying since littlefs doesn't actually store the
page (prog_size) in the superblock, since it doesn't need to know the
size for any other operation. We can work around this by storing both
the CRC and size of the next page when necessary.

Another interesting note is that we don't need to any bit tweaking
information, since we read the next page every time we would need to
know how to clobber the erase-state CRC. And since we only read
prog_size, this works really well with our caching, since the caches
must be a multiple of prog_size.

This also brings back the internal lfs_bd_crc function, in which we can
use some optimizations added to lfs_bd_cmp.

Needs some cleanup but the idea is passing most relevant tests.
- General cleanup from integration, including cleaning up some older
  commit code
- Partial-prog tests do not make sense when prog_size == block_size
  (there can't be partial-progs!)
- Fixed signed-comparison issue in modified filebd
Previously forward-looking CRCs was just two new CRC types, one for
commits with forward-looking CRCs, one without. These both contained the
CRC needed to complete the current commit (note that the commit CRC
must come last!).

         [--   32   --|--   32   --|--   32   --|--   32   --]
with:    [  crc3 tag  | nprog size |  nprog crc | commit crc ]
without: [  crc2 tag  | commit crc ]

This meant there had to be several checks for the two possible structure
sizes, messying up the implementation.

         [--   32   --|--   32   --|--   32   --|--   32   --|--   32   --]
with:    [nprogcrc tag| nprog size |  nprog crc | commit tag | commit crc ]
without: [ commit tag | commit crc ]

But we already have a mechanism for storing optional metadata! The
different metadata tags! So why not use a separate tage for the
forward-looking CRC, separate from the commit CRC?

I wasn't sure this would actually help that much, there are still
necessary conditions for wether or not a forward-looking CRC is there,
but in the end it simplified the code quite nicely, and resulted in a ~200 byte
code-cost saving.
This fixes most of the remaining bugs (except one with multiple padding
commits + noop erases in test_badblocks), with some other code tweaks.

The biggest change was dropping reliance on end-of-block commits to know
when to stop parsing commits. We can just continue to parse tags and
rely on the crc for catch bad commits, avoiding a backwards-compatiblity
hiccup. So no new commit tag.

Also renamed nprogcrc -> fcrc and commitcrc -> ccrc and made naming in
the code a bit more consistent.
Initially I thought the fcrc would be sufficient for all of the
end-of-commit context, since indicating that there is a new commit is a
simple as invalidating the fcrc. But it turns out there are cases that
make this impossible.

The surprising, and actually common, case, is that of an fcrc that
will end up containing a full commit. This is common as soon as the
prog_size is big, as small commits are padded to the prog_size at
minimum.

  .------------------. \
  |     metadata     | |
  |                  | |
  |                  | +-.
  |------------------| | |
  |   foward CRC ------------.
  |------------------| / |   |
  |   commit CRC    -----'   |
  |------------------|       |
  |     padding      |       |
  |                  |       |
  |------------------| \   \ |
  |     metadata     | |   | |
  |                  | +-. | |
  |                  | | | +-'
  |------------------| / | |
  |   commit CRC --------' |
  |------------------|     |
  |                  |     /
  '------------------'

When the commit + crc is all contained in the fcrc, something silly
happens with the math behind crcs. Everything in the commit gets
canceled out:

  crc(m) = m(x) x^|P|-1 mod P(x)

  m ++ crc(m) = m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x))

  crc(m ++ crc(m)) = (m(x) x^|P|-1 + (m(x) x^|P|-1 mod P(x))) x^|P|-1 mod P(x)

  crc(m ++ crc(m)) = (m(x) x^|P|-1 + m(x) x^|P|-1) x^|P|-1 mod P(x)

  crc(m ++ crc(m)) = 0 * x^|P|-1 mod P(x)

This is the reason the crc of a message + naive crc is zero. Even with an
initializer/bit-fiddling, the crc of the whole commit ends up as some
constant.

So no manipulation of the commit can change the fcrc...

But even if this did work, or we changed this scheme to use two
different checksums, it would still require calculating the fcrc of
the whole commit to know if we need to tweak the first bit to invalidate
the unlikely-but-problematic case where we happen to match the fcrc. This
would add a large amount of complexity to the commit code.

It's much simpler and cheaper to keep the 1-bit counter in the tag, even
if it adds another moving part to the system.
…orphans

This of course should never happen normally, two half-orphans requires
two parents, which is disallowed in littlefs for this reason. But it can
happen if there is an outdated half-orphan later in the metadata
linked-list. The two half-orphans can cause the deorphan step to get
stuck, constantly "fixing" the first half-orphan before it has a chance
to remove the problematic, outdated half-orphan later in the list.

The solution here is to do a full check for half-orphans before
restarting the half-orphan loop. This strategy has the potential to
visit more metadata blocks unnecessarily, but avoids situations where
removing a later half-orphan will eventually cause an earlier
half-orphan to resolve itself.

Found with heuristic powerloss testing with test_relocations_reentrant_renames
after 192 nested powerlosses.
@pjsg
Copy link

pjsg commented Dec 21, 2022

This looks exciting!

@geky geky added this to the v2.6 milestone Apr 17, 2023
@geky geky force-pushed the crc-rework-2 branch 2 times, most recently from 7e5f2fc to ba1c764 Compare April 19, 2023 18:52
This wasn't implemented correctly anyways, as it would need to recursively
rename directories that may not exist. Things would also get a bit
complicated if only some files in a directory were renamed.

Doable, but not needed for our use case.

For now just ignore any directory components. Though this may be worth
changing if the source directory structure becomes more complicated in
the future (maybe with a -r/--recursive flag?).
@geky geky force-pushed the crc-rework-2 branch 6 times, most recently from 9954939 to 81a7a1e Compare April 21, 2023 05:01
This is a bit tricky since we need two different version of littlefs in
order to test for most compatibility concerns.

Fortunately we already have scripts/changeprefix.py for version-specific
symbols, so it's not that hard to link in the previous version of
littlefs in CI as a separate set of symbols, "lfsp_" in this case.

So that we can at least test the compatibility tests locally, I've added
an ifdef against the expected define "LFSP" to define a set of aliases
mapping "lfsp_" symbols to "lfs_" symbols. This is manual at the moment,
and a bit hacky, but gets the job done.

---

Also changed BUILDDIR creation to derive subdirectories from a few
Makefile variables. This makes the subdirectories less manual and more
flexible for things like LFSP. Note this wasn't possible until BUILDDIR
was changed to default to "." when omitted.
This uses the "github.event.pull_request.base.ref" variable as the
"lfsp" target for compatibility testing.
This just means a rewrite of the superblock entry with the new minor
version.

Though it's interesting to note, we don't need to rewrite the superblock
entry until the first write operation in the filesystem, an optimization
that is already in use for the fixing of orphans and in-flight moves.

To keep track of any outdated minor version found during lfs_mount, we
can carve out a bit from the reserved bits in our gstate. These are
currently used for a counter tracking the number of orphans in the
filesystem, but this is usually a very small number so this hopefully
won't be an issue.

In-device gstate tag:

  [--       32      --]
  [1|- 11 -| 10 |1| 9 ]
   ^----^-----^--^--^-- 1-bit has orphans
        '-----|--|--|-- 11-bit move type
              '--|--|-- 10-bit move id
                 '--|-- 1-bit needs superblock
                    '-- 9-bit orphan count
See SPEC.md for more info.

Also considered adding an explanation to DESIGN.md, but there's not a
great place for it. Maybe FCRCs are too low-level for the high-level
design document. Though may be worth reconsidering if DESIGN.md gets
revisited.
@geky geky removed needs documentation needs documentation needs test all fixes need test coverage to prevent regression needs minor version new functionality only allowed in minor versions labels Apr 21, 2023
@geky
Copy link
Member Author

geky commented Apr 21, 2023

Just added documentation about this change to the SPEC.md:
https://github.com/littlefs-project/littlefs/pull/497/files?short_path=7426c9e#diff-7426c9e3a694ca6015df5f98637912975f2edea23270203ee89a8bdeed246ee0

Things look like they are working so this should be ready to merge soon.

Sorry again about the delay, this is the first change that involves bumping the on-disk minor version. This sort of irreversible change can lead to unpleasant situations if a mistake is made, so I wanted to make sure forward/backward compatibility is decently tested first.

116332d and 4c93600 should hopefully provide decent confidence that the on-disk minor version is respected. This will also need a good explanation in the release notes...

@geky geky marked this pull request as ready for review April 26, 2023 06:00
@geky geky changed the base branch from test-and-bench-runners to devel April 26, 2023 06:09
@geky geky merged commit 6f074eb into devel Apr 26, 2023
93 checks passed
@geky geky mentioned this pull request May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants