Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open speed and cache sizes #783

Open
bitzer-tjo opened this issue Mar 9, 2023 · 14 comments
Open

Open speed and cache sizes #783

bitzer-tjo opened this issue Mar 9, 2023 · 14 comments
Labels

Comments

@bitzer-tjo
Copy link

Hi

We have done a lot of performance test with LittleFS and some of the results don’t add up. Maybe someone can tell the why this is so and explain what we see.
We are using a STM32 H7 and a Micron Data flash running with a 66 MHz SPI clock.
We use LittleFS 2.4.5 but have tried with the latest 2.5.1 and the result is somehow similar. But it is a bit faster, and chop offs a few 100 ms. where file creation is worst.

File creation time (Open and Sync)
We have concentrated about file creation times (So an open and sync). We tested with different cache/prog sizes and file sizes. We created 100 files in the root folder and measured the time in milliseconds for doing an open and sync. Then we write data to the file and close it. Then create a new until we reach 100 files. If we tested with 128 file size, 128 byte was written to each file.
As expected, the file creation time goes up when more and more files are in the folder. But some of the create time takes like seconds to do. I have checked, the time is not used in the data flash driver. This also changes on the file sizes and cache size.
Some measurements with file size 0, 128, 256 and 512 bytes

file creation 0 size
file creation 128 byte size
file creation 256 byte size
file creation 512 byte size

All above 512 bytes looks the same. Even if every file has a size of 11KB
For file size 0, there are some spikes and file no 92 takes almost 1.4 seconds.
If the file size is the cache buffer size, it seems to perform very well, but outside we se these spikes!
File size above 512 file no 81 takes more than 3.5 seconds. (Up until 11 KB file size)
Ran with this config.
Block size: 4_KB
Read size: 32
Program size 256
Lookahead size 256
Cache size 256
Why do we see this? Is there a good explanation?

Here is the same teste but with different cache sizes. With higher cache size we got frequently more spikes?

File creation vs cache size

Cache and program buffer size impact:
When we look at the cache and program buffer sizes, we see this. Same test but I just log the time for the file that toke the longest to create. Increment the file size with 32 bytes for each loop.
Pseudo code:
Loop until file size is 1 KB
Format
Mount
Loop 100 times: Start time – open – sync – diff time – write file size data to file – close – Long highest diff
Increment file size with 32 bytes

worst cache 256
worst cache 512 bytes

Is this what to excpet?

Best regards
Thomas

@MWP
Copy link

MWP commented Mar 9, 2023

Do you have the tools to perform runtime code profiling?
It should make the cause pretty clear.

@bitzer-tjo
Copy link
Author

bitzer-tjo commented Mar 9, 2023

I have tried doing profiling using trace buffers. The TRACE pins are not in my hardware.
This is what I can get, creating 100 files. File no 92 toke 1,1 s to create.

code profiling

I dont exactly see what this tells me?

@geky
Copy link
Member

geky commented Mar 24, 2023

Hi @TjoBitDk these are some interesting results, thanks for sharing.

One of main bottlenecks in littlefs is it's $O(m^2)$ metadata compaction, which I think is the main problem here. This is something I'm looking at very deeply at the moment.


As expected, the file creation time goes up when more and more files are in the folder. But some of the create time takes like seconds to do.

The way littlefs stores metadata is in small, 2-block logs. These are quick to append to, but need to be "compacted" when they are full. The current compaction algorithm is fairly naive, being built with a bad assumption that these blocks are "small", and leads to runaway read operations.

The reason for the spikes is that this cost is only paid when the metadata blocks are full. If they still have space append is pretty fast.

file no 92 takes almost 1.4 seconds.

This may be caused by the block-level garbage collector, though it would take more investigation to confirm.

Program size 256
Lookahead size 256
Cache size 256

Just to check, are changing cache size independently of program size? littlefs generally always benefits from a smaller program size. Larger program sizes limit the granularity of what can be written and requires adding padding. This can lead to more compactions and wasted erases when files aren't aligned.

Here is the same teste but with different cache sizes. With higher cache size we got frequently more spikes?

This is an unfortunate consequence of cache size determining inline-file limits. They were merged at one point to reduce the config options, but this was probably a mistake and they should be disentangled.

Understanding why it behaves this way requires understanding how inline files work in littlefs.

Normally, files are allocated and stored at the block level. But this is very wasteful for small files, so small files can be "inlined" directly into the metadata blocks. This is normally a net benefit, but very large inline files can increase the amount of metadata written, leading to more compactions.

The measurements you have show this quite nicely.

The threshold for inline files is currently controlled by two things:

  1. cache_size - for other reasons inline files need to be fully cacheable.
  2. 1/8th block_size - a somewhat arbitrary cutoff to avoid runaway compactions.

The 1/8th block_size is probably why you're seeing a change in behavior at 512 bytes (4096/8).


In theory the inline file threshold doesn't need to be bound to cache_size, as long as it's $\le$ cache_size. This should probably be reintroduced as a separate config option.

As for the $\gt$ second performance spikes, performance has previously not be a priority for littlefs, which is way certain scenarios behave rather terribly. It's something I'm working on, but will end up needing a significant amount of changes.

Hopefully this at least helps explain what's going on.

@geky geky added enhancement needs fix we know what is wrong labels Mar 24, 2023
@bitzer-tjo
Copy link
Author

Hi Geky,

Thanks for you technical explanation.
Would it be possible to control (by a seetting) to control this 'compacting' so we dont have reads runaways?.
The biggest issue for us in the case of power fail. Than we need to save some data to a file. But we cannt predict when it will compact and we could hit one of those 'long' file creation time. We have 1-2 seconds of backup power when we have power fail, so we have limmited time to save the file.
We can live with wasting storage space doing no (or limitied) compacting.

Thomas

@evpopov
Copy link

evpopov commented Mar 27, 2023

@TjoBitDk, Thanks for sharing your findings and starting this great discussion.
I don't want to side-track this, just wanted to mention a suggestion that may save you a bunch of time during writing of files. It's also probably worth a completely new thread but here goes....
Try creating a simple erase cache on power-up and then maintaining it on regular basis by traversing the fs and looking for newly abandoned blocks. In doing so, you can keep track of what blocks are "unused and erased" vs "unused and dirty" and proactively erase the dirty ones. Once implemented, this approach allows you greatly improve the performance of your .erase() callback. When LittleFS calls your .erase() function with a block that you know is already erased, simply return immediately.
There's also one more trick I pulled at boot time by reading ((lfs_t *)pLFS)->free.off which get's me the first block that LittleFS will attempt to write to and making sure my erased sector checker ( garbage collector ) begins there.
As long as your writing throughput is much lower that your ability to proactively erase sectors, your erase cache will "outrun" LittleFS's need for clean sectors and every call to .erase() will be returning immediately.
The above-describe erase cache ( garbage collect ) algorithm has made writing a lot faster and more predictable for me. Maybe someday I can share this in a PR if there's enough interest.

I hope that helps, and again, sorry for the slightly off-topic post.

@GaryJSAdv
Copy link

Following with interest as offline garbage collection (as part of an idle task) is something I'd quite like to add to the codebase I'm working on. I'd be interested to see your PR @evpopov. There's a related PR here:

#610

@geky
Copy link
Member

geky commented Apr 10, 2023

Sorry about the late response.

Would it be possible to control (by a seetting) to control this 'compacting' so we dont have reads runaways?.

Currently, no. Unfortunately. Compaction only triggers when the block is full so there is no way to delay compaction.

A wishlist item I've had would be to add a garbage collection routine that checks how full each metadata block is, and eagerly compacts if it's above a user-configured threshold. But this has been lower priority for me personally vs trying to solve the underlying issue in the on-disk data-structure.

Such a "metadata-threshold" garbage collector tackles a different form of garbage than @evpopov's proposal. There's two forms of "garbage" that need to be collected in littlefs. Outdated metadata in metadata blocks, and unused blocks for block allocation. Unfortunately I don't think @evpopov's proposal or #610 would help in the case, though they may help with the larger spike you are seeing.

It would be interesting to see both of the garbage collectors in commong gc routine that could be called in the background.

I'll update @evpopov gc discussion with more info (#791)


It is probably worth mentioning the metadata_max config added by @mon in #502:

littlefs/lfs.h

Lines 261 to 265 in 6a53d76

// Optional upper limit on total space given to metadata pairs in bytes. On
// devices with large blocks (e.g. 128kB) setting this to a low size (2-8kB)
// can help bound the metadata compaction time. Must be <= block_size.
// Defaults to block_size when zero.
lfs_size_t metadata_max;

This config optional artificially limits the metadata block size to put an upper limit on how long compact operations take. It can improve the speed, but wastes storage in metadata blocks.

It's heavy handed, as it is a temporary workaround, but may be helpful in this case if latency is worth the storage tradeoff.

@bitzer-tjo
Copy link
Author

Hi

I did try the metadata_max. And yes, it did intead help the perfomance.
Here are some test with default (4 KB) metadata_max:
image
...and 300 bytes metadata_max:
image

But this of course waste a lot of space.
Here is the number of possible number of files with different sizes in in a 200 KB filesystem with 4 KB and 300 bytes metadata_max:

image
So with file size of 100 bytes I could create 304 files on the 200 KB filesystem (That is still not many), but only 37 files with the meta data max to 300 bytes. The bigger the files more the number of possible files matches. I think that makes sense.
I also looked at allocated blocks with different meta data sizes after a 100 files was created, again with diffrent sizes

image

I would guess it all with in what is excpected.

@geky
Copy link
Member

geky commented Apr 19, 2023

Hi @TjoBitDk, thanks for sharing this. This is a really valuable analysis.

@CalebDueck
Copy link

I'm using NAND flash memory (W25N01GVZEIR) where each block of memory consists of 64 pages and each page contains 2048 bytes.
I experimented using different sizes of metadata_max and found that I got the highest speed when metadata_max<=page size with no advantage to decreasing metadata below the size of one page.

Below shows the speeds to writing 1000 rows to a file 1 row at a time.

writing rows diff metadata

A few questions:

  1. Is there any way I could call compaction (or whatever causes the drop in write time)? I know you mentioned there isn't a function for this but if you could let me now where this action takes place in lfs I could look into developing some custom code.

  2. How does the metadata max size waste space?
    If I set metadata to 1 page does that mean that I reduce my total storage capacity by pagesize/blocksize?
    Or is there just 1 instance of metadata for each file?

  3. Would uninitializing lfs after each operation make the speed faster? (I only need to write one row of 100-200 bytes once every 25 seconds)

@geky
Copy link
Member

geky commented May 23, 2023

Hi @CalebDueck, thanks for sharing.

Is there any way I could call compaction (or whatever causes the drop in write time)? I know you mentioned there isn't a function for this but if you could let me now where this action takes place in lfs I could look into developing some custom code.

The metadata commit layers are a bit complex, as commits can trigger a number of different operations (and a bit messy as the code has evolved over time).

The begining of compaction is here, in lfs_dir_relocatingcommit: lfs.c#L2229-L2239

This can end up:

  1. Relocating the mdir
  2. Splitting the mdir into two mdirs
  3. Creating orphans that need to be cleaned up

(Internally, lfs_mdir_t represents a fetched metadata pair, the naming could be a more consistent in documentation)

I think the easiest thing to do would be to take advantage of the fact that lfs_dir_commit always compacts when mdir.erased is false.

Triggering compaction could be done by:

  1. Fetch each mdir with lfs_dir_fetch, lfs_dir_rawtraverse may be a good reference for this.
  2. Check if mdir.off / lfs->cfg->block_size is > some threshold. Keep in mind compaction will only reduce an mdir to 1/2 the block_size. It may not be worth compacting if there is only one or two commits after the previous compaction.
  3. If you want to compact, set mdir.erased = false
  4. Call lfs_dir_commit with an empty commit (lfs_dir_commit(lfs, mdir, NULL, 0)), this should trigger compaction.

If you do get something working, do make a PR, even if it's not user-ready it could be a good starting point and would be appreciated.


How does the metadata max size waste space?
If I set metadata to 1 page does that mean that I reduce my total storage capacity by pagesize/blocksize?
Or is there just 1 instance of metadata for each file?

This is a good question. metadata_max reduces the storage capacity of each metadata block.

This means you end up needing effectively ~ $\frac{\texttt{block\_size}}{\texttt{metadata\_max}}$ times the number of metadata blocks, and the unused storage goes wasted. So if metadata_max is 1/4 the block size, you effectively need 4x the number of metadata blocks to store the same amount of metadata.

Multiple files can share metadata pairs, though the metadata related to a file will never span multiple metadata pairs. Each file can be uniquely identified by its metadata pair + a pair-local id.


Would uninitializing lfs after each operation make the speed faster?

I don't think so. The metadata logs reside on disk. If anything uninitializing littlefs would lead to more work, since it needs to scan for free blocks on the first block allocation after lfs_mount.

@CalebDueck
Copy link

Hi @geky, thanks for the details. I've talked with my team and we will put this on hold as we have found that it is running quickly enough for our needs as of right now.
If we determine we need higher or more deterministic timing we will dig deeper into metadata compaction.

Thanks for all your help!

@geky
Copy link
Member

geky commented Dec 21, 2023

lfs_fs_gc should have the ability to compact mdirs at a user-specified time soon: #913

@geky
Copy link
Member

geky commented Apr 29, 2024

Since this is being referenced elsewhere I should mention that #913 was merged in v2.9 and lfs_fs_gc can now be used to potentially move metadata compactions out of hot loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants