Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read/prog sizes for SPI NAND flash with internal HW-buffer #277

Open
toesus opened this issue Aug 28, 2019 · 10 comments
Open

read/prog sizes for SPI NAND flash with internal HW-buffer #277

toesus opened this issue Aug 28, 2019 · 10 comments

Comments

@toesus
Copy link

toesus commented Aug 28, 2019

Hi,
I am trying to use littlefs on a SPI nand flash device. This device has an internal buffer of 1 page (2048 bytes), while the erasable block size is 64 pages.
Reading/Writing from the flash memory to the hw-buffer is done in full 2048 byte pages, so my first guess for read_size and prog_size was 2048.

However, SPI-access to the hw-buffer can be done one byte at a time. This means that I could set the read_size to 1. In the same way I could set the program size to 1 as well. If lfs tried to write less than a full page, I could first reset the hw-buffer, write only some part, and then program the buffer content to flash.

So I guess technically I could choose any value from 1 to 2048 for the read_size and prog_size, is there any guidelines on what would be an useful value?

@lsilvaalmeida
Copy link

I can't remember where, but I saw geky saying that smaller read/prog size can improve performance.

@geky
Copy link
Member

geky commented Sep 4, 2019

So this is where the difference between cache_size vs read_size becomes useful.

littlefs reads in multiples of the read_size (the same is true for writing). So it's more an alignment field than an upper-limit.

What this means is if you have cache_size=2048, read_size=4, and you write to a large file, littlefs will send the data to the block device in 2048 byte chunks aligned to 4 bytes. Small files and metadata will be written out in smaller chunks, but still aligned to 4 bytes.

So what you can do, is set cache_size=2048, read_size=1, and write a greedy read function. This is one of the best ways to take advantage of these small buffers without limiting the block device.

int my_read(const struct lfs_config *c, lfs_block_t block,
        lfs_off_t off, void *buffer, lfs_size_t size) {
    uint8_t *data = buffer;
    while (size > 0) {
        if (off % MY_PAGE_SIZE == 0 && size >= MY_PAGE_SIZE) {
            // read a page
            do_page_read(block, off, buffer);
            buffer += MY_PAGE_SIZE;
            off += MY_PAGE_SIZE;
            size -= MY_PAGE_SIZE;
        } else {
            // read a byte
            do_byte_read(block, off, buffer);
            buffer += 1;
            off += 1;
            size -= 1;
        }
    }
    return 0;
}

If lfs tried to write less than a full page, I could first reset the hw-buffer, write only some part, and then program the buffer content to flash.

Does the NAND device support rewriting some bits that way? Sometimes flash parts only support a single write after erase. May be worth checking. If not it's find for read_size != prog_size.


As for as performance:

The smallest prog size improves wear on the device, as less bytes are wasted on alignment.

In theory, a smaller prog size also means fewer garbage collection cycles, which improves performance. It also means a smaller amount of bytes sent over the wire, though this may be inconsequential.

However, we are finding scalability issues with the number of commits in a directory, and smaller prog size = more commits (#203). This is especially an issue for NAND devices, where blocks are very large.

So if you need performance, the best answer is to experiment. You may find that small prog size turns into too many commits for fetching to run efficiently. It mostly depends on your use case.

@toesus
Copy link
Author

toesus commented Sep 9, 2019

I have implemented the read according to your proposed "greedy" function, and it works well. Lfs seems to either read only a few bytes, or a complete cache-size of 2kb. So far I have only tested it with small files, but eventually I want to use up to about 1MB per file (but only a small number of files).

Does the NAND device support rewriting some bits that way? Sometimes flash parts only support a single write after erase. May be worth checking. If not it's find for read_size != prog_size.

According to the datasheet it supports up to 4 partial writes per page. So I guess this could be forced by setting prog-size to a quarter page. For now I have just left prog-size == page-size.
However, I'm worried about a requirement on the sequence of page writes. The datasheet mentions:

The pages within the block have to be programmed sequentially from the lower order page address to the higher order page address within the block. Programming pages out of sequence is prohibited.

Is there any chance that this is enforced by lfs? Or maybe also just working by chance? So far I have not run into an issue...

@geky
Copy link
Member

geky commented Sep 9, 2019

So I guess this could be forced by setting prog-size to a quarter page.

That's a clever solution. 👍

The pages within the block have to be programmed sequentially from the lower order page address to the higher order page address within the block.

This is guaranteed by littlefs. 👍

Because of how the erase -> prog two step write works, it's actually kinda challenging to make a filesystem that writes to prog pages in an erase block out of order. I don't know of any filesystems that don't write pages sequentially.

@sslupsky
Copy link

sslupsky commented Nov 5, 2019

Thank you to everyone that has contributed here. This information has helped me create an SPI NAND flash driver.

There is an edge case that I am uncertain about and would like to seek some input.

Here are the device specific parameters:

#define TC58CVG2S0_PAGE_DATA_BYTES 4096 ///<  See datasheet page 8
#define TC58CVG2S0_PAGE_EXTRA_BYTES 128
#define TC58CVG2S0_PAGE_ECC_BYTES 128
#define TC58CVG2S0_PARTIAL_PAGE_WRITES 4
#define TC58CVG2S0_PARTIAL_PAGE_BYTES (TC58CVG2S0_PAGE_DATA_BYTES / TC58CVG2S0_PARTIAL_PAGE_WRITES)
#define TC58CVG2S0_PAGES_PER_BLOCK 64 ///<  See datasheet page 8
#define TC58CVG2S0_BLOCK_COUNT 2048   ///<  See datasheet page 8
#define TC58CVG2S0_BLOCK_BYTES (TC58CVG2S0_PAGES_PER_BLOCK * TC58CVG2S0_PAGE_DATA_BYTES)
#define TC58CVG2S0_TOTAL_BYTES (TC58CVG2S0_BLOCK_BYTES * TC58CVG2S0_BLOCK_COUNT)
#define TC58CVG2S0_INVALID_PAGE 0xffffffff

And here are the LFS configuration parameters:

#define EXTERNAL_LFS_READ_SIZE   (TC58CVG2S0_PAGE_DATA_BYTES / TC58CVG2S0_PARTIAL_PAGE_WRITES)
#define EXTERNAL_LFS_PROG_SIZE   (TC58CVG2S0_PAGE_DATA_BYTES / TC58CVG2S0_PARTIAL_PAGE_WRITES)
#define EXTERNAL_LFS_BLOCK_COUNT TC58CVG2S0_BLOCK_COUNT
#define EXTERNAL_LFS_BLOCK_SIZE  TC58CVG2S0_BLOCK_BYTES
#define EXTERNAL_LFS_FS_SIZE     TC58CVG2S0_TOTAL_BYTES
#define EXTERNAL_LFS_PARTIAL_PAGE_WRITES  TC58CVG2S0_PARTIAL_PAGE_WRITES
#define EXTERNAL_LFS_CACHE_SIZE  EXTERNAL_LFS_PROG_SIZE

At the moment, my driver commits the data from the LFS "prog" callback to the on chip hw buffer. If the "prog" spans across a page boundary of the device, each page is committed to flash every time "prog" crosses the page boundary. The last partial page of data is not committed to flash until the LFS "sync" callback.

I think I understand that LFS will "sync" after a "prog". Is there ever a case where LFS could "read" from a different page between the "prog" and the "sync"? Do I need to check for this case during "read" and sync the hw buffer before loading a new page if there is a "pending sync"?

@pauloet
Copy link

pauloet commented May 21, 2021

I've put the read boot count working, but it's not working well. It counts well until a non fixed number (73, 79, 80, 84 or another one around theses ones) and then, the LFS erases Block 0 (where the file is being accessed). After erasing it, it count number is not read correctly. Is it normal? Here are the logs:

[FLASHAPP] boot_count: 71
...
[FLASHAPP] boot_count: 72
...
[LFS] lfs_file_open(0xa0002034, 0xa00020ac, "boot_count", 103)
[LFS] Read Block: 1 Offset: 0 Size: 16
[FLASH] Read page. Block: 1. Page: 0. Col: 0. Size: 16
[LFS] Read Block: 0 Offset: 0 Size: 16
[FLASH] Read page. Block: 0. Page: 0. Col: 0. Size: 16
[LFS] Read Block: 1 Offset: 0 Size: 4096
[FLASH] Read page. Block: 1. Page: 0. Col: 0. Size: 4096
[LFS] lfs_file_open -> 0
[LFS] lfs_file_read(0xa0002034, 0xa00020ac, 0xa001fea8, 4)
[LFS] lfs_file_read -> 4
[LFS] lfs_file_rewind(0xa0002034, 0xa00020ac)
[LFS] lfs_file_rewind -> 0
[LFS] lfs_file_write(0xa0002034, 0xa00020ac, 0xa001fea8, 4)
[LFS] lfs_file_write -> 4
[LFS] lfs_file_close(0xa0002034, 0xa00020ac)
[LFS] Prog Block: 1 Offset: 1248 Size: 16
[FLASH] Write page. Block: 1. Page: 0. Col: 1248. Size: 16
[LFS] Sync done
[LFS] Read Block: 1 Offset: 1248 Size: 16
[FLASH] Read page. Block: 1. Page: 0. Col: 1248. Size: 16
[LFS] lfs_file_close -> 0
[FLASHAPP] boot_count: 73

...

[LFS] lfs_file_open(0xa0002034, 0xa00020ac, "boot_count", 103)
[LFS] Read Block: 1 Offset: 0 Size: 16
[FLASH] Read page. Block: 1. Page: 0. Col: 0. Size: 16
[LFS] Read Block: 0 Offset: 0 Size: 16
[FLASH] Read page. Block: 0. Page: 0. Col: 0. Size: 16
[LFS] Read Block: 1 Offset: 0 Size: 4096
[FLASH] Read page. Block: 1. Page: 0. Col: 0. Size: 4096
[LFS] lfs_file_open -> 0
[LFS] lfs_file_read(0xa0002034, 0xa00020ac, 0xa001fea8, 4)
[LFS] lfs_file_read -> 4
[LFS] lfs_file_rewind(0xa0002034, 0xa00020ac)
[LFS] lfs_file_rewind -> 0
[LFS] lfs_file_write(0xa0002034, 0xa00020ac, 0xa001fea8, 4)
[LFS] lfs_file_write -> 4
[LFS] lfs_file_close(0xa0002034, 0xa00020ac)
[FLASH] Block 0 erased. Raddr: 0
[LFS] Read Block: 0 Offset: 80 Size: 16
[FLASH] Read page. Block: 0. Page: 0. Col: 80. Size: 16
[LFS] Prog Block: 0 Offset: 0 Size: 80
[FLASH] Write page. Block: 0. Page: 0. Col: 0. Size: 80
[LFS] Sync done
[LFS] Read Block: 0 Offset: 0 Size: 80
[FLASH] Read page. Block: 0. Page: 0. Col: 0. Size: 80
[LFS] lfs_file_close -> 0
[FLASHAPP] boot_count: 23

@zbqxyz
Copy link

zbqxyz commented Aug 11, 2021

littlefs can manage the bad blocks of nand?

@kyrreaa
Copy link

kyrreaa commented Feb 12, 2024

Just a small comment here. It may be worth seeing what effects I got from modifying write_size for an even larger NAND flash with 4096 byte pages and 64x4096 byte erase-blocks. #661
Since the datasheet said it was allowed to do 4 non-overlapping partial programmings I did so. This both sped up the traversal and later re-packing of data and reduced the initial overhead in metadata.

I also want to warn about the "internal programming buffer" and sync, as it appears sync is only littlefs synch, and non flash-sync. Code does not show any calls to any sync after calling lfs_bd_flush() which relies on the .prog configured for the block device.

@geky
Copy link
Member

geky commented Feb 27, 2024

I also want to warn about the "internal programming buffer" and sync, as it appears sync is only littlefs synch, and non flash-sync. Code does not show any calls to any sync after calling lfs_bd_flush() which relies on the .prog configured for the block device.

@kyrreaa, I'm curious, is this resolved by #948?

@kyrreaa
Copy link

kyrreaa commented Feb 27, 2024

I have not looked very hard at it but to me the idea of calling synch here seems reasonable. Also I did make a mistake as I have found the call to lfs->cfg->sync in lfs_bd_sync() which is the underlying block device sync. This is what is required to make sure any caching in partial program buffers etc is committed to memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants