Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding superblock deletes File #962

Open
nobody19 opened this issue Apr 8, 2024 · 7 comments
Open

Expanding superblock deletes File #962

nobody19 opened this issue Apr 8, 2024 · 7 comments
Labels
needs investigation no idea what is wrong

Comments

@nobody19
Copy link

nobody19 commented Apr 8, 2024

Hi,
I have some problems with the superblock expansion. When I'm incrementing the bootcounter 500 times, then lfs is reporting "Expanding superblock at rev 1001" and after that the bootcounter file is not anymore available. Currently I'm incrementing the bootcounter in a for loop, to get the error quite fast.

I'm using lfs v2.9.1 with STM32U5 octaspi flash MX25UM51245G.

image
Maybe this is relevated to #953

Thanks a lot for any comment which can help me to solve the problem.
nobody

@geky geky added the needs investigation no idea what is wrong label Apr 12, 2024
@geky
Copy link
Member

geky commented Apr 12, 2024

Hi @nobody19, thanks for creating an issue.

I think you're right this is the same issue in #953 (ignoring the #959 red herring).

If you can reproduce this efficiently that's really quite promising.

The next goal is to try to reproduce this locally, to rule out hardware issues and make debugging tractable. So, sorry for the barrage of questions:

  1. It looks like this is, or at least started as the example in the README.md.

    Is there many other operations going on? Is the littlefs-specific code sharable? Or could be reduced to sharable code that still reproduces the issue?

  2. What is your block_size x block_count?

  3. What is your block_cycles?

    Actually, sharing your entire lfs_config struct may help reproduce this.

@nobody19
Copy link
Author

Hi @geky,

Thanks a lot for your help. Yes it is more ore less the example from the README. Below I added the lfs structure and in the file should be all function which I'm using to run littleFS. If you have any input, just let me now.

const struct lfs_config mylfs_cfg = {
    .context = nullptr,
    // block device operations
    .read  = user_provided_block_device_read,
    .prog  = user_provided_block_device_prog,
    .erase = user_provided_block_device_erase,
    .sync  = user_provided_block_device_sync,

    // block device configuration
    .read_size = 32,
    .prog_size = 256,
    .block_size = 4096,
    .block_count = 16384,
    .block_cycles = 500,
    .cache_size = 4096,
    .lookahead_size = LFS__LOOKAHEAD_CNT,
    .read_buffer = lfs_read_buffer,
    .prog_buffer = lfs_prog_buffer,
    .lookahead_buffer = (uint8_t*)lfs_lookahead_buffer,
    .name_max = LFS_NAME_MAX, 
    .file_max = LFS_FILE_MAX,
    .attr_max = LFS_ATTR_MAX
};

FileSys.txt

@geky
Copy link
Member

geky commented Apr 16, 2024

Hi @nobody19, thanks for the extra info.

Unfortunately I wasn't able to reproduce this locally. test_writeBootCountToFile was able to trigger superblock expansion, but the resulting filesystem continued operating without data loss.

Is it possible that the BSP_OSPI_NOR_Erase_Block address should not be multiplied by the block_size? I couldn't really find documentation over this function, though I only searched briefly. Superblock expansion may be the first time when littlefs tries to use blocks > 0 and 1, so I could see it wrapping around and corrupting things.


Also just a note:

file.cfg = &fcfg;
int retOpen =  lfs_file_open(&lfs, &file, "boot_count2", LFS_O_RDWR);

The file.cfg = &fcfg here doesn't actually do anything. You need to call lfs_file_opencfg, any initial state in lfs_file_t is ignored:

int retOpen =  lfs_file_opencfg(&lfs, &file, "boot_count2", LFS_O_RDWR, &fcfg);

I don't think this is the source of the problem though, littlefs will fall back to malloc.

@nobody19
Copy link
Author

Hi @geky

I would also expect some trouble in the Erase function, but it looks that this multiplication by block_size is correct. I checked there the code and also the datasheet. The datasheet is expecting the sector_erase as a 24/32 Bit adress.

image

Thanks for the note.

@nobody19
Copy link
Author

I modified the code a bit to get some more information. Now I'm almost using the code from the example. Now I realised that I'm already have earlier problem. When I'm mounting it once (before the for loop), then I'm have already problem to read the boot_counter after some counts. But when I'm mounting and dismounting in the for loop then I have only problem after the superblock expansion. Do you have any input/explanation on this behavior?

Current code:

//	int err = lfs_mount(&lfs, &mylfs_cfg);

    // reformat if we can't mount the filesystem
    // this should only happen on the first boot
//    if (err) {
        lfs_format(&lfs, &mylfs_cfg);
        lfs_mount(&lfs, &mylfs_cfg);
//    }

    // read current count
    for(int i=0;i<500;i++){
//    	lfs_mount(&lfs, &mylfs_cfg);
		uint32_t boot_count = 0;
		lfs_file_open(&lfs, &file, "boot_count", LFS_O_RDWR | LFS_O_CREAT);
		lfs_file_read(&lfs, &file, &boot_count, sizeof(boot_count));

		// update boot count
		boot_count += 1;
		lfs_file_rewind(&lfs, &file);
		lfs_file_write(&lfs, &file, &boot_count, sizeof(boot_count));

		// remember the storage is not updated until the file is closed successfully
		lfs_file_close(&lfs, &file);

		// print the boot count
		SEGGER_RTT_printf(0,"boot_count: %d\n", boot_count);
//		lfs_unmount(&lfs);

    }

	// release any resources we were using
    lfs_unmount(&lfs);

mounting/dismounting every time:
image
image

boot_count: 496
boot_count: 497
../Middlewares/Third_Party/littlefs-2.9.1/lfs.c:2182:debug: Expanding superblock at rev 1502
boot_count: 498
assertion "lfs_mlist_isopen(lfs->mlist, (struct lfs_mlist*)file)" failed: file "../Middlewares/Third_Party/littlefs-2.9.1/lfs.c", line 6127, function: lfs_file_read

mounting once:
image
image

@nobody19
Copy link
Author

I found out why it first "crashes" around 500. It is depending on the block_cycle, which I have set to 500. But the "why" is not clear for me yet.

    // Number of erase cycles before littlefs evicts metadata logs and moves
    // the metadata to another block. Suggested values are in the
    // range 100-1000, with large values having better performance at the cost
    // of less consistent wear distribution.
    //
    // Set to -1 to disable block-level wear-leveling.
    int32_t block_cycles;

@geky
Copy link
Member

geky commented Apr 18, 2024

I would also expect some trouble in the Erase function, but it looks that this multiplication by block_size is correct. I checked there the code and also the datasheet. The datasheet is expecting the sector_erase as a 24/32 Bit adress.

Ah yeah, looks like your right. I had searched for BSP_OSPI_NOR_Erase_Block and found some code that omitted the multiplication, but it could have just been a coincidence.

When I'm mounting it once (before the for loop), then I'm have already problem to read the boot_counter after some counts. But when I'm mounting and dismounting in the for loop then I have only problem after the superblock expansion. Do you have any input/explanation on this behavior?

Hmm, if the code is more stable with remounting, that suggests something is going wrong on the device/RAM side. Or the driver is falling out of sync with what actually exists on disk. Since the remount could be temporarily fixing whatever is going wrong.

assertion "lfs_mlist_isopen(lfs->mlist, (struct lfs_mlist*)file)" failed: file "../Middlewares/Third_Party/littlefs-2.9.1/lfs.c", line 6127, function: lfs_file_read

This assert is failing because the file is not open. Mostly likely an error occurred during lfs_file_open. You can't continue to use the file after an error occurs.

Though if the error is LFS_ERR_NOENT because the file disappeared that is a problem.

I found out why it first "crashes" around 500. It is depending on the block_cycle, which I have set to 500. But the "why" is not clear for me yet.

The superblock expansion, and mdir relocations, etc, are controlled by block_cycles. block_cycles determines the number of erases allowed to a block before the data must move to another block. In the case of the superblock, the superblock can't really move, so an additional superblock is created to delay wear.

You could make it very small temporarily to speed up debugging. Unfortunately I haven't been able to reproduce what you're seeing even with block_cycles=2...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation no idea what is wrong
Projects
None yet
Development

No branches or pull requests

2 participants