Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVS should allow overwriting existing index even if there's no room to keep the old value #51019

Closed
maximevince opened this issue Oct 6, 2022 · 9 comments
Assignees
Labels
Enhancement Changes/Updates/Additions to existing features

Comments

@maximevince
Copy link
Contributor

Is your enhancement proposal related to a problem? Please describe.
When the NVS pages are full (i.e. the item of the size we're trying to write doesn't fit on the existing pages anymore),
but we want to overwrite an existing data-item (which might be of smaller or equal size),
NVS starts moving (through the GC) all id-data pairs on all pages to a new page, then looping back to the first page and finally bailing out without a successful update. (because it detected the loop condition: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/fs/nvs/nvs.c#L1073)
See below for an example log.

Describe the solution you'd like
I'd like to be able to update the data-item of the existing index, with a smaller or equally-size data item.
IMO, it should be possible by doing it this way:

  • find the id/data pair that must be updated
  • copy all other items on this page to the new / scratch page, but don't copy the item that must be updated
  • write the updated id/data pair to the new page as well
  • erase the original page, which now becomes your new empty / scratch page.

Describe alternatives you've considered
As an alternative approach, you could manually delete the id/data pair first, then write the new item. In which case the operation (obviously) succeeds. However, it leaves a tiny gap where data could be lost, in case of power-off / reset / ...

Additional context
Here the log of what happens when trying to replace a big chunk (eg 1800 bytes) when 5 of those are already present:

*** Booting Zephyr OS build v3.1.99-ncs1  ***
[00:00:00.000,000] <dbg> fs_nvs: nvs_recover_last_ate: Recovering last ate from sector 0
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <inf> fs_nvs: 4 Sectors of 4096 bytes
[00:00:00.000,000] <inf> fs_nvs: alloc wra: 0, fe8
[00:00:00.000,000] <inf> fs_nvs: data wra: 0, 0
[*] Writing idx 0
Generated random data for idx 0: Wrote data@idx 0: 1800
[*] Writing idx 1
Generated random data for idx 1: Wrote data@idx 1: 1800
[*] Writing idx 2
Generated random data for idx 2: [00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fe000, len 4096
Wrote data@idx 2: 1800
[*] Writing idx 3
Generated random data for idx 3: Wrote data@idx 3: 1800
[*] Writing idx 4
Generated random data for idx 4: [00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at ff000, len 4096
Wrote data@idx 4: 1800
[*] Writing idx 5
Generated random data for idx 5: Wrote data@idx 5: 1800
[*] Overwriting idx 0
Generated random data for idx 0: [00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 1, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 0, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at fe0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fc000, len 4096
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 3, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 2, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at fe0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fd000, len 4096
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 5, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 4, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at fe0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fe000, len 4096
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 0, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 1, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at fe0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at ff000, len 4096
Wrote data@idx 0: -28

So we see that all existing entries are shuffled around, which is causes useless page writes and erases, only to result in a failure to write data to idx 0.

What I would like to see is that index 0 is not being garbage collected / moved around, but in stead the new data entry for index 0 is written to it's new destination sector at once.
I've created a quick hack of nvs.c to demonstrate that this might work, but wanted to consult here of any possible side-effects or design considerations which I might not be aware of.
See the patch over here: https://gist.github.com/maximevince/2e3b83f56cc0374bb01cf6e67fe5139d

This result in the follow log for the same test case as above:

*** Booting Zephyr OS build v3.1.99-ncs1  ***
[00:00:00.000,000] <dbg> fs_nvs: nvs_recover_last_ate: Recovering last ate from sector 0
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <inf> fs_nvs: 4 Sectors of 4096 bytes
[00:00:00.000,000] <inf> fs_nvs: alloc wra: 0, fe8
[00:00:00.000,000] <inf> fs_nvs: data wra: 0, 0
[*] Writing idx 0
Generated random data for idx 0: Wrote data@idx 0: 1800
[*] Writing idx 1
Generated random data for idx 1: Wrote data@idx 1: 1800
[*] Writing idx 2
Generated random data for idx 2: [00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fe000, len 4096
Wrote data@idx 2: 1800
[*] Writing idx 3
Generated random data for idx 3: Wrote data@idx 3: 1800
[*] Writing idx 4
Generated random data for idx 4: [00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at ff0
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at ff000, len 4096
Wrote data@idx 4: 1800
[*] Writing idx 5
Generated random data for idx 5: Wrote data@idx 5: 1800
[*] Overwriting idx 0
Generated random data for idx 0: [00:00:00.000,000] <dbg> fs_nvs: nvs_gc: Moving 1, len 1800
[00:00:00.000,000] <dbg> fs_nvs: nvs_gc: NOT Moving 0, len 1800, since this is the one we're replacing
[00:00:00.000,000] <dbg> fs_nvs: nvs_add_gc_done_ate: Adding gc done ate at fe8
[00:00:00.000,000] <dbg> fs_nvs: nvs_flash_erase_sector: Erasing flash at fc000, len 4096
Wrote data@idx 0: 1800
[*] Reading back idx 0
Read BHP idx 0: 1800

So now index 0 is not being moved before writing the new index 0 entry, and it doesn't trigger the whole moving-everything-around loop, either.

If someone with the right knowledge of NVS can confirm that this could indeed be a possible improvement, I can create a proper PR out of it.

Looking forward to your feedback.

@maximevince maximevince added the Enhancement Changes/Updates/Additions to existing features label Oct 6, 2022
@Laczen
Copy link
Collaborator

Laczen commented Oct 6, 2022

@maximevince, I understand your request.

The reason nvs does not allow this is because you might loose data if there is a powerdown during the proposed operation as you correctly state. A second problem that might arise is that the new write takes up more space and so it might not fit leaving the nvs system without data.

The work around you propose (first deleting, then writing) is exactly the method that should be used and the decision to do this is down to the user.

@maximevince
Copy link
Contributor Author

@Laczen, thank for you quick response.

I am wondering, however, if there isn't a way to avoid the potential loss of data.
If we only delete the old entry 0, after writing the new entry 0 to the new sector, there is no risk of losing any data, right?
Even if in between operations there would be a powerdown?

@Laczen
Copy link
Collaborator

Laczen commented Oct 6, 2022

@Laczen, thank for you quick response.

I am wondering, however, if there isn't a way to avoid the potential loss of data. If we only delete the old entry 0, after writing the new entry 0 to the new sector, there is no risk of losing any data, right? Even if in between operations there would be a powerdown?

Yes, but then there would be the second problem, what if larger data is written, is there space enough? And there is another problem, suppose the old data is not in the last sector: the new data is written and the sector has insufficient space to copy the data in the last sector; this would be a gc failure.

@Laczen
Copy link
Collaborator

Laczen commented Oct 9, 2022

@maximevince, can this be closed ? There already is a solution (first erase, then write).

@maximevince
Copy link
Contributor Author

I would like to elaborate a little more, if that's okay.

@maximevince
Copy link
Contributor Author

@Laczen, thank for you quick response.
I am wondering, however, if there isn't a way to avoid the potential loss of data. If we only delete the old entry 0, after writing the new entry 0 to the new sector, there is no risk of losing any data, right? Even if in between operations there would be a powerdown?

Yes, but then there would be the second problem, what if larger data is written, is there space enough? And there is another problem, suppose the old data is not in the last sector: the new data is written and the sector has insufficient space to copy the data in the last sector; this would be a gc failure.

For the problem of the bigger size: that's the same in both cases, no? If you want to write a bigger entry, and there's no room for it, the action should fail. That seems normal behavior to me.
In that case, you could still keep the old entry (history), since it's not been deleted yet, right?

@Laczen
Copy link
Collaborator

Laczen commented Oct 9, 2022

For the problem of the bigger size: It needs to decide before it writes, not after.
For the second problem: there is no space enough for saving other entries from the oldest sector before it is deleted. Space will become available, but it is not available when it is needed.

BTW: Geniet van je zondag, vanuit een zonnig Koksijde.

@maximevince
Copy link
Contributor Author

Hi,
Thanks for the fast replies. I'm okay to close the ticket, although I feel there's still a little room for improvement here.
I can live with the delete + write workaround for my current use-case.

P.S: Haha, geniet van 't zeetje! Groetjes vanuit het ook zonnige Hasselt ;)

@Laczen
Copy link
Collaborator

Laczen commented Oct 9, 2022

@maximevince, if you find a solution that would guarantee no data is lost a PR is welcome. If you have other questions/proposals for nvs do not hesitate to add them as issue/PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Changes/Updates/Additions to existing features
Projects
None yet
Development

No branches or pull requests

2 participants