New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[efr32] Settings #4706
Comments
Diego, You are correct that the maximum number of writes on a flash word in between erase cycles is 2 on EFR32MG21. It is a statistically safe recommendation according to our testing. Writing a 3rd time may not be a problem immediately but it’s risky and violates our recommendation, in which case we can no longer guarantee that the flash always perform to its datasheet specifications such as write/erase cycle endurance and data retention. Two questions about PR #4552:
I should add that we ask those using EFR32 to adopt Silicon Labs NVM3 solution instead of the OT default NV manager. One thing to note is the token values cannot be expected to be preserved during firmware upgrade if switching from OT default NV manager to NVM3. Best regards, |
@yuping-xiao, thank you for the input. @LuDuda, can you comment on any flash limitations that the nRF52xx may have? |
@jwhui @dismirlian thanks for finding this issue! Indeed for nRF52840/nRF52833 the OPS says:
The nWRITE values is defined as 2 (maximum). I'll ask hardware engineers to elaborate what are the implicantions with writing to the same address 3 times. Most likely it follows the @yuping-xiao description. But in any case, I think we should try to follow the datasheets (now we know there are few CPU affected) and fix it asap. Can you Jonathan provide answers for these questions:
|
Hi @yuping-xiao, thanks for your response!
Each record begins with a
The code adds a record by writing on erased flash space, aligned to 4 bytes. During During
Ok, that's clear, thanks. |
@LuDuda, given that only |
@dismirlian, I think this could work. Of course, we would also need to support reading the existing delete flag for backward compatibility. Would you like to take a stab at implementing this? |
@jwhui I can give it a shot. @yuping-xiao, the EFR32MG21 reference manual says:
Could you clarify the last sentence? Specifically, what does it mean that "... the other 32-bit word cannot be used"? Thanks. |
@jwhui on second thought, a word can be written more than 2 times with this method. Imagine we put the deleted and first flags in openthread/src/core/utils/flash.cpp Lines 287 to 291 in 2f618bd
What do you think? Any other ideas? |
@dismirlian, some thoughts:
In summary, the last record with the "first" flag is where index 0 is. We should have at most 2 writes to the header (once when writing the header out after writing the value and potentially a second time if the record is deleted using an explicit index value). Do you think this will work? |
@dismirlian , any thoughts on the above? Have you been able to make any progress on this? |
Hi @jwhui, sorry for the delay, I've been busy... I'm planning to do it over the weekend. I'll keep you updated. |
@dismirlian about your question on April 10 about flash write on EFR32MG21 (sorry about the super late response, I missed it earlier but thought it'd be better late than never): The Flash memory is organized into 64-bit wide double-words. Each 64-bit double-word can be written only twice between erase cycles, so long as no bit is ever cleared (i.e. written as a 0) more than once. If a bit is written during the first write (1->0), then in the second write this bit should be masked. Any other “1” bits can be cleared to 0. For example, if first time we write Hope this is clearer now! |
Hi @silabs-YupingX, thanks for your answer. Yes, it's clear! @jwhui I think this would require a larger change to the settings driver. Given that the problem is specific for the EFR32MG2x platform, and that Silabs has transitioned to the NVM3 implementation, I don't think this is worth tackling. What do you think? |
Are you suggesting to close #4926 or keep it since it potentially helps the Nordic case? What do you think of my proposal in #4706 (comment) ? This would allow effectively two writes (once when we initially create the record, and only one more when we delete the record). |
Hi @jwhui, I think #4926 potentially helps the Nordic case. If you (and Nordic) agree, we could keep it (or a variation of it). I don't know if it's worth tackling the 64-bit word case. About your proposals:
I think this has the minor disadvantage that data written as 0xff will not be counted/"detected"; I know this is probably nitpicking.
I actually liked this one, but when you posted your idea I had already began working on the alternative I ended up proposing. If you prefer this one, I can update the PR. |
@dismirlian, thanks again for your contributions on this issue. In general, if we can solve the issue more generally, I prefer that. In this case, while EFR-32 has its preferred path moving forward, there may be other platforms in the future that have similar limitations. It would be nice not to have to keep churning on the flash layout.
One possibility it to check and avoid writing "0xff"?
Apologies for the additional effort. As mentioned above, I would prefer a more general solution. Thanks again for your effort on this. |
@dismirlian , any updates on this? Is this still something you would like to push forward? |
Hi @jwhui, I'm sorry for the slow responsiveness; I've been very busy. I can give it a shot this weekend. |
Hi @jwhui, I've been thinking about this. I think your proposals are a step in the right direction, but I still have some concerns:
In sum, I propose the following path forward:
As an unrelated issue, I propose implementing an erase counter for every swap area. I think this would provide valuable insights. This would be easy if we increase |
I wonder if we should go one step further and add some kind of integrity check to the end of the record, like a CRC-32. The integrity check would ensure the header and data were written properly. Thoughts?
Agree.
Why would Operation 4 cause 3 writes to record 2? Operation 3 should set the "delete" flag in record 1. Operation 4 should set the "delete" flag in record 2. I must be missing something.
An erase counter sounds like a great idea. |
Well, this would make the implementation much more robust. We could use the CRC16 implementation already present (here).
Maybe I'm making a mistake, but here is what I understand:
Commit 9e7e920 elides the marking of the record with
I have another concern:
What do you think? Thanks! |
Thanks for your continued effort on this.
Sounds good to me.
I was hoping to remove the "first" flag. We only write to a record a second time if it is deleted. In the case of deleting all records for a given key, we could write a new zero-length record at the end indicating that all previous records have been deleted.
I think the swap areas need to be sufficiently large to avoid wear. If the swap areas are too small, then each new write could cause an erase. My inclination is that the size increase should not be much of a concern. |
Ok, so for the BTW, I think that marking the record "deleted" instead of writing a zero-length record at the end would simplify the
Great! |
Yes, I think the zero-length record is an optimization. I'll leave it to your judgement on which approach to take. |
Recently, a rework of the settings storage was merged (see PR #4552), which retains format compatibility with the original implementation. This settings format (both the original and new implementations) has an issue with the efr32 platform:
The MG12 datasheet states:
The MG21 datasheet says something similar:
The current implementation writes the
RecordHeader
twice duringAdd
and once more duringDelete
.Additionally, PR #4521 provides an alternate implementation of the settings area, specifically for the efr32 platforms. This will not retain compatibility with the old format.
I wanted to "expose" a potential issue, for those with devices on the field:
In our case, we are not impacted by this potential problem, because we don't have devices on the field. I am opening this issue to see to what extent this is an actual issue or not. If it's not, then we can close it.
Thanks!
The text was updated successfully, but these errors were encountered: