Skip to content

Use a tempfile / rename for atomic writes in order to prevent dataloss during a HardFault on a repeater#2386

Open
winnieXY wants to merge 2 commits intomeshcore-dev:devfrom
winnieXY:atomic-write-on-repeater
Open

Use a tempfile / rename for atomic writes in order to prevent dataloss during a HardFault on a repeater#2386
winnieXY wants to merge 2 commits intomeshcore-dev:devfrom
winnieXY:atomic-write-on-repeater

Conversation

@winnieXY
Copy link
Copy Markdown

This is the first patch to make repeaters more reliable in the field - even if errors occur.

On a repeater the atomic write (tmpfile + rename) should work even on nrf52 Repeaters with internal flash, the amount of data stored is much smaller than the data used on a companion. I estimated the total size needed to be around 40-50kB with MAX_CLIENTS and MAX_REGIONS = 32.

This patch should ensure that no setting is lost during a failed write operation into the flash on a repeater. The repeater will be still stuck - but after restart all settings are preserved.

This is most likely the first of several smaller patches to mitigate this bug #2283.

@winnieXY
Copy link
Copy Markdown
Author

@oltaco : Could you review this patch? For repeaters and roomservers the 1MB flash ram is sufficient for this operation. I've tested that on a P1 and it's working.

Comment thread src/helpers/ClientACL.cpp Outdated
#include "ClientACL.h"

static File openWrite(FILESYSTEM* _fs, const char* filename) {
static File openWriteTemp(FILESYSTEM* _fs, const char* filename, const char* tmp_filename) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cleaner way would be to keep method sig the same, but for NRF52 case, concat ".tmp" to the given filename, as use that to open in write mode

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be much cleaner to just let openWrite handle the atomicity for the NRF52 case. All the openWrite callsites could then just stay as they are.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I added a getTmpPath() function to handle the tmp path name generation and moved back to openWrite().

Comment thread src/helpers/ClientACL.cpp
}
file.close();

#if defined(NRF52_PLATFORM) || defined(STM32_PLATFORM)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oltaco Can you verify that _fs->rename() definitely will work in this case? ie. does the real_path file get overwritten/replaced?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed, the underlying lfs_rename() call will point the old directory entry for file at the blocks referenced by file.tmp and then remove the directory entry for file.tmp

@oltaco
Copy link
Copy Markdown
Member

oltaco commented Apr 26, 2026

I estimated the total size needed to be around 40-50kB with MAX_CLIENTS and MAX_REGIONS = 32.

I think the total maximum size of files on disk for repeaters/rooms is much less than that, I think less than 10kB in total? The UserData area that repeaters and room servers use is only 28kb total (actual usable space is less than due to overheads).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants