Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime assertion in lfs_format due to uninitialized variable use (VS 2022) #756

Open
knaxr opened this issue Dec 9, 2022 · 7 comments
Open
Labels

Comments

@knaxr
Copy link

knaxr commented Dec 9, 2022

MSVC debug build will emit a runtime assertion related to the use of uninitialized variable in
lfs_dir_traverse:
"Run-Time Check Failure #3 - The variable 'disk' is being used without being initialized."
This occurs when calling lfs_format on a completely erased (0xff bytes) ram block device.
callstack:

lfs_dir_traverse() Line 850 (littlefs-2.5.1\lfs.c:850)
lfs_dir_compact() Line 1799 (littlefs-2.5.1\lfs.c:1799)
lfs_dir_splittingcompact() Line 2000 (littlefs-2.5.1\lfs.c:2000)
lfs_dir_relocatingcommit() Line 2117 (littlefs-2.5.1\lfs.c:2117)
lfs_dir_orphaningcommit() Line 2198 (littlefs-2.5.1\lfs.c:2198)
lfs_dir_commit() Line 2370 (littlefs-2.5.1\lfs.c:2370)
lfs_rawformat() Line 4066 (littlefs-2.5.1\lfs.c:4066)
lfs_format() Line 5324 (littlefs-2.5.1\lfs.c:5324)

We removed the displayed assertion message box by initializing struct lfs_diskoff disk with 0 in lfs.c line 811:
struct lfs_diskoff disk = {0};

Can someone confirm, that zero-initializing disk is safe or if this may lead to unexpected side effects?
If required, I'm happy to provide more information.

@geky geky added the needs investigation no idea what is wrong label Dec 12, 2022
@geky
Copy link
Member

geky commented Dec 12, 2022

Hi @knaxr, thanks for creating an issue. Do you know does MSVC report which line is responsible for the uninitialized read?

I wonder if this is caused by the move of the variable onto the explicit stack here. This stack contains most of the variables in the function, some which are uninitialized but never read depending on the code path.

I'm also curious why the Valgrind testing doesn't report this.

@knaxr
Copy link
Author

knaxr commented Dec 13, 2022

Hi @geky,

To my understanding line 849 is the statement responsible for the assertion. Line 846 is the according read operation.

msvc_runtime_assertion

@downtimes
Copy link

Hello,

We just wanted to update from 2.4.2 to the newest 2.5.1 and ran into the same problem under VS 2019. Our test suite also initializes an empty file system in a ram buffer and runs into the same not initialized exception. We get it on the same line 864 when copying the value to the explicit stack.

@downtimes
Copy link

Hello again,

this issue is currently marked as "needs investigation". Therefore, I wanted to ask if we could help in any way to get this minor fix under way.
On my end, I tried to isolate the problem further. I can roughly describe what we are doing:

Scenario:

We have a big buffer array of bytes in ram and want to initialize the file system with lfs_format.
The big buffer is initialized with 0xFF and has size of 4096 * 4096 bytes (16Mb)
LittleFS version is the tag 2.7.0

Callstack:

lfs_dir_traverse(lfs * lfs, const lfs_mdir * dir, unsigned int off, unsigned int ptag, const lfs_mattr * attrs, int attrcount, unsigned int tmask, unsigned int ttag, unsigned short begin, unsigned short end, short diff, int(*)(void *, unsigned int, const void *) cb, void * data) Line 899
lfs_dir_compact(lfs * lfs, lfs_mdir * dir, const lfs_mattr * attrs, int attrcount, lfs_mdir * source, unsigned short begin, unsigned short end) Line 1952
lfs_dir_splittingcompact(lfs * lfs, lfs_mdir * dir, const lfs_mattr * attrs, int attrcount, lfs_mdir * source, unsigned short begin, unsigned short end) Line 2161	
lfs_dir_relocatingcommit(lfs * lfs, lfs_mdir * dir, const unsigned int * pair, const lfs_mattr * attrs, int attrcount, lfs_mdir * pdir) Line 2279
lfs_dir_orphaningcommit(lfs * lfs, lfs_mdir * dir, const lfs_mattr * attrs, int attrcount) Line 2360	
lfs_dir_commit(lfs * lfs, lfs_mdir * dir, const lfs_mattr * attrs, int attrcount) Line 2532	
lfs_rawformat(lfs * lfs, const lfs_config * cfg) Line 4275
lfs_format(lfs * lfs, const lfs_config * cfg) Line 5713

What happens:

First time in the while (true) loop:

  • Case (attrcount > 0) (attrcount = 3)
  • results in not setting of the local disk variable
  • Hit the continue case in lfs.c:910 (if ((mask & tmask & tag) != (mask & tmask & ttag)))

Second time in the while (true) loop:

  • Case (attrcount > 0) (attrcount = 2)
  • results in not setting of local disk variable.
  • the condition in lfs.c:914 is true (lfs_tag_id(tmask) != 0)
  • access to disk variable without it being initialized while copying to the manual stack.

Initializing the disk variable like @knaxr described earlier fixes this access violation. The test suite runs without any problem after the fix.

I initially wanted to try and isolate the problem in more detail to also ensure that our ram device simulation is not the source of the mistake. I was sadly unable to patch our ram device into the test suite or even comprehend how to write tests for littlefs. The code generation for the test system went over my head.

Please feel free to let us know if we can help in any way.

@geky geky added fixed? and removed needs investigation no idea what is wrong labels Aug 16, 2023
@geky
Copy link
Member

geky commented Aug 16, 2023

Hi @downtimes, @knaxr, sorry for not updating this issue.

Thanks to @mdahamshi, this is being patched in the next patch release: #855

Just to clarify, this is an innocuous assertion. We only use the disk variable when the top bit of tag is set, and we only set the top bit of the tag when initializing disk.

The reason this triggers MSVC is because we move the disk variable in and out of the explicit stack, which MSVC thinks is using the contents of the variable.


I was sadly unable to patch our ram device into the test suite or even comprehend how to write tests for littlefs. The code generation for the test system went over my head.

Yeah, unfortunately I imagine the test framework is a bit impenetrable at the moment. The simulated block device is tightly integrated into the test framework to allow testing physical conditions such as bad blocks and different powerloss failure modes.

@downtimes
Copy link

Thank you very much. Sorry for the unnecessary ping, I should have looked through the Pull-Requests as well...

@geky
Copy link
Member

geky commented Aug 16, 2023

No worries, it was good because I missed that this issue would also be resolved by the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants