Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spiffsgen.py under windows does not generate valid binary images (IDFGH-4615) #6429

Closed
ctag-fh-kiel opened this issue Jan 19, 2021 · 16 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@ctag-fh-kiel
Copy link

Environment

  • Module or chip used: ESP32-WROVER-IB, 16MB flash, 4MB SPIRAM
  • IDF version (run git describe --tags to find it):
    release/v4.1
  • Build System: CMake + idf.py
  • Compiler version gcc for release/v4.1
  • Operating System: Windows
  • (Windows only) environment type: MSYS2 mingw32 & mingw64
  • Using an IDE?: No
  • Power Supply: external 3.3V

Problem Description

spiffsgen.py on windows builds corrupt image from directory, when this is flashed firmware doesn't run as some files in spiffs partition are missing or corrupt

diff storage_ok.bin storage_bad.bin (attached good (Linux + MacOS) and bad images (Windows)) indicates erroneous image as files differ, storage_ok.bin was correctly built on Mac / Linux

Expected Behavior

see attached storage_ok.bin in storage.zip
see attached spiffs_image.zip
file which contains desired file structure (compressed for this issue request)

Actual Behavior

see attached storage_bad.bin in storage.zip
bad image

Steps to reproduce

Image is built through CMake lines 20-38 of the following file
https://github.com/ctag-fh-kiel/ctag-tbd/blob/master/CMakeLists.txt

"idf.py build" executes this command (which seems to be correct):
C:\msys64\home\rma\ctag-tbd\build && C:\msys64\mingw64\bin\python.exe C:/msys64/home/rma/esp-idf/comp
onents/spiffs/spiffsgen.py 0x300000 C:/msys64/home/rma/ctag-tbd/build/spiffs_image C:/msys64/home/rma/ctag-tbd/build/storage.bin --page-size=256 --obj-name-len=32 --meta-len=4 --use-magic --use-magic-len"

which generates a bad spiffs image (storage_bad.bin attached)

Manual execution of $IDF_PATH/components/spiffs/spiffsgen.py 0x300000 build/spiffs_image build/storage.bin
sometimes lead to a good image, and sometimes didn't. It was somehow not reproducible, very strange.

Further information

Original file structure to be put in spiffs image can be found here
https://github.com/ctag-fh-kiel/ctag-tbd/tree/master/spiffs_image

I have tried to set up a working esp idf windows environment, as the esp idf installer was not able to successfully create one for me on windows (CMake Version too old in release/v4.1), I have written down my steps here, but ran into above issue (which persists also with a colleague of mine, who uses standard esp idf windows installer):
https://github.com/ctag-fh-kiel/ctag-tbd/blob/dev/doc/how-to-dev-windows.md

@github-actions github-actions bot changed the title spiffsgen.py under windows does not generate valid binary images spiffsgen.py under windows does not generate valid binary images (IDFGH-4615) Jan 19, 2021
@Alvin1Zhang
Copy link
Collaborator

Thanks for the very detailed report and sorry for the inconvenience, we will look into.

@georgik
Copy link
Collaborator

georgik commented Feb 8, 2021

@ctag-fh-kiel Thank you for reporting the issue.
After comparing those two file there is major difference.
File marked as storage_bad contains files in alphabetical order. The first is directory data, the second is directory www.
File marked as storage_ok contains files in reverse order. The first directory is www, then the directory is data.

Does your application depend on order of files in spiffs?

Please, could you provide us with version of Python that you're using:

C:\msys64\mingw64\bin\python.exe --version

@georgik
Copy link
Collaborator

georgik commented Feb 8, 2021

@ctag-fh-kiel Meanwhile I was able to reproduce the behavior.
Let say that we have files a,b,c,d,e,f
In case of Windows the files will be processed in order a,b,c,d,e,f.
On macOS files are processed in order a, f, c, d, e, b.

Please, check whether this different order is not causing the problem to your application.

It's also important to note that SPIFFS has flat structure and nested structure from filesystem is just mapped to file names.

@ctag-fh-kiel
Copy link
Author

If this is the only difference in the file then that is probably what is causing my issues.

Is it possible to create the spiffs image consistently on mac, linux and windows?
It seems that mac + linux are consistent, whereas windows is not. Also windows image doesn't work with my firmware whereas mac + linux do work.

Thanks for looking into the issue!

@georgik
Copy link
Collaborator

georgik commented Feb 8, 2021

@ctag-fh-kiel I made a further investigation of the content of filesystem.
It's possible to extract the content of SPIFFS using the tool https://github.com/igrr/mkspiffs written by @igrr . Check out releases for a version for your platform:

mkspiffs -u bad storage_bad.bin
mkspiffs -u ok storage_ok.bin

Then I used KDiff3 to compare the content of both directories: http://kdiff3.sourceforge.net/
The content of both directories is the same.

The only thing that differs is the output from both commands which contains a list of files. After sorting of lines both outputs the results are identical.

It seems that the only difference between images is the file order in the image.
Please, could you provide more information on which files are causing the problem?

@Visuelle-Musik
Copy link

Visuelle-Musik commented Feb 8, 2021

Hi Juraj Michálek,
first of all thanks for investigating the issue. I am a contributur to the TBD project @ctag-fh-kiel and ran in these issues too,
so I thought I could add some observations. Not sure if they are helpful, but because of the nature of the issue I guess anything may be worth mentioning?
I also used mkspiffs as an alternative tool and also noticed that even if the files crash the content according to mkspiffs (when restored to a filesystem on the "host") is correct.
I am using Windows 8.1 and the latest Ubuntu (20.04.2 LTS) as my operating systems.

The problems with the bins occur using ESP IDF 4.1 with the following setups:

  • Win 8.1 with idf.py
  • Win 8.1 with mkspiffs
  • Ubuntu 20.04.2 LTS with idf.py
  • Win 8.1 with Docker and ESP IDF image for docker (please note "Docker Toolbox" which is not supported anymore got used here, because the current Docker for Windows requires Win 10 or later)

I also tried the most recent version for ESP IDF (4.2) on Windows with idf.py and mkspiffs.
This also would not work but interestingly the errors were slightly different, if I remember correctly "parsing error" instead of "file not found" or similar.

Untested so far:

  • Ubuntu 20.04.2 with mkspiffs

Working solutions for me with ESP IDF 4.1:

  • Ubuntu 20.04.2 LTS with docker (latest version)
  • Ubuntu on Github with automated build via docker (not entirely sure which versions Github is using, sorry - I guess it's also ESP IDF 4.1 and a slightly oder Ubuntu)

So to sum things up, it seems like for the file-content we are using with ESP IDF 4.1 only on MacOS and with the Docker Image of ESP IDF on Ubuntu there are no errors running the SPIFFS on the ESP32, which seems to be kind of very strange for such a wide-spread architecture? The application itself is not dependant on the ordering of the files in the bins, it simply reads (and afterwards writes) the content of the files. The error occurs upon startup of the application, when it reads JSN-files needed for a webserver.

BTW: I also looked for weird (non-ASCII / non UTF 8) characters in our files but did not find anything interesting so far, also as you also as you found out there is not major difference between the working and non-working bins?

What you discovered concerning the different sort-order in the bins I find really interesting.
Because this is the only difference, maybe this is triggering the error during the runtime on the ESP 32 after the SPIFF files get used? Maybe for some reason there is a rare condition that is caused by the content for some reason. But all this is speculation, sorry to have no hard facts, yet!

All the best, Mathias (Brüssel)

@georgik
Copy link
Collaborator

georgik commented Feb 9, 2021

@Visuelle-Musik Thank you very much for providing these details.
It seems that problem might be in some IDF library and the problem occurs only if the image has special order of files which occurred on Windows. I made some investigation, but still without any result.

The idea was to write small app which reads all files, just to make sure that read operation perform well even on bigger files from the image.

@Visuelle-Musik
Copy link

@georgik Thanks again for looking into this!
Yes, in the beginning I also thought it would be a windows-only issue. But as mentioned I had the identical behaviour with Ubuntu too, unless I was using Docker on Ubuntu. But then the Docker Toolbox on Windows 8.1 is using Ubuntu "under the hood" and it did not work there either. So there is no real apparent logic regarding the underlying OS concerning Windows and Linux. BTW: There is no indication so far that the error is more likely to happen with larger files. In fact one file it's likely to crash with is "spm-config.jsn" and has about 8 kb.
E (1853) JSON: could not open file /spiffs/data/spm-config.jsn
One more apparent thing is that it never happened with MacOS.

@georgik
Copy link
Collaborator

georgik commented Feb 11, 2021

@ctag-fh-kiel @Visuelle-Musik I've noticed that the partition table requires a chip with 16 MB flash.

Right now I have only chips with 4 MB of flash. I've tried to build the project on Windows from ESP-IDF examples/storage/spiffsgen with data from reported spiffs_image. The example app booted without a problem and it was possible to read original /spiffs/data/spm-config.jsn. It was just necessary to bump up the size of the read buffer.

Here are two ideas to investigate the problem:

Please, check whether the read buffer is sufficient when working with .jsn file.

Please, try to flash the device and retrieve the data by esptool and compare it:

esptool -p PORT read_flash  START_ADDR SIZE spiffsgen-out.bin

This test can give a hint whether the bug is caused by the integrity of flash or a problem in esptool when flashing data.

@georgik
Copy link
Collaborator

georgik commented Feb 26, 2021

@ctag-fh-kiel @Visuelle-Musik I made further investigation of the problem.
The app starts crashing when it's not possible to read spm-config.jsn.
The problem might be that build\storage.bin does not contain the file or the file is not valid JSON.
In the readme.md there is mention of file spm-config.json which looks like a typo.
In case of "corrupted" image please:

  • check the syntax of the file by JSON validator
  • open the build\storage.bin in editor like vim and search for spm-config.jsn, the record should be present
  • if the problem occurs run idf.py fullclean before building it

BTW: I've noticed that components/network/CMakeLists.txt contains following code:

idf_component_register(SRCS network.cpp
        INCLUDE_DIRS . $ENV{IDF_PATH}/examples/common_components/protocol_examples_common/include
         PRIV_REQUIRES nvs_flash mdns esp_netif)

This code will cause failure in case of build on Windows. The part with ENV is redundant and could be removed:

idf_component_register(SRCS network.cpp
        INCLUDE_DIRS .
         PRIV_REQUIRES nvs_flash mdns esp_netif)

@ctag-fh-kiel
Copy link
Author

Hi georgik,
Thanks for looking into the issue. The file is present. We are using Github Actions for building the firmware based on the official docker ESP IDF image with an Ubuntu host. Everything seems fine there.

@georgik
Copy link
Collaborator

georgik commented Mar 5, 2021

@ctag-fh-kiel I'm glad that the image is working for you.
I discovered an issue when the builder has just single CPU core. It causes failure of the first build:

RuntimeError: given base directory C:/.../build/spiffs_image does not exist
ninja failed with exit code 1

The next build is ok.

It might not be related, but it's very strange issue with parallel execution of build steps. How many cores do have your Ubuntu and Windows machines? macOS on common Apple HW has always more than one CPU core.

@Visuelle-Musik
Copy link

@georgik @ctag-fh-kiel Hi Juraj, that's an interesting observation, but like you I don't think it's related with our problem.
My Windows machine has 4 cores, my Ubuntu machine has 2 cores. I can't tell how many cores the Github servers have where the Docker version of ESP-IDF is working ok, but when I use Docker on my Ubuntu-box, which has the issue too it works perfectly fine, whereas Docker (older version) on Windows 8.1 also produces the issue. All this is rather strange and I actually don't know where the reason may be. In essence we never had the issue on MacOS or with Docker on Ubuntu, that's all I can say, sorry.

@georgik
Copy link
Collaborator

georgik commented Mar 8, 2021

@ctag-fh-kiel @Visuelle-Musik I identified potential source of the problem. spiffs_create_partition_image does not contain declaration of dependency on targets which performs file copy. This can lead to fail of spiffs_create_partition_image or building the spiffs image during copying files to build directory.

The pull request is here: ctag-fh-kiel/ctag-tbd#11

You can find description of DEPENDS here: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/storage/spiffs.html

Please review and test suggested change

@Visuelle-Musik
Copy link

@georgik @ctag-fh-kiel Hi Juraj, I just tested your fix and can confirm that it works for me (Windows 8.1, idf.py) :-)
Thanks so much for putting all that work into this, that's really very kind of you!
I copied your fix locally to my working-directory for now, but I'll test again of course when it's part of the official repo!
Kind of amazing that the issue did not happen on MacOS, isn't it? Maybe some kind of race-condition, in relation to the ordering of the spiffs in the bin-file? But of course this is kind of speculation. Anyways, thanks very much again! All the best, Mathias

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally labels Mar 10, 2021
@Alvin1Zhang
Copy link
Collaborator

@Visuelle-Musik Thanks for reporting and testing the fix, and glad it works. Will close now, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants