Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIFFS randomly corrupted #87

Open
sieren opened this issue Apr 11, 2020 · 34 comments · Fixed by #153
Open

SPIFFS randomly corrupted #87

sieren opened this issue Apr 11, 2020 · 34 comments · Fixed by #153
Labels
bug Something isn't working needs more info

Comments

@sieren
Copy link
Owner

sieren commented Apr 11, 2020

Occasionally SPIFFS gets corrupted in failsafe (or otherwise), resulting in files unable to be edited or created (deleting however always works).
Often files are only "half" written during save and afterwards cannot be overwritten properly.

Currently no steps to reproduce, seems random.
Only "fix" is to reflash the bin partition.

@sieren sieren added bug Something isn't working needs more info labels Apr 11, 2020
@sieren sieren mentioned this issue Aug 22, 2020
@jonathanmbradshaw
Copy link

I have seen this (or something related). After a number of updates to the config file using the HTTP UI I get to the point where the it can not be accessed for Read or Write. Only option is to re flash the M5 using the USB port.

@sieren sieren changed the title Unable to read SPIFFS in Recovery Mode SPIFFS randomly corrupted Nov 15, 2020
@AstroTyrannus
Copy link

Moin Sieren,

Schonmal in Betracht gezogen das SPIFFS selber der Verursacher sein könnte?
Versuch mal SPIFFS durch LittleFS zu ersetzen.

@htvekov
Copy link

htvekov commented Jan 10, 2021

Hi' @sieren

Issue with occasional corrupt SPIFFS still persists I'm afraid.
I'm in a test phase atm. 😁and edit and writes numerous revisions to config file these days.
I've experienced for the third time now (both on HP v0.05 M5stack and ESP32_generic files) that SPIFFS after many writes (min. 20-30 or so) suddenly corrupts.

Issue can't be resolved with an OTA update but can only be recovered with a complete serial reflash.

Ciao !

@sieren
Copy link
Owner Author

sieren commented Jan 10, 2021

Thanks for checking, but yeah, sadly no news on that front. I'm afraid it's in the SPIFFS implementation itself and unrelated to Homepoint.

@htvekov
Copy link

htvekov commented Jan 21, 2021

Hi' @sieren

There's definitely an issue with SPIFFS in HomePoint.
Either with SPIFFS itself or the configuration.

Tried to upload my icon pack via web interface and it dies uploading file no. 12.
Icons files are only some max. 2kb incl. overhead, so partition is filled after some 22 kb upload only.
I've done this three times repeatedly with exactly the same steps, stops every time at exactly the same file.

So I tend more to believe that it's some sort of partition size issue, rather than an issue with repeated writes to eg. config.json

Has released binary files by accident been compiled with some strange ultra, minimum SPIFFS partition scheme ?
I would wish I could be more specific, but I'm afraid my knowledge on this specific subject is very close to zero 😒

@cerietke
Copy link

cerietke commented Feb 1, 2021

Just wanted to confirm I see similar behaviour on an M5Stack Core. Have had file uploads be refused, have had config not being written and have had config getting corrupted at different times. I sometimes see a 500 error with no descriptive text. My suspicion was also partition size, but not that alone as after reflash using USB and uploading the exact same files and writing the same config it will succeed. I was thinking it might be storing some version history, hence it running out after doing a bunch of things, not immediately.

@sieren
Copy link
Owner Author

sieren commented Feb 10, 2021

LittleFS will make it into ESP-IDF v4.2, but sadly the underlying Arduino Library doesn't support it yet.
I will investigate replacing SPIFFS once that happened.

@htvekov
Copy link

htvekov commented Feb 16, 2021

Hi' Matt.

Yes, that would really make a huge improvement to replace SPIFFS.
It's hard 'tinkering' with HomePoint with current SPIFFS issue as frequent flashing is needed.
But the good thing is that HomePoint already is absolutely 'rock solid' if left alone

Ciao !

@ghosty-be
Copy link
Contributor

I just encountered this too after messing with some settings that I had a corrupted config.json it seems...
I have (for another project) actually https://randomnerdtutorials.com/install-esp32-filesystem-uploader-arduino-ide/ installed in my arduino IDE... was wondering if it would be possible to just rewrite the spiffs like that...
I tried in arduino IDE to just create an empty project, copied the data folder from your github and (luckily) a local copy of my config.json pasted in there... but it seems that it does not want to upload the spiffs like that by choosing esp32 sketch data upload ...
Here are my settings... (just trying something random as I have no prior experience with spiffs... )
arduino-ide
In the end I reflashed the md5stack core esp32 with the bin image and started over with the access point mode config to copy-paste my previous config.json

@sieren
Copy link
Owner Author

sieren commented Feb 23, 2021

It should be possible...at least when using the raw project you can update the spiffs bin with make spiffs_spiffs_bin and flash it manually with something along the lines of esptool.py --port /dev/<serial_port_goes_here> --chip esp32 -b 921600 write_flash --flash_mode dio --flash_freq 80m --flash_size 4MB 0x2b0000 spiffs.bin (I'd assume)

Your Arduino project needs to know where to upload the data. That's defined in the partition scheme (partition.csv) in the root of this project - the address is 0x2b0000

@ghosty-be
Copy link
Contributor

reading up on this it seems that it's not that straight forward to have that partition layout handed to arduino IDE... I guess it requires creating a new board definition specifically with your partitions.csv layout ... darn why can't stuff be simpler :)
https://robotzero.one/arduino-ide-partitions/

@ghosty-be
Copy link
Contributor

ok messing around with mkspiffs ... I saw in your partitions.csv the spiffs is 500kB ?
Tried something like mkspiffs -c ./data -s 512000 spiffs.bin but that says
...
/captive/app.js
SPIFFS_write error(-10001): File system is full.

error adding file!
Error for adding content from captive!
/power_inactive.jpg
SPIFFS_write error(-10001): File system is full.

error adding file!

While it should fit according to this logic:
$ du -bs spiffs.bin
512000 spiffs.bin
$ du -bs data/
432687 data/

so not sure what I am doing wrong there... googles on

@ghosty-be
Copy link
Contributor

so after a bit more searching: spiffs by default use 4k block ...
du -B4K data/
130
when I multiply that by 4 it's 520k :/
So you use a different block size for your spiffs somehow? :)

@sieren
Copy link
Owner Author

sieren commented Feb 24, 2021

oh you may be onto something.. maybe this is the reason for the corruptions all along??

@ghosty-be
Copy link
Contributor

just wondering how you got all the data there in the end :/
As I assume that your github /data should be a reflection of the data onto the spiffs ...
Was also looking if it was possible to like download the whole spiff space to do a compare or stuff... but that doesn't appear to be possible...
What I did notice after the corruption yesterday might be then caused by the same issue: I deleted through the webinterface the config.json and tried to upload my backup copy but that always failed ...
So might go in and delete some images I don't really use and see if I can do a similar operation successful: deleting and re-uploading the config.json after... (but that'll be for this evening when I have more time to tinker with it )

@sieren
Copy link
Owner Author

sieren commented Feb 24, 2021

This is really great info! We might be onto something here.
You are right, even spiffs.bin is beyond 500kb by now. It's weird that none of the ESP tools were complaining about the fact that the bin-file is beyond what's specified in the partition scheme. This would make perfect sense though.

It seems I need to create per-device SPIFFS Partition Schemes too, some ESP32 devices only support 4MB, others like the M5Stack are made for up to 16MB. But it looks like there was still some space to bump up the SPIFFS partition.

I've attached a special build that remedies this. It'd be great if you could test it and report back.
ping @htvekov maybe you too, since the icon pack might be hammering the SPIFFS partition pretty well.

Mind you this requires a hard flash through USB with the _full.bin - dont forget to make a backup of your config before :)

homepoint_release.zip

@ghosty-be
Copy link
Contributor

Did not really find how to trigger it yet still trying with the current release v0.07.2
tried deleting a bunch of images and uploading the backup config.json ok
reflashed
tried deleting config.json, uploading the backup config.json ok
tried adding a couple lines to config.json, reloaded and rebooted a couple times ok
it started to act weird when I after that uploaded a readme.txt, then uploaded another readme.txt (other content) all still only a couple bytes...
shortly after that the webinterface started to act up (no files visible anymore in the webinterface... but the reload and restart still worked... )
after reflash again tried to upload some files, even a specially crafted blob.abc (just urandom data with dd) which was over 4kB in size to take up 2x4kB blocks ... but still couldn't corrupt the spiffs...

Now just flashed your above release...
it works but so not sure how to verify that it actually fixed anything :)
I also uploaded a couple of icons from @htvekov to test but so far it still work...
So what is the change now in the release? just the size of the spiffs is larger and how large?
Would like to continue actually my quest into rewriting the spiffs without re-flashing the actual code ... tinkering all the way

@htvekov
Copy link

htvekov commented Mar 4, 2021

Hi' Sieren.

Sorry about the extremely late reply. I've been quite busy at work last 14 days.
Just tested and loaded some 30+ icons, all working!! 👌😁🎉

I'll leave it active for a few days and see if i can crash this Home Point version abusing spiffs 😉

Ciao !

@ghosty-be
Copy link
Contributor

running it for a solid week now without problems... but then again since I didn't mess with it much I can't really say if its now gone :)
I could not make it crash before on purpose...

@sieren
Copy link
Owner Author

sieren commented Mar 5, 2021

Cool, sounds promising so far. Maybe once you had a chance to mess with it a bit more, let me know. Otherwise I'll roll this changes into the next update

@sieren
Copy link
Owner Author

sieren commented Mar 5, 2021

Leaving open for now

@sieren sieren reopened this Mar 5, 2021
@htvekov
Copy link

htvekov commented Mar 15, 2021

Can't kill it, Matt !😉

Has been running stable and without any issues - loading files or otherwise.
I would merge fix and release as stable build.

Ciao !

@ghosty-be
Copy link
Contributor

I today played a lot with the config, copy - pasting other configs and reloading... that's what trashed mine before, but this version has been running fine for 18 days now...

@dresende
Copy link

dresende commented Apr 1, 2021

I was just bitten by this, and since wifi is on the config, it's "bricked", have to reflash. I would suggest 2 features to solve this:

  1. Fallback to no config after perhaps 5min? 10min of not being able to connect?
  2. Have an option to reformat SPIFFS.

@sieren
Copy link
Owner Author

sieren commented Apr 1, 2021

Using the version attached to this issue a few comments back? The fix hasn't been rolled into a main release yet

@dresende
Copy link

dresende commented Apr 2, 2021

No, I used the latest version, didn't saw that attachment. Will try that and see how it goes.

@cerietke
Copy link

cerietke commented Apr 3, 2021

Is the file attached to the Feb 24th post the latest? My core 2 seems unstable (though no space problems so far), it keeps restarting and has trouble connecting to the wifi, once it has connected it sometimes reboots or turns off on a click.

@dresende
Copy link

dresende commented Apr 3, 2021

I flashed 2 hours ago and so far so good 😄

@sieren
Copy link
Owner Author

sieren commented Apr 3, 2021

@cerietke see #145 - known issue right now

@ghosty-be
Copy link
Contributor

so far been messing with it quite a bit ... not seen the spiffs corruption issue with the version attached to this thread... (did not get it corrupted despite rewriting a bunch of times the configs, adding a couple of extra icons etc...)
However reading above about the crashes on m5stack core 2... (and I went to look in that thread... but I have a 1st generation core ...I have seen some similar behavior on my m5stack core (but not that often)
I have it attached to usb power adapter on my desk (I removed the battery shield) and have seen sometimes that it like reloads showing the wifi is disconnected for a couple of seconds before it reconnects... (lets say once a week or even less frequent)
Not sure if it could be some kind of condition where the connection to the broker or wifi is lost and that causes homepoint to freak out and reload or something? :/

@iqbalibrahim1992
Copy link

Hi @sieren absolutely love this, I got it all running without any issues at first, but I've noticed that I experience this problem (cannot edit/create files) after I upload a new icon file.

I've created two JPG files both 50px x 50px called office_active.jpg and office_inactive.jpg - as soon as they upload, I'm not able to edit the config.json file anymore. Using the M5Stack Basic Development Kit. What I'm having to do is build the config.json file in Notepad++, flash the M5Stack via USB, then edit the config.json in the Web UI for it to work, and then basically not touching anything in the Web UI.

Any advance on how to resolve this issue? Happy to try and help with it although my knowledge is probably nowhere near yours on this... Thank you

@cerietke
Copy link

cerietke commented Aug 19, 2021

Are you sure it's always the case that you can't edit it?

I have a number of images I created. I upload them one after another after a fresh install, then edit my config (copy-paste) and then it's usually fine. If I have to edit the config again later I sometimes that it loads partially and I can't seem to save it again. If that happens I reset: do a fresh install and start from scratch.

I believe the new code is very different, I tried it on my core2, but unfortunately I couldn't get creating my own images to work. I did not see the same issue with the config.json though.

@gon0
Copy link

gon0 commented Sep 24, 2021

Maybe it helps reproducing this issue: on my M5 Stack, I entered a german letter, "ß" in the config.json-file as a name, e.g.

    "name": "Gießen",
    "type": "Sensor",
    "icon": "door",

After I saw the message "Configuration invalid! Login via browser Could not Parse config file", it was not possible to edit the config-file. After some minutes, it was completely empty. I have reflashed the M5 Stack to get back to a working system again.

@jonathanmbradshaw
Copy link

Its a long shot.. might this be related to NTP & DHCP options? I dont have the serial log at this point, but I had a lot of NTP retry 1 - 10 errors, and also a set time error on the file system while updating config.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs more info
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants