Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange Total-Crash Failure after a couple of days/weeks ESP32 RF gateway #854

Closed
Dattel opened this issue Jan 30, 2021 · 26 comments
Closed
Milestone

Comments

@Dattel
Copy link
Contributor

Dattel commented Jan 30, 2021

Hi,

i have a strange behaviour with OMG in Version 0.9.5 on a ESP8266 NodeMCU Board

espressif8266@2.6.2

My attached devices are:
-BME280
-RF-Sender
-RF-Receiver

I'm using my own prod_env.ini with the following compiler settings:

[platformio]
build_flags =
  '-DESPWifiManualSetup="1"'
  '-DWifiManager_password="*****"'
  '-DOMG_VERSION="v0.9.5"'
  '-Dwifi_ssid="*****"'
  '-Dwifi_password="*****"'
  '-DMQTT_USER="*****"'
  '-DMQTT_PASS="*****"'
  '-DMQTT_SERVER="*****"'
  '-DMQTT_PORT="1884"'
  '-Dota_password="*****"'  

[com-esp01]
lib_deps =
  ${com-esp.lib_deps}
build_flags =
  ${env.build_flags}
  ${platformio.build_flags}
  -DMQTT_MAX_PACKET_SIZE=1024
  '-DsimpleReceiving=true'
  '-DTRACE=0'

[env:nodemcuv2-OpenMediaGateway]
platform = ${com.esp8266_platform}
board = nodemcuv2
lib_deps =
  ${com-esp01.lib_deps}
  ${libraries.rc-switch}
  ${libraries.bme280}
build_flags = 
  ${com-esp01.build_flags}
  ${platformio.build_flags}
  '-DGateway_Name="OpenMQTTGateway"'
  '-DZgatewayRF="RF"'
  '-DZsensorBME280="BME280"'
  '-DRF_RECEIVER_GPIO=14' #D5
  '-DRF_EMITTER_GPIO=12' #D6

After a variable time period (days/weeks) the board hangs... If i try to capture some informations using the serial port in need to adjust the baudrate to 74880 and all i got is:

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3584, room 16 
tail 0
chksum 0xb0
csum 0xb0
v2843a5ac
~ld
⸮⸮⸮

The only way to fix these issue-state is to flash a sample blink sketch with Arduino-IDE using the "Erase all Flash Content" and after that reflashing OMG. If i flash without "Erase all Flash Content", i can't leave these state of death.

I also have these issue with other ESP-01 based boards only with BME280 Sensors attached. But here the frequence of death is much less frequently.

Any Idea's how to fix that?? Or any clue, where i have to dig a bit deeper??

Thanks
D

@Dattel
Copy link
Contributor Author

Dattel commented Jan 30, 2021

saw some similar problems here with a suggest for solution:

esp8266/Arduino#2414 (comment)

@1technophile
Copy link
Owner

1technophile commented Jan 30, 2021

Hello,

Could you try to isolate the crash reason by deactivating the modules one by one please?
I would start first by the BME280.

Alternatively you could try to update the BME280 library (platformio.ini file):
bme280 = SparkFun BME280@2.0.4
to
bme280 = SparkFun BME280@2.0.9

@Dattel
Copy link
Contributor Author

Dattel commented Jan 30, 2021

i will have a try with BME@2.0.9
disable the BME will not make much sence for me because my ESP01 boards only uses the BME Sensors... so if i disable these sensor, the OMG instances will be useless for me :-)

i just compiled and uploaded the version to my NODEMCU board, which is also using the RF-Receiver/sender... i'll try to keep an eye on that and post feedback.
Thanks for your quick reply

@1technophile
Copy link
Owner

disable the BME will not make much sence for me because my ESP01 boards only uses the BME Sensors... so if i disable these sensor, the OMG instances will be useless for me :-)

This is just to target the defective module, not to ask for a permanent removing

@Dattel
Copy link
Contributor Author

Dattel commented Jan 31, 2021

Thanks, i completely understand...

i've updated my NodeMCU with RF and BME to BME-Lib2.0.9 and give it a try. If it dosn't work, i'll remove the BME from these board completely for testing.

@Dattel
Copy link
Contributor Author

Dattel commented Feb 10, 2021

Updating to BME @2.0.8 seems not to fix....
I've got some power outage yesterday - i'm not sure, if that accelerates the occurance of the failure.

BME@2.0.9 was not possible in a quickly manner since there are some changes in the initialization process that seems to require a bit of changes in OMG to open the I2C.

Nevertheless, i removed the BME-Lib and the sensor completely from the board and give it a try for testing purpose.
keep you updated here...

@Albertowue
Copy link

Hi,
I have exactly the same problem (After a variable time period the board hangs). My setup is:

OMG in Version 0.9.5 development
ESP8266 NodeMCU Board
espressif8266@2.6.3

  • DHT22
    -RF-Sender
    -RF-Receiver

Also for me the only way to unlock the esp is to use the Dattel technique,
I have a different sensor, so the problem isn't the dht or BME module.

@Dattel
Copy link
Contributor Author

Dattel commented Feb 24, 2021

Today my device crashes again - this time only with RF-Send/Receive Modules..
I assume, it's nothing wrong with the BME Lib...

so 14 days to crash this time...

@1technophile
Copy link
Owner

Thanks for the feedback.

Did you checked your power supply voltage and stability?

Also I found out that there is an issue with wifi manual setup and the v0.9.5, could you erase the flash, comment this line of code, and upload it.

The issue is that the wifi.begin() function may be called too close to another wifi.begin resulting in a continuous reconnect loop
I don't know if this is producing your problem or not but it may be a track to follow.

@1technophile
Copy link
Owner

@Albertowue did you get a serial monitor output (115200) to see what's going on.
Also for you both it could be nice to put the log level to LOG_LEVEL_TRACE

@Dattel
Copy link
Contributor Author

Dattel commented Feb 25, 2021

Hi @1technophile, thanks for your fast reply..

i just updated my sources to the latest development-sources, commented out your line, but didn't change the loglevel, since the device is running offsite only with power supply. As i mentioned earlier, if the device crashes, i only got rubbish on the serial and the rate changes magically to 74880 - so i assume that there is nothing more usefull coming in then.

Will give it a try now...

@1technophile
Copy link
Owner

1technophile commented Mar 2, 2021

Hello @Dattel @Albertowue,

I have done an update of the RCSwitch libraries,
could you change the line 87 of platformio.ini file
from

rc-switch = https://github.com/1technophile/rc-switch.git#385a7e0

to

rc-switch = https://github.com/1technophile/rc-switch.git#0e0d210

and see if you get a better stability?

@Dattel
Copy link
Contributor Author

Dattel commented Mar 3, 2021

sure...
I updated my device with the latest git-sources and changed to rc-switch #0e0d210...
But this time, i don't commented out your suggested
line from your last post.

@Albertowue
Copy link

I just updated to the latest git sources and changed line 87 of platformio.ini as indicated. The system just crashed after 10 days of operation.

@hvorragend
Copy link

I am using OMG with BLE discovery. I've tried the current code on several ESP32 and I recognized crashes after several days.
I will try be5a93a#diff-4446afd728a4f34cbcddc306a9cb6be845d1a61c216076a295683bcc9c106724 soon.

@1technophile
Copy link
Owner

I've tried the current code on several ESP32

Could you indicate what do you mean by the current code?

@1technophile
Copy link
Owner

I just updated to the latest git sources and changed line 87 of platformio.ini as indicated. The system just crashed after 10 days of operation.

Hmm, I don't understand how you can have a system crash after 10 days of operation if you took a change that was provided 2 days before your message :-)

@hvorragend
Copy link

I've tried the current code on several ESP32

Could you indicate what do you mean by the current code?

I am using the current dev-branch. Not the release.

@1technophile
Copy link
Owner

I am using the current dev-branch. Not the release.

ok, there were several corrections made these last days, could you try the last dev?

@Albertowue
Copy link

I just updated to the latest git sources and changed line 87 of platformio.ini as indicated. The system just crashed after 10 days of operation.

Hmm, I don't understand how you can have a system crash after 10 days of operation if you took a change that was provided 2 days before your message :-)

Excuse my bad English, I meant that before making the change you suggested it was blocked again. For now with your changes I have reached 12 days of operation and without blocks

@1technophile
Copy link
Owner

No problem, thanks for the feedback!

@1technophile 1technophile changed the title Strange Total-Crash Failure after a couple of days/weeks Strange Total-Crash Failure after a couple of days/weeks ESP32 RF gateway Mar 16, 2021
@1technophile 1technophile added this to the v0.9.6 milestone Mar 16, 2021
@1technophile
Copy link
Owner

I'm closing the issue if it reappears do not hesitate to comment here

@Dattel
Copy link
Contributor Author

Dattel commented Apr 10, 2021

sorry, the problem isn't fixed... two devices reached lockdown-state a couple of days before - i didn't realized that so i can't say, how long they really last without crashing.

@1technophile
Copy link
Owner

@Dattel which version are you using? By lockdown state, you mean that it didn't restart itself?
Do you have still the BME detached?

@Dattel
Copy link
Contributor Author

Dattel commented Apr 13, 2021

with "lockdown state" i mean, i have to reflash the device including "Erase UserData & Wifi settings"..

i have two "locked down" devices:

  • One with BME attached. (ESP01)
  • and the other (NODEMCU Dev-Board ESP8266) without the BME attached and also not included the libs in the Build-section. (just with rc-switch)

I'm using the latest git-checkout from the 03.03.2021 with rcswitch#0e0d210

@Dattel
Copy link
Contributor Author

Dattel commented Apr 15, 2021

I reflashed both devices 2 day's ago - the NODEMCU Dev-Board lasts for 2 days and stucks again....
So i just pulled the latest git sources and reflashed again and will have a look, if these version is more stable now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants