Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak on ESP8266 when using api encryption after socket error #3320

Open
Dilbert66 opened this issue May 27, 2022 · 18 comments
Open

Memory leak on ESP8266 when using api encryption after socket error #3320

Dilbert66 opened this issue May 27, 2022 · 18 comments
Assignees

Comments

@Dilbert66
Copy link

Dilbert66 commented May 27, 2022

The problem

Heap Free is consistently dropping in size over the course of a few hours until the ESP will crash and reboot due to lack of heap space due to a "socket operation failed: BAD_INDICATOR errono=11" error caused by a HA instance attempting to connect in plain text. The memory decrease is related to the socket error since the heap size immediately after is lower. Possible unreleased socket memory structure after error? The issue is not the error itself but the associated heap leakage which should not be happening in this case. I have confirmed that the heap decrease is related to the socket error by stopping the erroneous connection attempts. Without the error , heap is staying consistent.

Which version of ESPHome has the issue?

2022.5.1

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2021.12.4

What platform are you using?

ESP8266

Board

Wemos D1 mini

Component causing the issue

api encryption

Example YAML snippet

substitutions:
  # change the name to suit your needs. This is what esphome will use as the friendly name for your component.
  # also used as a prefix for all fields
  systemName: "dscalarmmodule" 
  
 
  
esphome:
  name: $systemName
  platform: ESP8266
  board: nodemcuv2


wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  ap:
    ssid: "$systemName"
    password: !secret wifi_password

logger:
  baud_rate: 115200
  level: DEBUG

api:
   password: !secret api_password  
   encryption:
    key: !secret encryption_key
   

ota:
   password: !secret ota_password
   safe_mode: True


 

binary_sensor:
    #zone status open/close for each zone 
  - platform: template
    id: z1
    name: "$systemName Front door(z1)"
    device_class: door

    
text_sensor:
    #general system status online/disconnected 
  - platform: template
    id: system_status
    name: "$systemName System Status"
    icon: "mdi:shield"
    #battery level status for wireless channels, tamper , in alarm, etc for individual zones
  
  - platform: debug
    device:
      name: "Device Info"    
        
debug:
  update_interval: 30s

sensor:
  - platform: debug
    free:
      name: "Heap Free"
    fragmentation:
      name: "Heap Fragmentation"
    block:
      name: "Heap Max Block"
    loop_time:
      name: "Loop Time"

Anything in the logs that might be useful for us?

�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18944.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16400.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 35.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18944.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16400.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 43.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][api:102]: Accepted 192.168.2.229�[0m
�[0;33m[W][api.connection:071]: 192.168.2.229: Socket operation failed: BAD_INDICATOR errno=11�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18840.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16400.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 35.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18840.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16400.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 42.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][api:102]: Accepted 192.168.2.229�[0m
�[0;33m[W][api.connection:071]: 192.168.2.229: Socket operation failed: BAD_INDICATOR errno=11�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18736.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16296.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 37.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18736.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16296.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 6.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 35.00000 ms with 0 decimals of accuracy�[0m
�[0;36m[D][api:102]: Accepted 192.168.2.229�[0m
�[0;33m[W][api.connection:071]: 192.168.2.229: Socket operation failed: BAD_INDICATOR errno=11�[0m
�[0;36m[D][sensor:124]: 'Heap Free': Sending state 18632.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Max Block': Sending state 16024.00000 B with 0 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Heap Fragmentation': Sending state 7.00000 % with 1 decimals of accuracy�[0m
�[0;36m[D][sensor:124]: 'Loop Time': Sending state 36.00000 ms with 0 decimals of accuracy�[0m

Additional information

No response

@Dilbert66 Dilbert66 changed the title Memory leak on ESP8266 when using api encryption Memory leak on ESP8266 when using api encryption after socket error May 27, 2022
@probot-esphome
Copy link

Hey there @OttoWinter, mind taking a look at this issue as it has been labeled with an integration (api) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)

@tickleberrydude
Copy link

tickleberrydude commented Aug 24, 2022

I am having a similar issue, Crappy HiLetgo ESP8266 nodemcu. I am also using a motion sensor in mine,

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 23, 2022
@zorrobyte
Copy link

Also seeing this:

[18:23:09][W][api.connection:071]: ::FFFF:10.0.0.137: Socket operation failed: HANDSHAKESTATE_READ_FAILED errno=11
[18:23:15][D][api:102]: Accepted ::FFFF:10.0.0.137
[18:23:15][W][api.connection:071]: ::FFFF:10.0.0.137: Socket operation failed: HANDSHAKESTATE_READ_FAILED errno=11
[18:23:17][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:23:17][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:17][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:23:18][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:23:18][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:18][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:23:25][D][api:102]: Accepted ::FFFF:10.0.0.137
[18:23:25][W][api.connection:071]: ::FFFF:10.0.0.137: Socket operation failed: HANDSHAKESTATE_READ_FAILED errno=11
[18:23:27][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:23:27][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:27][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:23:28][D][dht:048]: Got Temperature=18.3°C Humidity=37.0%
[18:23:28][D][sensor:127]: 'Battery Box Temperature': Sending state 18.30000 °C with 1 decimals of accuracy
[18:23:28][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:23:37][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:23:37][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:37][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:23:38][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:23:38][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:38][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:23:44][D][api:102]: Accepted ::FFFF:10.0.0.137
[18:23:44][W][api.connection:071]: ::FFFF:10.0.0.137: Socket operation failed: HANDSHAKESTATE_READ_FAILED errno=11
[18:23:47][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:23:47][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:47][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:23:48][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:23:48][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:48][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:23:57][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:23:57][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:57][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:23:58][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:23:58][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:23:58][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:24:07][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:24:07][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:24:07][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:24:08][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:24:08][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:24:08][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:24:17][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:24:17][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:24:17][D][sensor:127]: 'Front Walkway Humidity': Sending state 35.00000 % with 0 decimals of accuracy
[18:24:18][D][dht:048]: Got Temperature=18.2°C Humidity=37.0%
[18:24:18][D][sensor:127]: 'Battery Box Temperature': Sending state 18.20000 °C with 1 decimals of accuracy
[18:24:18][D][sensor:127]: 'Battery Box Humidity': Sending state 37.00000 % with 0 decimals of accuracy
[18:24:19][D][api:102]: Accepted ::FFFF:10.0.0.137
[18:24:19][W][api.connection:071]: ::FFFF:10.0.0.137: Socket operation failed: HANDSHAKESTATE_READ_FAILED errno=11
[18:24:27][D][dht:048]: Got Temperature=18.2°C Humidity=35.0%
[18:24:27][D][sensor:127]: 'Front Walkway Temperature': Sending state 18.20000 °C with 1 decimals of accuracy

Code:

# Enable web server
web_server:
  port: 80

switch:
  - platform: gpio
    pin: GPIO21
    name: "Battery Box Fan"
    id: battery_box_fan
    icon: "mdi:fan"
    restore_mode: ALWAYS_ON
sensor:
  - platform: dht
    pin: GPIO23
    temperature:
      name: "Front Walkway Temperature"
    humidity:
      name: "Front Walkway Humidity"
    update_interval: 10s

  - platform: dht
    pin: GPIO22
    temperature:
      name: "Battery Box Temperature"
    humidity:
      name: "Battery Box Humidity"
    update_interval: 10s

@github-actions github-actions bot removed the stale label Dec 29, 2022
@enter360
Copy link

enter360 commented Jan 3, 2023

Any movement on this ? I'm having a similar issue.

@Keeze
Copy link

Keeze commented Feb 2, 2023

Same here, after the latest ESPHome update I was advised to migrate to use the encryptionkey for API use, so I updated al my (old) devices to use the key and now I have several devices with this issue...

@sygys
Copy link

sygys commented Feb 4, 2023

i installed esphome 2022.12.8 and core 2023.2.1 i also got the repair warning to update the api password for an encryption key. i did with one to test it and now the device keeps disconnecting every few seconds and the logs show this:

[12:10:19][V][panasonic_ac:245]: TX: 5A.11.10.09.00.38.01.01.30.01.11.00.80.00.00.B0.00.02.31.00.00.A0.00.00.A1.00.00.A5.00.00.A4.00.00.B2.00.02.35.00.02.33.00.02.34.00.02.32.00.00.BB.00.00.BE.00.02.20.00.02.21.00.00.86.00.47 (63)
[12:10:19][V][panasonic_ac:247]: RX: 5A.11.10.89.00.76.00.01.30.01.11.00.80.01.31.00.B0.01.43.02.31.01.26.00.A0.01.34.00.A1.01.42.00.A5.01.43.00.A4.01.43.00.B2.01.41.02.35.01.41.02.33.01.41.02.34.01.41.02.32.01.42.00.BB.01.0F.00.BE.01.7E.02.20.01.41.02.21.01.41.00.86.2E.2A.00.00.0B.01.01.48.30.30.30.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.52 (125)
[12:10:19][D][panasonic_ac.dnskp11:418]: Received query response
[12:10:19][W][panasonic_ac:064]: Received out of range outside temperature: 126
[12:10:19][D][climate:385]: 'Panasonic AC' - Sending state:
[12:10:19][D][climate:388]:   Mode: OFF
[12:10:19][D][climate:396]:   Custom Fan Mode: 3
[12:10:19][D][climate:402]:   Custom Preset: Normal
[12:10:19][D][climate:405]:   Swing Mode: OFF
[12:10:19][D][climate:408]:   Current Temperature: 15.00°C
[12:10:19][D][climate:414]:   Target Temperature: 19.00°C
[12:10:19][V][component:200]: Component panasonic_ac.climate took a long time for an operation (0.10 s).
[12:10:19][V][component:201]: Components should block for at most 20-30ms.
[12:10:30][D][api:102]: Accepted ::FFFF:192.168.178.51
[12:10:30][W][api.connection:071]: ::FFFF:192.168.178.51: Socket operation failed: BAD_INDICATOR errno=11
[12:10:30][D][api:102]: Accepted ::FFFF:192.168.178.51
[12:10:30][V][api:114]: Removing connection to ::FFFF:192.168.178.51
[12:10:30][W][api.connection:071]: ::FFFF:192.168.178.51: Socket operation failed: BAD_INDICATOR errno=11
[12:10:30][V][api:114]: Removing connection to ::FFFF:192.168.178.51
[12:10:30][D][api:102]: Accepted ::FFFF:192.168.178.51
[12:10:30][V][component:200]: Component api took a long time for an operation (0.05 s).
[12:10:30][V][component:201]: Components should block for at most 20-30ms.
[12:10:30][V][api.connection:900]: Hello from client: 'aioesphomeapi (::FFFF:192.168.178.51)' | API Version 1.7
[12:10:30][D][api.connection:159]: Home Assistant 2023.2.1 (::FFFF:192.168.178.51) requested disconnected
[12:10:30][W][api.connection:083]: aioesphomeapi (::FFFF:192.168.178.51): Connection closed
[12:10:30][D][api:102]: Accepted ::FFFF:192.168.178.51
[12:10:30][V][api:114]: Removing connection to aioesphomeapi (::FFFF:192.168.178.51)
[12:10:30][V][component:200]: Component api took a long time for an operation (0.06 s).
[12:10:30][V][component:201]: Components should block for at most 20-30ms.
[12:10:30][V][api:114]: Removing connection to Home Assistant 2023.2.1 (::FFFF:192.168.178.51)
[12:10:30][V][api.connection:900]: Hello from client: 'Home Assistant 2023.2.1 (::FFFF:192.168.178.51)' | API Version 1.7
[12:10:30][D][api.connection:918]: Home Assistant 2023.2.1 (::FFFF:192.168.178.51): Connected successfully

@sygys
Copy link

sygys commented Feb 4, 2023

But also with only api: configured without key or password will give these disconnects now.

@moverperfect
Copy link

moverperfect commented Feb 15, 2023

I am getting Dilbert66's issue after upgrading to using encryption key instead of password. Running ESPHOME 2023.2.0 and Home Assistant 2023.2.4. Out of the 4 ESP8266, only 1 is showing this error.

EDIT: Upgraded the OS Version to 9.5 and the issue is now fixed. (Not sure if the OS upgrade or the HA restart fixed it)

@razorback16
Copy link

razorback16 commented Feb 18, 2023

I am still seeing the issue on EspHome 2023.2.2 and HomeAssistant 2023.2.5

@Dilbert66
Copy link
Author

Dilbert66 commented Feb 18, 2023

FYI, i opened this issue to report a memory leak problem, not a connection issue or error. If you folks are having connection problems, this is not the same issue.

@robloh
Copy link

robloh commented Mar 2, 2023

Apparently a HASS restart can solve the connection errors: #4223

I was seeing these and my device was restarting every 20 minutes or so (a Shelly 1). I'm doing some more testing to see if this is just related to encryption (ie, memory leak) or if it was just HASS needing a restart after having moved to encryption

@crashbash2020
Copy link

crashbash2020 commented Jun 1, 2023

FYI, i opened this issue to report a memory leak problem, not a connection issue or error. If you folks are having connection problems, this is not the same issue.

Hey was there a solution to this problem? I have the same issue (appears to work correctly besides the 2-3hourly restart) and have verified it with your symptoms (decreasing heap leading up to reboot)

you mentioned in your original bug report;

I have confirmed that the heap decrease is related to the socket error by stopping the erroneous connection attempts. Without the error , heap is staying consistent.

does that mean there was something you can do to stop it from erroring out?

cheers,

@CataCluj
Copy link

I also have ESP8266 problems after updating to API Key. Was OK with Password; never disconnecting; It warned me to upgrade to Key; I did; now it keeps disconnecting with "[api.connection:083]: ...: Socket operation failed: BAD_INDICATOR errno=11"
I restarted HAOS many times as there are updates all the time; to no avail.
Also updated Home Assistant which prompted FW update on the device.
HAOS 10.3, core 2023.7.1

@mzpx70
Copy link

mzpx70 commented Sep 7, 2023

I also have ESP8266 problems after updating to API Key. It was OK with Password; never disconnected. It warned me to upgrade to Key; I did; now it keeps disconnecting with "Socket operation failed: BAD_INDICATOR errno=11"
I restarted HAOS many times as there are updates all the time; to no avail.

  1. I had 3 other module with the same error. After modifying some dummy data in the project and rebuild the project, then delete them from devices list and add them again to HA, the problem solved.

  2. I have only one project which suffer this problem no matter what I have done.

  3. I added the debug function to monitor some internal parameters. It seems the heap is consumed finally the module restarted. See the attached
    esp-heap

In the project there are only:

  • switch
  • cover
  • light

@mzpx70
Copy link

mzpx70 commented Sep 21, 2023

No any advice or proposal how to get rid of this memory leak issue?

@JSchoeck
Copy link

JSchoeck commented Nov 30, 2023

Maybe it would be a good idea to not recommend changing to the api encryption on Shelly 1 / ESP8266 devices @OttoWinter ?

I also "bricked" a Shelly 1 due to this, which worked just fine with the password.

Update: after a week, my Shelly 1 now does not loose connection anymore and has been up continuously. I don't understand what causes this problematic startup behavior.

@fencer
Copy link

fencer commented Dec 11, 2023

Same here. Running on esphome 2023.11.6 I'm getting the same issue on various devices mainly a number of wemos d1 mini and sonoff devices. Device reboots after some time of getting the Bad Indicator error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests