Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Problem using Plugwise USB Stick: Rpi hangs due to running out of memory #10

Closed
MagnusThome opened this issue Nov 1, 2022 · 26 comments
Assignees
Labels
bug Something isn't working

Comments

@MagnusThome
Copy link

Describe the bug.

Plugwise might be totally innocent but here's my scenario:

Two fully separate HA installations in different houses. Both are Rpi4 with new original Rpi OEM PSUs. Both with Plugwise USB Stick plugged in. Both regularly crash and need power cycling to get back up again.

In addition to the Plugwise Beta plus Sticks both also have Tasmota installed. Other than that they have a few things installed but not the same on both.

Just curious if the Stick can be part of the problem if anyone else have had issues? And if so if Plugwise beta is something to dig out logs from and so on.

What version of Home Assistant Core has the issue?

All (both Rpis set up from scratch this spring)

What was the last working version of Home Assistant Core?

None

What type of installation are you running?

Home Assistant OS

How did you install plugwise-beta?

HACS

What kind of Plugwise device is having issues?

USB: Stick

What firmware version is your Plugwise product at?

Circles are firmware: 2015-09-18 10:53:15

Logging

No response

Diagnostics information

No response

Additional information

No response

@MagnusThome MagnusThome added the bug Something isn't working label Nov 1, 2022
@bouwew
Copy link
Contributor

bouwew commented Nov 2, 2022

Do you use SD-cards as the storage-medium on these Pi's?
If yes, that's most likely the reason for the crashes. Your SD-card will become full in no time if you don't take special measures in limiting the growth of the HA database.

@MagnusThome
Copy link
Author

Thanks for the reply. Both use same brand Sandisk USB sticks to run on. One has 32GB and is just now reporting 20GB free, the other one runs on 16GB and reports 5,7GB free.

@bouwew
Copy link
Contributor

bouwew commented Nov 2, 2022

Ok, so you're saying, that should not be the cause of your problem?

Next question: is there any operation-system log that you can check for clues? Like the journal-log?
This also depends on how you have installed Home Assistant?

Also, what is crashing? The Pi operating system or HA? If HA is crashing, is the operating system still running?

@MagnusThome
Copy link
Author

It doesn't look like the USB sticks should be a problem I think?

I've been looking at the different available logs but honestly don't know what to search for. I do see some errors from the plugwise addon but not sure what is okay noise and what is an actual problem.

On one of the installs I see this reoccuring:

2022-11-02 13:02:05.026 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000037B255 because max retries 3 reached
2022-11-02 13:02:05.173 WARNING (serial_reader_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F000037B255 because max retries 3 reached

On the other install I see many of these too:

2022-11-02 13:41:59.901 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 13:42:14.697 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 13:42:16.500 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 13:42:17.676 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 13:42:25.295 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 13:43:04.013 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 13:43:32.335 WARNING (serial_reader_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 13:43:57.782 WARNING (serial_reader_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 13:43:59.583 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached

When they have "crashed" and I ping I seem to get replies and port 22 and 8123 are also open according to my port scanner, but trying with ssh or web browser there is nothing coming back from either.

@bouwew
Copy link
Contributor

bouwew commented Nov 2, 2022

The messages you show are indeed okay noise :) I see many of them too in my system.

Ok, so your Pi OS crashes? With which method have you installed HA on your Pi?

Also, have you searched for "Pi crashes" or similar in https://community.home-assistant.io/?

@MagnusThome
Copy link
Author

MagnusThome commented Nov 2, 2022

I also found two of these watchdog and timeout events but only on one of the Pis, not the other one

2022-11-02 14:15:28.055 ERROR (receive_timeout_thread) [root] Uncaught thread exception

2022-11-02 14:15:46.436 WARNING (watchdog_thread) [plugwise.controller] Unexpected halt of receive thread, restart thread

But it continued to run at these two occasions though, no crash

2022-11-02 14:13:17.392 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:13:27.878 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:13:27.887 WARNING (receive_timeout_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:13:30.904 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:13:33.955 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:13:35.716 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:13:49.010 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:13:49.157 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:14:02.240 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:14:19.867 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:14:21.099 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:14:21.701 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:14:33.192 WARNING (MainThread) [homeassistant.helpers.template_entity] Template loop detected while processing event: <Event state_changed[L]: entity_id=sensor.all_effekt, old_state=<state sensor.all_effekt=267.65; unit_of_measurement=W, device_class=power, friendly_name=All effekt @ 2022-11-02T14:14:31.184424+01:00>, new_state=<state sensor.all_effekt=265.45; unit_of_measurement=W, device_class=power, friendly_name=All effekt @ 2022-11-02T14:14:32.192254+01:00>>, skipping template render for Template[{% set ns = namespace(states=[]) %} {% for s in states.sensor %}
{% if s.object_id.startswith('power_usage_') %}
{% set ns.states = ns.states + [ s.state | float ] %}
{% endif %}
{% endfor %} {{ ns.states | sum | round(2) }}]
2022-11-02 14:14:50.474 WARNING (serial_reader_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:14:50.632 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:14:50.769 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:14:50.919 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:14:52.157 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:15:14.514 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:15:22.295 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:15:22.424 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:15:22.667 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:15:28.042 WARNING (receive_timeout_thread) [plugwise.controller] Drop CirclePowerUsageRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:15:28.055 ERROR (receive_timeout_thread) [root] Uncaught thread exception
Traceback (most recent call last):
File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/site-packages/plugwise/controller.py", line 329, in _receive_timeout_loop
if self.expected_responses[seq_id][3] is not None:
KeyError: b'DE3A'
2022-11-02 14:15:42.937 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:15:46.436 WARNING (watchdog_thread) [plugwise.controller] Unexpected halt of receive thread, restart thread
2022-11-02 14:15:47.465 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:15:47.613 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:15:56.837 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:16:02.835 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:16:10.827 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:16:10.987 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:16:16.486 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:16:31.642 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:16:32.699 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:16:32.881 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:16:33.160 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:16:46.528 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F000356273F because max retries 3 reached
2022-11-02 14:16:46.529 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F000356273F because max retries 3 reached
2022-11-02 14:16:46.532 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F000356273F because max retries 3 reached
2022-11-02 14:38:43.390 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:38:43.545 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:38:48.695 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:38:48.915 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:38:48.917 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:38:48.920 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F0003562646 because max retries 3 reached
2022-11-02 14:38:48.923 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F00035626FE because max retries 3 reached
2022-11-02 14:38:48.925 WARNING (receive_timeout_thread) [plugwise.controller] Drop CircleEnergyCountersRequest to 000D6F00035626FE because max retries 3 reached
2022-11-02 14:39:02.373 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:39:03.963 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:39:03.971 ERROR (receive_timeout_thread) [root] Uncaught thread exception
Traceback (most recent call last):
File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/site-packages/plugwise/controller.py", line 329, in _receive_timeout_loop
if self.expected_responses[seq_id][3] is not None:
KeyError: b'02EF'
2022-11-02 14:39:06.904 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:39:07.042 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:39:31.037 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:39:31.174 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:39:31.333 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:39:31.471 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:39:48.063 WARNING (watchdog_thread) [plugwise.controller] Unexpected halt of receive thread, restart thread
2022-11-02 14:39:50.175 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:39:59.336 WARNING (MainThread) [homeassistant.helpers.template_entity] Template loop detected while processing event: <Event state_changed[L]: entity_id=sensor.all_effekt, old_state=<state sensor.all_effekt=293.34; unit_of_measurement=W, device_class=power, friendly_name=All effekt @ 2022-11-02T14:39:57.312549+01:00>, new_state=<state sensor.all_effekt=291.19; unit_of_measurement=W, device_class=power, friendly_name=All effekt @ 2022-11-02T14:39:58.326518+01:00>>, skipping template render for Template[{% set ns = namespace(states=[]) %} {% for s in states.sensor %}
{% if s.object_id.startswith('power_usage_') %}
{% set ns.states = ns.states + [ s.state | float ] %}
{% endif %}
{% endfor %} {{ ns.states | sum | round(2) }}]
2022-11-02 14:40:21.082 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:40:21.684 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F000038C184 because max retries 3 reached
2022-11-02 14:40:23.334 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:40:23.492 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:40:33.137 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached
2022-11-02 14:40:33.143 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768C91 because max retries 3 reached
2022-11-02 14:40:33.146 WARNING (receive_timeout_thread) [plugwise.controller] Drop NodePingRequest to 000D6F0000768CAD because max retries 3 reached
2022-11-02 14:40:45.802 WARNING (serial_reader_thread) [plugwise.controller] Drop NodePingRequest to 000D6F00004BF6F1 because max retries 3 reached

@MagnusThome
Copy link
Author

Ok, so your Pi OS crashes?

Sort of half crashes I guess since it still has the ports open but they are sort of dead anyway.

With which method have you installed HA on your Pi?

Used the downloadable image for the Pi and made the USB sticks with it

Also, have you searched for "Pi crashes" or similar in https://community.home-assistant.io/?

Ah, will do!

@bouwew
Copy link
Contributor

bouwew commented Nov 2, 2022

@MagnusThome
Copy link
Author

Very similar to you case? https://community.home-assistant.io/t/unexpected-crashes-how-to-find-the-cause/361822

Ok, worth a try to add a fan. Will give it a go :-)

@MagnusThome
Copy link
Author

I found HA isn't crashing, its under super heavy load, these are the two HA I run and what it looks like when they don't reply at all and I have to power cycle them. Is there any way I can check if these loads are from the plugwise addon or if its something else?

image

image

@bouwew
Copy link
Contributor

bouwew commented Nov 22, 2022

There is clearly a memory-leak. You can disable the plugwise (Stick) integration and then monitor your system again.

I must say, I use the Stick integration myself, next to an Adam and P1 integration, I never have any problems with the memory getting full or similar.
I'm running HA in a virtual machine under Proxmox.

@MagnusThome
Copy link
Author

So far no memory peaks like that after disabling. But I'll let it run a couple of days more.

@MagnusThome
Copy link
Author

Interesting, it's way more responsive and fast in the web interface also. It's a Rpi4 .

@MagnusThome
Copy link
Author

MagnusThome commented Nov 24, 2022

Sorry to say but HA runs flawlessly with Plugwise beta disabled.

Something with using an USB stick instead of the other options I guess then 😟

Or could it be something with old data stats still in the circles? They have all been reset though.

@bouwew
Copy link
Contributor

bouwew commented Nov 25, 2022

@brefra any idea about the above data & conclusion?

@MagnusThome
Copy link
Author

MagnusThome commented Dec 1, 2022

Some logs with/without plugwise beta running with USB stick

Screenshot 2022-12-01 164717

@MagnusThome
Copy link
Author

What can I do to help pinpoint the issue?

@herwinjan
Copy link

herwinjan commented Jan 13, 2023

Hi,

There is a memory leak.. but where, if i enable the plugwise integration, my systemen (i5 NUC ESXi with 8 GB memory assigned) if full in 1 to 2 days.

CleanShot 2023-01-13 at 08 25 39@2x

See the graph with Plugwise enabled and disabled.

My system:

  • Nuc 8 i5 with 32 GB memory, 8 assigned to Home Assistant
  • HASOS
  • latest versions
  • Plugwise Stick with USB connected to the VM
  • 18 circles
  • Debugging turned off (to prevent lots of writes in the log files)

Please let me know how to help debugging!

@MagnusThome
Copy link
Author

I know you've spent time on this and I'm very grateful. Is there anything I can do to help?

I noticed this odd thing with both a reoccuring same growing rate of memory usage but also a suddenly much faster rise in memory usage. Surprised me a bit

image

@MagnusThome
Copy link
Author

Very different behaviour lately with latest updates
Screenshot_20230405_222052

@MagnusThome
Copy link
Author

MagnusThome commented Apr 13, 2023

@brefra @bouwew

An update of my logging :-)

I've marked with red arrows where homeassistant totally stopped responding and I had to power cycle it to get it going again.

Then I disabled plugwise beta and that is where the spikes totally stop coming:

Untitled-1

@MagnusThome
Copy link
Author

Screenshot_20230414_154126_com android chrome

@MagnusThome
Copy link
Author

Screenshot_20230418_195800_com android chrome

@CoMPaTech CoMPaTech changed the title Anyone else using USB Stick and have Rpi crashes? [Question] Anyone else using USB Stick and have Rpi crashes? Apr 19, 2023
@CoMPaTech CoMPaTech transferred this issue from plugwise/plugwise-beta Apr 19, 2023
@MagnusThome MagnusThome changed the title [Question] Anyone else using USB Stick and have Rpi crashes? Problem using Plugwise USB Stick: Rpi hangs due to running out of memory (mem leak) Apr 24, 2023
@MagnusThome MagnusThome changed the title Problem using Plugwise USB Stick: Rpi hangs due to running out of memory (mem leak) Problem using Plugwise USB Stick: Rpi hangs due to running out of memory (mem leak?) Apr 24, 2023
@MagnusThome MagnusThome changed the title Problem using Plugwise USB Stick: Rpi hangs due to running out of memory (mem leak?) [bug] Problem using Plugwise USB Stick: Rpi hangs due to running out of memory Apr 24, 2023
@MagnusThome MagnusThome changed the title [bug] Problem using Plugwise USB Stick: Rpi hangs due to running out of memory [BUG] Problem using Plugwise USB Stick: Rpi hangs due to running out of memory Apr 24, 2023
@dirixmjm
Copy link
Contributor

@MagnusThome can you please try with release 0.31.1b1?
see #110

@MagnusThome
Copy link
Author

@MagnusThome can you please try with release 0.31.1b1? see #110

Sorry can't, I've sold off all Plugwise gear.

@dirixmjm
Copy link
Contributor

@MagnusThome ok I understand.
@CoMPaTech can you please close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants