Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage with docker #6612

Closed
2 of 3 tasks
Giel538 opened this issue Jan 11, 2024 · 32 comments · Fixed by #6640
Closed
2 of 3 tasks

High CPU usage with docker #6612

Giel538 opened this issue Jan 11, 2024 · 32 comments · Fixed by #6640
Labels
bug Something isn't working

Comments

@Giel538
Copy link

Giel538 commented Jan 11, 2024

Checklist

  • I am not using Home Assistant. Or: a developer has told me to come here.
  • I have checked the troubleshooting section and my problem is not described there.
  • I have read the changelog and my problem is not mentioned there.

Deploy method

Docker

Z-Wave JS UI version

9.6.2.6e369a1

ZwaveJS version

12.4.1

Describe the bug

I am running all my smart home stuff inside a debian virtual machine. It was almost using 2% cpu power steady for months. Since the update from Zwavejs UI to 9.6.0 my cpu usage is increased to 3.1%. I know it is still low in absolute value but it is still almost 35%! The vm is using 2 cores from my intel i5 13500 cpu. May be other people who are using a rpi or similar systems are have more problems with the increased cpu usage.

ZwaveJS is using the most cpu power from all my docker containers. Even more then Home Assistant or Zigbee2mqtt (which has a lot more devices).

I checked the change log but i don't know what is changed what can cause a higher cpu usage

To Reproduce

I restarted the container a few times but it is still the same. When i stop the container the cpu usage is back to 2%.

Expected behavior

I expect after a "small" update that the cpu usage is not changing this much.

Additional context

I am using AEON Labs | Z‐Stick Gen5 USB Controller but it is connected via ser2net via another computer.

This is my docker stack:

services:
  zwavejs2mqtt:
    container_name: zwavejs2mqtt
    image: zwavejs/zwavejs2mqtt:latest
    restart: unless-stopped
    tty: true
    stop_signal: SIGINT
    environment:
#      - SESSION_SECRET=mysupersecretkey
      - TZ=Europe/Amsterdam
#      - ZWAVEJS_EXTERNAL_CONFIG=/usr/src/app/store/.config-db      
#    devices:
#      - /dev/serial/by-id/usb-0658_0200-if00:/dev/zwave
    volumes:
      - /home/xxxxx/data/zwavejs2mqtt/store:/usr/src/app/store
    ports:
      - '8091:8091' # port for web interface
      - '3000:3000' # port for Z-Wave JS websocket server
@Giel538 Giel538 added the bug Something isn't working label Jan 11, 2024
@robertsLando
Copy link
Member

robertsLando commented Jan 12, 2024

@Giel538 By checking changes on my side most things are related to build stuff so nothing that could cause such cpu increese. Are you able to detect the exact version that causes this? I mean dis you switched from 5.1.0 to 9.6.2? Could you try doing a granular switch between version and tell me the exact version that introduced this issue?

cc @AlCalzone

@AlCalzone
Copy link
Member

Also driver logs on level debug could be helpful.

@t-o-o-m
Copy link

t-o-o-m commented Jan 13, 2024

I am running HAOS in a VM as well; also my controller is an AEON Labs | Z‐Stick Gen5 USB as well, but I connected it to the host machine and pass it through to the VM.

zwavejs2 is also leading my docker stats by far (docker stats --no-stream --format "table {{.Name}}\t{{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}" | sort -k 3 -h -r) with a usage of 3 - 11%. The VM is granted two cores of an ancient i5 6260u.

7 devices are in my network, with one of them reporting every few seconds and quite some data to unpack (a metering plug - Node 7 in my attached logs).

Happy to provide anything that can help narrowing this down!

EDIT: Attached driver logs on level debug
zwave-js-ui-store.zip

This might be related to zwave-js/zwave-js-ui#3323

@Giel538
Copy link
Author

Giel538 commented Jan 14, 2024

Hmm i feel a bit stupid... Downgrading to older versions did not seem to help.

I got triggered by the post from @t-o-o-m.

It seems like that at the exact same date i created an automation in home assistant to force my zwave wall plug to send an update of the actual power usage everey 2 seconds. I use zwave_js.refresh.value for this action. I needed a fast update cycle because the power usage is changing a lot in a small area (between 7 and 12 W).

@t-o-o-m
Copy link

t-o-o-m commented Jan 14, 2024

Could be related to our power metering devices, yes.
If I could I would configure mine to send way less values, but so far I could not find an option to do so.

But just to put things into perspective: my zigbee network has 23 nodes, not 7, and a few of those have ridiculous reporting intervals as well. Zigbee2mqtt hovers around 0.01% usage with peaks at 0.05%, though.
So I still think this would be worth looking into.

@AlCalzone
Copy link
Member

Anyone with this problem able to capture a Node.js CPU profile?
That would help narrow down where the load comes from.

@robertsLando
Copy link
Member

It seems like that at the exact same date i created an automation in home assistant to force my zwave wall plug to send an update of the actual power usage everey 2 seconds

I have a feel this could be a possible reason but only a CPU profile could help understanding what's going on there...

@t-o-o-m
Copy link

t-o-o-m commented Jan 15, 2024

I tried to stop the add-on and call the command mentioned here: docker run ghcr.io/hassio-addons/zwave-js-ui/amd64:3.1.0 /bin/sh -c 'node --prof-process --preprocess $(ls -1t isolate* | head -n1)' > processed.txt, but I guess this lacks some environment vars / device passthrough / volume mount that is usually taken care of by the supervisor.

Should I assemble a docker compose file taking care of that? Would that work together with the supervisor?

@robertsLando
Copy link
Member

robertsLando commented Jan 15, 2024

Can't speak for the addon side but you should lunch the nodejs process with --inspect , in order to do this on docker you could try:

docker run -p 9229:9229 zwavejs/zwave-js-ui:latest node --inspect=0.0.0.0:9229 server/bin/www

@t-o-o-m
Copy link

t-o-o-m commented Jan 15, 2024

Created a separate VM and passed my ZWave stick through. Profiled it for a while - attached the processed tick file.

Can also run any command and use the inspect port if you need me to.
processed.txt

@Giel538
Copy link
Author

Giel538 commented Jan 15, 2024

Can't speak for the addon side but you should lunch the nodejs process with --inspect , in order to do this on docker you could try:

docker run -p 9229:9229 zwavejs/zwave-js-ui:latest node --inspect=0.0.0.0:9229 server/bin/www

IF i run this command it stops at this line:

2024-01-15 18:29:54.714 WARN Z-WAVE: Z-Wave driver not inited, no port configured

How can i attach a ser2net device to docker run? Or is this not possible? Otherwise i have to place my stick temporary in the right computer.

@t-o-o-m
Copy link

t-o-o-m commented Jan 15, 2024

I guess you have to also export port 8091 -p 8091:8091, visit this port in your browser and set the serial port in the UI to tcp://ip:port of your ser2net server as you probably also did during the initial configuration of your current setup?

Or it's probably sufficient to mount your current config into the container, like you did in the docker compose config you posted -v /home/xxxxx/data/zwavejs2mqtt/store:/usr/src/app/store. This hopefully contains the serial port setting as well.

@robertsLando
Copy link
Member

@Giel538 Yeah you should mount the volume, or just edit your actual docker-compose file and add command: node --inspect=0.0.0.0:9229 server/bin/www

@t-o-o-m
Copy link

t-o-o-m commented Jan 16, 2024

@robertsLando @AlCalzone: did you take a look at the CPU profile posted above? Can you deduct anything from it? It looks quite similar to the one provided in zwave-js/zwave-js-ui#3323

Happy to provide other things - also whatever stuff is possible through the inspect port.

@robertsLando
Copy link
Member

I checked it but I don't find anything that points to something on z-ui side, I see many entries from node_modules/@zwave-js so it could be likely something on driver side...

@AlCalzone just throwing some ideas, could it be the jsonl db?

@t-o-o-m just to discard a possible issue, could you disable logging to file to see if that keeps the cpu high?

@t-o-o-m
Copy link

t-o-o-m commented Jan 16, 2024

Completely disabled logs - tick analysis: processed_no_logs.txt

@robertsLando
Copy link
Member

@t-o-o-m Do you still see high usage? Now I check the logs

@t-o-o-m
Copy link

t-o-o-m commented Jan 16, 2024

@robertsLando Yes, still high.
I'd love to test a different Z-Wave stick as it might be actually connected to the driver, but unfortunately the Zooz sticks are currently not really available here.

Edit: disconnected the Z-Wave stick - the CPU usage immediately dropped to ~0%. So I guess this excludes all kinds of infinite loops :)

@AlCalzone
Copy link
Member

just throwing some ideas, could it be the jsonl db?

While this does loop quite a bit, most of the time should be spent sleeping. And it's only 50 of ~18000 total ticks.

@t-o-o-m you could try editing /usr/src/app/node_modules/@alcalzone/jsonl-db/build/lib/db.js line 592 for a test:

-         const sleepDuration = 20; // ms
+         const sleepDuration = 250; // ms

for example.

@robertsLando
Copy link
Member

disconnected the Z-Wave stick - the CPU usage immediately dropped to ~0%. So I guess this excludes all kinds of infinite loops :)

Humm this is interesting, try making the change @AlCalzone suggested if you can

@AlCalzone
Copy link
Member

AlCalzone commented Jan 17, 2024

Disconnecting the stick destroys the driver. So it should exclude the UI from the list of potential reasons.

@AlCalzone
Copy link
Member

Have we considered serialport yet?
serialport/node-serialport#2659

@robertsLando
Copy link
Member

we didn't checked that on the other issue, it could be BTW! I could try creating a test container running on node 20.2 maybe?

@robertsLando
Copy link
Member

@t-o-o-m
Copy link

t-o-o-m commented Jan 17, 2024

just throwing some ideas, could it be the jsonl db?

While this does loop quite a bit, most of the time should be spent sleeping. And it's only 50 of ~18000 total ticks.

@t-o-o-m you could try editing /usr/src/app/node_modules/@alcalzone/jsonl-db/build/lib/db.js line 592 for a test:

-         const sleepDuration = 20; // ms
+         const sleepDuration = 250; // ms

for example.

Tried this now - this brings the CPU usage down to the realm of 0.xx%; at least using the standalone container without HA 🎉.

Would love to try this within HA as well!

Edit: also seems to work with the HA community addon. Just had to edit /opt/node_modules/@alcalzone/jsonl-db/build/lib/db.js instead (different path). Killed the node process which immediately resurrected with much less CPU usage.

Thanks for sticking with me on this! It was never really a big issue but I kinda am hunting down any Watt I can save :D

@AlCalzone
Copy link
Member

Huh, guess I'll have to rework that function then.

@robertsLando
Copy link
Member

Huh seems we find it so, @AlCalzone do you want me to submit a PR? I think it may be worth a patch release of zwave-js before your return :)

@AlCalzone
Copy link
Member

I'm not sure I like the solution we tried here. Might have to do something a little more complicated.

@robertsLando robertsLando changed the title CPU usage increase since update to 9.6.0 (docker) High CPU usage with docker Jan 18, 2024
@AlCalzone
Copy link
Member

@robertsLando Can you transfer to node-zwave-js please?

@robertsLando robertsLando transferred this issue from zwave-js/zwave-js-ui Jan 18, 2024
@Giel538
Copy link
Author

Giel538 commented Jan 18, 2024

just throwing some ideas, could it be the jsonl db?

While this does loop quite a bit, most of the time should be spent sleeping. And it's only 50 of ~18000 total ticks.

@t-o-o-m you could try editing /usr/src/app/node_modules/@alcalzone/jsonl-db/build/lib/db.js line 592 for a test:

-         const sleepDuration = 20; // ms
+         const sleepDuration = 250; // ms

for example.

Hi.

I can also confirm that this is working for me as well.

@robertsLando
Copy link
Member

Hi guys, could you give a try to master tag?

@t-o-o-m
Copy link

t-o-o-m commented Jan 25, 2024

Can confirm low idle CPU usage on the master branch. Great work @AlCalzone @robertsLando!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants