Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADS-B Feeder | High CPU Usage on Home Assistant OS 10 #151

Closed
MaxWinterstein opened this issue Apr 21, 2023 · 41 comments
Closed

ADS-B Feeder | High CPU Usage on Home Assistant OS 10 #151

MaxWinterstein opened this issue Apr 21, 2023 · 41 comments

Comments

@MaxWinterstein
Copy link
Owner

          Anybody experiencing unusually high CPU usage with this Addon since OS V10? It usually sat at ~3% and now it idles with 20% with the addon enabled.

Edit:
Bildschirmfoto vom 2023-04-21 10-50-22

Originally posted by @maweki in #149 (comment)

@therealhalifax
Copy link

Same issue for me. First I made the update of the Addon to x.3 and after the update of the OS to 10.0. The CPU load increased from normaly 3% to 26%. If I restart the Addon the CPU load goes down to 1-2% for about 30s and than switch back to 26%

@MaxWinterstein
Copy link
Owner Author

Out of curiosity, does this also happen when disabling the http server?

@MaxWinterstein MaxWinterstein changed the title High CPU Usage on Home Assistant OS 10 ADS-B Feeder | High CPU Usage on Home Assistant OS 10 Apr 21, 2023
@maweki
Copy link

maweki commented Apr 21, 2023

Out of curiosity, does this also happen when disabling the http server?

The FR24 feeder service seems to be the issue. Deactivating that drops the CPU usage.

@mrkaqz
Copy link

mrkaqz commented Apr 21, 2023

Confirmed having the same issue with 27% cpu consume on the add on page.

@Tntdruid
Copy link

My HA looks like this whit addon running:
firefox_2023-04-21_15-31-17

From proxmox VM

@lopeti
Copy link

lopeti commented Apr 21, 2023

Same issue for me.

@mrkaqz
Copy link

mrkaqz commented Apr 22, 2023

Out of curiosity, does this also happen when disabling the http server?

The FR24 feeder service seems to be the issue. Deactivating that drops the CPU usage.

Yes after disable FR24 Feed. The cpu came back to normal. Seem like it cause the issue.

@chiefcomm
Copy link

Any progress on this issue, higher than normal CPU isn't a concern but with higher CPU comes higher CPU temps which concern me somewhat, nothing unmanageable yet but then for me it is in the cooler months so would prefer to have some idea as to when we could see a fix :-(

image

@mrkaqz
Copy link

mrkaqz commented Apr 27, 2023

Any progress on this issue, higher than normal CPU isn't a concern but with higher CPU comes higher CPU temps which concern me somewhat, nothing unmanageable yet but then for me it is in the cooler months so would prefer to have some idea as to when we could see a fix :-(

image

I am waiting for a fix also, i am not a programmer so I can only help test but not fix. I have disable FR24feed for a moment waiting for a fix. If it taking longer I will have to fire up another VM just to run it out from the HA but if anyone can help fix this issue i will be much happy :)

@mrkaqz
Copy link

mrkaqz commented Apr 28, 2023

with version [1.21.0] - 2023-04-27 updated this morning, still not fix high CPU issue.

image

@MaxWinterstein
Copy link
Owner Author

yeah, the update was not really ment to fix the cpu issue.

For the moment I am not sure what happens there. FR24feed was updated along v1.19.0 - that was a few weeks ago. I hope to reproduce this by playing with the ulimit stuff that changed along docker v23, but need to find some time for that.

@Tntdruid
Copy link

The latest update crash my HA whit oom, had to role back to 1.20.3

@martkopecky
Copy link

Any updates here, please? I have had to disable the addon while the issue lasts due to CPU temperature raising uncomfortably :(

@mrkaqz
Copy link

mrkaqz commented May 13, 2023

I am waiting for a fix as well :)

@SenMorgan
Copy link

Me too waiting for fix

@MaxWinterstein
Copy link
Owner Author

I just updated my OrangePi5 to docker v23, but I am having a hard time to replicate this.

Running with FR24:

image

Running without:

image

@mrkaqz What sensor are you looking at? Where is the information from?

@mrkaqz
Copy link

mrkaqz commented May 13, 2023

I can see it from the add-on info page.

Here is with FR24 disable the CPU usage is about 1%

Screenshot_20230513-162658

Then once FR24 is enable, the usage went up to 30%

Screenshot_20230513-162745

@Delta1977
Copy link
Contributor

this is on may Intel i7 HA Maschine when Feeder runs. I hear the fan on 100 %:
image

@maweki
Copy link

maweki commented May 15, 2023

Logging into the ha-os container with docker exec I find the fr24feed binary to cause huge cpu spikes.

grafik

I started the feder service under strace and the high cpu usage starts when this happens:

ugetrlimit(RLIMIT_NOFILE, {rlim_cur=1073741816, rlim_max=1073741816}) = 0
fcntl64(0, F_GETFD)                     = 0
fcntl64(1, F_GETFD)                     = 0
fcntl64(2, F_GETFD)                     = 0
fcntl64(3, F_GETFD)                     = 0
fcntl64(4, F_GETFD)                     = 0
fcntl64(5, F_GETFD)                     = 0
fcntl64(6, F_GETFD)                     = 0
fcntl64(7, F_GETFD)                     = 0
fcntl64(8, F_GETFD)                     = 0
fcntl64(9, F_GETFD)                     = -1 EBADF (Bad file descriptor)
fcntl64(10, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(11, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(12, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(13, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(14, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(15, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(16, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(17, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(18, F_GETFD)                    = -1 EBADF (Bad file descriptor)
fcntl64(19, F_GETFD)                    = -1 EBADF (Bad file descriptor)
...

which looks to me like iterating over the file system maybe, but I'm not a systems programmer. With strace it takes so long and I've not reached the end yet. Without strace the load drops after some time and then, after a few seconds it picks up again (I guess, starting the process anew).

If the feeder goes through all file descriptors, then maybe setting a low limit might help.

@maweki
Copy link

maweki commented May 15, 2023

Okay, starting with ( ulimit -n 1024 && fr24feed ) works, as it only iterates up to 1024.

Therefore I propose the following "fix":

in /etc/s6-overlay/s6-rc.d/fr24feed/run (or wherever that actually is in all this container stuff)

change

if [ "$SERVICE_ENABLE_FR24FEED" != "false" ]; then
	set -eo pipefail
	/fr24feed/fr24feed/fr24feed 2>&1 | mawk -W interactive '{printf "%c[34m[fr24feed]%c[0m %s\n", 27, 27, $0}'
	# awk -W interactive ...  (prefix log messages with color and "[fr24feed]")
else
	tail -f /dev/null
fi

to

if [ "$SERVICE_ENABLE_FR24FEED" != "false" ]; then
	set -eo pipefail
	( ulimit -n 1024 && /fr24feed/fr24feed/fr24feed ) 2>&1 | mawk -W interactive '{printf "%c[34m[fr24feed]%c[0m %s\n", 27, 27, $0}'
	# awk -W interactive ...  (prefix log messages with color and "[fr24feed]")
else
	tail -f /dev/null
fi

@mblauth
Copy link

mblauth commented May 15, 2023

Very interesting. Did you try setting the ulimits via docker/podman (--ulimit option)? Maybe this would also do the trick without having to modify the scripts in the container.

@maweki
Copy link

maweki commented May 15, 2023

I am not sure I can do that within the home assistant appliance.

The thing is, that the feeder binary seems to iterate over all possible file descriptors (for whatever reason). So imposing that limit from outside should work as well BUT it's then imposed on everything within the container (the webserver, the other feeder, etc.). Then 1024 might be too low a number.

@mblauth
Copy link

mblauth commented May 15, 2023

That makes perfect sense. I came from the sdr-enthusiasts/docker-flightradar24 project where you linked your post here, I am running the fr24 stuff in a separate container, hence my suggestion. I didn't realize it didn't fit in the context of this project.

@Delta1977
Copy link
Contributor

In newest Image Thom-x exposes the Ulimit in Dockerfile.
Maybe so its easier to configure the parameter in HA

Thom-x/docker-fr24feed-piaware-dump1090@36856fe

@maweki
Copy link

maweki commented May 15, 2023

I wouldn't know where to start guessing a limit that is easy on the cpu (the lower the better, because all are iterated over) and a number that is large enough for all services included in the addon.

Seeing that 1024 seemed enough for fr24feed, the million-odd from the docker file could be overly large already.

It's CPU time and energy we could be cumulatively wasting here.

@mrkaqz
Copy link

mrkaqz commented May 18, 2023

In newest Image Thom-x exposes the Ulimit in Dockerfile. Maybe so its easier to configure the parameter in HA

Thom-x/docker-fr24feed-piaware-dump1090@36856fe

@MaxWinterstein May you look at this please, the container is updated to version 1.23 now.
Not sure if we just updated the version will fix issue or may need to set some other parameter.

Finger cross.

Thanks

@Delta1977
Copy link
Contributor

@MaxWinterstein Hey Max, are you still actively taking care of this repository? What is your plan for this issue? In my opinion, only the ENV needs to be set.

@MaxWinterstein
Copy link
Owner Author

Sadly this is not that easy / I don't get this in total.

Mentioned commits of the upstream only take care of this for the thttp server component, not the fr24 feeder. Also, I am confused about what values might be good.

Simply overwriting the original files is not a good style, as all changes that follow need to be adjusted as well.

I will try to find some flexible patching and make the value configurable, then we can figure good values out.

@Delta1977
Copy link
Contributor

Delta1977 commented Jun 5, 2023

Thank for your feedback.
As I understand then unlimit is systemwide and not only for the one process.

https://man7.org/linux/man-pages/man3/ulimit.3.html

Maybe you can expose the ENV to addon-settings and default it to -1 so everybody can override it.

So we can have a quick workaround and see if it helps.

@MaxWinterstein
Copy link
Owner Author

It's kinda ugly that Thom-x sells this as a system setting, as in fact this only targets the thttp server process.

Also I am confused why there is no issues with the fr24 feed in his repository, this should be an issue for him as well?!

@Delta1977
Copy link
Contributor

Sorry, you are right. He sets ulimit in front of thttp start. So all other process before starte with unlimited.

@maweki
Copy link

maweki commented Jun 5, 2023

Simply overwriting the original files is not a good style

It's not too too uncommon to package/provide a custom run-script. My proposal was not very invasive.

@MaxWinterstein
Copy link
Owner Author

It's not too too uncommon to package/provide a custom run-script. My proposal was not very invasive.

Sure, don't get me wrong. It is just another thing on the list if things I have to remember to always double check when the upstream releases. And I am lazy :)


I just released an update that now contains two new settings:

  • ULIMIT_N - used by the thttp server - default to the old default
  • FR24FEED_ULIMIT_N - used by FR24Feed, just rolled the dice and choosed 1024

This should allow for testing for everyone with issues.

Feedback is highly welcomed ❤️

@CoMPaTech
Copy link

CoMPaTech commented Jun 5, 2023

Installing as we speak/type, initially looks very promising:
proxmox_1 23
(using the defaults)

After some minutes, same graph but from the pve/hv not the vm (above one is from the vm)
proxmox_1 23-pve-stats

Seems to hold fine just about 25 minutes in ... will report back tomorrow

@mrkaqz
Copy link

mrkaqz commented Jun 6, 2023 via email

@chiefcomm
Copy link

Lovely just lovely both CPU usage and CPU temp dropped back to normal with this update :-)

image
image

@yousaf465
Copy link

image

image

There is a clear difference now, even though I hadn't noticed an issue before.

@Delta1977
Copy link
Contributor

Thanks for the patch. CPU wants down from 34 to 3 %. I set both ulimit to 1024 which was iirc the default in docker < v23 ( Hassos < v19)

@therealhalifax
Copy link

Perfect Result even with default settings 🤗👍

@MaxWinterstein
Copy link
Owner Author

Super happy to see that this improved the CPU issue thing 🥳

But, still confused why the upstream (thom-x repo) seems to not have this issue 🤔

@MaxWinterstein
Copy link
Owner Author

Alright, thanks for the ride everyone, it seems we have a solution figured out.

Thanks ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests