-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce frequency of Reaper and avoid copying the cache when possible #297
Conversation
b5f6d00
to
9f48d04
Compare
29756c9
to
2c87110
Compare
2c87110
to
60bb866
Compare
Decreasing the reaper didn't help that much on its own. It is the cache copy that kills the performance. Since its a rare exception, it might make sense to trap it and only do the copy if we hit the exception. |
60bb866
to
df4a7a1
Compare
2da92cb
to
31d30df
Compare
At this point I'm happy with this as it equates to a significant decrease in cpu usage. I'm going to update everything I have in production with it and run it for 24 hours to make sure it doesn't have any unexpected side effects. |
For completeness sake:
LGTM otherwise. |
I'll try setting it back to 10s and re-run all the profiles. If it still looks good, I'll adjust here and add the top 10 after |
31d30df
to
1ca185b
Compare
The cache reaper was running at least every 10 seconds, making a copy of the cache, and iterated all the entries to check if they were expired so they could be removed. In practice the reaper was actually running much more frequently because it used self.zc.wait which would unblock any time a record was updated, a listener was added, or when a listener was removed. This change ensures the reaper frequency is only every 10s, and will first attempt to iterate the cache before falling back to making a copy. Previously it made sense to expire the cache more frequently because we had places were we frequently had to enumerate all the cache entries. With python-zeroconf#247 and python-zeroconf#232 we no longer have to account for this concern. On a mostly idle RPi running HomeAssistant and a busy network the total time spent reaping the cache was more than the total time spent processing the mDNS traffic. Top 10 functions, idle RPi (before) %Own %Total OwnTime TotalTime Function (filename:line) 0.00% 0.00% 2.69s 2.69s handle_read (zeroconf/__init__.py:1367) <== Incoming mDNS 0.00% 0.00% 1.51s 2.98s run (zeroconf/__init__.py:1431) <== Reaper 0.00% 0.00% 1.42s 1.42s is_expired (zeroconf/__init__.py:502) <== Reaper 0.00% 0.00% 1.12s 1.12s entries (zeroconf/__init__.py:1274) <== Reaper 0.00% 0.00% 0.620s 0.620s do_execute (sqlalchemy/engine/default.py:593) 0.00% 0.00% 0.620s 0.620s read_utf (zeroconf/__init__.py:837) 0.00% 0.00% 0.610s 0.610s do_commit (sqlalchemy/engine/default.py:546) 0.00% 0.00% 0.540s 1.16s read_name (zeroconf/__init__.py:853) 0.00% 0.00% 0.380s 0.380s do_close (sqlalchemy/engine/default.py:549) 0.00% 0.00% 0.340s 0.340s write (asyncio/selector_events.py:908) After this change, the Reaper code paths do not show up in the top 10 function sample. %Own %Total OwnTime TotalTime Function (filename:line) 4.00% 4.00% 2.72s 2.72s handle_read (zeroconf/__init__.py:1378) <== Incoming mDNS 4.00% 4.00% 1.81s 1.81s read_utf (zeroconf/__init__.py:837) 1.00% 5.00% 1.68s 3.51s read_name (zeroconf/__init__.py:853) 0.00% 0.00% 1.32s 1.32s do_execute (sqlalchemy/engine/default.py:593) 0.00% 0.00% 0.960s 0.960s readinto (socket.py:669) 0.00% 0.00% 0.950s 0.950s create_connection (urllib3/util/connection.py:74) 0.00% 0.00% 0.910s 0.910s do_commit (sqlalchemy/engine/default.py:546) 1.00% 1.00% 0.880s 0.880s write (asyncio/selector_events.py:908) 0.00% 0.00% 0.700s 0.810s __eq__ (zeroconf/__init__.py:606) 2.00% 2.00% 0.670s 0.670s unpack (zeroconf/__init__.py:737)
1ca185b
to
9043cba
Compare
The reaper continues to be absent from the profile with reverting to a 10s interval 👍 The cpu graph on one of my servers with 3 Home Assistant instances is down ~6% and remains so with the switch. |
Nice! |
Thanks again @jstasiak Do you have a moment to put out a release today? I want to get this in today to go out in Home Assistant 0.115 beta tomorrow as I'm still working with a few users to track performance issues. Any unrelated data we can get out of the profile will go along way to getting better reports. |
Sure thing, 0.28.4 is out with your change. :) |
Avoid copying the entires cache and reduce frequency of Reaper
The cache reaper was running at least every 10 seconds, making
a copy of the cache, and iterated all the entries to
check if they were expired so they could be removed.
In practice the reaper was actually running much more frequently
because it used self.zc.wait which would unblock any time
a record was updated, a listener was added, or when a
listener was removed.
This change ensures the reaper frequency is only every 10s, and
will first attempt to iterate the cache before falling back to
making a copy.
Previously it made sense to expire the cache more frequently
because we had places were we frequently had to enumerate
all the cache entries. With #247 and #232 we no longer
have to account for this concern.
On a mostly idle RPi running HomeAssistant and a busy
network the total time spent reaping the cache was
more than the total time spent processing the mDNS traffic.
Top 10 functions, idle RPi (before)
After this change, the Reaper code paths do not show up in the top
10 function sample.