Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce buffering between watcher and Store #1494

Merged

Conversation

fabriziosestito
Copy link
Contributor

Motivation

Fixes: #1487

This PR reduces allocations when the watcher is paging/streaming the initial list.
It updates the watcher::Event enum to emit events that can be used to update a buffer on the end side and it updated the reflector store accordingly.

Solution

ListWatch strategy

When starting the paginated API request, a RestartedStart is returned,
and the state is changed from Empty to InitPage.
When pages are received, RestartedPage(objs)events are returned till the end of the list, then the state is changed to InitPageDone and the RestartedDone event is returned.
The Store is updated to react to the new events:

  • On RestartedStart it initializes the buffer
  • On RestartedPage(objs) it extends the buffer with the objects received.
  • On RestartDone it swaps the buffer with the internal store. This allows
    to update the store atomically.

StreamingList strategy

The RestartedStart event is returned when the InitialWatch is started, then RestartedApplied(obj) or RestartedDelete(obj) events are returned till bookmark with the "k8s.io/initial-events-end" annotation is received.
Then, the RestartedDone event is returned.
The store reacts to RestartedApplied/RestartedDeleted events by updating the buffer.

Benchmarks

Watching 10000 RoleBindings:

ListWatch strategy:

With Store Without Store
Patched ~58 Mb ~13 Mb
Upstream ~92 Mb ~76 Mb

StreamingList strategy:

With Store Without Store
Patched ~53 Mb ~7 Mb
Upstream ~94 Mb ~76 Mb

NOTE:
This is a WIP, need to update tests.

@fabriziosestito fabriziosestito force-pushed the fix/reduce-buffering-between-watcher-and-store branch 2 times, most recently from 14df858 to 7d8ec65 Compare May 13, 2024 11:09
@clux clux added the changelog-change changelog change category for prs label May 13, 2024
@fabriziosestito fabriziosestito force-pushed the fix/reduce-buffering-between-watcher-and-store branch 2 times, most recently from 8732b66 to 2797cbf Compare May 15, 2024 11:19
Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot, thanks a lot for doing this. Had a deeper dig into the watcher code and the way this is enabled in watcher's trampolines (by injecting extra events via the state machinery) and bubbled up to store and swap buffered is just great.

Left some questions and suggestions here and there initially. Unsure of best way forward for subscribers atm.

kube-runtime/src/watcher.rs Outdated Show resolved Hide resolved
kube-runtime/src/watcher.rs Outdated Show resolved Hide resolved
kube-runtime/src/watcher.rs Outdated Show resolved Hide resolved
kube-runtime/src/watcher.rs Outdated Show resolved Hide resolved
kube-runtime/src/utils/event_flatten.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
@fabriziosestito fabriziosestito force-pushed the fix/reduce-buffering-between-watcher-and-store branch from 90c310f to 045b6b5 Compare May 16, 2024 11:40
Copy link
Contributor

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great so far! I left some nitpicks and a larger comment around the dispatch / subscribe interface. It's relatively new so I don't think we need to be too conservative in the changes we make there.

I'll do a more thorough review once I get my bearings a bit more. It's a pretty big change. Super exciting though!

kube-runtime/src/reflector/store.rs Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
kube-runtime/src/reflector/store.rs Outdated Show resolved Hide resolved
@clux
Copy link
Member

clux commented May 21, 2024

I got tests to pass in #1497. you can cherry pick the last commit there and i think this should work.

the complications I ran into:

  • restarts now go through store contents in a non-deterministic order (could be fixed if we care)
  • marker watcher::Events do not count towards buffer size in broadcast channel (so tests are less clear with them)

so have stopped matching on explicit objects (in tests), and moved some tests towards Apply events only (but kept one, without backpressure).

@fabriziosestito
Copy link
Contributor Author

@clux I've cherry-picked the commit, please let me know if some action is needed from my side to move forward.

Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 77.01149% with 40 lines in your changes are missing coverage. Please review.

Project coverage is 74.9%. Comparing base (c30ce34) to head (31a4110).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1494     +/-   ##
=======================================
+ Coverage   74.8%   74.9%   +0.1%     
=======================================
  Files         78      78             
  Lines       6808    6860     +52     
=======================================
+ Hits        5091    5136     +45     
- Misses      1717    1724      +7     
Files Coverage Δ
kube-runtime/src/controller/mod.rs 29.9% <100.0%> (ø)
kube-runtime/src/reflector/dispatcher.rs 96.2% <100.0%> (+0.7%) ⬆️
kube-runtime/src/reflector/mod.rs 100.0% <100.0%> (ø)
kube-runtime/src/utils/event_modify.rs 95.5% <100.0%> (ø)
kube-runtime/src/utils/reflect.rs 100.0% <100.0%> (ø)
kube-runtime/src/utils/event_flatten.rs 89.8% <91.7%> (-2.1%) ⬇️
kube-runtime/src/reflector/store.rs 91.8% <69.7%> (-7.3%) ⬇️
kube-runtime/src/watcher.rs 39.8% <48.3%> (+1.9%) ⬆️

fabriziosestito and others added 12 commits May 22, 2024 10:12
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: clux <sszynrae@gmail.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
@fabriziosestito fabriziosestito force-pushed the fix/reduce-buffering-between-watcher-and-store branch from 952f616 to d573f4f Compare May 22, 2024 08:18
@clux clux added this to the 0.92.0 milestone May 22, 2024
@clux
Copy link
Member

clux commented May 22, 2024

I think this should be in a good state now. Feel free to mark it as ready if you are, because I am perfectly happy with this as it stands. Real world usage is also in-line with numbers in the PR, upgrade was trivial in cases where I was not using custom stores.


There is one minor bikeshed to maybe consider:

maybe there's potentially better watcher::Event names for the markers (i feel maybe the RestartInit + Restart is slightly confusing, but don't have a lot of good suggestions other than maybe ReList + ReListed + Initial)

It does this PR, as we could tweak this separately to spare you from that before release.

@fabriziosestito fabriziosestito marked this pull request as ready for review May 22, 2024 13:51
Copy link
Contributor

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 looks great! I think @clux's suggestion around event names is worth following up but I also don't see it as a blocker for this PR. Thanks a lot @fabriziosestito for doing the heavy lifting on this one. :)

@clux
Copy link
Member

clux commented May 23, 2024

Yeah thanks again!

The best result I've seen with this is on a workload controller that's been running for a day operating on ~2k pods (and it's already quite optimised with metadata watches), and still; the container_memory_working_set_bytes has gone from a consistent ~100M to ~50M.

Screenshot 2024-05-23 at 10 48 17 (yellow old, green this branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-change changelog change category for prs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce buffering between watcher and Store
3 participants