Reduce resource consumption of ListUnifiedResources by rosstimothy · Pull Request #53213 · gravitational/teleport

rosstimothy · 2025-03-20T02:03:57Z

The changes here were a result of examining CPU and memory profiles of BenchmarkListUnifiedResources. The biggest inefficiencies stemmed from the authorizer and (types.Server) MatchSearch.

When the authorizer computes lock targets it will call (types.LockTarget) Equals, which was relying on proto.Equals to determine if nine strings were equivalent. The fields are now individually checked and a test was added to ensure that if any new fields are added they must be added to the check for the test to pass.

ListUnifiedResources attempts to be smart and evaluate whether the user has read/list access to a particular resource kind once. However, it was always checking if users had access to all supported resources instead of just the resources that the user was requesting. While most of the checks are not very expensive, checking all resources is still worse than only checking the required resources.

Lastly, the (types.Server) MatchSearch implementation is no longer allocating a slice of search fields and using types.MatchSearch. Instead a variant of the loop from types.MatchSearch is inlined. While this was only applied to types.Server it should likely be extended to other resource types in the future - or types.MatchSearch should be refactored in a way that does not require callers to allocate.

$ benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/auth
cpu: Apple M2 Pro
                                              │   old.txt   │               new.txt               │
                                              │   sec/op    │   sec/op     vs base                │
ListUnifiedResourcesFilter/labels-12            195.0m ± 6%   173.3m ± 2%  -11.12% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_path-12    230.6m ± 3%   210.8m ± 4%   -8.57% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_index-12   229.8m ± 5%   209.0m ± 7%   -9.08% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_lower-12      600.6m ± 3%   364.3m ± 1%  -39.35% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_upper-12      687.0m ± 2%   423.4m ± 1%  -38.37% (p=0.000 n=10)
geomean                                         335.7m        259.5m       -22.69%

                                              │    old.txt     │               new.txt                │
                                              │      B/op      │     B/op      vs base                │
ListUnifiedResourcesFilter/labels-12              868.8Ki ± 0%   414.1Ki ± 0%  -52.34% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_path-12      7.731Mi ± 0%   7.287Mi ± 0%   -5.74% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_index-12     7.731Mi ± 0%   7.288Mi ± 0%   -5.74% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_lower-12      22106.9Ki ± 0%   486.8Ki ± 0%  -97.80% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_upper-12      22107.0Ki ± 0%   486.3Ki ± 0%  -97.80% (p=0.000 n=10)
geomean                                           7.494Mi        1.371Mi       -81.70%

                                              │    old.txt    │               new.txt               │
                                              │   allocs/op   │  allocs/op   vs base                │
ListUnifiedResourcesFilter/labels-12             11.401k ± 0%   4.865k ± 0%  -57.32% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_path-12      161.5k ± 0%   155.0k ± 0%   -4.04% (p=0.000 n=10)
ListUnifiedResourcesFilter/predicate_index-12     161.5k ± 0%   155.0k ± 0%   -4.04% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_lower-12      162.212k ± 0%   5.277k ± 0%  -96.75% (p=0.000 n=10)
ListUnifiedResourcesFilter/search_upper-12      162.224k ± 0%   5.275k ± 0%  -96.75% (p=0.000 n=10)
geomean                                           95.22k        20.07k       -78.93%

Changelog: Improve resource consumption when retrieving resources via the Web UI or tsh ls.

strideynet · 2025-03-20T11:18:57Z

api/types/lock_test.go

+	t.Run("equal", func(t *testing.T) {
+		require.True(t, (LockTarget{}).Equals(LockTarget{}), "empty targets equal")
+
+		for i, field := range reflect.VisibleFields(reflect.TypeOf(LockTarget{})) {


I feel like a testhelper for this scramble behaviour would be super useful at somepoint

I shamelessly stole this from TestLockTargetIsEmpty. If we are to continue hand rolling these functions I agree this could be made generic to cover other resources. Though perhaps we should also consider using goderive, or option (gogoproto.equal) = true; instead of hand rolling.

rosstimothy · 2025-03-20T13:52:25Z

Flame graphs that inspired this change:

rosstimothy · 2025-03-20T13:53:40Z

lib/auth/auth_with_roles_test.go

+			types.ServerSpecV2{
+				Hostname: "node." + strconv.Itoa(i),
+			},


This change is inconsequential, but as I was running the benchmarks I noticed that our invalid hostname detection was flagging these nodes.

rosstimothy · 2025-03-20T13:53:44Z

lib/auth/auth_with_roles_test.go

+			types.ServerSpecV2{
+				Hostname: "node." + strconv.Itoa(i),
+			},


This change is inconsequential, but as I was running the benchmarks I noticed that our invalid hostname detection was flagging these nodes.

espadolini · 2025-03-20T16:41:21Z

api/types/server.go

+Outer:
+	for _, searchV := range values {
+		for key, value := range s.Metadata.Labels {
+			if containsFold(key, searchV) || containsFold(value, searchV) {


While we're doing perf improvements, could we replace containsFold with github.com/charlievieth/strcase.Contains?

Comparison of benchmarks with containsFold vs github.com/charlievieth/strcase.Contains.

$ benchstat new.txt strcase.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/auth cpu: Apple M2 Pro │ new.txt │ strcase.txt │ │ sec/op │ sec/op vs base │ ListUnifiedResourcesFilter/labels-12 173.3m ± 2% 178.2m ± 5% +2.78% (p=0.011 n=10) ListUnifiedResourcesFilter/predicate_path-12 210.8m ± 4% 214.9m ± 1% +1.94% (p=0.023 n=10) ListUnifiedResourcesFilter/predicate_index-12 209.0m ± 7% 211.1m ± 1% ~ (p=0.529 n=10) ListUnifiedResourcesFilter/search_lower-12 364.3m ± 1% 240.3m ± 3% -34.03% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 423.4m ± 1% 278.1m ± 5% -34.31% (p=0.000 n=10) geomean 259.5m 222.1m -14.43% │ new.txt │ strcase.txt │ │ B/op │ B/op vs base │ ListUnifiedResourcesFilter/labels-12 414.1Ki ± 0% 414.6Ki ± 0% ~ (p=0.436 n=10) ListUnifiedResourcesFilter/predicate_path-12 7.287Mi ± 0% 7.289Mi ± 0% ~ (p=0.089 n=10) ListUnifiedResourcesFilter/predicate_index-12 7.288Mi ± 0% 7.288Mi ± 0% ~ (p=0.529 n=10) ListUnifiedResourcesFilter/search_lower-12 486.8Ki ± 0% 427.8Ki ± 0% -12.12% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 486.3Ki ± 0% 449.6Ki ± 0% -7.55% (p=0.000 n=10) geomean 1.371Mi 1.316Mi -4.04% │ new.txt │ strcase.txt │ │ allocs/op │ allocs/op vs base │ ListUnifiedResourcesFilter/labels-12 4.865k ± 0% 4.869k ± 0% ~ (p=0.670 n=10) ListUnifiedResourcesFilter/predicate_path-12 155.0k ± 0% 155.0k ± 0% ~ (p=0.643 n=10) ListUnifiedResourcesFilter/predicate_index-12 155.0k ± 0% 155.0k ± 0% ~ (p=0.468 n=10) ListUnifiedResourcesFilter/search_lower-12 5.277k ± 0% 4.949k ± 0% -6.22% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 5.275k ± 0% 5.076k ± 0% -3.77% (p=0.000 n=10) geomean 20.07k 19.66k -2.02%

Good find E!

lib/auth/auth_with_roles.go

The changes here were a result of examining CPU and memory profiles of BenchmarkListUnifiedResources. The biggest inefficiencies stemmed from the authorizer and `(types.Server) MatchSearch`. When the authorizer computes lock targets it will call `(types.LockTarget) Equals`, which was relying on proto.Equals to determine if nine strings were equivalent. The fields are now individually checked and a test was added to ensure that if any new fields are added they must be added to the check for the test to pass. ListUnifiedResources attempts to be smart and evaluate whether the user has read/list access to a particular resource kind once. However, it was always checking if users had access to _all_ supported resources instead of just the resources that the user was requesting. While most of the checks are not very expensive, checking all resources is still worse than only checking the required resources. Lastly, the `(types.Server) MatchSearch` implementation is no longer allocating a slice of search fields and using `types.MatchSearch`. Instead a variant of the loop from `types.MatchSearch` is inlined. While this was only applied to `types.Server` it should likely be extended to other resource types in the future - or `types.MatchSearch` should be refactored in a way that does not require callers to allocate. ```bash $ benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/auth cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ListUnifiedResourcesFilter/labels-12 195.0m ± 6% 173.3m ± 2% -11.12% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 230.6m ± 3% 210.8m ± 4% -8.57% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 229.8m ± 5% 209.0m ± 7% -9.08% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 600.6m ± 3% 364.3m ± 1% -39.35% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 687.0m ± 2% 423.4m ± 1% -38.37% (p=0.000 n=10) geomean 335.7m 259.5m -22.69% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ListUnifiedResourcesFilter/labels-12 868.8Ki ± 0% 414.1Ki ± 0% -52.34% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 7.731Mi ± 0% 7.287Mi ± 0% -5.74% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 7.731Mi ± 0% 7.288Mi ± 0% -5.74% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 22106.9Ki ± 0% 486.8Ki ± 0% -97.80% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 22107.0Ki ± 0% 486.3Ki ± 0% -97.80% (p=0.000 n=10) geomean 7.494Mi 1.371Mi -81.70% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ListUnifiedResourcesFilter/labels-12 11.401k ± 0% 4.865k ± 0% -57.32% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 161.5k ± 0% 155.0k ± 0% -4.04% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 161.5k ± 0% 155.0k ± 0% -4.04% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 162.212k ± 0% 5.277k ± 0% -96.75% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 162.224k ± 0% 5.275k ± 0% -96.75% (p=0.000 n=10) geomean 95.22k 20.07k -78.93% ```

backport-bot-workflows · 2025-03-21T19:44:38Z

@rosstimothy See the table below for backport results.

Branch	Result
branch/v15	Failed
branch/v16	Failed
branch/v17	Failed

The changes here were a result of examining CPU and memory profiles of BenchmarkListUnifiedResources. The biggest inefficiencies stemmed from the authorizer and `(types.Server) MatchSearch`. When the authorizer computes lock targets it will call `(types.LockTarget) Equals`, which was relying on proto.Equals to determine if nine strings were equivalent. The fields are now individually checked and a test was added to ensure that if any new fields are added they must be added to the check for the test to pass. ListUnifiedResources attempts to be smart and evaluate whether the user has read/list access to a particular resource kind once. However, it was always checking if users had access to _all_ supported resources instead of just the resources that the user was requesting. While most of the checks are not very expensive, checking all resources is still worse than only checking the required resources. Lastly, the `(types.Server) MatchSearch` implementation is no longer allocating a slice of search fields and using `types.MatchSearch`. Instead a variant of the loop from `types.MatchSearch` is inlined. While this was only applied to `types.Server` it should likely be extended to other resource types in the future - or `types.MatchSearch` should be refactored in a way that does not require callers to allocate. ```bash $ benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/auth cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ListUnifiedResourcesFilter/labels-12 195.0m ± 6% 173.3m ± 2% -11.12% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 230.6m ± 3% 210.8m ± 4% -8.57% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 229.8m ± 5% 209.0m ± 7% -9.08% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 600.6m ± 3% 364.3m ± 1% -39.35% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 687.0m ± 2% 423.4m ± 1% -38.37% (p=0.000 n=10) geomean 335.7m 259.5m -22.69% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ListUnifiedResourcesFilter/labels-12 868.8Ki ± 0% 414.1Ki ± 0% -52.34% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 7.731Mi ± 0% 7.287Mi ± 0% -5.74% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 7.731Mi ± 0% 7.288Mi ± 0% -5.74% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 22106.9Ki ± 0% 486.8Ki ± 0% -97.80% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 22107.0Ki ± 0% 486.3Ki ± 0% -97.80% (p=0.000 n=10) geomean 7.494Mi 1.371Mi -81.70% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ListUnifiedResourcesFilter/labels-12 11.401k ± 0% 4.865k ± 0% -57.32% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_path-12 161.5k ± 0% 155.0k ± 0% -4.04% (p=0.000 n=10) ListUnifiedResourcesFilter/predicate_index-12 161.5k ± 0% 155.0k ± 0% -4.04% (p=0.000 n=10) ListUnifiedResourcesFilter/search_lower-12 162.212k ± 0% 5.277k ± 0% -96.75% (p=0.000 n=10) ListUnifiedResourcesFilter/search_upper-12 162.224k ± 0% 5.275k ± 0% -96.75% (p=0.000 n=10) geomean 95.22k 20.07k -78.93% ```

rosstimothy added backport/branch/v15 backport/branch/v17 labels Mar 20, 2025

strideynet reviewed Mar 20, 2025

View reviewed changes

rosstimothy marked this pull request as ready for review March 20, 2025 13:52

github-actions bot added the size/sm label Mar 20, 2025

github-actions bot requested review from Joerger and hugoShaka March 20, 2025 13:53

rosstimothy commented Mar 20, 2025

View reviewed changes

espadolini reviewed Mar 20, 2025

View reviewed changes

rosstimothy requested a review from espadolini March 20, 2025 20:07

rosstimothy force-pushed the tross/improve_unified_resources_perf branch from 479534f to df17553 Compare March 20, 2025 20:13

zmb3 approved these changes Mar 20, 2025

View reviewed changes

espadolini approved these changes Mar 20, 2025

View reviewed changes

lib/auth/auth_with_roles.go Outdated Show resolved Hide resolved

rosstimothy force-pushed the tross/improve_unified_resources_perf branch 2 times, most recently from c2da40c to b91364e Compare March 21, 2025 18:20

rosstimothy force-pushed the tross/improve_unified_resources_perf branch from b91364e to 8834392 Compare March 21, 2025 19:00

rosstimothy enabled auto-merge March 21, 2025 19:15

rosstimothy added this pull request to the merge queue Mar 21, 2025

Merged via the queue into master with commit 6d13e96 Mar 21, 2025
43 checks passed

rosstimothy deleted the tross/improve_unified_resources_perf branch March 21, 2025 19:42

rosstimothy mentioned this pull request Mar 21, 2025

[v17] Reduce resource consumption of ListUnifiedResources #53302

Merged

rosstimothy mentioned this pull request Mar 21, 2025

[v16] Reduce resource consumption of ListUnifiedResources #53303

Merged

rosstimothy mentioned this pull request Mar 21, 2025

[v15] Reduce resource consumption of ListUnifiedResources #53304

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce resource consumption of ListUnifiedResources#53213

Reduce resource consumption of ListUnifiedResources#53213
rosstimothy merged 1 commit intomasterfrom
tross/improve_unified_resources_perf

rosstimothy commented Mar 20, 2025 •

edited

Loading

Uh oh!

strideynet Mar 20, 2025

Uh oh!

rosstimothy Mar 20, 2025

Uh oh!

rosstimothy commented Mar 20, 2025 •

edited

Loading

Uh oh!

rosstimothy Mar 20, 2025

Uh oh!

rosstimothy Mar 20, 2025

Uh oh!

espadolini Mar 20, 2025

Uh oh!

rosstimothy Mar 20, 2025

Uh oh!

rosstimothy Mar 20, 2025

Uh oh!

zmb3 Mar 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

backport-bot-workflows bot commented Mar 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

rosstimothy commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rosstimothy commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

backport-bot-workflows bot commented Mar 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rosstimothy commented Mar 20, 2025 •

edited

Loading

rosstimothy commented Mar 20, 2025 •

edited

Loading