Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact MetricName tag map implementation #1368

Merged
merged 17 commits into from
Mar 1, 2022

Conversation

carterkozak
Copy link
Contributor

Before this PR

Excessive cost to create and store MetricName due to tags.

After this PR

==COMMIT_MSG==
Compact MetricName tag map implementation
==COMMIT_MSG==

Possible downsides?

Custom code

@changelog-app
Copy link

changelog-app bot commented Feb 28, 2022

Generate changelog in changelog/@unreleased

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Compact MetricName tag map implementation

Check the box to generate changelog(s)

  • Generate changelog entry

static final TagMap EMPTY = new TagMap(new String[0]);

private final String[] values;
private final int hash;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on pre-computing the hash since the MetricName implementation already caches the hashCode.

It would be helpful for map wrappers with additional entries, however that case may perform better if we flatten it an create a new TagMap instance with an array two entries larger (assuming relatively few entries, which is generally true). Note that I have not verified this, so take it with a grain of salt.

In the common TaggedMetricSet case which applies additional entries, we never hash the name. It is passed to the metric log subscriber, and only hashed for gauges to ensure we're not blocked recording the same gauge twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we could drop the hash field as all the current usages of TagMap will now compute the hashCode after creating a TagMap, and we now do create a new TagMap when adding an entry in https://github.com/palantir/tritium/pull/1368/files#diff-07d251c64776a9fd9112f154669063beff3ea2c65f4bece8c4bf9c24acce747aR87

Dropped the hash field from #1367

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped hash field here as well

@carterkozak
Copy link
Contributor Author

I'm preparing a change to remove the ExtraEntryMap wrapper in favor of using TagMap directly

@carterkozak
Copy link
Contributor Author

NestedMetricsBenchmark results

Benchmarks after:

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10   292477.731 ±  12461.604   ns/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10   362048.602 ±      4.604    B/op
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10   610141.258 ±   3238.038   ns/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  1040735.489 ±     15.408    B/op

Benchmarks before:

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10   295342.347 ±   6560.391   ns/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10   324494.444 ±      8.378    B/op
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10   623551.299 ±  25264.115   ns/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  1000733.243 ±     18.299    B/op

Analysis

These don't seem to tell the whole story, they measure object creation, but we never consume the metricname components, there's no iteration over keys.

More representative benchmarks

With the benchmark changes introduced in 173efe9 we have larger tag maps, and the consumer blackholes the safeName and all tag key/value pairs.

After:

tags forEach(biconsumer)

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10      398.119 ±      2.923   us/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate                       avgt   10      513.892 ±      3.774  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10   322031.270 ±      8.948    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space              avgt   10      512.321 ±    287.807  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space.norm         avgt   10   320913.039 ± 179534.072    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space          avgt   10        0.009 ±      0.025  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space.norm     avgt   10        5.628 ±     15.830    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.count                            avgt   10       13.000               counts
NestedMetricsBenchmark.benchmarkForEach:·gc.time                             avgt   10       18.000                   ms
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10      777.042 ±     13.052   us/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate                    avgt   10      916.455 ±     15.157  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  1120738.250 ±     16.985    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space           avgt   10      906.663 ±    287.963  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space.norm      avgt   10  1108490.138 ± 348872.370    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space       avgt   10        0.234 ±      0.089  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space.norm  avgt   10      286.871 ±    110.637    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.count                         avgt   10       23.000               counts
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.time                          avgt   10       31.000                   ms

tags entrySet + enhanced for-loop

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10      404.491 ±     15.991   us/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate                       avgt   10      506.065 ±     19.342  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10   322031.456 ±      8.778    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space              avgt   10      512.328 ±    287.825  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space.norm         avgt   10   325076.827 ± 176804.791    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space          avgt   10        0.009 ±      0.024  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space.norm     avgt   10        5.472 ±     15.307    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.count                            avgt   10       13.000               counts
NestedMetricsBenchmark.benchmarkForEach:·gc.time                             avgt   10       19.000                   ms
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10      809.683 ±     24.776   us/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate                    avgt   10      879.724 ±     26.745  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  1120738.161 ±     16.376    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space           avgt   10      866.973 ±    251.475  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space.norm      avgt   10  1104453.933 ± 317276.137    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space       avgt   10        0.200 ±      0.117  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space.norm  avgt   10      255.357 ±    152.886    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.count                         avgt   10       22.000               counts
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.time                          avgt   10       29.000                   ms

Before:

tags forEach(biconsumer)

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10     1496.868 ±     44.749   us/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate                       avgt   10     1241.821 ±     36.396  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10  2924587.239 ±     19.219    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space              avgt   10     1221.408 ±    188.776  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space.norm         avgt   10  2877447.293 ± 451164.940    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space          avgt   10        0.007 ±      0.015  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space.norm     avgt   10       16.604 ±     33.975    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.count                            avgt   10       31.000               counts
NestedMetricsBenchmark.benchmarkForEach:·gc.time                             avgt   10       37.000                   ms
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10     1864.543 ±     48.843   us/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate                    avgt   10     1227.344 ±     31.588  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  3600849.967 ±     28.255    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space           avgt   10     1221.425 ±    188.120  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space.norm      avgt   10  3582756.816 ± 524419.814    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space       avgt   10        0.039 ±      0.038  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space.norm  avgt   10      115.083 ±    113.371    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.count                         avgt   10       31.000               counts
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.time                          avgt   10       41.000                   ms

tags entrySet + enhanced for-loop

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10     1610.645 ±     69.380   us/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate                       avgt   10     1202.110 ±     51.561  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10  3044592.417 ±      5.200    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space              avgt   10     1182.330 ±      0.580  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space.norm         avgt   10  2996674.988 ± 129049.113    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space          avgt   10        0.008 ±      0.019  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space.norm     avgt   10       20.970 ±     49.960    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.count                            avgt   10       30.000               counts
NestedMetricsBenchmark.benchmarkForEach:·gc.time                             avgt   10       34.000                   ms
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10     1962.866 ±    102.056   us/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate                    avgt   10     1205.972 ±     60.899  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  3720853.849 ±     16.751    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space           avgt   10     1181.870 ±      0.728  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space.norm      avgt   10  3650231.367 ± 188346.090    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space       avgt   10        0.146 ±      0.189  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space.norm  avgt   10      453.699 ±    590.895    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.count                         avgt   10       30.000               counts
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.time                          avgt   10       38.000                   ms

@carterkozak
Copy link
Contributor Author

I was curious if the hashCode avoidance had an excessive impact (this PR makes RealMetricName.hashCode lazy)

Adding this to the consumer:

blackhole.consume(name.hashCode());

Resulted in:

Benchmark                                                                    Mode  Cnt        Score        Error   Units
NestedMetricsBenchmark.benchmarkForEach                                      avgt   10      448.480 ±     14.674   us/op
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate                       avgt   10      459.839 ±     14.760  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.alloc.rate.norm                  avgt   10   324478.852 ±      8.861    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space              avgt   10      433.598 ±    188.539  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Eden_Space.norm         avgt   10   305944.055 ± 131950.841    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space          avgt   10        0.009 ±      0.025  MB/sec
NestedMetricsBenchmark.benchmarkForEach:·gc.churn.G1_Survivor_Space.norm     avgt   10        6.111 ±     17.581    B/op
NestedMetricsBenchmark.benchmarkForEach:·gc.count                            avgt   10       11.000               counts
NestedMetricsBenchmark.benchmarkForEach:·gc.time                             avgt   10       16.000                   ms
NestedMetricsBenchmark.benchmarkGetMetrics                                   avgt   10      818.882 ±     11.323   us/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate                    avgt   10      869.588 ±     12.228  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.alloc.rate.norm               avgt   10  1120738.842 ±     17.464    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space           avgt   10      867.047 ±    251.145  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Eden_Space.norm      avgt   10  1118063.341 ± 329708.755    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space       avgt   10        0.223 ±      0.130  MB/sec
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.churn.G1_Survivor_Space.norm  avgt   10      287.676 ±    167.739    B/op
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.count                         avgt   10       22.000               counts
NestedMetricsBenchmark.benchmarkGetMetrics:·gc.time                          avgt   10       30.000                   ms

The optimization gives us a 10% uplift, which is a very small faction of the improvement from this PR

@carterkozak carterkozak marked this pull request as ready for review March 1, 2022 22:51
@carterkozak
Copy link
Contributor Author

👍

@bulldozer-bot bulldozer-bot bot merged commit a120435 into develop Mar 1, 2022
@svc-autorelease
Copy link
Collaborator

Released 0.41.0

@carterkozak carterkozak changed the title [WIP] Compact MetricName tag map implementation Compact MetricName tag map implementation Sep 28, 2022
@schlosna schlosna deleted the ckozak/compact_tag_map branch October 4, 2023 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants