Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Caffeine CacheStats recorder #1897

Merged
merged 16 commits into from
Mar 7, 2024
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,21 @@ Service instrumentedService = Tritium.instrument(Service.class,
interestingService, environment.metrics());
```

## Instrumenting a [Caffeine cache](https://github.com/ben-manes/caffeine/)

```java
import com.palantir.tritium.metrics.caffeine.CacheStats;

TaggedMetricRegistry taggedMetricRegistry = ...
Cache<Integer, String> cache = Caffeine.newBuilder()
.recordStats(CacheStats.of(taggedMetricRegistry, "unique-cache-name"))
.build();

LoadingCache<String, Integer> loadingCache = Caffeine.newBuilder()
.recordStats(CacheStats.of(taggedMetricRegistry, "unique-loading-cache-name"))
.build(key::length);
```

## Creating a metric registry with reservoirs backed by [HDR Histograms](https://hdrhistogram.github.io/HdrHistogram/).

HDR histograms are more useful if the service is long running, so the stats represents the lifetime of the server rather than using default exponential decay which can lead to some mis-interpretations of timings (especially higher percentiles and things like max dropping over time) if the consumer isn't aware of these assumptions.
Expand Down
18 changes: 18 additions & 0 deletions changelog/@unreleased/pr-1897.v2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
type: improvement
improvement:
description: |
Initial Caffeine CacheStats recorder

Example usage:
```
TaggedMetricRegistry taggedMetricRegistry = ...
Cache<Integer, String> cache = Caffeine.newBuilder()
.recordStats(CacheStats.of(taggedMetricRegistry, "unique-cache-name"))
.build();

LoadingCache<String, Integer> loadingCache = Caffeine.newBuilder()
.recordStats(CacheStats.of(taggedMetricRegistry, "unique-loading-cache-name"))
.build(key::length);
```
links:
- https://github.com/palantir/tritium/pull/1897
2 changes: 2 additions & 0 deletions tritium-caffeine/build.gradle
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
apply plugin: 'com.palantir.external-publish-jar'
apply plugin: 'com.palantir.metric-schema'

dependencies {
api 'com.github.ben-manes.caffeine:caffeine'
Expand All @@ -11,6 +12,7 @@ dependencies {
implementation 'com.palantir.safe-logging:preconditions'
implementation 'com.palantir.safe-logging:safe-logging'
implementation 'io.dropwizard.metrics:metrics-core'
implementation 'org.checkerframework:checker-qual'

testImplementation 'org.assertj:assertj-core'
testImplementation 'org.awaitility:awaitility'
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
/*
* (c) Copyright 2024 Palantir Technologies Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.palantir.tritium.metrics.caffeine;

import com.codahale.metrics.Counting;
import com.codahale.metrics.Meter;
import com.codahale.metrics.Timer;
import com.github.benmanes.caffeine.cache.RemovalCause;
import com.github.benmanes.caffeine.cache.stats.StatsCounter;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Maps;
import com.palantir.logsafe.Safe;
import com.palantir.tritium.metrics.caffeine.CacheMetrics.Load_Result;
import com.palantir.tritium.metrics.registry.TaggedMetricRegistry;
import java.util.Arrays;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.LongAdder;
import java.util.function.Supplier;
import org.checkerframework.checker.index.qual.NonNegative;

public final class CacheStats implements StatsCounter, Supplier<StatsCounter> {
private final String name;
private final Meter hitMeter;
private final Meter missMeter;
private final Timer loadSuccessTimer;
private final Timer loadFailureTimer;
private final Meter evictionsTotalMeter;
private final ImmutableMap<RemovalCause, Meter> evictionMeters;
private final LongAdder totalLoadNanos = new LongAdder();

/**
* Creates a {@link CacheStats} instance that registers metrics for Caffeine cache statistics.
* <p>
* Example usage for a {@link com.github.benmanes.caffeine.cache.Cache} or
* {@link com.github.benmanes.caffeine.cache.LoadingCache}:
* <pre>
* LoadingCache&lt;Integer, String&gt; cache = Caffeine.newBuilder()
* .recordStats(CacheStats.of(taggedMetricRegistry, "your-cache-name"))
* .build(key -&gt; computeSomethingExpensive(key));
* </pre>
* @param taggedMetricRegistry tagged metric registry to add cache metrics
* @param name cache name
* @return Caffeine stats instance to register via
* {@link com.github.benmanes.caffeine.cache.Caffeine#recordStats(Supplier)}.
*/
public static CacheStats of(TaggedMetricRegistry taggedMetricRegistry, @Safe String name) {
return new CacheStats(CacheMetrics.of(taggedMetricRegistry), name);
}

private CacheStats(CacheMetrics metrics, @Safe String name) {
this.name = name;
this.hitMeter = metrics.hit(name);
this.missMeter = metrics.miss(name);
this.loadSuccessTimer =
metrics.load().cache(name).result(Load_Result.SUCCESS).build();
this.loadFailureTimer =
metrics.load().cache(name).result(Load_Result.FAILURE).build();
this.evictionsTotalMeter = metrics.eviction(name);
this.evictionMeters = Arrays.stream(RemovalCause.values())
.collect(Maps.toImmutableEnumMap(cause -> cause, cause -> metrics.evictions()
.cache(name)
.cause(cause.toString())
.build()));
}

@Override
public StatsCounter get() {
return this;
}

@Override
public void recordHits(@NonNegative int count) {
hitMeter.mark(count);
}

@Override
public void recordMisses(@NonNegative int count) {
missMeter.mark(count);
}

@Override
public void recordLoadSuccess(@NonNegative long loadTime) {
loadSuccessTimer.update(loadTime, TimeUnit.NANOSECONDS);
totalLoadNanos.add(loadTime);
}

@Override
public void recordLoadFailure(@NonNegative long loadTime) {
loadFailureTimer.update(loadTime, TimeUnit.NANOSECONDS);
totalLoadNanos.add(loadTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could make the meter into a timer, which also provides a “count” value in addition to percentiles, more timeseries though

}

@Override
public void recordEviction(@NonNegative int weight, RemovalCause cause) {
Meter counter = evictionMeters.get(cause);
if (counter != null) {
counter.mark(weight);
}
evictionsTotalMeter.mark(weight);
}

@Override
public com.github.benmanes.caffeine.cache.stats.CacheStats snapshot() {
return com.github.benmanes.caffeine.cache.stats.CacheStats.of(
hitMeter.getCount(),
missMeter.getCount(),
loadSuccessTimer.getCount(),
loadFailureTimer.getCount(),
totalLoadNanos.sum(),
evictionsTotalMeter.getCount(),
evictionMeters.values().stream().mapToLong(Counting::getCount).sum());
}

@Override
public String toString() {
return name + ": " + snapshot();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import com.palantir.tritium.metrics.registry.TaggedMetricRegistry;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import java.util.function.Supplier;

public final class CaffeineCacheStats {

Expand All @@ -43,7 +44,7 @@ private CaffeineCacheStats() {}

/**
* Register specified cache with the given metric registry.
*
* <p>
* Callers should ensure that they have {@link Caffeine#recordStats() enabled stats recording}
* {@code Caffeine.newBuilder().recordStats()} otherwise there are no cache metrics to register.
*
Expand All @@ -69,14 +70,17 @@ public static void registerCache(MetricRegistry registry, Cache<?, ?> cache, Str

/**
* Register specified cache with the given metric registry.
*
* <p>
* Callers should ensure that they have {@link Caffeine#recordStats() enabled stats recording}
* {@code Caffeine.newBuilder().recordStats()} otherwise there are no cache metrics to register.
*
* @param registry metric registry
* @param cache cache to instrument
* @param name cache name
* <p>
* Soon to be deprecated, prefer {@link Caffeine#recordStats(Supplier)} and {@link CacheStats#of(TaggedMetricRegistry, String)}
*/
// Soon to be @Deprecated
public static void registerCache(TaggedMetricRegistry registry, Cache<?, ?> cache, @Safe String name) {
checkNotNull(registry, "registry");
checkNotNull(cache, "cache");
Expand Down
37 changes: 37 additions & 0 deletions tritium-caffeine/src/main/metrics/cache-metrics.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
options:
javaPackage: com.palantir.tritium.metrics.caffeine
javaVisibility: packagePrivate
namespaces:
cache:
docs: Cache statistic metrics
metrics:
hit:
type: meter
tags: [cache]
docs: Count of cache hits
miss:
type: meter
tags: [cache]
docs: Count of cache misses
load:
type: timer
tags:
- name: cache
- name: result
values: [success, failure]
docs: Count of successful cache loads
evictions:
type: meter
tags: [cache, cause]
docs: Count of evicted entries by cause
eviction:
type: meter
tags: [cache]
docs: Total count of evicted entries
stats.disabled:
type: meter
tags: [cache]
docs: |
Registered cache does not have stats recording enabled, stats will always be zero.
To enable cache metrics, stats recording must be enabled when constructing the cache:
Caffeine.newBuilder().recordStats()