-
Notifications
You must be signed in to change notification settings - Fork 319
Track schema registry usage #9974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
piochelepiotr
merged 13 commits into
master
from
piotr.wolski/track-schema-registry-usage
Nov 19, 2025
Merged
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
07e5e9a
Track schema registry usage
piochelepiotr b139559
fix cluster ID tagging for consumer
piochelepiotr b362939
Easier instrumentation
piochelepiotr 7d26fe3
fix deserialize instrumentation
piochelepiotr 867cfa9
Add back Kafka cluster ID tagging
piochelepiotr 1043c98
Make jacoco pass
piochelepiotr e45acd9
exclude schema usage from test coverage
piochelepiotr 324c0b9
Don't use reflection
piochelepiotr 77b92a5
add Kafka cluster ID instrumentation to Kafka 3.8
piochelepiotr c67fe80
Fix instrumentation crash
piochelepiotr a74e7f9
rename setSchemaRegistryUsage
piochelepiotr 026f824
Add helper for extracting schema ID
piochelepiotr 325ae25
Merge branch 'master' into piotr.wolski/track-schema-registry-usage
piochelepiotr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -46,6 +46,10 @@ out/ | |
| ###################### | ||
| .vscode | ||
|
|
||
| # Cursor # | ||
| ########## | ||
| .cursor | ||
|
|
||
| # Others # | ||
| ########## | ||
| /logs/* | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
26 changes: 26 additions & 0 deletions
26
...gent/instrumentation/confluent-schema-registry/confluent-schema-registry-7.0/build.gradle
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| apply from: "$rootDir/gradle/java.gradle" | ||
|
|
||
| muzzle { | ||
| pass { | ||
| group = "io.confluent" | ||
| module = "kafka-schema-registry-client" | ||
| versions = "[7.0.0,)" | ||
| assertInverse = true | ||
| } | ||
| } | ||
|
|
||
| dependencies { | ||
| compileOnly project(':dd-java-agent:instrumentation:kafka:kafka-common') | ||
| compileOnly group: 'io.confluent', name: 'kafka-schema-registry-client', version: '7.0.0' | ||
| compileOnly group: 'io.confluent', name: 'kafka-avro-serializer', version: '7.0.0' | ||
| compileOnly group: 'io.confluent', name: 'kafka-protobuf-serializer', version: '7.0.0' | ||
| compileOnly group: 'org.apache.kafka', name: 'kafka-clients', version: '3.0.0' | ||
|
|
||
| testImplementation project(':dd-java-agent:instrumentation:kafka:kafka-common') | ||
| testImplementation group: 'io.confluent', name: 'kafka-schema-registry-client', version: '7.5.2' | ||
| testImplementation group: 'io.confluent', name: 'kafka-avro-serializer', version: '7.5.2' | ||
| testImplementation group: 'io.confluent', name: 'kafka-protobuf-serializer', version: '7.5.1' | ||
| testImplementation group: 'org.apache.kafka', name: 'kafka-clients', version: '3.5.0' | ||
| testImplementation group: 'org.apache.avro', name: 'avro', version: '1.11.0' | ||
| } | ||
|
|
115 changes: 115 additions & 0 deletions
115
...tadog/trace/instrumentation/confluentschemaregistry/KafkaDeserializerInstrumentation.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| package datadog.trace.instrumentation.confluentschemaregistry; | ||
|
|
||
| import static datadog.trace.agent.tooling.bytebuddy.matcher.NameMatchers.named; | ||
| import static net.bytebuddy.matcher.ElementMatchers.isMethod; | ||
| import static net.bytebuddy.matcher.ElementMatchers.isPublic; | ||
| import static net.bytebuddy.matcher.ElementMatchers.takesArgument; | ||
| import static net.bytebuddy.matcher.ElementMatchers.takesArguments; | ||
|
|
||
| import com.google.auto.service.AutoService; | ||
| import datadog.trace.agent.tooling.Instrumenter; | ||
| import datadog.trace.agent.tooling.Instrumenter.MethodTransformer; | ||
| import datadog.trace.agent.tooling.InstrumenterModule; | ||
| import datadog.trace.bootstrap.InstrumentationContext; | ||
| import datadog.trace.bootstrap.instrumentation.api.AgentTracer; | ||
| import datadog.trace.instrumentation.kafka_common.ClusterIdHolder; | ||
| import java.util.HashMap; | ||
| import java.util.Map; | ||
| import net.bytebuddy.asm.Advice; | ||
| import org.apache.kafka.common.serialization.Deserializer; | ||
|
|
||
| /** | ||
| * Instruments Confluent Schema Registry deserializers (Avro, Protobuf, and JSON) to capture | ||
| * deserialization operations. | ||
| */ | ||
| @AutoService(InstrumenterModule.class) | ||
| public class KafkaDeserializerInstrumentation extends InstrumenterModule.Tracing | ||
| implements Instrumenter.ForKnownTypes, Instrumenter.HasMethodAdvice { | ||
|
|
||
| public KafkaDeserializerInstrumentation() { | ||
| super("confluent-schema-registry", "kafka"); | ||
| } | ||
|
|
||
| @Override | ||
| public String[] knownMatchingTypes() { | ||
| return new String[] { | ||
| "io.confluent.kafka.serializers.KafkaAvroDeserializer", | ||
| "io.confluent.kafka.serializers.json.KafkaJsonSchemaDeserializer", | ||
| "io.confluent.kafka.serializers.protobuf.KafkaProtobufDeserializer" | ||
| }; | ||
| } | ||
|
|
||
| @Override | ||
| public String[] helperClassNames() { | ||
| return new String[] { | ||
| "datadog.trace.instrumentation.kafka_common.ClusterIdHolder", | ||
| packageName + ".SchemaIdExtractor" | ||
| }; | ||
| } | ||
|
|
||
| @Override | ||
| public Map<String, String> contextStore() { | ||
| Map<String, String> contextStores = new HashMap<>(); | ||
| contextStores.put("org.apache.kafka.common.serialization.Deserializer", "java.lang.Boolean"); | ||
| return contextStores; | ||
| } | ||
|
|
||
| @Override | ||
| public void methodAdvice(MethodTransformer transformer) { | ||
| // Instrument configure to capture isKey value | ||
| transformer.applyAdvice( | ||
| isMethod() | ||
| .and(named("configure")) | ||
| .and(isPublic()) | ||
| .and(takesArguments(2)) | ||
| .and(takesArgument(1, boolean.class)), | ||
| getClass().getName() + "$ConfigureAdvice"); | ||
|
|
||
| // Instrument deserialize(String topic, Headers headers, byte[] data) | ||
| // The 2-arg version calls this one, so we only need to instrument this to avoid duplicates | ||
| transformer.applyAdvice( | ||
| isMethod() | ||
| .and(named("deserialize")) | ||
| .and(isPublic()) | ||
| .and(takesArguments(3)) | ||
| .and(takesArgument(0, String.class)) | ||
| .and(takesArgument(2, byte[].class)), | ||
| getClass().getName() + "$DeserializeAdvice"); | ||
| } | ||
|
|
||
| public static class ConfigureAdvice { | ||
| @Advice.OnMethodExit(suppress = Throwable.class) | ||
| public static void onExit( | ||
| @Advice.This Deserializer deserializer, @Advice.Argument(1) boolean isKey) { | ||
| // Store the isKey value in InstrumentationContext for later use | ||
| InstrumentationContext.get(Deserializer.class, Boolean.class).put(deserializer, isKey); | ||
| } | ||
| } | ||
|
|
||
| public static class DeserializeAdvice { | ||
| @Advice.OnMethodExit(onThrowable = Throwable.class, suppress = Throwable.class) | ||
| public static void onExit( | ||
| @Advice.This Deserializer deserializer, | ||
| @Advice.Argument(0) String topic, | ||
| @Advice.Argument(2) byte[] data, | ||
| @Advice.Return Object result, | ||
| @Advice.Thrown Throwable throwable) { | ||
|
|
||
| // Get isKey from InstrumentationContext (stored during configure) | ||
| Boolean isKeyObj = | ||
| InstrumentationContext.get(Deserializer.class, Boolean.class).get(deserializer); | ||
| boolean isKey = isKeyObj != null && isKeyObj; | ||
|
|
||
| // Get cluster ID from thread-local (set by Kafka consumer instrumentation) | ||
| String clusterId = ClusterIdHolder.get(); | ||
|
|
||
| boolean isSuccess = throwable == null; | ||
| int schemaId = isSuccess ? SchemaIdExtractor.extractSchemaId(data) : -1; | ||
|
|
||
| // Record the schema registry usage | ||
| AgentTracer.get() | ||
| .getDataStreamsMonitoring() | ||
| .reportSchemaRegistryUsage(topic, clusterId, schemaId, isSuccess, isKey, "deserialize"); | ||
| } | ||
| } | ||
| } |
114 changes: 114 additions & 0 deletions
114
...datadog/trace/instrumentation/confluentschemaregistry/KafkaSerializerInstrumentation.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| package datadog.trace.instrumentation.confluentschemaregistry; | ||
|
|
||
| import static datadog.trace.agent.tooling.bytebuddy.matcher.NameMatchers.named; | ||
| import static net.bytebuddy.matcher.ElementMatchers.isMethod; | ||
| import static net.bytebuddy.matcher.ElementMatchers.isPublic; | ||
| import static net.bytebuddy.matcher.ElementMatchers.returns; | ||
| import static net.bytebuddy.matcher.ElementMatchers.takesArgument; | ||
| import static net.bytebuddy.matcher.ElementMatchers.takesArguments; | ||
|
|
||
| import com.google.auto.service.AutoService; | ||
| import datadog.trace.agent.tooling.Instrumenter; | ||
| import datadog.trace.agent.tooling.Instrumenter.MethodTransformer; | ||
| import datadog.trace.agent.tooling.InstrumenterModule; | ||
| import datadog.trace.bootstrap.InstrumentationContext; | ||
| import datadog.trace.bootstrap.instrumentation.api.AgentTracer; | ||
| import datadog.trace.instrumentation.kafka_common.ClusterIdHolder; | ||
| import java.util.HashMap; | ||
| import java.util.Map; | ||
| import net.bytebuddy.asm.Advice; | ||
| import org.apache.kafka.common.serialization.Serializer; | ||
|
|
||
| /** | ||
| * Instruments Confluent Schema Registry serializers (Avro, Protobuf, and JSON) to capture | ||
| * serialization operations. | ||
| */ | ||
| @AutoService(InstrumenterModule.class) | ||
| public class KafkaSerializerInstrumentation extends InstrumenterModule.Tracing | ||
| implements Instrumenter.ForKnownTypes, Instrumenter.HasMethodAdvice { | ||
|
|
||
| public KafkaSerializerInstrumentation() { | ||
| super("confluent-schema-registry", "kafka"); | ||
| } | ||
|
|
||
| @Override | ||
| public String[] knownMatchingTypes() { | ||
| return new String[] { | ||
| "io.confluent.kafka.serializers.KafkaAvroSerializer", | ||
| "io.confluent.kafka.serializers.json.KafkaJsonSchemaSerializer", | ||
| "io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer" | ||
| }; | ||
| } | ||
|
|
||
| @Override | ||
| public String[] helperClassNames() { | ||
| return new String[] { | ||
| "datadog.trace.instrumentation.kafka_common.ClusterIdHolder", | ||
| packageName + ".SchemaIdExtractor" | ||
| }; | ||
| } | ||
|
|
||
| @Override | ||
| public Map<String, String> contextStore() { | ||
| Map<String, String> contextStores = new HashMap<>(); | ||
| contextStores.put("org.apache.kafka.common.serialization.Serializer", "java.lang.Boolean"); | ||
| return contextStores; | ||
| } | ||
|
|
||
| @Override | ||
| public void methodAdvice(MethodTransformer transformer) { | ||
| // Instrument configure to capture isKey value | ||
| transformer.applyAdvice( | ||
| isMethod() | ||
| .and(named("configure")) | ||
| .and(isPublic()) | ||
| .and(takesArguments(2)) | ||
| .and(takesArgument(1, boolean.class)), | ||
| getClass().getName() + "$ConfigureAdvice"); | ||
|
|
||
| // Instrument both serialize(String topic, Object data) | ||
| // and serialize(String topic, Headers headers, Object data) for Kafka 2.1+ | ||
| transformer.applyAdvice( | ||
| isMethod() | ||
| .and(named("serialize")) | ||
| .and(isPublic()) | ||
| .and(takesArgument(0, String.class)) | ||
| .and(returns(byte[].class)), | ||
| getClass().getName() + "$SerializeAdvice"); | ||
| } | ||
|
|
||
| public static class ConfigureAdvice { | ||
| @Advice.OnMethodExit(suppress = Throwable.class) | ||
| public static void onExit( | ||
| @Advice.This Serializer serializer, @Advice.Argument(1) boolean isKey) { | ||
| // Store the isKey value in InstrumentationContext for later use | ||
| InstrumentationContext.get(Serializer.class, Boolean.class).put(serializer, isKey); | ||
| } | ||
| } | ||
|
|
||
| public static class SerializeAdvice { | ||
| @Advice.OnMethodExit(onThrowable = Throwable.class, suppress = Throwable.class) | ||
| public static void onExit( | ||
| @Advice.This Serializer serializer, | ||
| @Advice.Argument(0) String topic, | ||
| @Advice.Return byte[] result, | ||
| @Advice.Thrown Throwable throwable) { | ||
|
|
||
| // Get isKey from InstrumentationContext (stored during configure) | ||
| Boolean isKeyObj = | ||
| InstrumentationContext.get(Serializer.class, Boolean.class).get(serializer); | ||
| boolean isKey = isKeyObj != null && isKeyObj; | ||
|
|
||
| // Get cluster ID from thread-local (set by Kafka producer instrumentation) | ||
| String clusterId = ClusterIdHolder.get(); | ||
|
|
||
| boolean isSuccess = throwable == null; | ||
| int schemaId = isSuccess ? SchemaIdExtractor.extractSchemaId(result) : -1; | ||
|
|
||
| // Record the schema registry usage | ||
| AgentTracer.get() | ||
| .getDataStreamsMonitoring() | ||
| .reportSchemaRegistryUsage(topic, clusterId, schemaId, isSuccess, isKey, "serialize"); | ||
| } | ||
| } | ||
| } | ||
23 changes: 23 additions & 0 deletions
23
...rc/main/java/datadog/trace/instrumentation/confluentschemaregistry/SchemaIdExtractor.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| package datadog.trace.instrumentation.confluentschemaregistry; | ||
|
|
||
| /** | ||
| * Helper class to extract schema ID from Confluent Schema Registry wire format. Wire format: | ||
| * [magic_byte][4-byte schema id][data] | ||
| */ | ||
| public class SchemaIdExtractor { | ||
| public static int extractSchemaId(byte[] data) { | ||
| if (data == null || data.length < 5 || data[0] != 0) { | ||
| return -1; | ||
| } | ||
|
|
||
| try { | ||
| // Confluent wire format: [magic_byte][4-byte schema id][data] | ||
| return ((data[1] & 0xFF) << 24) | ||
| | ((data[2] & 0xFF) << 16) | ||
| | ((data[3] & 0xFF) << 8) | ||
| | (data[4] & 0xFF); | ||
| } catch (Throwable ignored) { | ||
| return -1; | ||
| } | ||
| } | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Should we clear the cluster Id from the holder? Just to be extra careful.
As of today, I do not see a path that can bring us here without having passed for
poll(which sets the new value)...but just in case in case of api changes in the future.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are clearing it in the place we are setting it.
For the producer, we could clear it after using it, but for the consumer, it's not possible, because you can poll a batch of messages, so if we clear it after using it, we would only have the Kafka cluster ID for the first message.
So for consistency, I was thinking of only setting the Kafka cluster ID from the Kafka instrumentation. What do you think?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes total sense about the issue with batching.
Unsure of what you mean, but the current implementation (setting in
pollenter + theClusterIdHolder.clear()that you just added to thepollexit) feels like enough to me. Do you have a way in mind to improve on it?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's what I was thinking of 👍