-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(observability): micrometer telemetry provider #1089
feat(observability): micrometer telemetry provider #1089
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great start. I've left several comments/questions on the diff itself but I also note there's no new tests for verifying correctness. Please add additional tests to cover the new provider.
@ExperimentalApi | ||
public class MicrometerTelemetryProvider( | ||
meterRegistry: MeterRegistry, | ||
override val loggerProvider: LoggerProvider = GlobalTelemetryProvider.instance.loggerProvider, | ||
) : TelemetryProvider { | ||
override val meterProvider: MeterProvider = MicrometerMeterProvider(meterRegistry) | ||
override val tracerProvider: TracerProvider = TracerProvider.None | ||
override val contextManager: ContextManager = ContextManager.None | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Micrometer supports more than just metrics, it also supports tracing (either directly or via bridges to other providers like OTel and Brave) and context propagation—which seem like natural fits for new TracingProvider
and ContextManager
implementations. Have you looked yet at what it would take to add support for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At my org we use another tool for tracing, so atm I'm only interested in metrics
|
||
@ExperimentalApi | ||
public class MicrometerTelemetryProvider( | ||
meterRegistry: MeterRegistry, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Would it make sense for meterRegistry
to default to Metrics.globalRegistry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, why not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal class MicrometerMeterProvider(meterRegistry: MeterRegistry) : MeterProvider { | ||
private val meter = MicrometerMeter(meterRegistry) | ||
|
||
override fun getOrCreateMeter(scope: String): Meter = meter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: The scope
parameter is intended to delineate which component of the SDK is under observation by specific meters. For instance, meters related to individual calls are given the scope aws.sdk.kotlin.<service-name>
whereas meters related to the default HTTP engine are given the scope aws.smithy.kotlin.runtime.http.engine.okhttp
. Could the scope be used here, perhaps as tags attached to newly-created meters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...micrometer/jvm/src/aws/smithy/kotlin/runtime/telemetry/micrometer/MicrometerMeterProvider.kt
Show resolved
Hide resolved
private class MicrometerUpDownCounter( | ||
private val meterMetadata: MeterMetadata, | ||
private val meterRegistry: MeterRegistry, | ||
) : UpDownCounter { | ||
override fun add(value: Long, attributes: Attributes, context: Context?) { | ||
meterMetadata | ||
.counter() | ||
.tags(attributes.toTags()) | ||
.register(meterRegistry) | ||
.increment(value.toDouble()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness: The UpDownCounter
type accepts both increments and decrements (e.g., something like queue length which could rise/fall over time). Micrometers' Counter
seems to accept only positive increments. What will happen if MicrometerUpDownCounter.add
gets called with a negative value
? If the underlying Micrometer implementation would throw an exception or otherwise misbehave then it's better to make UpDownCounter
a no-op implementation (i.e., createUpDownCounter
returns a dummy object that does nothing when add
is called).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would work fine, you can use this snippet to check it yourself with a Kotlin scratch file in IntelliJ:
import io.micrometer.core.instrument.simple.SimpleMeterRegistry
val registry = SimpleMeterRegistry()
val counter = registry.counter("test.counter")
counter.increment(3.0)
println(counter.count())
counter.increment(-2.0)
println(counter.count())
Will produce:
3.0
1.0
...micrometer/jvm/src/aws/smithy/kotlin/runtime/telemetry/micrometer/MicrometerMeterProvider.kt
Outdated
Show resolved
Hide resolved
runtime/observability/telemetry-provider-micrometer/build.gradle.kts
Outdated
Show resolved
Hide resolved
...micrometer/jvm/src/aws/smithy/kotlin/runtime/telemetry/micrometer/MicrometerMeterProvider.kt
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this a bit today and ran some sample tests using the new Micrometer telemetry provider with the Cloudwatch backend. I'm seeing metrics named with suffixes that aren't specified in the code. For instance, during a simple S3 ListBuckets
call, I saw the following metrics emitted into Cloudwatch:
smithy.client.call.attempt_duration.count
smithy.client.call.attempt_duration.sum
smithy.client.call.attempt_overhead_duration.count
smithy.client.call.attempt_overhead_duration.sum
smithy.client.call.attempts.count
smithy.client.call.auth.resolve_identity_duration.count
smithy.client.call.auth.resolve_identity_duration.sum
smithy.client.call.auth.signing_duration.count
smithy.client.call.auth.signing_duration.sum
smithy.client.call.deserialization_duration.count
smithy.client.call.deserialization_duration.sum
smithy.client.call.duration.count
smithy.client.call.duration.sum
smithy.client.call.resolve_endpoint_duration.count
smithy.client.call.resolve_endpoint_duration.sum
smithy.client.call.serialization_duration.count
smithy.client.call.serialization_duration.sum
Note that every metric name is suffixed with either .count
or .sum
but the metrics emitted by the SDK into telemetry providers don't include those.
Furthermore, none of those emitted metrics showed any non-zero values except for smithy.client.call.attempts.count
. See image below and/or this CSV file.
I'm not familiar enough with Micrometer to understand what's going wrong but I'm hoping you have some insight @monosoul!
My test code
val cwConfig = object : CloudWatchConfig {
override fun get(s: String) = null
override fun namespace() = "ian.micrometer.test"
}
val registry = CloudWatchMeterRegistry(cwConfig, Clock.SYSTEM, CloudWatchAsyncClient.create())
val telProvider = MicrometerTelemetryProvider(registry)
S3Client.fromEnvironment {
telemetryProvider = telProvider
}.use { s3 ->
s3.listBuckets()
}
registry.close()
@monosoul have you had a chance to dig deeper into the metric naming issues? |
@ianbotsf sorry, I was away for a few weeks. Metrics with |
…etry-provider Conflicts: gradle/libs.versions.toml
@ianbotsf I used this snippet to try it locally (uses testcontainers to spin up localstack instance): import aws.sdk.kotlin.runtime.auth.credentials.StaticCredentialsProvider
import aws.sdk.kotlin.services.s3.S3Client
import aws.sdk.kotlin.services.s3.createBucket
import aws.smithy.kotlin.runtime.ExperimentalApi
import aws.smithy.kotlin.runtime.net.url.Url
import aws.smithy.kotlin.runtime.telemetry.micrometer.MicrometerTelemetryProvider
import io.micrometer.core.instrument.Clock
import io.micrometer.core.instrument.logging.LoggingMeterRegistry
import io.micrometer.core.instrument.logging.LoggingRegistryConfig
import kotlinx.coroutines.runBlocking
import org.testcontainers.containers.localstack.LocalStackContainer
import org.testcontainers.utility.DockerImageName
import java.time.Duration
fun main(): Unit = runBlocking {
val localstack = LocalStackContainer(DockerImageName.parse("localstack/localstack:3.5.0"))
.withServices(LocalStackContainer.Service.S3)
.also { it.start() }
val config = object : LoggingRegistryConfig {
override fun get(p0: String): String? = null
override fun step(): Duration = Duration.ofSeconds(1)
}
val registry = LoggingMeterRegistry(config, Clock.SYSTEM)
val telProvider = MicrometerTelemetryProvider(registry)
S3Client.fromEnvironment {
endpointUrl = Url.parse(localstack.getEndpointOverride(LocalStackContainer.Service.S3).toString())
credentialsProvider = StaticCredentialsProvider {
accessKeyId = localstack.accessKey
secretAccessKey = localstack.secretKey
}
region = localstack.region
telemetryProvider = telProvider
httpClient {
telemetryProvider = telProvider
}
}.use { s3 ->
s3.createBucket {
bucket = "mybucket"
}
s3.listBuckets()
}
registry.stop()
Thread.sleep(2000)
registry.close()
} Here's what I got as the output:
UPD: nvm, there was a missing call: aba4715 |
@ianbotsf I can add tests as well if you're fine with the current implementation |
Looks like that was it. With the latest commits I now see the full suite of expected metrics now from my test code!
Yes, that sounds great. Thank you! |
…etry-provider Conflicts: gradle/libs.versions.toml
@ianbotsf I've added tests and resolved conflicts with main branch, should be good to merge now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I'll let @lauzadis chime in as well and then we can get this shipped,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the contribution!
OK this has been merged to main and should be available in the next release, tentatively scheduled for tomorrow. Thanks again for the contribution! |
This is a PoC telemetry provider implementation using micrometer
Issue #
Fixes #1087
Description of changes
Introduces telemetry provider using micrometer as a backend
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.