Skip to content

fix(gax): record fractional latency metrics#12979

Merged
blakeli0 merged 3 commits intomainfrom
fix/gax-fractional-latency-metrics
May 1, 2026
Merged

fix(gax): record fractional latency metrics#12979
blakeli0 merged 3 commits intomainfrom
fix/gax-fractional-latency-metrics

Conversation

@rahul2393
Copy link
Copy Markdown
Contributor

@rahul2393 rahul2393 commented May 1, 2026

Summary

Record GAX operation and attempt latency metrics with fractional millisecond precision instead of truncating to whole milliseconds.

Previously MetricsTracer used:

stopwatch.elapsed(TimeUnit.MILLISECONDS)

This floors sub-millisecond precision before the value is recorded into the histogram. For low-latency RPCs, this can make Cloud Monitoring percentiles look significantly lower than application-observed latency.

For example, a Spanner customer investigation showed:

  • application/client observed P50: ~3.7ms
  • trace-measured GAX operation latency: ~3.9ms
  • exported operation_latencies P50: ~2.9ms

The apparent ~800us P50 gap looked like extra client-side latency in Spanner, but raw metric prints showed values being recorded as whole numbers only (3.0, 4.0, 7.0). The sub-ms precision was lost before histogram bucketization, so this was a rounding/truncation artifact rather than actual Spanner client overhead.

Fix

Use nanosecond elapsed time and convert to fractional milliseconds:

stopwatch.elapsed(TimeUnit.NANOSECONDS) / 1_000_000.0

Applied to:

  • operation latency
  • attempt latency

@rahul2393 rahul2393 requested a review from a team as a code owner May 1, 2026 04:43
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates MetricsTracer to support fractional millisecond latency reporting by calculating elapsed time from nanoseconds instead of using the millisecond precision provided by the Stopwatch class. It also introduces Ticker injection to improve testability, allowing for more precise unit tests using FakeTicker. A review comment suggests replacing the magic number 1,000,000.0 with TimeUnit constants to improve code readability and self-documentation.

I am having trouble creating individual review comments. Click here to see my feedback.

sdk-platform-java/gax-java/gax/src/main/java/com/google/api/gax/tracing/MetricsTracer.java (246)

medium

To improve readability and avoid using a magic number, consider using the TimeUnit class for the conversion. This makes the code more self-documenting about the units involved.

    return stopwatch.elapsed(TimeUnit.NANOSECONDS) / (double) TimeUnit.MILLISECONDS.toNanos(1);

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

Quality Gate Failed Quality Gate failed for 'gapic-generator-java-root'

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

Quality Gate Failed Quality Gate failed for 'gapic-generator-java-root'

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@blakeli0 blakeli0 merged commit d690333 into main May 1, 2026
252 of 255 checks passed
@blakeli0 blakeli0 deleted the fix/gax-fractional-latency-metrics branch May 1, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants