Skip to content

Built-in metrics fail to export due to instance_id missing after executing a request that would not get sent  #3611

@olavloite

Description

@olavloite

The trigger for this issue report is GoogleCloudPlatform/pgadapter#2760

The built-in metrics exporter checks that all metrics include an instance_id. If at least one metric does not include an instance_id, then entire export will be skipped. This again also means that the metric remains in the collection of unexported metrics, and all built-in metric exports stop from that point until the client is restarted.

A client can easily collect a metric without an instance_id, because the `instance_id is being set in a the header interceptor that is called when the headers are being sent by the client. That never happens if the client cannot establish a network connection to Spanner.

Copy-paste from the PGAdapter issue:

The problem occurs when a request is being sent by PGAdapter (or more correct: By the underlying Java client) but never really leaves the client, for example due to a network problem. The reason is that:

  1. The RPC is collected as a failed attempt and included in the metrics.
  2. However, the instance_id is not set before the request is being sent. Instead, that happens in this interceptor when the headers are being is sent.
  3. If the request is never being sent due to a network problem, then no instance_id will be set, and the metric will be added to the collection without an instance_id.
  4. Once that has happened once, it will continue to log the warning, as the entire export is being skipped (instead of only the metric without an instance_id).

The easiest way to reproduce the problem (and verify that it indeed happens in the way described above):

  1. Create a client and successfully execute a simple query (this ensures that everything has been initialized and simplifies the next steps).
  2. Set a breakpoint at this line (this is where the client initiates the request)
  3. Set a breakpoint at this line (this is where the headers are being sent)
  4. Disable your network (disable WiFi or unplug your cable).
  5. Try to execute another query. You will see that breakpoint 3 is reached and 4 is not. This also means that no instance_id is added to the metric attributes.

Metadata

Metadata

Assignees

Labels

api: spannerIssues related to the googleapis/java-spanner API.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions