Skip to content

[server] Coordinator metrics not updated after restart when no events arrive #3147

@swuferhong

Description

@swuferhong

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

main (development)

Please describe the bug 🐞

After a CoordinatorServer restart, once the init context completes, coordinator-level metrics such as fluss_coordinator_activeTabletServerCount remain 0 if no client requests or events arrive.

Solution

The CoordinatorEventThread.doWork() has two issues:

  1. lastMetricsUpdateTime is initialized to System.currentTimeMillis(), so the first doWork() invocation skips the metrics update (less than 5s elapsed).

  2. queue.take() blocks indefinitely. If no events are enqueued, the thread never loops back to re-check the metrics update condition.

As a result, metrics are only updated as a side effect of event processing. In an idle coordinator with no incoming requests, the gauges stay at their initial zero values.

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions