Skip to content

event service does not send ready/handshake immediately for newly registered or reset dispatchers when no new resolved-ts arrives, causing dispatcher initialization or recovery to stall #4873

@zier-one

Description

@zier-one

What did you do?

Registered a new dispatcher, or reset an existing dispatcher after it had reached the ready stage, while keeping the upstream idle so that no new events advanced the resolved-ts after the register/reset.

What did you expect to see?

The event service should send a ReadyEvent immediately after a dispatcher is registered successfully, and send a HandshakeEvent immediately after a reset switches the dispatcher to a new epoch, so that the event collector can complete the register/reset control-plane transition without waiting for new upstream traffic.

What did you see instead?

Both ReadyEvent and HandshakeEvent were coupled to the onNotify() -> scanReady() path. When no new resolved-ts arrived:

  • the collector never received ReadyEvent during initial registration, so it could not issue the follow-up reset request;
  • after reset, the new epoch never received HandshakeEvent, so the collector could not initialize the new event stream.

As a result, the dispatcher could remain stuck in an uninitialized or post-reset unhandshaked state until another upstream resolved-ts happened to arrive.

Versions of the cluster

master

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions