-
Notifications
You must be signed in to change notification settings - Fork 244
Description
Support separated callback handling for RoutingTableProvider can deliver information in short time.
CURRENT_STATE is the source of truth for both router listener and helix (to do EV update), if it's the writes clogging ZK, then both helix EV update and router listener should suffer the same latency, or EV update should be worse because of an extra hop.
If there are too many participants reading ZK, that means the observer (router) clogged in ZK callback queue, then we should see different router have different refresh time.
For CurrentStates RoutingTableProvider, this is what happen in time:
Source of truth changes states, zookeeper notifies Helix currentstate based RoutingTableProvider, RoutingTableProvider reads instances change and instances config changes and current state changes from zookeeper, RoutingTableProvider calculates a snapshot and invokes callback of the Espresso logic with the snapshot
Solution:
- update BasicClusterDataCache to do refresh with selective update. Only when a change happens, we do the cache refresh only for that change type (ex. instance config change). So we don’t have to do full refresh for each type change and this improves read performance.
- improve RoutingTableProvider.queueEvent() and RoutingTableProvider.handleEvent(). Before the change, instanceConfigs may be clogged by currentStates refresh and so callback is waiting long time for the snapshot. After the change, instanceConfigs snapshot will be returned to callback immediately, instead of waiting for currentStates completion.