Propose design for aggregated cluster view service #266

zhan849 · 2018-08-21T02:17:10Z

This PR adds a design doc for aggregated cluster view service.

kishoreg · 2018-10-21T15:28:17Z

designs/aggregated-cluster-view/design.md

+
+
+## Problem Statement
+We identified a couple of use cases for accessing cross datacenter information. [Ambry](https://github.com/linkedin/ambry) is one of them.


can you expand more on why Ambry needs this feature?

Sure (will also update design doc about it).

Ambry uses Helix spectator in both their router (for retrying get requests remotely if failed locally) and storage node (for data replication purpose). Given the amount of clients that need global information, it would be more cost-effective for them if aggregated information are provided locally.

kishoreg · 2018-10-21T15:29:39Z

designs/aggregated-cluster-view/design.md

+
+To provide aggregated cluster view, the solution I'm proposing is to add a special type of cluster, i.e. **View Cluster**.
+View cluster leverages current Helix semantics to store aggregated information of various **Source Clusters**.
+There will be another micro service (Helix View Aggregator) running, fetching information from clusters (likely from other data centers) to be aggregated, and store then to the view cluster.


why cant we just set up zookeeper observers?

though setting up observer local to clients can potentially reduce cross data center traffic, but has a few draw backs:

all data changes will be propagated immediately, and if such information is not required frequently, there will be wasted traffic. Building a service makes it possible to customize aggregation granularity

Using zookeeper observer leaves aggregation logic to client - providing aggregated data will make it easier for user to consume

Building a service will leave space to customize aggregated data in the future, i.e. if we want to aggregate idea state, we might not need to aggregate preference list, etc

Will add these points into design doc

zhan849 force-pushed the harry/view-aggregator-design branch 2 times, most recently from 318ae2f to 9c0e4dd Compare August 21, 2018 18:10

kishoreg reviewed Oct 21, 2018

View reviewed changes

zhan849 mentioned this pull request Nov 2, 2018

Implement view cluster aggregator #294

Closed

Propose design for aggregated cluster view service

1e07ec3

zhan849 force-pushed the harry/view-aggregator-design branch from 9c0e4dd to 1e07ec3 Compare November 2, 2018 22:38

asfgit force-pushed the master branch 2 times, most recently from 5ebe967 to 9d89e93 Compare November 16, 2018 23:57

junkaixue closed this Jul 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propose design for aggregated cluster view service #266

Propose design for aggregated cluster view service #266

zhan849 commented Aug 21, 2018

kishoreg Oct 21, 2018

zhan849 Oct 23, 2018

kishoreg Oct 21, 2018

zhan849 Oct 23, 2018



		## Problem Statement
		We identified a couple of use cases for accessing cross datacenter information. [Ambry](https://github.com/linkedin/ambry) is one of them.

Propose design for aggregated cluster view service #266

Propose design for aggregated cluster view service #266

Conversation

zhan849 commented Aug 21, 2018

kishoreg Oct 21, 2018

Choose a reason for hiding this comment

zhan849 Oct 23, 2018

Choose a reason for hiding this comment

kishoreg Oct 21, 2018

Choose a reason for hiding this comment

zhan849 Oct 23, 2018

Choose a reason for hiding this comment