-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Can you check my understanding of the relationships between servers, replicas, groups, shards, and tablets described in the documentation?
- A cluster is the entire deployment of a dgraph instance. A cluster contains at least one server, at least one zero, and zero or more RAFT quorums.
- A zero is a super node. It monitors and balances the data held on servers.
- A server is a worker node within a cluster. It contains the actual data on disk, and serves both reads and writes. A server contains exactly one group.
- A group is a logical collection of predicates. A group contains at least one predicate.
- A predicate is a logical collection of ground truths.
- A tablet is a physical collection of ground truths stored on the disk. A tablet contains the physical data for one predicate.
- The terms shard, data shard, and tablet as used in the documentation are synonyms.
- A replica is a server that belongs to a distinct quorum. Depending on the value of the
--replicasflag all servers are replicas (>=2) or none of them are (==1). - A quorum consists of a set of replicas that provide readwrite access to consistent copies of the exact same group where consensus is handled by RAFT. A quorum contains a least two servers, but ideally 3 or 5 to avoid split brain situations.
Capacity Planning
The --replicas tunable applies to the entire cluster and not to a particular predicate. To plan the number of servers for a cluster with replicas > 1, you have to be able to predict the number and size of predicates for your schema, and then solve a bin packing problem to estimate how many groups dgraph would optimally want to break that into.
When sizing a cluster, if you opt for few groups and large disks, you'll run into throughput issues on reads due to page faults. Alternately, if you opt for more groups, then you'll have to provision redundant servers for predicates that see low query traffic resulting in higher TCO.
If I understand things correctly, maybe the thing to do here is to let folks set replica policy on a per predicate basis, and then let the zeroes migrate tablets based on policy rather than cluster-wide settings?