cluster data storage and replication #32

genericgithubuser · 2019-05-24T15:38:44Z

Can someone clarify what the data writing looks like in cluster mode? Is data actually replicated in the cluster, or is it hashed and written to a specific vmstorage node, or just fanned out so can land on any vmstorage node, or some other setup?
When data is queried, is the query passed to all vmstorage nodes, or does the query layer know where the data lives so just asks the appropriate node?

tenmozes · 2019-05-25T05:47:46Z

Hello, there are 3 services in cluster version
vmstorage - persistence storage (statefull)
vmselect - read gateway (stateless)
vminsert - write gateway (stateless)
vminsert knows about all storage nodes and uses consistent hashes to choose one particular node and writes to it (hash from metric + label in sorted order), if node doesn't exist at the moment it writes to next one
vmselect - reads from all nodes and merge the result, if one of them are not reachable it marks result as partial

here a bit more info https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster#cluster-availability

valyala · 2019-05-25T10:59:53Z

A few words about replication additionally to the info provided by @tenmozes :

We didn't come up with reliable yet simple replication scheme on VictoriaMetrics level, which could provide data safety and high availability in the event of storage loss. So vmstorage nodes rely on durable replicated disks such as Google Compute disks instead of implementing the replication itself.

The most straightforward approach for the replication on VictoriaMetrics level - just put N copies of data to different vmstorage nodes, where N is replication factor - requires complex and fragile data reshuffling scheme in order to restore the required replication factor for the data stored on broken disks. The automatic reshuffling may hurt cluster availability and performance due to increased usage of network, disk and CPU resources. Additionally, the reshuffling may fail on edge cases such as temporary unavailability of the network between vmstorage nodes. So we chose the simplest approach - to shift the data safety headache to durable disks.

It is possible to implement replication on the Prometheus level by running multiple VictoriaMetrics clusters in distinct availability zones and writing data in parallel to all these clusters. Then the data may be queried via promxy sitting in front of all the VictoriaMetrics clusters.

valyala · 2019-07-24T23:07:32Z

Related issue: #118

valyala · 2020-05-27T21:18:27Z

FYI, release v1.36.0 contains application-level replication support for cluster version of VictoriaMetrics. See more details about the replication here.

Closing this issue.

tenmozes added the question The question issue label May 25, 2019

valyala changed the title ~~cluster data storage~~ cluster data storage and replication May 28, 2019

valyala mentioned this issue Aug 14, 2019

vmstorage use all available memory #142

Closed

valyala closed this as completed May 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster data storage and replication #32

cluster data storage and replication #32

genericgithubuser commented May 24, 2019

tenmozes commented May 25, 2019

valyala commented May 25, 2019 •

edited

valyala commented Jul 24, 2019

valyala commented May 27, 2020

cluster data storage and replication #32

cluster data storage and replication #32

Comments

genericgithubuser commented May 24, 2019

tenmozes commented May 25, 2019

valyala commented May 25, 2019 • edited

valyala commented Jul 24, 2019

valyala commented May 27, 2020

valyala commented May 25, 2019 •

edited