Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable fine-tuning of dqlite parameters #32

Merged
merged 2 commits into from
Nov 29, 2022
Merged

Conversation

neoaggelos
Copy link
Contributor

Summary

Support tuning snapshot parameters, network latency and kine compaction and polling intervals in k8s-dqlite.

Changes

Tuning can be achieved by adding a tuning.yaml file per-node with the following contents:

# configure trailing and threshold log entries for snapshots.
snapshot:
  trailing: 2048
  threshold: 1024

# configure average network latency between nodes (duration, so `s`, `ms` units are needed)
network-latency: 50ms

# configure kine compact and poll interval (duration, so `m`, `s` units are needed)
kine-compact-interval: 5m
kine-poll-interval: 1s

If any value is not set, then the dqlite defaults are used.

go.mod Show resolved Hide resolved
@ktsakalozos
Copy link
Member

Why do the tests fail? Is there a cange in how dqlite is build?

@ktsakalozos
Copy link
Member

I wonder if any of those parameters need to be set across all nodes in the cluster.

@neoaggelos neoaggelos merged commit e925e3e into master Nov 29, 2022
@MathieuBordere
Copy link

MathieuBordere commented Jan 9, 2023

Snapshot.Trailing Determines how many log entries are kept around in the log after taking a snapshot. The higher this number, the more log entries are kept in memory and on disk, but the lower the chance of having to send a full snapshot to a lagging node.

Snapshot.Threshold Determines after how many new log entries a raft snapshot is taken. Increasing this number decreases the snapshot's frequency. As a consequence, more log entries will accumulate in the log before taking a snapshot, increasing memory usage and room occupied on disk by log entries and vice versa.

As a practical example, let's take Dqlite's default parameters (here).

Snapshot.Trailing is 8192 and Snapshot.Threshold is 1024. This means that on average 8192 + 1024/2 = 8704 raft log entries are kept around with a 8192 + 1024 = 9216 peak just before taking the snapshot and a 8192 bottom after taking the snapshot.

Changing the parameters to e.g. Snapshot.Trailing 8192 and Snapshot.Threshold is 8192 means that on average 8192 + 8192/2 = 12288 raft log entries are kept around with a 8192 + 8192 = 16384 peak and a 8192 bottom, while the snapshot frequency has decreased 8x, this can significantly lower the amount of disk IO the system has to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants