Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint daemon doesn't randomise checkpoint interval #4589

Closed
sergepetrenko opened this issue Oct 28, 2019 · 2 comments
Closed

checkpoint daemon doesn't randomise checkpoint interval #4589

sergepetrenko opened this issue Oct 28, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@sergepetrenko
Copy link
Collaborator

Tarantool version: 2.1, 2.2, 2.3

Bug description:
#4432 resolved the issue after box.snapshot(), but it turns out that
checkpoint_daemon doesn't randomise checkpoint time at all.
This was introduced in 4c04808

Take a look at the logs. Note that before the manual box.snapshot() checkpoints are
made every 10 seconds. Issuing box.snapshot() reschedules a checkpoint to a random interval,
but the next automatic reschedule sets the interval to 10 seconds again.

2019-10-28 15:25:10.360 [17665] main/104/checkpoint_daemon I> scheduled next checkpoint for Mon Oct 28 15:25:20 2019
2019-10-28 15:25:20.362 [17665] snapshot/101/main I> saving snapshot `./00000000000000000035.snap.inprogress'
2019-10-28 15:25:20.367 [17665] snapshot/101/main I> done
2019-10-28 15:25:20.368 [17665] main/104/checkpoint_daemon I> scheduled next checkpoint for Mon Oct 28 15:25:30 2019
2019-10-28 15:25:30.362 [17665] snapshot/101/main I> saving snapshot `./00000000000000000036.snap.inprogress'
2019-10-28 15:25:30.367 [17665] snapshot/101/main I> done
2019-10-28 15:25:30.368 [17665] main/104/checkpoint_daemon I> scheduled next checkpoint for Mon Oct 28 15:25:40 2019

tarantool> box.snapshot()

2019-10-28 15:25:36.991 [17665] main/104/checkpoint_daemon I> scheduled next checkpoint for Mon Oct 28 15:25:53 2019
2019-10-28 15:25:36.991 [17665] snapshot/101/main I> saving snapshot `./00000000000000000037.snap.inprogress'
2019-10-28 15:25:36.996 [17665] snapshot/101/main I> done
2019-10-28 15:25:53.995 [17665] main/104/checkpoint_daemon I> scheduled next checkpoint for Mon Oct 28 15:26:03 2019
@sergepetrenko sergepetrenko added the bug Something isn't working label Oct 28, 2019
@kostja
Copy link
Contributor

kostja commented Oct 29, 2019

What is not randomized? You should not re-set the interval after each checkpoint, otherwise you'r ebound to to have collisions in a running setup once in a while. A collision can be easily fixed with box.snapshot(). This is an imperfect solution but the right one is kostja#1

@sergepetrenko
Copy link
Collaborator Author

@kostja Makes sense, thanks. Closing then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants