Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement snapshotting at shutdown #363

Merged
merged 35 commits into from
Apr 22, 2023
Merged

Conversation

danielealbano
Copy link
Owner

This PR introduces the ability to trigger a snapshot when the workers start to shutdown.

To achieve this quite some changes have been done to the workers events loops: meanwhile before the initialization and shutdown of the internal database and the networking could have performed before entering the loop and at its end, with the need to be able to perform I/O the loop has still to run at the shutdown but it's necessary to terminate all the connections and pending operations.
The PR introduces the required changes and therefore it's now possible perform operations both at the startup and at the shutdown, within the loop!

The PR also introduces a new field in the snapshot structure to keep track of the number of iteration performed, very handy to check if a snapshot has been taken, used now internally.

A number of additional fixes have been introduced, e.g. to better sync the shutdown of the signal handler thread or of the epoch gc workers., better tests for the SAVE and BGSAVE commands, improve tests for SHUTDOWN, etc.

Closes #316

…ng within the loop and to trigger a snapshot once all the network connections are closed
…he global terminate variable in a worker and program one
… if snapshots are in progress unreliable and flaky
@danielealbano danielealbano added bug Something isn't working enhancement New feature or request labels Apr 22, 2023
@danielealbano danielealbano added this to the v0.3 milestone Apr 22, 2023
@danielealbano danielealbano self-assigned this Apr 22, 2023
@danielealbano danielealbano added this to In Progress in cachegrand via automation Apr 22, 2023
@codecov
Copy link

codecov bot commented Apr 22, 2023

Codecov Report

Patch coverage: 82.83% and project coverage change: +0.13 🎉

Comparison is base (8475737) 78.97% compared to head (0fdc880) 79.09%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #363      +/-   ##
==========================================
+ Coverage   78.97%   79.09%   +0.13%     
==========================================
  Files         182      182              
  Lines       12546    12609      +63     
==========================================
+ Hits         9907     9973      +66     
+ Misses       2639     2636       -3     
Flag Coverage Δ
unittests 79.09% <82.83%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/config_cyaml_schema.c 100.00% <ø> (ø)
src/storage/db/storage_db.h 30.77% <ø> (ø)
src/support/io_uring/io_uring_support.c 82.42% <0.00%> (-3.79%) ⬇️
...orker/fiber/worker_fiber_storage_db_snapshot_rdb.c 72.22% <ø> (ø)
src/worker/storage/worker_storage_iouring_op.c 87.50% <ø> (ø)
src/worker/worker_context.h 100.00% <ø> (ø)
src/worker/network/worker_network_op.c 71.52% <50.00%> (+1.46%) ⬆️
src/program.c 39.83% <60.00%> (+3.70%) ⬆️
src/worker/worker.c 83.49% <91.67%> (+2.24%) ⬆️
...dule/redis/command/module_redis_command_shutdown.c 85.71% <100.00%> (+10.71%) ⬆️
... and 7 more

... and 9 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@danielealbano danielealbano enabled auto-merge (squash) April 22, 2023 12:56
uint8_t listeners_count) {
// TODO: should use a struct with fp pointers, not ifs
if (worker_context->config->network->backend == CONFIG_NETWORK_BACKEND_IO_URING) {
worker_network_iouring_cleanup(listeners, listeners_count);

Check warning

Code scanning / CodeQL

Expression has no effect

This expression has no effect (because [worker_network_iouring_cleanup](1) has no external side effects).
@danielealbano danielealbano merged commit b872b52 into main Apr 22, 2023
cachegrand automation moved this from In Progress to Completed Apr 22, 2023
@danielealbano danielealbano deleted the feat-snapshot-at-shutdown branch April 22, 2023 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
cachegrand
  
Completed
1 participant