Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoints: fix performance drop on big volume restores #1345

Conversation

alaaeddineelamri
Copy link
Contributor

@alaaeddineelamri alaaeddineelamri commented Jan 1, 2023

Thank you for contributing to the Bareos Project!

Description

In this PR, jobmedia table entries are not updated anymore, and only new entries can be created. This change fixes a performance issue on restores of big volumes.

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

General
  • PR name is meaningful
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
  • Check backport line
  • Is the PR title usable as CHANGELOG entry?
  • Separate commit for CHANGELOG.md ("update CHANGELOG.md"). The PR number is correct.
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
  • bareos-check-sources --since-merge does not report any problems

@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch from f183406 to 52e56d7 Compare January 4, 2023 12:33
@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch from 52e56d7 to 8882856 Compare February 6, 2023 07:53
@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch 3 times, most recently from f838940 to b6c219c Compare February 7, 2023 09:00
@alaaeddineelamri alaaeddineelamri marked this pull request as ready for review February 7, 2023 09:56
@alaaeddineelamri alaaeddineelamri changed the title checkpoints: update jobmedia table updates with checkpoints checkpoints: update how jobmedia table entries are added Feb 7, 2023
@alaaeddineelamri alaaeddineelamri changed the title checkpoints: update how jobmedia table entries are added checkpoints: fix performance drop on big volume restores Feb 7, 2023
@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch from 12325b2 to 8c0b01a Compare February 8, 2023 17:45
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks fine. I have some comments, but it is mostly about naming and using const.

auto now
= std::chrono::system_clock::to_time_t(std::chrono::system_clock::now());
time_t next_checkpoint_time = now + me->checkpoint_interval;
bool checkpoints_enabled = me->checkpoint_interval > 0 ? true : false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool checkpoints_enabled = me->checkpoint_interval > 0 ? true : false;
const bool checkpoints_enabled = me->checkpoint_interval > 0;

Comment on lines 317 to 312
bool block_changed
= current_block_number != jcr->sd_impl->dcr->block->BlockNumber;
bool volume_changed
= jcr->sd_impl->dcr->VolMediaId != current_volumeid && block_changed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool block_changed
= current_block_number != jcr->sd_impl->dcr->block->BlockNumber;
bool volume_changed
= jcr->sd_impl->dcr->VolMediaId != current_volumeid && block_changed;
const bool block_changed
= current_block_number != jcr->sd_impl->dcr->block->BlockNumber;
const bool volume_changed
= jcr->sd_impl->dcr->VolMediaId != current_volumeid && block_changed;

void DoBackupCheckpoint(JobControlRecord* jcr);
void DoTimedCheckpoint(JobControlRecord* jcr);
void DoVolumeChangeBackupCheckpoint(JobControlRecord* jcr);
void SetReadyForCheckpoint(bool ready) { ready_for_checkpoint_ = ready; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest of the codebase uses SetXYZ() / ClearXYZ(), so maybe we want to apply that pattern here, too.
Also ClearReadyForCheckpoint() could even be made private.

void DoTimedCheckpoint(JobControlRecord* jcr);
void DoVolumeChangeBackupCheckpoint(JobControlRecord* jcr);
void SetReadyForCheckpoint(bool ready) { ready_for_checkpoint_ = ready; }
bool ReadyForCheckpoint() const { return ready_for_checkpoint_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this looks completely reasonable here in context, ReadyForCheckpoint() could also mean "I am ready for a checkpoint now" instead of "Am I ready for a checkpoint?"
Mark at least as [[nodiscard]] or rename to IsXYZ().


namespace storagedaemon {

CheckpointHandler::CheckpointHandler(time_t interval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the class' name is CheckpointHandler I would have named the files checkpoint_handler.cc and checkpoint_handler.h.

@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch from 812e8a4 to 878d413 Compare February 13, 2023 11:43
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the include guards needs to be changed after renaming the file, also bareos-check-sources is unhappy with the headers in append.cc and append.h.

Also, some of the commit headlines exceed the limit that pr-tool is willing to accept.

Comment on lines 22 to 23
#ifndef BAREOS_STORED_CHECKPOINTHANDLER_H_
#define BAREOS_STORED_CHECKPOINTHANDLER_H_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include guard needs to be changed on rename, too.

};

} // namespace storagedaemon
#endif // BAREOS_STORED_CHECKPOINTHANDLER_H_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include guard needs to be changed on rename, too.

@alaaeddineelamri alaaeddineelamri force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch 2 times, most recently from 4b49f99 to 6f55b4b Compare February 13, 2023 12:59
@arogge arogge self-requested a review February 23, 2023 13:50
remove unused `PossibleIncompleteJob` function
Previously, when updating backup checkpoints, DIR would try to
update the jobmedia table entries or create a new one in case 
an update is not possible which in consequence made the 
jobmedia table entries kind of merged.

The change lowered the number of entries in the jobmedia table
but had the unwanted consequence of slowing restores in certain
situations (restores on big volumes for example).

With this change, jobmedia entries will only get created, and 
checkpoints will add an entry into the jobmedia table instead
of updating existing entries.
* changed when checkpoints are triggered relative to attribute saving
* add extra checks for when to trigger checkpoints
@arogge arogge force-pushed the dev/alaaeddineelamri/master/change-jobmedia-update branch from 6f55b4b to b270c90 Compare February 23, 2023 13:51
@arogge arogge merged commit 1e5cdb4 into bareos:master Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants