-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vadp-dumper: fix multithreaded backup/restore issues #1593
vadp-dumper: fix multithreaded backup/restore issues #1593
Conversation
3971a50
to
a0afab7
Compare
a552473
to
2fecb27
Compare
2fecb27
to
0b5d769
Compare
Neither write nor read guarantee that they will finish everything in one call. One has to repeatedly call them in a loop to ensure that everything gets written/read.
This code path is only taken if multithreaded restore is selected.
Since the input size is a variable, we need to ensure that we hold enough capacity to actually read the data into the buffer. Otherwise, if we get unlucky, we will run into the situation were the first few cbts are very small, and the rest are really big, ensuring a buffer overflow; possibly causing a crash.
We need to ensure that the writer does not write into the block we are currently reading. Currently we only distinguish between data thats inside the queue (which can neither be written to nor read from) and data thats outside the queue (which can _both_ be written to and read from). To fix this data race we split the data into three categories instead: - The head of the queue (which can only be read from) - The rest of the queue (which can neither be read from nor written to) - The data outside the queue (which can only be written to) To accomplish this we split the normal dequeue() function into two parts. The normal dequeue() becomes peek-like, only returning the head element, and the new function FinishDeq() does the actual dequeuing.
The general gist before was this: * copy thread: - Wait for start - Wait for dequeue - if data: handle data; loop - if no data: break inner (this only happens on flush) - `signal` that queue was fushed and that the thread is waiting for work - goto begin * main thread - do some setup - start enqueueing data; signal start to thread if not started (this was done with an unsynchronized read!) - flush when done - During cleanup, cancel the thread while it is waiting on a start signal. with some locks & unlocks sprinkled in. One should note that the cleanup copy thread callback always unlockes the lock regardless of whether the copy thread locked the mutex or not! This was slightly changed: * Clearly there is no need to wait for "start" at all, since we wait on the dequeue anyways -- so this was removed. * Instead of just canceling the thread, we set a flag that tells the Copythread to exit, which the thread checks after every flush. As such we also make sure flush the queue on cleanup. * We also now properly initialize all thread_context members!
Since sectors_per_call is not actually bounded by SECTORS_PER_CALL, we need to ensure that the buffer has enough space at runtime.
a6c996d
to
1cd8ec5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, I'd like to suggest the following changes:
- When passing
-s 0
tobareos_vadp_dumper
it excessively consumes cpu and memory, so that value should be rejected. - Output like the following should only appear when
-v
was passed:
Making slot 3 bigger (4194304 -> 8388608)
Making slot 15 bigger (6291456 -> 8388608)
Changed len: 3471835136, Saved len: 3471835136
The same will probably happen with |
d91ff72
to
043e3b2
Compare
I made it reject all numbers <= 0 as well as incorrect inputs (so |
pr-tool said the headline of commit "vadp-dumper: add support for VixDiskLib_QueryAllocatedBlocks()" is too long. |
This function returns -- up to a certain amount of precision -- a list of intervals of all sectors containing useful (i.e. allocated) data. For full backups its recommended to backup only the allocated blocks; for other kinds of backups (incremental/differential) it is only necessary to backup those sectors that are both changed & allocated. Since we currently cannot take advantage of information regarding unallocated, changed blocks, we just ignore them. In the future these could be used for faster restores & consolidation. The algorithm used for finding out the intersection is trivial as both lists are sorted (and the intervals themselves are disjoint in each set). As such it is enough to just go through both arrays linearly at the same time, sending the pairwise intersection if any, and finally advancing the pointer to the "smaller" interval.
Since we cannot meaningfully recover from a mutex error, we should just assert that they do not happen.
043e3b2
to
923c25f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, looks good now.
Thank you for contributing to the Bareos Project!
This pr fixes some bugs of the vadp_dumper, mostly related to its multithreaded mode. For example:
Additionally this PR adds support for the QueryAllocatedBlocks function in the following way:
Previously we backed up every block that was reported by vmware to have changed since some point
in time (or at all in case of full backups). This list can contain blocks that were once changed but are now
not needed anymore which we previously backed up needlessly. As such we now ask vmware to also give us a list
of all allocated (i.e. needed) blocks and backup only sectors that are both part of a allocated block and part of a
changed block.
Please check
If you have any questions or problems, please give a comment in the PR.
Helpful documentation and best practices
Checklist for the reviewer of the PR (will be processed by the Bareos team)
Make sure you check/merge the PR using
devtools/pr-tool
to have some simple automated checks run and a proper changelog record added.General
Source code quality