Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stale volumes/devices in bareos-sd after failure to acquire a read device on MAC jobs #1106

Merged
merged 4 commits into from
Mar 10, 2022

Conversation

arogge
Copy link
Member

@arogge arogge commented Mar 7, 2022

When a job fails due to the SD failing to switch its read-device to match the required media type (which can happen if the medium is not accessible by the SD), the device that was reserved / acquired for writing was not released.
This PR adds a test for that behaviour and fixes the problem.

Thank you for contributing to the Bareos Project!

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

General
  • PR name is meaningful
  • Purpose of the PR is understood
  • Separate commit for this PR in the CHANGELOG.md, PR number referenced is same
  • Commit descriptions are understandable and well formatted
  • If backport: add original PR number and target branch at top of this file: Backport of PR#000 to bareos-2x

@arogge arogge force-pushed the dev/arogge/master/stale-volumes-devices branch from 2aec7c5 to e0e9495 Compare March 7, 2022 17:10
@pstorz pstorz self-requested a review March 10, 2022 12:45
Copy link
Member

@pstorz pstorz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work!

In DoActualMigration() function jcr->db_batch->WriteBatchFileRecords()
was called without checking if jcr->db_batch was valid. While the
function itself took precautions to work correctly in that scenario, it
is still undefined behaviour.
This patch adds the proper check.
This patch splits copy-bscan into individual subtests and adds a new
test "impossible-copy". That test will ensure the copy fails when
changing the read-device to expose a bug in the SD and check for the
presence of that bug.
Previously releasing the write-device may not happen on specific
failures. This patch makes sure the write-device is always released when
appropriate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants