bvfs: fix cache race#2642
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthrough
ChangesCache Claim Optimization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
sebsura
left a comment
There was a problem hiding this comment.
The fix looks correct, but I have not been able to reliably reproduce the problem.
The test itself does not seem to be able to reproduce the issue as it always succeeds, even if i run it without the proposed fix.
930237a to
3aaa258
Compare
Atomically claim BVFS cache generation before filling PathVisibility so concurrent .bvfs_update runs cannot start the same job twice.
3aaa258 to
c972e6a
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/src/cats/bvfs.cc`:
- Around line 150-160: The current branch treats a benign "already in progress"
as an error because it only checks existence with "SELECT 1 ... HasCache=1" and
returns false; change the query and return semantics so the caller can
distinguish three outcomes: computed, in-progress, and missing/error. In the
bvfs.cc path that checks updated_rows (the block using Mmsg/QueryDb/SqlNumRows
and JobId), SELECT HasCache FROM Job WHERE JobId = %s and examine the returned
HasCache value (1 => computed -> retval=true; 0 or -1 => in-progress -> return a
distinct status or encoded value, e.g., set retval to a new enum/constant
BVFS_IN_PROGRESS instead of false; NULL/no row => treat as missing/error). Then
update DotBvfsUpdateCmd to handle the new BVFS_IN_PROGRESS value separately from
an error (currently it treats any false as ERROR).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: fe6e4749-91bd-4fb1-82e6-1166749b95d9
📒 Files selected for processing (1)
core/src/cats/bvfs.cc
| if (updated_rows == 0) { | ||
| Mmsg(cmd, "SELECT 1 FROM Job WHERE JobId = %s AND HasCache=1", jobid); | ||
| if (!QueryDb(jcr, cmd)) { goto bail_out; } | ||
|
|
||
| if (!QueryDb(jcr, cmd) || SqlNumRows() > 0) { | ||
| Dmsg1(dbglevel, "already in progress %d\n", (uint32_t)JobId); | ||
| retval = false; | ||
| if (SqlNumRows() > 0) { | ||
| Dmsg1(dbglevel, "Already computed %d\n", (uint32_t)JobId); | ||
| retval = true; | ||
| } else { | ||
| Dmsg1(dbglevel, "already in progress %d\n", (uint32_t)JobId); | ||
| retval = false; | ||
| } |
There was a problem hiding this comment.
Don't report the expected "already in progress" path as an error.
This branch returns false for the benign concurrent case, and DotBvfsUpdateCmd() in core/src/dird/ua_dotcmds.cc:110-126 treats every false as ERROR. It also can't distinguish a missing JobId from HasCache=-1, because SELECT 1 ... HasCache=1 only tells you "computed" vs "not computed". Please surface a distinct status here, or otherwise handle "in progress" separately from real failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/src/cats/bvfs.cc` around lines 150 - 160, The current branch treats a
benign "already in progress" as an error because it only checks existence with
"SELECT 1 ... HasCache=1" and returns false; change the query and return
semantics so the caller can distinguish three outcomes: computed, in-progress,
and missing/error. In the bvfs.cc path that checks updated_rows (the block using
Mmsg/QueryDb/SqlNumRows and JobId), SELECT HasCache FROM Job WHERE JobId = %s
and examine the returned HasCache value (1 => computed -> retval=true; 0 or -1
=> in-progress -> return a distinct status or encoded value, e.g., set retval to
a new enum/constant BVFS_IN_PROGRESS instead of false; NULL/no row => treat as
missing/error). Then update DotBvfsUpdateCmd to handle the new BVFS_IN_PROGRESS
value separately from an error (currently it treats any false as ERROR).
Atomically claim BVFS cache generation before filling PathVisibility so concurrent .bvfs_update runs cannot start the same job twice.
Add a python-bareos system test that runs concurrent BVFS updates against the same job set to cover the regression.
Thank you for contributing to the Bareos Project!
Please check
If you have any questions or problems, please give a comment in the PR.
Helpful documentation and best practices
Checklist for the reviewer of the PR (will be processed by the Bareos team)
Make sure you check/merge the PR using
devtools/pr-toolto have some simple automated checks run and a proper changelog record added.General
Source code quality