New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed how blob storage PutBlob errors are handled in content.Manager #117
Conversation
In order to guarantee that all index entries have corresponding pack blobs, we must ensure that `content.Manager.Flush` will not succeed unless all pending writes have completed. Added test that simulates various patterns of PutBlock failures and ensures that data remains durable despite those, assuming all calls to `WriteContent()` and `Flush()` are retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ✅
Just a couple of questions inline
@@ -152,6 +153,16 @@ func (bm *Manager) addToPackUnlocked(ctx context.Context, contentID ID, data []b | |||
bm.cond.Wait() | |||
} | |||
|
|||
// see if we have any packs that have failed previously | |||
// retry writing them now. | |||
fp := append([]*pendingPackInfo(nil), bm.failedPacks...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a subtle "dependency" here on the implementation of .writePackAndAddToIndex(...)
.
Would it make sense to add a comment here explaining why the slice needs to be copied, namely that bm.failedPacks
is modified inside bm.writePackAndAddToIndex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return errors.Wrap(err, "error writing previously failed pack") | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this change, but "surprising" if you will. Is the assumption (expected semantics here) that when bm.flushPackIndexesLocked(ctx)
returns an error below, it should not release the lock?
For what I can tell from the calling paths into addToPackUnlocked()
, both from WriteContent
and RewriteContent
, this is unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, clearly forgot that one
In order to guarantee that all index entries have corresponding
pack blobs, we must ensure that
content.Manager.Flush
willnot succeed unless all pending writes have completed.
Added test that simulates various patterns of PutBlock failures and
ensures that data remains durable despite those, assuming all calls
to
WriteContent()
andFlush()
are retried.