Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a defensive check on blob status #258

Merged
merged 4 commits into from
Feb 21, 2024

Conversation

jianoaix
Copy link
Contributor

Why are these changes needed?

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@@ -124,6 +124,10 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
for _, m := range metadatas {
stageTimer := time.Now()
blobKey := m.GetBlobKey()
if m.BlobStatus != disperser.Confirmed {
Copy link
Contributor

@ian-shim ian-shim Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the suspicion is that confirmationMetadata.ConfirmationInfo is nil in L138, right?
What if m.BlobStatus is Confirmed, but is missing confirmationMetadata.ConfirmationInfo?
This would be an even more serious issue if that's the case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's failing either on confirmationMetadata being nil, or confirmationMetadata.ConfirmationInfo being nil, which are the only two pointers at L138.

The confirmationMetadata is created by https://github.com/Layr-Labs/eigenda/blob/master/disperser/common/blobstore/blob_metadata_store.go#L55, it doesn't return nil as long as err != nil (which already checked at L132). So confirmationMetadata.ConfirmationInfo being nil seems to be only case.

These blobs were created from L98 or L113 by using GetBlobMetadataByStatusWithPagination(ctx, disperser.Confirmed, ...), so it seems either 1) this function returned some blobs that are not actually Confirmed; or 2) the blob's status got changed from Confirmed to something else after that call. For 1), it's a dynamo query, should be right; for 2), do we only transition Confirmed to Finalized? I haven't tracked down further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! So this logging should help us track down the root cause

@@ -124,6 +124,10 @@ func (f *finalizer) updateBlobs(ctx context.Context, metadatas []*disperser.Blob
for _, m := range metadatas {
stageTimer := time.Now()
blobKey := m.GetBlobKey()
if m.BlobStatus != disperser.Confirmed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! So this logging should help us track down the root cause

@jianoaix jianoaix merged commit 402eafb into Layr-Labs:master Feb 21, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants