Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include missing file data if querier fails to find trace by ID #1730

Closed
AlexDCraig opened this issue Sep 12, 2022 · 3 comments · Fixed by #1737
Closed

Include missing file data if querier fails to find trace by ID #1730

AlexDCraig opened this issue Sep 12, 2022 · 3 comments · Fixed by #1737
Labels
good first issue Good for newcomers

Comments

@AlexDCraig
Copy link
Contributor

Is your feature request related to a problem? Please describe.

In the event you have corruption within a folder, for instance if bloom files are missing, querier will fail to iterate over it and then log a somewhat opaque error like the following:

level=warn ts=2022-09-09T21:51:49.828334419Z caller=querier.go:246 msg="failed to query 1 blocks" blockErrs="error finding trace by id, blockID: bd96ed36-2e90-42f0-9d3f-995b3290be17: error retrieving bloom (single-tenant, bd96ed36-2e90-42f0-9d3f-995b3290be17): does not exist"

Describe the solution you'd like

A list of the bloom files or other files, or even a subset of those files if the list is long, would be helpful debugging info.

Describe alternatives you've considered

You can examine the folder to see if the bloom files do not miss a number.

Additional context

If this is irrelevant with the parquet backend, we can close.

@joe-elliott joe-elliott added the good first issue Good for newcomers label Sep 12, 2022
@joe-elliott
Copy link
Member

This is not a bad idea. I think we could probably wrap errors in these methods to include information about the requested objects:

func (r *reader) Read(ctx context.Context, name string, blockID uuid.UUID, tenantID string, shouldCache bool) ([]byte, error) {
objReader, size, err := r.r.Read(ctx, name, KeyPathForBlock(blockID, tenantID), shouldCache)
if err != nil {
return nil, err
}
defer objReader.Close()
return tempo_io.ReadAllWithEstimate(objReader, size)
}
func (r *reader) StreamReader(ctx context.Context, name string, blockID uuid.UUID, tenantID string) (io.ReadCloser, int64, error) {
return r.r.Read(ctx, name, KeyPathForBlock(blockID, tenantID), false)
}
func (r *reader) ReadRange(ctx context.Context, name string, blockID uuid.UUID, tenantID string, offset uint64, buffer []byte, shouldCache bool) error {
return r.r.ReadRange(ctx, name, KeyPathForBlock(blockID, tenantID), offset, buffer, shouldCache)
}

@joe-elliott
Copy link
Member

Oof, scratch that. We check the return of a lot of these methods with statements like:

if err == backend.ErrDoesNotExist {

If we wrapped the error on the lines recommended above it would break these checks. Perhaps it would be better just to make the log change where we request the bloom filters:

return nil, fmt.Errorf("error retrieving bloom (%s, %s): %w", b.meta.TenantID, b.meta.BlockID, err)

return false, fmt.Errorf("error retrieving bloom (%s, %s): %w", b.meta.TenantID, b.meta.BlockID, err)

@AlexDCraig
Copy link
Contributor Author

Cool I'll take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants