Include missing file data if querier fails to find trace by ID #1730

AlexDCraig · 2022-09-12T16:56:56Z

Is your feature request related to a problem? Please describe.

In the event you have corruption within a folder, for instance if bloom files are missing, querier will fail to iterate over it and then log a somewhat opaque error like the following:

level=warn ts=2022-09-09T21:51:49.828334419Z caller=querier.go:246 msg="failed to query 1 blocks" blockErrs="error finding trace by id, blockID: bd96ed36-2e90-42f0-9d3f-995b3290be17: error retrieving bloom (single-tenant, bd96ed36-2e90-42f0-9d3f-995b3290be17): does not exist"

Describe the solution you'd like

A list of the bloom files or other files, or even a subset of those files if the list is long, would be helpful debugging info.

Describe alternatives you've considered

You can examine the folder to see if the bloom files do not miss a number.

Additional context

If this is irrelevant with the parquet backend, we can close.

The text was updated successfully, but these errors were encountered:

joe-elliott · 2022-09-12T19:25:14Z

This is not a bad idea. I think we could probably wrap errors in these methods to include information about the requested objects:

tempo/tempodb/backend/raw.go

Lines 116 to 131 in 9a135a9

    
           func (r *reader) Read(ctx context.Context, name string, blockID uuid.UUID, tenantID string, shouldCache bool) ([]byte, error) { 
        
           	objReader, size, err := r.r.Read(ctx, name, KeyPathForBlock(blockID, tenantID), shouldCache) 
        
           	if err != nil { 
        
           		return nil, err 
        
           	} 
        
           	defer objReader.Close() 
        
           	return tempo_io.ReadAllWithEstimate(objReader, size) 
        
           } 
        
           func (r *reader) StreamReader(ctx context.Context, name string, blockID uuid.UUID, tenantID string) (io.ReadCloser, int64, error) { 
        
           	return r.r.Read(ctx, name, KeyPathForBlock(blockID, tenantID), false) 
        
           } 
        
           func (r *reader) ReadRange(ctx context.Context, name string, blockID uuid.UUID, tenantID string, offset uint64, buffer []byte, shouldCache bool) error { 
        
           	return r.r.ReadRange(ctx, name, KeyPathForBlock(blockID, tenantID), offset, buffer, shouldCache) 
        
           }

joe-elliott · 2022-09-12T19:29:53Z

Oof, scratch that. We check the return of a lot of these methods with statements like:

if err == backend.ErrDoesNotExist {

If we wrapped the error on the lines recommended above it would break these checks. Perhaps it would be better just to make the log change where we request the bloom filters:

tempo/tempodb/encoding/v2/backend_block.go

Line 56 in 9a135a9

    
           return nil, fmt.Errorf("error retrieving bloom (%s, %s): %w", b.meta.TenantID, b.meta.BlockID, err)

tempo/tempodb/encoding/vparquet/block_findtracebyid.go

Line 150 in 9a135a9

    
           return false, fmt.Errorf("error retrieving bloom (%s, %s): %w", b.meta.TenantID, b.meta.BlockID, err)

AlexDCraig · 2022-09-12T20:16:20Z

Cool I'll take a look

joe-elliott added the good first issue Good for newcomers label Sep 12, 2022

AlexDCraig mentioned this issue Sep 14, 2022

Identify bloom that could not be retrieved from backend block #1737

Merged

mdisibio closed this as completed in #1737 Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include missing file data if querier fails to find trace by ID #1730

Include missing file data if querier fails to find trace by ID #1730

AlexDCraig commented Sep 12, 2022

joe-elliott commented Sep 12, 2022

joe-elliott commented Sep 12, 2022

AlexDCraig commented Sep 12, 2022

Include missing file data if querier fails to find trace by ID #1730

Include missing file data if querier fails to find trace by ID #1730

Comments

AlexDCraig commented Sep 12, 2022

joe-elliott commented Sep 12, 2022

joe-elliott commented Sep 12, 2022

AlexDCraig commented Sep 12, 2022