Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vparquet3] Add command to tempo-cli to analyse blocks for dedicated columns #2622

Merged
merged 17 commits into from
Jul 28, 2023

Conversation

mapno
Copy link
Member

@mapno mapno commented Jul 6, 2023

What this PR does:

NOTE: Depends on vparquet3 being merged to main

Adds two new methods to tempo-cli to analyse parquet blocks and output summaries of generic attribute columns:

Analyse block

Analyses a block and outputs a summary of the block's generic attributes.
It's of particular use when trying to determine what attributes to configure for dedicated columns in vParquet3.

Arguments:

  • tenant-id The tenant ID. Use single-tenant for single tenant setups.
  • block-id The block ID as UUID string.

Options:

  • Backend options
  • --num-attr <value> Number of attributes to output (default: 10)

Example:

tempo-cli analyse block --backend=local --bucket=./cmd/tempo-cli/test-data/ single-tenant b18beca6-4d7f-4464-9f72-f343e688a4a0
Analyse blocks Analyses all blocks in a given time range and outputs a summary of the blocks' generic attributes. It's of particular use when trying to determine what attributes to configure for dedicated columns in vParquet3.

Arguments:

  • tenant-id The tenant ID. Use single-tenant for single tenant setups.

Options:

  • Backend options
  • --num-attr <value> Number of attributes to output (default: 10)
  • --min-compaction-level <value> Minimum compaction level to include in the analysis (default: 3)
  • --max-blocks <value> Maximum number of blocks to analyse (default: 10)

Example:

tempo-cli analyse blocks --backend=local --bucket=./cmd/tempo-cli/test-data/ single-tenant

Which issue(s) this PR fixes:
Fixes #2630

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Collaborator

@stoewer stoewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works like a charm :)

Comment on lines 60 to 64
case vparquet.VersionString:
return vparquet.FieldSpanAttrKey, vparquetSpanAttrs
case vparquet2.VersionString:
return vparquet2.FieldSpanAttrKey, vparquet2SpanAttrs
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this will be vParquet2 and vParquet3 as soon as vParquet3 is realeased?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly 👍

cmd/tempo-cli/cmd-analyse-block.go Outdated Show resolved Hide resolved
cmd/tempo-cli/cmd-analyse-block.go Outdated Show resolved Hide resolved

func newBackendBlock(meta *backend.BlockMeta, r backend.Reader) *backendBlock {
return &backendBlock{
func NewBackendBlock(meta *backend.BlockMeta, r backend.Reader) *BackendBlock {
Copy link
Collaborator

@stoewer stoewer Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether there is an alternative to making this function and BackendBlock public. Maybe the CLI could have a thin wrapper around backend.ContextReader that implements an io.ReaderAt that can be passed into parquet.OpenFile()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used NewBackendReaderAt() and passed the reader to parquet.OpenFile() instead. No need to export BackendBlock nor any of its methods. Nice call.

@mapno mapno mentioned this pull request Jul 17, 2023
3 tasks

## Analyse blocks
Analyses all blocks in a given time range and outputs a summary of the blocks' generic attributes.
It's of particular use when trying to determine what attributes to configure for dedicated columns in vParquet3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe link to the dedicated columns page (PR#2664)?

Copy link
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the doc portion of the PR. Thank you for adding doc!

mapno and others added 3 commits July 18, 2023 09:55
Copy link
Collaborator

@stoewer stoewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stoewer stoewer merged commit 41dfbc4 into grafana:main Jul 28, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Method to identify candidates for dedicated columns
3 participants