Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a compaction test to the load generator #24925

Closed
wants to merge 7 commits into from

Conversation

hiltontj
Copy link
Contributor

@hiltontj hiltontj commented Apr 17, 2024

Part of #24919

Added a new compact sub-command to the load generator. This provides a means to test compaction performance for a variety of parameters:

  • --num-tags: number of tags in generated measurement data
  • --num-rows: number of rows per file generated
  • --cardinality: the cardinality, i.e., total unique combinations of tags
  • --num-input-files: the number of files that will be generated, and fed into the compaction routine
  • --series-id: use a data model with the _series_id column, or not
  • --num-threads: number of threads to give the iox_query::Executor that performs the compaction

It works by generating a set of parquet files with the given number of rows, each row having the given number of tags, and resulting in the given cardinality. Each generated file is sorted by the primary key of the data model under test, i.e., with or without _series_id column.

It then streams the generated files through the iox_query::ReorgPlanner::compact routine to compact the the generated data and save the output in a new set of files. The new set of files will be a re-shuffle of the input, so that there will be the same number of output files as was inputted, but each will have reduced cardinality, due to the sorting during compaction.

@hiltontj
Copy link
Contributor Author

hiltontj commented May 7, 2024

Closing after we decided to move on from _series_id. See #24815

@hiltontj hiltontj closed this May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant