Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel replicas feature is Beta #63151

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alexey-milovidov
Copy link
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

The "parallel replicas" feature is no longer experimental and it is now in beta. The setting allow_experimental_parallel_replicas is renamed to use_parallel_replicas.


Modify your CI run:

NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step

Include tests (required builds will be added automatically):

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Unit tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with Analyzer
  • Add your option here

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • Add your option here

Extra options:

  • do not test (only style check)
  • disable merge-commit (no merge from master before tests)
  • disable CI cache (job reuse)

Only specified batches in multi-batch jobs:

  • 1
  • 2
  • 3
  • 4

@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-improvement Pull request with some product improvements label Apr 30, 2024
@robot-ch-test-poll3
Copy link
Contributor

robot-ch-test-poll3 commented Apr 30, 2024

This is an automated comment for commit e73981e with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
A SyncA status of the workflow running in the cloud repository against the sync PR⏳ pending
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR❌ failure
ClickHouse build checkBuilds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process⏳ pending
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here❌ failure
Mergeable CheckChecks if all other necessary checks are successful❌ failure
Successful checks
Check nameDescriptionStatus
PR CheckThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Push to DockerhubThe check for building and pushing the CI related docker images to docker hub✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success

@alexey-milovidov
Copy link
Member Author

Currently, there are four modes for parallel replicas:

  1. Sample key;
  2. Task-based;
  3. Custom key by hash;
  4. Custom key by range;

Confusingly, the "sample key" mode is available even without use_parallel_replicas,
and the choice between "custom key" and "task-based" modes is determined by setting the value of "parallel_replicas_custom_key".

We should introduce a new setting "parallel_replicas_mode" which will decide between these modes.

@@ -31,7 +31,7 @@ test1() {
GROUP BY CounterID, URL, EventDate
ORDER BY URL, EventDate
LIMIT 5 OFFSET 10
SETTINGS optimize_aggregation_in_order = 1, enable_memory_bound_merging_of_aggregation_results = 1, allow_experimental_parallel_reading_from_replicas = 1, parallel_replicas_for_non_replicated_merge_tree = 1, max_parallel_replicas = 3"
SETTINGS optimize_aggregation_in_order = 1, enable_memory_bound_merging_of_aggregation_results = 1, use_parallel_replicas = 1, parallel_replicas_for_non_replicated_merge_tree = 1, max_parallel_replicas = 3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have at least one test with old name for compatibility check

Copy link
Member

@devcrafter devcrafter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikitamikhaylov nikitamikhaylov self-assigned this May 6, 2024
@nikitamikhaylov
Copy link
Member

@alexey-milovidov Parallel replicas modes 1, 3, and 4 are currently working on top of distributed table. While task-based parallel replicas work on top of MergeTree. We should unify this as well.

Copy link
Member

@nikitamikhaylov nikitamikhaylov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the parallel replicas with custom key work on top of MergeTree first and introduce the setting parallel_replicas_mode: #63521

@devcrafter
Copy link
Member

Let's make the parallel replicas with custom key work on top of MergeTree first and introduce the setting parallel_replicas_mode: #63521

Unifying settings and implementing custom key on top of MergeTree are 2 different things and can be done independently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants