MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. #14310

gianm · 2023-05-18T15:23:19Z

This is an undocumented parameter that controls how cluster-by statistics are merged. In PARALLEL mode, statistics are gathered from workers all at once. In SEQUENTIAL mode, statistics are gathered time chunk by time chunk. This improves accuracy for jobs with many time chunks, and reduces memory usage.

The main downside of SEQUENTIAL is that it can take longer, but in most situations I've seen, PARALLEL is only really usable in cases where the sketches are small enough that SEQUENTIAL would also run relatively quickly. So it seems like SEQUENTIAL is a better default.

This is an undocumented parameter that controls how cluster-by statistics are merged. In PARALLEL mode, statistics are gathered from workers all at once. In SEQUENTIAL mode, statistics are gathered time chunk by time chunk. This improves accuracy for jobs with many time chunks, and reduces memory usage. The main downside of SEQUENTIAL is that it can take longer, but in most situations I've seen, PARALLEL is only really usable in cases where the sketches are small enough that SEQUENTIAL would also run relatively quickly. So it seems like SEQUENTIAL is a better default.

cryptoe · 2023-05-22T06:16:06Z

Segments cuts for sequential would be atleast equal to segment cuts in parallel mode. I have seen cases where the job runs 30% faster when changing modes from parallel to sequential when number of workers was 1000.

cryptoe · 2023-06-24T01:10:50Z

We should also change

MSQTestBase#245 to PARALLEL_MERGE_CONTEXT since now the default has changed/

This can be done in another PR as well

gianm · 2023-06-26T05:39:05Z

We should also change
* MSQTestBase#245 to PARALLEL_MERGE_CONTEXT since now the default has changed/
This can be done in another PR as well

Ah, I'll do it in this patch, since it needs updates anyway due to one of the test cases failing. Currently, the test case MSQFaultsTest.testInsertCannotBeEmptyFault is timing out, I suppose because the sequential fetching logic doesn't handle the case where no workers have any time chunks. @cryptoe any idea why that might be happening? If not, I'll take a deeper look into it soon.

cryptoe · 2023-06-26T10:05:02Z

@gianm Quite possible we missed this case since we throw InsertCannotByEmptyFault only after getting the partition boundaries.

    if (isTimeBucketed && partitionBoundaries.equals(ClusterByPartitions.oneUniversalPartition())) {
            throw new MSQException(new InsertCannotBeEmptyFault(task.getDataSource()));
          } else {
            log.info("Query [%s] generating %d segments.", queryDef.getQueryId(), partitionBoundaries.size());
          }

The fix would be here WorkerSketcherFetcher#235 . Need to check if CompleteKeyStatisticsInformation is empty.

… all.

gianm · 2023-06-26T16:20:23Z

The fix would be here WorkerSketcherFetcher#235 . Need to check if CompleteKeyStatisticsInformation is empty.

TY, I added a block there that registers ClusterByStatisticsSnapshot.empty() for all workers if completeKeyStatisticsInformation.getTimeSegmentVsWorkerMap().isEmpty(). Please let me know if this looks good.

cryptoe

Thanks for the changes @gianm

…#14310) * MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. This is an undocumented parameter that controls how cluster-by statistics are merged. In PARALLEL mode, statistics are gathered from workers all at once. In SEQUENTIAL mode, statistics are gathered time chunk by time chunk. This improves accuracy for jobs with many time chunks, and reduces memory usage. The main downside of SEQUENTIAL is that it can take longer, but in most situations I've seen, PARALLEL is only really usable in cases where the sketches are small enough that SEQUENTIAL would also run relatively quickly. So it seems like SEQUENTIAL is a better default. * Switch off-test from SEQUENTIAL to PARALLEL. * Fix sequential merge for situations where there are no time chunks at all. * Add a couple more tests.

gianm added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label May 18, 2023

cryptoe approved these changes May 22, 2023

View reviewed changes

Merge branch 'master' into msq-cmm-default

6eed88a

gianm added 3 commits June 24, 2023 15:46

Merge branch 'master' into msq-cmm-default

7de24a7

Switch off-test from SEQUENTIAL to PARALLEL.

8fceed1

Merge branch 'master' into msq-cmm-default

db1e950

gianm added 2 commits June 26, 2023 08:11

Merge branch 'master' into msq-cmm-default

36dd301

Fix sequential merge for situations where there are no time chunks at…

0d2dd03

… all.

Add a couple more tests.

548d43d

cryptoe approved these changes Jun 26, 2023

View reviewed changes

gianm merged commit 8211379 into apache:master Jun 26, 2023
45 checks passed

gianm deleted the msq-cmm-default branch June 26, 2023 17:54

abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023

AmatyaAvadhanula mentioned this pull request Aug 6, 2023

[DRAFT] 27.0.0 release notes #14761

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. #14310

MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. #14310

gianm commented May 18, 2023

cryptoe commented May 22, 2023

cryptoe commented Jun 24, 2023

gianm commented Jun 26, 2023

cryptoe commented Jun 26, 2023

gianm commented Jun 26, 2023

cryptoe left a comment

MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. #14310

MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL. #14310

Conversation

gianm commented May 18, 2023

cryptoe commented May 22, 2023

cryptoe commented Jun 24, 2023

gianm commented Jun 26, 2023

cryptoe commented Jun 26, 2023

gianm commented Jun 26, 2023

cryptoe left a comment

Choose a reason for hiding this comment