Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support maxRowsPerSegment for auto compaction #6780

Merged
merged 9 commits into from
Jan 10, 2019

Conversation

jihoonson
Copy link
Contributor

This PR mainly contains three changes:

@@ -116,23 +116,31 @@ public void testIndexTaskTuningConfigTargetPartitionSizeOrNumShards() throws Exc
IndexTask.IndexTuningConfig.class
);

Assert.assertEquals(10, (int) tuningConfig.getTargetPartitionSize());
Assert.assertEquals(10, (int) tuningConfig.getMaxRowsPerSegment());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please add a backwards compatibility test if not already present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I added a test when both maxRowsPerSegment and targetPartitionSize don't exist. We already have a test for when only targetPartitionSize is provided here.

skipOffsetFromLatest,
tuningConfig,
taskContext
);
}

public static class UserCompactTuningConfig extends ClientCompactQueryTuningConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have two separate configs ? the only difference seems to be maxRowsPerSegment in these two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. So, I added maxRowsPerSegment as a constructor parameter because it's similar to targetPartitionSize and they should be at the same level. But, this can make users confused because ClientCompactionQueryTuningConfig also has maxRowsPerSegment. So I added UserCompactTuningConfig which doesn't have maxRowsPerSegment.

@jihoonson
Copy link
Contributor Author

@nishantmonu51 do you have more comments?

this.maxRowsInMemory = maxRowsInMemory == null ? DEFAULT_MAX_ROWS_IN_MEMORY : maxRowsInMemory;
this.maxTotalRows = maxTotalRows == null ? DEFAULT_MAX_TOTAL_ROWS : maxTotalRows;
this.indexSpec = indexSpec == null ? DEFAULT_INDEX_SPEC : indexSpec;
this.maxPendingPersists = maxPendingPersists == null ? DEFAULT_MAX_PENDING_PERSISTS : maxPendingPersists;
this.publishTimeout = publishTimeout == null ? DEFAULT_PUBLISH_TIMEOUT : publishTimeout;
this.pushTimeout = pushTimeout == null ? DEFAULT_PUBLISH_TIMEOUT : pushTimeout;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change DEFAULT_PUBLISH_TIMEOUT to DEFAULT_PUSH_TIMEOUT?

@@ -47,7 +47,8 @@ Compaction tasks merge all segments of the given interval. The syntax is:
|`id`|Task id|No|
|`dataSource`|DataSource name to be compacted|Yes|
|`interval`|Interval of segments to be compacted|Yes|
|`dimensions`|Custom dimensionsSpec. compaction task will use this dimensionsSpec if exist instead of generating one. See below for more details.|No|
|`dimensionsSpec`|Custom dimensionsSpec. Compaction task will use this dimensionsSpec if exist instead of generating one. See below for more details.|No|
|`metricsSpec`|Custom metricsSpec. Compaction task will use this metricsSpec if specified rather than generating one.|No|
|`segmentGranularity`|If this is set, compactionTask will change the segment granularity for the given interval. See [segmentGranularity of Uniform Granularity Spec](./ingestion-spec.html#uniform-granularity-spec) for more details. See the below table for the behavior.|No|
|`keepSegmentGranularity`|Deprecated. Please use `segmentGranularity` instead. See the below table for its behavior.|No|
|`targetCompactionSizeBytes`|Target segment size after comapction. Cannot be used with `targetPartitionSize`, `maxTotalRows`, and `numShards` in tuningConfig.|No|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still is targetPartitionSize? And after look into the code, it seems there is no maxRowsPerSegment for compaction tuningConfig.

@@ -77,7 +78,7 @@ An example of compaction task is

This compaction task reads _all segments_ of the interval `2017-01-01/2018-01-01` and results in new segments.
Since both `segmentGranularity` and `keepSegmentGranularity` are null, the original segment granularity will be remained and not changed after compaction.
To control the number of result segments per time chunk, you can set `targetPartitionSize` or `numShards`. See [indexTuningConfig](../ingestion/native_tasks.html#tuningconfig) for more details.
To control the number of result segments per time chunk, you can set `maxRowsPerSegment` or `numShards`. See [indexTuningConfig](../ingestion/native_tasks.html#tuningconfig) for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same, there is no maxRowsPerSegment and numShards for compaction tuningConfig. The maxRowsPerSegment is at the same level with targetCompactionSizeBytes, not in tuningConfig.

Copy link
Member

@QiuMM QiuMM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jihoonson
Copy link
Contributor Author

@QiuMM thanks for the review!
@nishantmonu51 I'm merging this PR, but please leave any comments if you have more. I'll do in a follow-up pr.

@jihoonson jihoonson merged commit c35a39d into apache:master Jan 10, 2019
@jon-wei jon-wei added this to the 0.14.0 milestone Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants