Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARCTIC-946][AMS] Add balanced schedule policy for optimizer #993

Merged
merged 3 commits into from
Jan 13, 2023

Conversation

shidayang
Copy link
Contributor

@shidayang shidayang commented Jan 10, 2023

Why are the changes needed?

linked: #946

Brief change log

  • Add config of optimizer group to determine schedule policy of optimization
  • Add some schedule policy in com.netease.arctic.ams.server.service.impl.OptimizeQueueService

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduces a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added module:ams-server Ams server module module:ams-dashboard Ams dashboard module type:docs Improvements or additions to documentation labels Jan 10, 2023
@shidayang shidayang closed this Jan 10, 2023
@shidayang shidayang reopened this Jan 10, 2023
@shidayang
Copy link
Contributor Author

Under the quota strategy, the stock table will not perform optimization for a long time.
In balanced policy it look like this:
image

I also ran a benchmark.
image

@@ -58,6 +58,9 @@ public class ConfigFileProperties {
public static final String OPTIMIZE_GROUP_NAME = "name";
public static final String OPTIMIZE_GROUP_CONTAINER = "container";
public static final String OPTIMIZE_GROUP_PROPERTIES = "properties";
public static final String OPTIMIZE_SCHEDULING_POLICY = "scheduling_policy";
public static final String OPTIMIZE_SCHEDULING_POLICY_QUOTA = "quota";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using Enum to define specific policies will be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config system need to reconstruct, such as introduce ConfigOption class to resolve all config, but it is another work

* A policy that determines how to schedule a table.
* Tables at the head of the list have higher optimizing priority
*/
interface SchedulePolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface and the implements will be clear in structure if they are external classes.
Otherwise, the OptimizeQueueService class contains too much code.

Copy link
Contributor Author

@shidayang shidayang Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean, could you explain in detail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean, could you explain in detail

I mean the interface SchedulePolicy and implement defined as external classes rather than the inner class will be better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SchedulePolicy is customized for queue,I still think it would be better as an inner class.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shidayang Thanks a lot for your contribution. The implementaton looks good to me.
But I still found some code that could probably be improved, please take another look.

@zhoujinsong
Copy link
Contributor

zhoujinsong commented Jan 11, 2023

Under the quota strategy, the stock table will not perform optimization for a long time. In balanced policy it look like this: image

I also ran a benchmark. image

Cool! There are a great improvement in performance after we introduce the balanced optimizing schedule policy.
Should we change the performance test result in the Benchmark chapter on site?
Of course, we can do it in another PR.

@shidayang
Copy link
Contributor Author

Under the quota strategy, the stock table will not perform optimization for a long time. In balanced policy it look like this: image
I also ran a benchmark. image

Cool! There are a great improvement in performance after we introduce the balanced optimizing schedule policy. Should we change the performance test result in the Benchmark chapter on site? Of course, we can do it in another PR.

OK! I will change the performance test result in this pr

@github-actions github-actions bot added the module:mixed-hive Hive moduel for Mixed Format label Jan 11, 2023
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


如果你使用的是不可更新的表,如日志,传感器数据,并且已经习惯于 Iceberg 提供的 optimize 指令,可以考虑通过下面的配置关闭表上的 self-optimizing 功能:
## Self-optimizing scheduling policy
scheduling policy 表示 AMS 在分发任务给 optimizer 时候的策略,可以决定本次调度哪些表的任务给 optimizer 执行。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scheduling policy 是 AMS 决定不同表执行 self-optimizing 先后顺序的调度策略,通过不同的调度策略,决定了每张表实际可以占用的 self-optimizing 的资源,Arctic 用 Quota 定义每张表的预期资源用量,Quota occupy 代表了相比预期用量,实际占用的资源百分比。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下面建议再加一个 quota 和 quota occupy 的图片截图,说明 quota 和 quota occupy 的含义

Copy link
Contributor

@majin1102 majin1102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify documents

zhoujinsong pushed a commit that referenced this pull request May 31, 2023
* Add balanced schedule policy for optimizer

* Fix timezone error of int96 predicate push down

* Fix timezone error of int96 predicate push down
@shidayang shidayang deleted the auto-quota branch October 24, 2023 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-dashboard Ams dashboard module module:ams-server Ams server module type:docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants