Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. improvements for priority task processing #2897

Merged
merged 1 commit into from
May 24, 2022

Conversation

yycptt
Copy link
Member

@yycptt yycptt commented May 24, 2022

What changed?

  • Tune various configuration default values
  • Handle namespace not found error in priority assigner
  • Add more buckets for dimensionless histogram metrics

Why?

  • Improve priority task processing performance

How did you test it?

  • Tested locally with bench

Potential risks
N/A, still disabled by default.

Is hotfix candidate?

  • No

@yycptt yycptt requested a review from a team as a code owner May 24, 2022 21:33
10000,
20000,
50000,
100000,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We emit metrics for # of pending tasks in a shard, and suspend task loading if the number is too high. The original 1000 is too low and can't give a good estimation of the # of pending tasks nor can be used to tell if task loading is suspended or not.

@@ -25,7 +25,7 @@
package tasks

const (
WeightedChannelDefaultSize = 100
WeightedChannelDefaultSize = 10000
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be regarded as the host level task buffer. and handle a burst of task submission. The size should be at least bigger than the priority weight. Set it to a larger number so that when there's a burst of tasks, instead of waiting for a periodical reschedule (which happens 3-5s later), buffer the tasks in the scheduler can shorten the e2e latency since they will be processed as soon as possible.

@@ -316,8 +316,8 @@ func NewConfig(dc *dynamicconfig.Collection, numberOfShards int32, isAdvancedVis
TimerTaskMaxRetryCount: dc.GetIntProperty(dynamicconfig.TimerTaskMaxRetryCount, 100),
TimerProcessorEnableSingleCursor: dc.GetBoolProperty(dynamicconfig.TimerProcessorEnableSingleCursor, false),
TimerProcessorEnablePriorityTaskScheduler: dc.GetBoolProperty(dynamicconfig.TimerProcessorEnablePriorityTaskScheduler, false),
TimerProcessorSchedulerWorkerCount: dc.GetIntProperty(dynamicconfig.TimerProcessorSchedulerWorkerCount, 200),
TimerProcessorSchedulerQueueSize: dc.GetIntProperty(dynamicconfig.TimerProcessorSchedulerQueueSize, 10000),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought this is the queue size for the weighted channel. But it's the size of the channel that's actually polled by the worker pool. (Task scheduler basically moves tasks from weighted channel to that single channel according to the weight).

If the number is high, then task scheduler simply move tasks to that single channel in the order they are received. And we lost the effect of wrr scheduler.

I will probably remove this config in the future, I don't think there's a need to tune this number.

@yycptt yycptt merged commit c39cfb7 into temporalio:master May 24, 2022
@yycptt yycptt deleted the misc-priority-task-improvments branch May 24, 2022 22:26
Sushisource pushed a commit to Sushisource/temporal that referenced this pull request Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants