-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch Indexing short circuits #11913
Conversation
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
done close |
This issue is no longer marked as stale. |
This pull request has been marked as stale due to 60 days of inactivity. |
This pull request has been marked as stale due to 60 days of inactivity. |
This pull request/issue has been closed due to lack of activity. If you think that |
Description
Add configurations to
index_hadoop
andindex
task type tuning configs that allow for certain ingestion jobs to short circuit early if they are determined to breach the configured thresholds that have been added by this PR. The defaults for these circuit breaker configs are turned off by default.short circuit 1: maxSegmentsIngested - short circuits the ingestion job if it is determined that the job will generate a number of segments greater than what is specified in the tuningConfig
short circuit 2: maxIntervalsIngested - short circuits the ingestion job if it is determined that the job will generate a number of segment intervals greater than what is specified in the tuningConfig
These short circuits only apply in certain scenarios:
why not have both circuit breakers in effect for all batch job types?
index
task type, because the partitions are generated dynamically when the segments are being generated.index_hadoop
Why is this useful?
Key changed/added classes in this PR
JobHelper
IndexTask
HashPartitionAnalysis
HadoopTuningConfig
HadoopDruidDetermineConfigurationJob
This PR has: