-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tuning #430
Tuning #430
Conversation
app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/AbstractTuningTypeManager.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBT.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/Schduler/AzkabanJobStatusManager.java
Outdated
Show resolved
Hide resolved
I have done the initial review. One general comment is we need to look into logging in detail and think about overall strategy for logging. We need to decide for each manager and each execution what should be logged as info and debug. I know most of these logging statements were already there, but as number of calls to tuning APIs are increasing because of change in default behavior, logging will become much more important from performance and debugging perspective. |
app/com/linkedin/drelephant/mapreduce/heuristics/ConfigurationHeuristic.java
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/Schduler/AzkabanJobStatusManager.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/engine/SparkHBTParamRecommender.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/engine/SparkHBTParamRecommender.java
Outdated
Show resolved
Hide resolved
(long) (suggestedDriverMemory + (roundedContainerSize - suggestedDriverMemory - memoryOverhead - FileUtils.ONE_MB) * 0.9); | ||
|
||
return driverMemory; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we combine these two methods , getRoundedExecutorMemory and getRoundedDriverMemory ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine. Both are using different parameters. Currently logic is similar but going forward it may be different.
* threshold of max executor core then good, otherwise reduce the core and check memory | ||
* requirement. | ||
*/ | ||
private void suggestExecutorMemoryCore() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think , for the next version , we should also think to add number of partitions / tasks parameters , to come up with final executor memory .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely.
app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java
Outdated
Show resolved
Hide resolved
app/com/linkedin/drelephant/tuning/AbstractJobStatusManager.java
Outdated
Show resolved
Hide resolved
…disable tuning for HBT
…rk for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging
…anagerOBT , minor changes in naming variables/metohds
LGTM. Pending changes can be taken care in next PR. |
Squashed commit of the following: * Made fitness calculation, tuning type specific * Changed Boolean to boolean * Removing spark heuristics specific changes to separate commit * Removing spark heuristics specific changes to separate commit * Handling scenario where no param generator implementation for algorith type * Documentation/Javadoc added * Updated HBT parameter generation for Spark * Changing log status to debug * Code changes to handle multiple jobs in Spark * Changing spark input configuration for version 1 api * Adding Spark Job Specific Configuration Input Output * Changed logic to have algorithm selection based on version * Changed logging to debug from info , optimization and indenetation changes * Unit test cases for Paramgenerator HBT . Documentation added for entity * Unit test cases for Fitness Manager for HBT * Added Unit Test cases , Changes specific to IPSO * Renamed TuningType to ParamGenerator , Generalized ParameterGenerateManagerOBT , minor changes in naming variables/metohds * New Spark HBT algorithm for specific suggestion * Changed to get tuning type from DB instead of Azkaban : Changed to work for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging * Added Documentation ; Added is_suggested_param_set ; Added logic for disable tuning for HBT * Ipso configuration changes for Unified Architecture * unified architecture changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the code coverage for newly added code? As this is not master, Travis CI wont run for this PR. Probably run the quality tools locally and share the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the code coverage for the newly added code? As Travis CI wont run for this branch, we can probably run the quality tools locally and share the results.
app/com/linkedin/drelephant/tuning/engine/SparkConfigurationConstants.java
Show resolved
Hide resolved
Squashed commit of the following: * Made fitness calculation, tuning type specific * Changed Boolean to boolean * Removing spark heuristics specific changes to separate commit * Removing spark heuristics specific changes to separate commit * Handling scenario where no param generator implementation for algorith type * Documentation/Javadoc added * Updated HBT parameter generation for Spark * Changing log status to debug * Code changes to handle multiple jobs in Spark * Changing spark input configuration for version 1 api * Adding Spark Job Specific Configuration Input Output * Changed logic to have algorithm selection based on version * Changed logging to debug from info , optimization and indenetation changes * Unit test cases for Paramgenerator HBT . Documentation added for entity * Unit test cases for Fitness Manager for HBT * Added Unit Test cases , Changes specific to IPSO * Renamed TuningType to ParamGenerator , Generalized ParameterGenerateManagerOBT , minor changes in naming variables/metohds * New Spark HBT algorithm for specific suggestion * Changed to get tuning type from DB instead of Azkaban : Changed to work for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging * Added Documentation ; Added is_suggested_param_set ; Added logic for disable tuning for HBT * Ipso configuration changes for Unified Architecture * unified architecture changes (cherry picked from commit 6241016)
This pull request is cherrypicked and merged to customSHS |
@@@Dr-elephant how to configure the azkaban scheduler, or configure the default |
@mkumar1984 |
Dr-elephant how to configure the scheduler or the default @ @mkumar1984 @pralabhkumar |
@fusonghe you can have a look at sample
|
Dr-elephant comepare and FlowHistroy how to configure |
@fusonghe can you create an Issue for this as discussing this is not the right place to discuss this. |
This pull request is cherrypicked and merged to customSHS . Hence closing this eeec95d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the code coverage for the newly added code? As Travis CI wont run for this branch, we can probably run the quality tools locally and share the results.
Resolving this as coverage was added later after integration of code coverage in Travis CI.
*/ | ||
public abstract class AbstractBaselineManager implements Manager { | ||
protected final String BASELINE_EXECUTION_COUNT = "baseline.execution.count"; | ||
protected Integer NUM_JOBS_FOR_BASELINE_DEFAULT = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use int instead of Integer
@@ -0,0 +1,174 @@ | |||
package com.linkedin.drelephant.tuning; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include file header. Comment applies for other files as well
ignoreExecutionWaitInterval = | ||
Utils.getNonNegativeLong(configuration, IGNORE_EXECUTION_WAIT_INTERVAL, 2 * 60 * AutoTuner.ONE_MIN); | ||
|
||
// #executions after which tuning will stop even if parameters don't converge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On what basis 39 and 18 have been decided? Also add a constant for them
This pull request is an implementation of Unified Architecture . Its basically an effort to combine Heuristic based tuning and Optimization based tuning. This also includes effort to refactor TuneIn , such that it should be easily extensible
For further details , please visit below design doc
Unified Architecture : https://docs.google.com/document/d/1U7s1NDYujsG5cnvX39KrB0vrtVJUtIBx524BAyqGFcw/edit
Integration Test cases
https://docs.google.com/spreadsheets/d/1Z4YRtMkhPcSil3JU68tjMpyrNAjTM1B-b8umQ6Ev_y4/edit#gid=0
Unit test case have been written to take care of new functionality