Tuning #430

pralabhkumar · 2018-08-31T12:31:50Z

This pull request is an implementation of Unified Architecture . Its basically an effort to combine Heuristic based tuning and Optimization based tuning. This also includes effort to refactor TuneIn , such that it should be easily extensible
For further details , please visit below design doc

Unified Architecture : https://docs.google.com/document/d/1U7s1NDYujsG5cnvX39KrB0vrtVJUtIBx524BAyqGFcw/edit

Integration Test cases
https://docs.google.com/spreadsheets/d/1Z4YRtMkhPcSil3JU68tjMpyrNAjTM1B-b8umQ6Ev_y4/edit#gid=0

Unit test case have been written to take care of new functionality

app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java

app/models/TuningAlgorithm.java

app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java

app/com/linkedin/drelephant/tuning/Flow.java

app/com/linkedin/drelephant/tuning/AbstractTuningTypeManager.java

app/com/linkedin/drelephant/tuning/Manager.java

app/com/linkedin/drelephant/tuning/ManagerFactory.java

app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java

app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBT.java

app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java

app/models/TuningParameterConstraint.java

app/models/TuningJobDefinition.java

app/com/linkedin/drelephant/tuning/obt/TuningTypeManagerOBTAlgoIPSO.java

app/com/linkedin/drelephant/tuning/Schduler/AzkabanJobStatusManager.java

mkumar1984 · 2018-10-03T11:25:52Z

I have done the initial review. One general comment is we need to look into logging in detail and think about overall strategy for logging. We need to decide for each manager and each execution what should be logged as info and debug. I know most of these logging statements were already there, but as number of calls to tuning APIs are increasing because of change in default behavior, logging will become much more important from performance and debugging perspective.

app/com/linkedin/drelephant/mapreduce/heuristics/ConfigurationHeuristic.java

app/com/linkedin/drelephant/tuning/AbstractFitnessManager.java

app/com/linkedin/drelephant/tuning/Flow.java

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java

app/com/linkedin/drelephant/tuning/AbstractFitnessManager.java

app/com/linkedin/drelephant/tuning/AbstractJobStatusManager.java

app/com/linkedin/drelephant/tuning/Schduler/AzkabanJobStatusManager.java

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java

app/com/linkedin/drelephant/tuning/engine/SparkHBTParamRecommender.java

pralabhkumar · 2018-10-09T11:02:55Z

app/com/linkedin/drelephant/tuning/engine/SparkHBTParamRecommender.java

+        (long) (suggestedDriverMemory + (roundedContainerSize - suggestedDriverMemory - memoryOverhead - FileUtils.ONE_MB) * 0.9);
+
+    return driverMemory;
+  }


Can we combine these two methods , getRoundedExecutorMemory and getRoundedDriverMemory ?

I think that's fine. Both are using different parameters. Currently logic is similar but going forward it may be different.

pralabhkumar · 2018-10-09T11:07:57Z

app/com/linkedin/drelephant/tuning/engine/SparkHBTParamRecommender.java

+   * threshold of max executor core then good, otherwise reduce the core and check memory
+   * requirement.
+   */
+  private void suggestExecutorMemoryCore() {


I think , for the next version , we should also think to add number of partitions / tasks parameters , to come up with final executor memory .

Definitely.

app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java

app/com/linkedin/drelephant/tuning/AbstractJobStatusManager.java

app/com/linkedin/drelephant/tuning/AutoTuningAPIHelper.java

app/com/linkedin/drelephant/tuning/AutoTuningFlow.java

app/com/linkedin/drelephant/tuning/Constant.java

app/com/linkedin/drelephant/tuning/ExecutionEngine.java

app/com/linkedin/drelephant/tuning/Manager.java

test/com/linkedin/drelephant/tuning/IPSOManagerTestRunner.java

app/controllers/Application.java

…disable tuning for HBT

…rk for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging

…anagerOBT , minor changes in naming variables/metohds

…anges

…h type

mkumar1984 · 2018-10-16T11:51:56Z

LGTM. Pending changes can be taken care in next PR.

app/com/linkedin/drelephant/tuning/AutoTuningFlow.java

app/com/linkedin/drelephant/tuning/ExecutionEngine.java

app/com/linkedin/drelephant/tuning/engine/SparkExecutionEngine.java

app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java

app/com/linkedin/drelephant/tuning/engine/SparkExecutionEngine.java

Squashed commit of the following: * Made fitness calculation, tuning type specific * Changed Boolean to boolean * Removing spark heuristics specific changes to separate commit * Removing spark heuristics specific changes to separate commit * Handling scenario where no param generator implementation for algorith type * Documentation/Javadoc added * Updated HBT parameter generation for Spark * Changing log status to debug * Code changes to handle multiple jobs in Spark * Changing spark input configuration for version 1 api * Adding Spark Job Specific Configuration Input Output * Changed logic to have algorithm selection based on version * Changed logging to debug from info , optimization and indenetation changes * Unit test cases for Paramgenerator HBT . Documentation added for entity * Unit test cases for Fitness Manager for HBT * Added Unit Test cases , Changes specific to IPSO * Renamed TuningType to ParamGenerator , Generalized ParameterGenerateManagerOBT , minor changes in naming variables/metohds * New Spark HBT algorithm for specific suggestion * Changed to get tuning type from DB instead of Azkaban : Changed to work for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging * Added Documentation ; Added is_suggested_param_set ; Added logic for disable tuning for HBT * Ipso configuration changes for Unified Architecture * unified architecture changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java

varunsaxena

What's the code coverage for newly added code? As this is not master, Travis CI wont run for this PR. Probably run the quality tools locally and share the results.

varunsaxena

What is the code coverage for the newly added code? As Travis CI wont run for this branch, we can probably run the quality tools locally and share the results.

app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java

app/com/linkedin/drelephant/tuning/hbt/ParameterGenerateManagerHBT.java

app/com/linkedin/drelephant/tuning/ExecutionEngine.java

app/com/linkedin/drelephant/tuning/TuningInput.java

app/com/linkedin/drelephant/tuning/engine/MRExecutionEngine.java

app/com/linkedin/drelephant/tuning/engine/SparkConfigurationConstants.java

app/com/linkedin/drelephant/tuning/ExecutionEngine.java

Squashed commit of the following: * Made fitness calculation, tuning type specific * Changed Boolean to boolean * Removing spark heuristics specific changes to separate commit * Removing spark heuristics specific changes to separate commit * Handling scenario where no param generator implementation for algorith type * Documentation/Javadoc added * Updated HBT parameter generation for Spark * Changing log status to debug * Code changes to handle multiple jobs in Spark * Changing spark input configuration for version 1 api * Adding Spark Job Specific Configuration Input Output * Changed logic to have algorithm selection based on version * Changed logging to debug from info , optimization and indenetation changes * Unit test cases for Paramgenerator HBT . Documentation added for entity * Unit test cases for Fitness Manager for HBT * Added Unit Test cases , Changes specific to IPSO * Renamed TuningType to ParamGenerator , Generalized ParameterGenerateManagerOBT , minor changes in naming variables/metohds * New Spark HBT algorithm for specific suggestion * Changed to get tuning type from DB instead of Azkaban : Changed to work for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging * Added Documentation ; Added is_suggested_param_set ; Added logic for disable tuning for HBT * Ipso configuration changes for Unified Architecture * unified architecture changes (cherry picked from commit 6241016)

pralabhkumar · 2019-02-19T10:45:32Z

This pull request is cherrypicked and merged to customSHS

fusonghe · 2019-05-07T10:42:51Z

@@@Dr-elephant how to configure the azkaban scheduler, or configure the default

fusonghe · 2019-05-07T10:43:20Z

@mkumar1984
Dr-elephant how to configure the azkaban scheduler, or configure the default

fusonghe · 2019-05-09T08:57:03Z

Dr-elephant how to configure the scheduler or the default @ @mkumar1984 @pralabhkumar

ShubhamGupta29 · 2019-05-09T09:18:55Z

@fusonghe you can have a look at sample app-conf/SchedulerConf.xml. Just in case here is the sample for scheduler conf which will be mentioned in the SchedulerConf.xml

<scheduler> <name>azkaban</name> <classname>com.linkedin.drelephant.schedulers.AzkabanScheduler</classname> <params> <exception_enabled>true</exception_enabled> <workflow_client>com.linkedin.drelephant.clients.azkaban.AzkabanWorkflowClient</workflow_client> <username>sample_username</username> <password>sample_password</password> </params> </scheduler>

fusonghe · 2019-05-13T02:42:08Z

Dr-elephant comepare and FlowHistroy how to configure
The configuration scheduler azkaban is also not displayed @mkumar1984 @ShubhamGupta29

ShubhamGupta29 · 2019-05-13T03:09:32Z

@fusonghe can you create an Issue for this as discussing this is not the right place to discuss this.

fusonghe · 2019-05-14T02:11:49Z

@pralabhkumar

pralabhkumar · 2019-07-08T04:51:53Z

This pull request is cherrypicked and merged to customSHS . Hence closing this eeec95d

varunsaxena

What is the code coverage for the newly added code? As Travis CI wont run for this branch, we can probably run the quality tools locally and share the results.

Resolving this as coverage was added later after integration of code coverage in Travis CI.

varunsaxena · 2018-11-04T09:49:33Z

app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java

+ */
+public abstract class AbstractBaselineManager implements Manager {
+  protected final String BASELINE_EXECUTION_COUNT = "baseline.execution.count";
+  protected Integer NUM_JOBS_FOR_BASELINE_DEFAULT = 30;


Use int instead of Integer

varunsaxena · 2018-11-04T09:58:40Z

app/com/linkedin/drelephant/tuning/AbstractBaselineManager.java

@@ -0,0 +1,174 @@
+package com.linkedin.drelephant.tuning;


Include file header. Comment applies for other files as well

varunsaxena · 2018-11-04T11:47:58Z

app/com/linkedin/drelephant/tuning/obt/FitnessManagerOBT.java

+    ignoreExecutionWaitInterval =
+        Utils.getNonNegativeLong(configuration, IGNORE_EXECUTION_WAIT_INTERVAL, 2 * 60 * AutoTuner.ONE_MIN);
+
+    // #executions after which tuning will stop even if parameters don't converge


On what basis 39 and 18 have been decided? Also add a constant for them

pralabhkumar force-pushed the tuning branch from 89b36f5 to 3c7746e Compare August 31, 2018 12:47

varunsaxena self-requested a review September 3, 2018 05:25

varunsaxena mentioned this pull request Sep 3, 2018

unified architecture v0.1 #418

Open

pralabhkumar requested a review from mkumar1984 September 4, 2018 10:17

varunsaxena requested changes Sep 5, 2018

View reviewed changes

varunsaxena requested changes Sep 6, 2018

View reviewed changes

mkumar1984 requested changes Oct 3, 2018

View reviewed changes

pralabhkumar commented Oct 9, 2018

View reviewed changes

edwinalu reviewed Oct 16, 2018

View reviewed changes

varunsaxena reviewed Oct 16, 2018

View reviewed changes

test/com/linkedin/drelephant/tuning/IPSOManagerTestRunner.java Show resolved Hide resolved

app/controllers/Application.java Show resolved Hide resolved

varunsaxena force-pushed the tuning branch from 64cde93 to ffe4d17 Compare October 16, 2018 10:15

pralabhkumar and others added 17 commits October 16, 2018 15:47

unified architecture changes

e44bc93

Ipso configuration changes for Unified Architecture

5a1f63b

Added Documentation ; Added is_suggested_param_set ; Added logic for …

44f5d8c

…disable tuning for HBT

Changed to get tuning type from DB instead of Azkaban : Changed to wo…

10d5c15

…rk for HBT to OBT toggle : Added unit test cases : Enabled debug log for debugging

New Spark HBT algorithm for specific suggestion

27c0411

Renamed TuningType to ParamGenerator , Generalized ParameterGenerateM…

a877484

…anagerOBT , minor changes in naming variables/metohds

Added Unit Test cases , Changes specific to IPSO

909b24c

Unit test cases for Fitness Manager for HBT

e2f0097

Unit test cases for Paramgenerator HBT . Documentation added for entity

030b036

Changed logging to debug from info , optimization and indenetation ch…

d93a2c9

…anges

Changed logic to have algorithm selection based on version

21ea0ef

Adding Spark Job Specific Configuration Input Output

c29f969

Changing spark input configuration for version 1 api

d3cc99f

Code changes to handle multiple jobs in Spark

1905c2a

Changing log status to debug

c519f89

Updated HBT parameter generation for Spark

b8f2ccc

Documentation/Javadoc added

f126f9a

pralabhkumar force-pushed the tuning branch from c68352d to f126f9a Compare October 16, 2018 10:20

pralabhkumar and others added 3 commits October 16, 2018 16:35

Handling scenario where no param generator implementation for algorit…

4ae3c0a

…h type

Removing spark heuristics specific changes to separate commit

62c8734

Removing spark heuristics specific changes to separate commit

2eaee21

mkumar1984 approved these changes Oct 16, 2018

View reviewed changes

pralabhkumar added 2 commits October 16, 2018 17:30

Changed Boolean to boolean

e523afe

Made fitness calculation, tuning type specific

2eb5e7e

varunsaxena reviewed Oct 17, 2018

View reviewed changes

ShubhamGupta29 reviewed Oct 21, 2018

View reviewed changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

ShubhamGupta29 reviewed Oct 21, 2018

View reviewed changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

ShubhamGupta29 reviewed Oct 21, 2018

View reviewed changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

ShubhamGupta29 reviewed Oct 22, 2018

View reviewed changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

ShubhamGupta29 reviewed Oct 22, 2018

View reviewed changes

app/com/linkedin/drelephant/tuning/hbt/FitnessManagerHBT.java Show resolved Hide resolved

varunsaxena reviewed Oct 22, 2018

View reviewed changes

varunsaxena reviewed Oct 23, 2018

View reviewed changes

pralabhkumar closed this Jul 8, 2019

varunsaxena self-assigned this Jul 15, 2019

varunsaxena reviewed Dec 19, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuning #430

Tuning #430

pralabhkumar commented Aug 31, 2018 •

edited

Loading

mkumar1984 commented Oct 3, 2018 •

edited

Loading

pralabhkumar Oct 9, 2018

mkumar1984 Oct 14, 2018

pralabhkumar Oct 9, 2018

mkumar1984 Oct 14, 2018

mkumar1984 commented Oct 16, 2018

varunsaxena left a comment

varunsaxena left a comment

pralabhkumar commented Feb 19, 2019 •

edited

Loading

fusonghe commented May 7, 2019

fusonghe commented May 7, 2019

fusonghe commented May 9, 2019

ShubhamGupta29 commented May 9, 2019

fusonghe commented May 13, 2019

ShubhamGupta29 commented May 13, 2019

fusonghe commented May 14, 2019

pralabhkumar commented Jul 8, 2019 •

edited

Loading

varunsaxena left a comment

varunsaxena Nov 4, 2018

varunsaxena Nov 4, 2018

varunsaxena Nov 4, 2018

Tuning #430

Tuning #430

Conversation

pralabhkumar commented Aug 31, 2018 • edited Loading

mkumar1984 commented Oct 3, 2018 • edited Loading

pralabhkumar Oct 9, 2018

Choose a reason for hiding this comment

mkumar1984 Oct 14, 2018

Choose a reason for hiding this comment

pralabhkumar Oct 9, 2018

Choose a reason for hiding this comment

mkumar1984 Oct 14, 2018

Choose a reason for hiding this comment

mkumar1984 commented Oct 16, 2018

varunsaxena left a comment

Choose a reason for hiding this comment

varunsaxena left a comment

Choose a reason for hiding this comment

pralabhkumar commented Feb 19, 2019 • edited Loading

fusonghe commented May 7, 2019

fusonghe commented May 7, 2019

fusonghe commented May 9, 2019

ShubhamGupta29 commented May 9, 2019

fusonghe commented May 13, 2019

ShubhamGupta29 commented May 13, 2019

fusonghe commented May 14, 2019

pralabhkumar commented Jul 8, 2019 • edited Loading

varunsaxena left a comment

Choose a reason for hiding this comment

varunsaxena Nov 4, 2018

Choose a reason for hiding this comment

varunsaxena Nov 4, 2018

Choose a reason for hiding this comment

varunsaxena Nov 4, 2018

Choose a reason for hiding this comment

pralabhkumar commented Aug 31, 2018 •

edited

Loading

mkumar1984 commented Oct 3, 2018 •

edited

Loading

pralabhkumar commented Feb 19, 2019 •

edited

Loading

pralabhkumar commented Jul 8, 2019 •

edited

Loading