Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled #222

shkhrgpt · 2017-03-08T01:47:04Z

Also updates all the heuristic unit tests to take non-sampled tasks into account.

…a when sampling is enabled (linkedin#33) * Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled. * Adds a comment

shkhrgpt · 2017-03-08T01:47:34Z

@akshayrai @shankar37

Please take a look.

shankar37 · 2017-03-14T05:32:32Z

app/com/linkedin/drelephant/mapreduce/TaskLevelAggregatedMetrics.java

@@ -106,6 +106,9 @@ private void compute(MapReduceTaskData[] taskDatas, long containerSize, long ide
    }

    for (MapReduceTaskData taskData: taskDatas) {
+      if (!taskData.isTimeAndCounterDataPresent()) {


Isn't this always true ? Can you elaborate on what exactly is the issue you are trying to fix ? I remember you mentioning something in the meetup but dont recollect exactly what the issue was.

No, it's not always true. If sampling is enabled, then it's only going to be true for sampled tasks. In MapReduceFetcherHadoop2 class, MapReduceTaskData instance is created for all the tasks, but time and counter data is only stored for sampled tasks. Please look at the logic here.
The unit test in this change will fail because NullPointerException without this check.

shankar37 · 2017-03-14T05:48:25Z

app/com/linkedin/drelephant/mapreduce/heuristics/JobQueueLimitHeuristic.java

 import java.util.Properties;
 import java.util.concurrent.TimeUnit;

 import com.linkedin.drelephant.analysis.Heuristic;
 import com.linkedin.drelephant.analysis.HeuristicResult;
 import com.linkedin.drelephant.analysis.Severity;

-
 public class JobQueueLimitHeuristic implements Heuristic<MapReduceApplicationData> {


@akshayrai Is this heuristic used at linkedin ? I dont see it enabled.

No. We don't use this. As far as I remember, this heuristic was created to warn jobs which were hitting close to the default queue time out limit of 15mins.

shankar37 · 2017-03-14T06:02:13Z

app/com/linkedin/drelephant/mapreduce/TaskLevelAggregatedMetrics.java

@@ -106,6 +106,9 @@ private void compute(MapReduceTaskData[] taskDatas, long containerSize, long ide
    }

    for (MapReduceTaskData taskData: taskDatas) {
+      if (!taskData.isTimeAndCounterDataPresent()) {


Can you add a comment to the function saying that the aggregatemetrics are not expected to be accurate when sampling is enabled so that it's clear ?

Thanks @shankar37 for the review.
Added a comment. PTAL.

akshayrai · 2017-03-28T11:42:51Z

+1

…a when sampling is enabled (linkedin#222)

Fixes MapReduce aggregator and heuristic to correctly handle task dat…

7fce04c

…a when sampling is enabled (linkedin#33) * Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled. * Adds a comment

shankar37 reviewed Mar 14, 2017

View reviewed changes

shankar37 approved these changes Mar 14, 2017

View reviewed changes

Adds a comment

8395712

akshayrai merged commit 5a98701 into linkedin:master Mar 28, 2017

skakker pushed a commit to skakker/dr-elephant that referenced this pull request Dec 14, 2017

Fixes MapReduce aggregator and heuristic to correctly handle task dat…

fdfb643

…a when sampling is enabled (linkedin#222)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled #222

Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled #222

shkhrgpt commented Mar 8, 2017

shkhrgpt commented Mar 8, 2017

shankar37 Mar 14, 2017

shkhrgpt Mar 14, 2017

shankar37 Mar 14, 2017

akshayrai Mar 28, 2017

shankar37 Mar 14, 2017

shkhrgpt Mar 14, 2017

akshayrai commented Mar 28, 2017

Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled #222

Fixes MapReduce aggregator and heuristic to correctly handle task data when sampling is enabled #222

Conversation

shkhrgpt commented Mar 8, 2017

shkhrgpt commented Mar 8, 2017

shankar37 Mar 14, 2017

Choose a reason for hiding this comment

shkhrgpt Mar 14, 2017

Choose a reason for hiding this comment

shankar37 Mar 14, 2017

Choose a reason for hiding this comment

akshayrai Mar 28, 2017

Choose a reason for hiding this comment

shankar37 Mar 14, 2017

Choose a reason for hiding this comment

shkhrgpt Mar 14, 2017

Choose a reason for hiding this comment

akshayrai commented Mar 28, 2017