Fix Rounding Bug in RatioBasedEstimator #1542

isnotinvain · 2016-03-31T20:24:50Z

Fixes the rounding bug mentioned in #1541, and adds a maximum for reducer estimation as well.

isnotinvain · 2016-03-31T20:26:00Z

...-core/src/main/scala/com/twitter/scalding/reducer_estimation/InputSizeReducerEstimator.scala

   */
-  override def estimateReducers(info: FlowStrategyInfo): Option[Int] =
+  def estimateExactReducers(info: FlowStrategyInfo): Option[Double] = {


probably a bad name

johnynek · 2016-03-31T21:26:42Z

Can you have a test that shows this issue and that this fixes it?

isnotinvain · 2016-03-31T22:16:17Z

@johnynek yes:

Needs tests, and some discussion about a hard cap as well

isnotinvain · 2016-04-01T01:42:58Z

Running the new test on the develop branch yields:

[info] - should handle mapper output explosion over small data correctly *** FAILED ***
[info]   1000 did not equal 2 (RatioBasedEstimatorTest.scala:216)

isnotinvain · 2016-04-01T02:05:46Z

I'm adding a cap as well @gerashegalov

sriramkrishnan · 2016-04-01T03:23:15Z

scalding-core/src/main/scala/com/twitter/scalding/reducer_estimation/Common.scala

+  val maxEstimatedReducersKey = "scalding.reducer.estimator.max.estimated.reducers"
+
+  // TODO: what's a reasonable default? Int.maxValue? 5k? 100k?
+  val defaultMaxEstimatedReducers = 100 * 1000


100K seems way too many, no? Pig's max is 999.

Yeah I have no idea. I guess I can spot check some jobs in our hadoop cluster.
But if you can have 30k mappers, why not 30k reducers? and if 30k is somewhat normal, then 100k isn't that far off in terms of being "way too much"

I don't know whether pig's max of 999 was chosen with much thought or what size of cluster it was chosen for.

One reason I can think of - too many reducers equals too many files.

Yeah, I don't know the historic reason for 999 reducer. I think Hive has the same too.

isnotinvain · 2016-04-01T18:55:44Z

I can make the max non-fatal, though I'm not entirely convinced.

Having the max act as a cap means that it's another tuning dimension that users need to worry about, that might be causing them to have an unexpected number of reducers (if the cap is set too low).

On the other hand, if we set the cap relatively high, as in a value where we think "anything that estimates needing this is clearly broken" and treat it as exceptional, this becomes a faster feedback mechanism for users. I actually think that in many cases fail-fast is a much better user experience than "just try to make it work and hope". The grey area here is whether this is one of those cases or not. If the estimator estimates needing a ton of reducers, they are either needed, and capping will probably not be great, or the estimator is broken.

That's my pitch, but I don't feel too strongly if everyone else want the cap to not be exceptional.

johnynek · 2016-04-01T19:45:31Z

thinking as a user, I think I would not want scalding to (at least by default) fail to run if it uses too many reducers.

If I set N as the max, and it runs slow, I should have alerts that I notice, and I can rewrite my job or turn up the max. If it runs too fast because of a bug (and I waste resources) I should have a system to watch that too, and if not, set a lower max so the waste is not that bad.

I have a hard time imagining making the default to fail the job.

I can see adding an option to fail if we exceed the max (since we expect a bug in that case, or a totally giant input), and some users may want it, but I would err on the side of running the job and not changing default behavior.

isnotinvain · 2016-04-01T19:47:30Z

I'll update to make it non-fatal. I will also add a property to the hadoop config so that tools that monitor hadoop config properties can warn users that the max has been applied.

isnotinvain · 2016-04-01T22:08:39Z

OK, this should be good to go. LMK if 5k is not a good default for the cap

sriramkrishnan · 2016-04-01T22:18:50Z

scalding-core/src/main/scala/com/twitter/scalding/reducer_estimation/Common.scala

@@ -179,14 +195,27 @@ object ReducerEstimatorStepStrategy extends FlowStepStrategy[JobConf] {
      val info = FlowStrategyInfo(flow, preds.asScala, step)

      // if still None, make it '-1' to make it simpler to log


Is this comment still valid? Move it down maybe before you log it?

I guess this could be moved to below where it become -1

sriramkrishnan · 2016-04-01T22:27:30Z

LGTM - may want @rubanm take a look at it too.

rubanm · 2016-04-04T18:15:20Z

scalding-core/src/main/scala/com/twitter/scalding/reducer_estimation/Common.scala

+             """.stripMargin)
+        }
+
+        n.min(configuredMax)


This could go in an if-else block above?

rubanm · 2016-04-04T18:15:38Z

One minor comment, LGTM!

isnotinvain · 2016-04-04T18:22:30Z

I will merge once the tests run

Initial pass at Issue 1541

c344057

isnotinvain mentioned this pull request Mar 31, 2016

Rounding Bug in RatioBasedEstimator #1541

Closed

isnotinvain reviewed Mar 31, 2016
View reviewed changes

Add tests for RatioBasedEstimator rounding bug

747a1af

isnotinvain changed the title ~~[WIP] Initial pass at Issue 1541~~ Fix Rounding Bug in RatioBasedEstimator Apr 1, 2016

isnotinvain added 2 commits March 31, 2016 20:00

Add maxEstimatedReducers

1e35dea

fix double comment

1c3dec0

sriramkrishnan reviewed Apr 1, 2016
View reviewed changes

Rename estimateReducers to estimateReducersWithoutRounding

861689b

Make cap nonfatal

a780970

sriramkrishnan reviewed Apr 1, 2016
View reviewed changes

clean up some comments

3a20849

rubanm reviewed Apr 4, 2016
View reviewed changes

replace min w/ if/else

25070a2

isnotinvain merged commit 315f9c0 into develop Apr 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Rounding Bug in RatioBasedEstimator #1542

Fix Rounding Bug in RatioBasedEstimator #1542

isnotinvain commented Mar 31, 2016

isnotinvain Mar 31, 2016

johnynek commented Mar 31, 2016

isnotinvain commented Mar 31, 2016

isnotinvain commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

sriramkrishnan Apr 1, 2016

isnotinvain Apr 1, 2016

sriramkrishnan Apr 1, 2016

isnotinvain commented Apr 1, 2016

johnynek commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

sriramkrishnan Apr 1, 2016

isnotinvain Apr 1, 2016

sriramkrishnan commented Apr 1, 2016

rubanm Apr 4, 2016

rubanm commented Apr 4, 2016

isnotinvain commented Apr 4, 2016

		@@ -179,14 +195,27 @@ object ReducerEstimatorStepStrategy extends FlowStepStrategy[JobConf] {
		val info = FlowStrategyInfo(flow, preds.asScala, step)

		// if still None, make it '-1' to make it simpler to log

Fix Rounding Bug in RatioBasedEstimator #1542

Fix Rounding Bug in RatioBasedEstimator #1542

Conversation

isnotinvain commented Mar 31, 2016

isnotinvain Mar 31, 2016

Choose a reason for hiding this comment

johnynek commented Mar 31, 2016

isnotinvain commented Mar 31, 2016

isnotinvain commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

sriramkrishnan Apr 1, 2016

Choose a reason for hiding this comment

isnotinvain Apr 1, 2016

Choose a reason for hiding this comment

sriramkrishnan Apr 1, 2016

Choose a reason for hiding this comment

isnotinvain commented Apr 1, 2016

johnynek commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

isnotinvain commented Apr 1, 2016

sriramkrishnan Apr 1, 2016

Choose a reason for hiding this comment

isnotinvain Apr 1, 2016

Choose a reason for hiding this comment

sriramkrishnan commented Apr 1, 2016

rubanm Apr 4, 2016

Choose a reason for hiding this comment

rubanm commented Apr 4, 2016

isnotinvain commented Apr 4, 2016