Skip to content

Conversation

@uncleGen
Copy link
Contributor

@uncleGen uncleGen commented Feb 7, 2017

What changes were proposed in this pull request?

First, there is no need to set 'spark.master' multi-times with different values. Second, It is possible for users to set the different 'spark.master' in code with
spark-submit command, and will confuse himself. So, we should do once check if the 'spark.master' already exists in settings and if the previous value is the same with current value. Throw a IllegalArgumentException when previous value is different with current value.

BTW, I have received many cases that users forgot to remove "local[*]" from code and submit it in command line. And many fresh users has no consciousness to check if job runs correctly in standalone mode or yarn mode. At last, the job will run on master node or gateway node all along.

How was this patch tested?

add new unit test

@uncleGen
Copy link
Contributor Author

uncleGen commented Feb 7, 2017

retest this please.

@SparkQA
Copy link

SparkQA commented Feb 7, 2017

Test build #72483 has finished for PR 16827 at commit 0cc5997.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 7, 2017

Test build #72485 has finished for PR 16827 at commit b50a0bd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 7, 2017

Test build #72486 has finished for PR 16827 at commit 8198db3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@uncleGen
Copy link
Contributor Author

uncleGen commented Feb 7, 2017

Working on UT failure.

val commandLineArgs = Array(
"--deploy-mode", "cluster",
"--master", masterUrl,
"--name", mainClass,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a useless setting, it will be rewritten in rest server side with right value.

val conf = new SparkConf(false).setAppName("My app")
sc = new SparkContext("local[2]", "My other app", conf)
assert(sc.master === "local[2]")
assert(sc.appName === "My other app")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it, as it can not be overrode.

assert(Utils.isDynamicAllocationEnabled(
conf.set("spark.executor.instances", "0")) === true)
conf.remove("spark.master")
assert(Utils.isDynamicAllocationEnabled(conf.set("spark.master", "local")) === false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trick for unit test, otherwise fail.

newReloadConf.getOption(prop).foreach { value =>
// (trick) avoid SPARK-19482.
newSparkConf.remove(prop)
newSparkConf.set(prop, value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, we should remove properties which will be reloaded, otherwise will fail for some special properties, like 'spark.master'

}
if (previousOne.isDefined && !previousOne.get.equals(value)) {
throw new IllegalArgumentException(s"'spark.master' should not be set with different " +
s"values, previous value is ${previousOne.get} and current value is $value.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, we choose to throw IllegalArgumentException to fail job. Besides, we can also choose to log a warning in a gentle way.

@srowen
Copy link
Member

srowen commented Feb 7, 2017

I don't think we want or need to special-case behavior of this one property. It's also a behavior change.

@SparkQA
Copy link

SparkQA commented Feb 7, 2017

Test build #72494 has finished for PR 16827 at commit 8565f97.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@uncleGen
Copy link
Contributor Author

uncleGen commented Feb 7, 2017

@srowen Well, this change may be really radical. Just like what I say, many fresh users has no consciousness to check if job runs correctly in standalone mode or yarn mode. Maybe, they just forgot to remove the "local[*]" from code after debug or just did not know the priority of setting 'spark.master'. This is not a rare case. I have have received many reports like this. Indeed, we can say only newbie may make this mistake. But it is better to warn users in some ways.

What about just log a warning? But, I don't think users will notice it.

Besides, I do not find any strong reason to set 'spark.master' multi-times with different value.

Any suggestion is appreciated.

@uncleGen
Copy link
Contributor Author

@srowen Could you please take a second review?
cc @rxin also

@srowen
Copy link
Member

srowen commented Feb 10, 2017

I still don't see a reason to special-case this property. It also means you can't override a property set by defaults, which could be a problem.

@uncleGen
Copy link
Contributor Author

@srowen make senses. What about logging a warning/error message if 'spark.master' is set with different values?

@srowen
Copy link
Member

srowen commented Feb 10, 2017

I just don't think that setting properties to override defaults, which is the common reason this would occur, is a warning or error situation.

@uncleGen
Copy link
Contributor Author

@srowen make sense, close it first before there is a follow-up

@uncleGen uncleGen closed this Feb 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants