Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13808][test-maven] Don't build assembly in dev/run-tests #11701

Closed
wants to merge 7 commits into from

Conversation

JoshRosen
Copy link
Contributor

As of SPARK-9284 we should no longer need to build the full Spark assembly JAR in order to run tests. Therefore, we should remove the assembly step from dev/run-tests in order to reduce build + test time.

Most of the changes in this PR were originally part of #11178.

@JoshRosen
Copy link
Contributor Author

/cc @vanzin, this patch removes the need to build an assembly before running tests.

@@ -323,7 +323,7 @@ def get_hadoop_profiles(hadoop_version):
def build_spark_maven(hadoop_version):
# Enable all of the profiles for the build:
build_profiles = get_hadoop_profiles(hadoop_version) + modules.root.build_profile_flags
mvn_goals = ["clean", "package", "-DskipTests"]
mvn_goals = ["clean", "package", "-DskipTests", "-pl", "!assembly"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're looking at speeding up the build, building and testing in one shot might be a good thing to do at least for maven. Using -fae would allow the build to go as far as it can when something fails.

No need to do that in this change, though.

@vanzin
Copy link
Contributor

vanzin commented Mar 14, 2016

LGTM if tests pass; although a have a comment about the pyspark tests workaround.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53094 has finished for PR 11701 at commit 2c10193.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53120 has finished for PR 11701 at commit 1154eb4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53110 has finished for PR 11701 at commit 267aaf9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53133 has finished for PR 11701 at commit 1154eb4.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53148 has finished for PR 11701 at commit 1154eb4.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53158 has finished for PR 11701 at commit 1fd489c.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 15, 2016

Test build #53209 has finished for PR 11701 at commit 1fd489c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor Author

(Pretty sure that something is wrong here, but hoping that maybe it's just flaky; I'll dig in more when I have time.)

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53285 has finished for PR 11701 at commit 1fd489c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen changed the title [SPARK-13808] Don't build assembly in dev/run-tests [SPARK-13808][test-maven] Don't build assembly in dev/run-tests Mar 16, 2016
@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@JoshRosen
Copy link
Contributor Author

(Going to quickly make sure this also works with Maven tests, just to sanity-check)

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53332 has finished for PR 11701 at commit 1fd489c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2016

retest this please

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53333 has finished for PR 11701 at commit 1fd489c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2016

16/03/16 11:43:23.870 SparkLauncherSuite-2 INFO OutputRedirector: Failed to find Spark jars directory (/home/jenkins/workspace/SparkPullRequestBuilder@2/assembly/target/scala-2.10).
16/03/16 11:43:23.870 SparkLauncherSuite-2 INFO OutputRedirector: You need to build Spark before running this program.

Looks like some adjustment is needed in my previous change?

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2016

Perhaps changing this in bin/spark-class:

if [ ! -d "$SPARK_JARS_DIR" ] && [ -z "$SPARK_TESTING" ] ; then
  echo "Failed to find Spark jars directory ($SPARK_JARS_DIR)." 1>&2
  echo "You need to build Spark before running this program." 1>&2
  exit 1
fi

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53336 has finished for PR 11701 at commit 20d2e2a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

fi

LAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"

# Add the launcher build dir to the classpath if requested.
if [ -n "$SPARK_PREPEND_CLASSES" ]; then
LAUNCH_CLASSPATH="${SPARK_HOME}/launcher/target/scala-$SPARK_SCALA_VERSION/classes:$LAUNCH_CLASSPATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this is not being picked up (test is failing because the Main class is not found). Is this path correct for maven too?

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2016

Hmmm. I think this is because this code in load-spark-env.sh:

if [ -z "$SPARK_SCALA_VERSION" ]; then
  USER_SCALA_VERSION_SET=0
  ASSEMBLY_DIR2="${SPARK_HOME}/assembly/target/scala-2.11"
  ASSEMBLY_DIR1="${SPARK_HOME}/assembly/target/scala-2.10"

  if [[ -d "$ASSEMBLY_DIR2" && -d "$ASSEMBLY_DIR1" ]]; then
    echo -e "Presence of build for both scala versions(SCALA 2.10 and SCALA 2.11) detected." 1>&2
    echo -e 'Either clean one of them or, export SPARK_SCALA_VERSION=2.11 in spark-env.sh.' 1>&2
    exit 1
  fi

  if [ -d "$ASSEMBLY_DIR2" ]; then
    export SPARK_SCALA_VERSION="2.11"
  else
    export SPARK_SCALA_VERSION="2.10"
  fi
else
    USER_SCALA_VERSION_SET=1
fi

Since the assembly build is being skipped, it's defaulting to scala 2.10 and failing to find the classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 16, 2016

BTW, I'm just waiting for this patch before working on SPARK-13579; so it's probably ok to keep building the assembly in the maven build, because it will become a lot cheaper once I implement that change (basically just copy files around).

@JoshRosen
Copy link
Contributor Author

Yeah, I literally just found the same issue with the defaults in load-spark-env.sh. I wonder whether it would make sense to just have Maven set that variable appropriately based on the profile.

@JoshRosen
Copy link
Contributor Author

Since all of this code is going to be changed heavily / removed after your final patch, I'm going to go ahead and just leave the Maven test path unchanged so that we can get this merged.

original_working_dir = os.getcwd()
os.chdir(SPARK_HOME)
cp = subprocess_check_output(
["./build/sbt", "-Phive", "export assembly/managedClasspath"], universal_newlines=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin, as part of your next patch will SPARK_DIST_CLASSPATH just point to the libs directory? If so, we can remove this as part of that patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially. I'll make a note to take a look at this.

@JoshRosen JoshRosen changed the title [SPARK-13808][test-maven] Don't build assembly in dev/run-tests [SPARK-13808] Don't build assembly in dev/run-tests Mar 16, 2016
@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 16, 2016

Test build #53345 has finished for PR 11701 at commit 0900b13.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53364 has finished for PR 11701 at commit 0900b13.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Okay, so this passes all Scala tests but is failing PySpark Maven tests...

@vanzin
Copy link
Contributor

vanzin commented Mar 17, 2016

Are the failures legitimate? When I looked before it seemed only one of the python interpreters failed, which generally screams "flaky test" to me...

@JoshRosen JoshRosen changed the title [SPARK-13808] Don't build assembly in dev/run-tests [SPARK-13808][test-maven] Don't build assembly in dev/run-tests Mar 17, 2016
@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53443 has finished for PR 11701 at commit 0900b13.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Mar 17, 2016

Test build #53453 has finished for PR 11701 at commit 0900b13.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

I'm going to close this in favor of @vanzin's #11796

@JoshRosen JoshRosen closed this Mar 22, 2016
vanzin pushed a commit to vanzin/spark that referenced this pull request Mar 25, 2016
vanzin pushed a commit to vanzin/spark that referenced this pull request Mar 25, 2016
@JoshRosen JoshRosen deleted the remove-assembly-in-tests branch August 29, 2016 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants