Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9284] [tests] Allow all tests to run without an assembly. #7629

Closed
wants to merge 14 commits into from

Conversation

vanzin
Copy link
Contributor

@vanzin vanzin commented Jul 23, 2015

This change aims at speeding up the dev cycle a little bit, by making
sure that all tests behave the same w.r.t. where the code to be tested
is loaded from. Namely, that means that tests don't rely on the assembly
anymore, rather loading all needed classes from the build directories.

The main change is to make sure all build directories (classes and test-classes)
are added to the classpath of child processes when running tests.

YarnClusterSuite required some custom code since the executors are run
differently (i.e. not through the launcher library, like standalone and
Mesos do).

I also found a couple of tests that could leak a SparkContext on failure,
and added code to handle those.

With this patch, it's possible to run the following command from a clean
source directory and have all tests pass:

mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install

Marcelo Vanzin added 2 commits July 23, 2015 11:23
Doing this may cause weird errors when tests are run on maven, depending
on the flags used. Instead, expose the needed functionality through methods
that do not expose shaded classes.
This change aims at speeding up the dev cycle a little bit, by making
sure that all tests behave the same w.r.t. where the code to be tested
is loaded from. Namely, that means that tests don't rely on the assembly
anymore, rather loading all needed classes from the build directories.

The main change is to make sure all build directories (classes and test-classes)
are added to the classpath of child processes when running tests.

YarnClusterSuite required some custom code since the executors are run
differently (i.e. not through the launcher library, like standalone and
Mesos do).

I also found a couple of tests that could leak a SparkContext on failure,
and added code to handle those.

With this patch, it's possible to run the following command from a clean
source directory and have all tests pass:

  mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install
@vanzin
Copy link
Contributor Author

vanzin commented Jul 23, 2015

Note I included a separate patch in this PR, since I was hitting that issue every time with these changes. I'll rebase once that patch is committed.

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38281 has finished for PR 7629 at commit 8d2b0ea.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 24, 2015

Weird that scalastyle passed locally.

@SparkQA
Copy link

SparkQA commented Jul 24, 2015

Test build #38387 has finished for PR 7629 at commit 0f5a5bb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class InternalRow extends Serializable
    • case class ChangeDecimalPrecision(child: Expression) extends UnaryExpression
    • class GenericRow(protected[sql] val values: Array[Any]) extends Row
    • class GenericInternalRow(protected[sql] val values: Array[Any]) extends InternalRow
    • class GenericInternalRowWithSchema(values: Array[Any], val schema: StructType)
    • class GenericMutableRow(val values: Array[Any]) extends MutableRow

@SparkQA
Copy link

SparkQA commented Jul 24, 2015

Test build #38393 has finished for PR 7629 at commit 3ec7cd4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class InternalRow extends Serializable
    • case class ChangeDecimalPrecision(child: Expression) extends UnaryExpression
    • class GenericRow(protected[sql] val values: Array[Any]) extends Row
    • class GenericInternalRow(protected[sql] val values: Array[Any]) extends InternalRow
    • class GenericInternalRowWithSchema(values: Array[Any], val schema: StructType)
    • class GenericMutableRow(val values: Array[Any]) extends MutableRow

@SparkQA
Copy link

SparkQA commented Jul 24, 2015

Test build #38401 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 25, 2015

Ok, this is down to that kinesis test that's failing for everybody.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 27, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #119 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 28, 2015

Jenkins retest this please.

@SparkQA
Copy link

SparkQA commented Jul 28, 2015

Test build #138 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 28, 2015

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jul 29, 2015

Test build #140 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 29, 2015

Test build #38774 has finished for PR 7629 at commit 469a732.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 31, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Jul 31, 2015

Test build #182 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 31, 2015

Test build #39251 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Jul 31, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39280 has finished for PR 7629 at commit 469a732.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #183 has finished for PR 7629 at commit 469a732.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39303 has finished for PR 7629 at commit 8e4a136.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 1, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #188 has finished for PR 7629 at commit 8e4a136.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #41471 timed out for PR 7629 at commit d234399 after a configured wait of 175m.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 24, 2015

Seems like all tests passed, no idea why jenkins thinks they timed out.

AFAICT, this is good to do. @pwendell ?

@pwendell
Copy link
Contributor

Yeah sounds good - might be good to let it run one more time just to be
sure it's not affecting jenkins somehow.

On Mon, Aug 24, 2015 at 4:40 PM, Marcelo Vanzin notifications@github.com
wrote:

Seems like all tests passed, no idea why jenkins thinks they timed out.

AFAICT, this is good to do. @pwendell https://github.com/pwendell ?


Reply to this email directly or view it on GitHub
#7629 (comment).

@pwendell
Copy link
Contributor

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41492 timed out for PR 7629 at commit d234399 after a configured wait of 175m.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 25, 2015

175m is starting to look really low. the scala/java unit tests took 143m to run. anyway, retest this please.

@pwendell
Copy link
Contributor

Does this PR increase test time in some way? Just wondering why this would
consistently timeout when others don't.

On Tue, Aug 25, 2015 at 10:43 AM, Marcelo Vanzin notifications@github.com
wrote:

175m is starting to look really low. the scala/java unit tests took 143m
to run. anyway, retest this please.


Reply to this email directly or view it on GitHub
#7629 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Aug 25, 2015

No, it shouldn't affect the test runtime at all. And I get timeouts in lots of PRs, not just this one.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41539 has finished for PR 7629 at commit d234399.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41559 has finished for PR 7629 at commit d234399.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41574 has finished for PR 7629 at commit a2c7c59.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 26, 2015

I've seen the failing test fail in other PRs (and it has passed here before)... let's try again just in case. retest this please

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41584 timed out for PR 7629 at commit a2c7c59 after a configured wait of 175m.

@pwendell
Copy link
Contributor

Marcelo you will need to change the timeout in the code itself for it to
increase from 175
On Aug 25, 2015 11:24 PM, "UCB AMPLab" notifications@github.com wrote:

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41584/
Test FAILed.


Reply to this email directly or view it on GitHub
#7629 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Aug 26, 2015

you will need to change the timeout in the code

Yes but I don't want to do that as part of this change, since they're unrelated things. All tests have been passing, the timeouts are unrelated to the PR.

@pwendell
Copy link
Contributor

Okay then - I would do that separately, get this PR passing, then merge it.
It is not good to merge a PR that deterministically fails jenkins. Have we
done that in the recent past? I saw a few other PR's that hit an occasional
timeout, but this one seems to be timing out with certainty every time.

On Wed, Aug 26, 2015 at 9:15 AM, Marcelo Vanzin notifications@github.com
wrote:

you will need to change the timeout in the code

Yes but I don't want to do that as part of this change, since they're
unrelated things. All tests have been passing, the timeouts are unrelated
to the PR.


Reply to this email directly or view it on GitHub
#7629 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Aug 26, 2015

but this one seems to be timing out with certainty every time

Often, but not deterministically every time. It seems to time out as often as any other PR that needs to run all tests (I've already cc'ed you on at least another one that times out just as often).

@pwendell
Copy link
Contributor

I still don't understand - why not fix the issue in a separate PR, get this
passing and then merge this? It will then benefit the other PR's also that
are facing this issue.

On Wed, Aug 26, 2015 at 9:46 AM, Marcelo Vanzin notifications@github.com
wrote:

but this one seems to be timing out with certainty every time

Often, but not deterministically every time. It seems to time out as often
as any other PR that needs to run all tests (I've already cc'ed you on at
least another one that times out just as often).


Reply to this email directly or view it on GitHub
#7629 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Aug 26, 2015

I'm working on fixing the root cause of the timeouts (running unnecessary tests). If you think it would be beneficial to just bump the timeout right now, please just send a PR for that; I'm pretty confident that this PR does not make the timeout issue any worse.

@pwendell
Copy link
Contributor

Jenkins, retest this please.

@pwendell
Copy link
Contributor

K - I just sent a hotfix to up the timeout.

On Wed, Aug 26, 2015 at 9:51 AM, Marcelo Vanzin notifications@github.com
wrote:

I'm working on fixing the root cause of the timeouts (running unnecessary
tests). If you think it would be beneficial to just bump the timeout right
now, please just send a PR for that; I'm pretty confident that this PR does
not make the timeout issue any worse.


Reply to this email directly or view it on GitHub
#7629 (comment).

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41647 has finished for PR 7629 at commit a2c7c59.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor Author

vanzin commented Aug 26, 2015

Yay!

@pwendell
Copy link
Contributor

Great - looks good!
On Aug 26, 2015 3:53 PM, "Marcelo Vanzin" notifications@github.com wrote:

Yay!


Reply to this email directly or view it on GitHub
#7629 (comment).

@vanzin
Copy link
Contributor Author

vanzin commented Aug 28, 2015

Ok I'm merging this.

@asfgit asfgit closed this in c53c902 Aug 28, 2015
@vanzin vanzin deleted the SPARK-9284 branch September 9, 2015 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants