Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-4297 [BUILD] Build warning fixes omnibus #3157

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Nov 7, 2014

There are a number of warnings generated in a normal, successful build right now. They're mostly Java unchecked cast warnings, which can be suppressed. But there's a grab bag of other Scala language warnings and so on that can all be easily fixed. The forthcoming PR fixes about 90% of the build warnings I see now.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23054 has started for PR 3157 at commit 17bc581.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23054 has finished for PR 3157 at commit 17bc581.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23054/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23055 has started for PR 3157 at commit 27800f7.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 7, 2014

Test build #23055 has finished for PR 3157 at commit 27800f7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23055/
Test PASSed.

@@ -63,7 +62,7 @@ object StreamingKMeans {
val trainingData = ssc.textFileStream(args(0)).map(Vectors.parse)
val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)

val model = new StreamingKMeans()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this change needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was fixed in [https://github.com//pull/3568] by renaming the example. My apologies for not seeing this PR before submitting 3568!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will undo this bit, and rebase too for good measure.

@srowen
Copy link
Member Author

srowen commented Nov 15, 2014

The object and the MLlib class have the same name so one was shadowing the other. It was correct but generated a warning IIRC. This just disambiguates.

@SparkQA
Copy link

SparkQA commented Nov 21, 2014

Test build #23727 has started for PR 3157 at commit f4b6c5e.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 21, 2014

Test build #23727 has finished for PR 3157 at commit f4b6c5e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23727/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Dec 3, 2014

Test build #24100 has started for PR 3157 at commit 36a5947.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 3, 2014

Test build #24100 has finished for PR 3157 at commit 36a5947.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24100/
Test PASSed.

@jkbradley
Copy link
Member

@srowen With my settings, comparing warnings between master and this PR merged with master shows this PR eliminates 2 warnings (TaskResultGetter, ParquetTypes). The others don't appear (for my settings). Hopefully someone else can check too.

@srowen
Copy link
Member Author

srowen commented Dec 3, 2014

@jkbradley No there should be many more. Here's what I see on master now: https://gist.github.com/srowen/ddf5e606ba9cb888999f

Not all of these are addressed in the PR and some are spurious. But maybe the difference is that several of the warnings are from the build (i.e. antrun plugin), test code, and Java code?

@jkbradley
Copy link
Member

@srowen I hadn't checked the test build. With that, I think I've verified all the warning fixes/suppressions expected, except for JavaRowSuite (for which I don't see any warnings in master or your PR). Hopefully someone else can check, but FWIW, this LGTM

@SparkQA
Copy link

SparkQA commented Dec 11, 2014

Test build #24376 has started for PR 3157 at commit b5df4e7.

  • This patch merges cleanly.

@srowen
Copy link
Member Author

srowen commented Dec 11, 2014

@jkbradley I rebased just now. Hm, JavaRowSuite shows a warning in IntelliJ for an unchecked generic array creation, and I see this if I compile without the annotation:

[warn] Note: /Users/srowen/Documents/spark/sql/core/src/test/java/org/apache/spark/sql/api/java/JavaRowSuite.java uses unchecked or unsafe operations.
[warn] Note: Recompile with -Xlint:unchecked for details.

I'm going to fix a few more new, easy warnings and push. Take a look in a minute.

@SparkQA
Copy link

SparkQA commented Dec 11, 2014

Test build #24377 has started for PR 3157 at commit 4d2847b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 11, 2014

Test build #24376 has finished for PR 3157 at commit b5df4e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24376/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Dec 11, 2014

Test build #24377 has finished for PR 3157 at commit 4d2847b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24377/
Test PASSed.

@nchammas
Copy link
Contributor

+1 on this kind of cleanup work.

@jkbradley
Copy link
Member

Sorry for the slow response; testing now, but it will take a bit longer to finish

@SparkQA
Copy link

SparkQA commented Dec 19, 2014

Test build #24642 has started for PR 3157 at commit 8c9e469.

  • This patch merges cleanly.

@srowen
Copy link
Member Author

srowen commented Dec 19, 2014

Sorry for the noise on this one. I couldn't figure out why the test was failing after a rebase, but I figured it out. I think that, in the process of fixing the warning, it actually fixed the test, and, reveals a small problem in the test.

In https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetQuerySuite.scala#L487 it looks like it should really be

    checkFilter[Operators.NotEq[Integer]](!('a.int === 1))`

instead of

    checkFilter[Operators.Not](!('a.int === 1))

@liancheng does that make sense to you? I think this was the commit with the test:
423baea

@SparkQA
Copy link

SparkQA commented Dec 19, 2014

Test build #24642 has finished for PR 3157 at commit 8c9e469.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24642/
Test FAILed.

@srowen
Copy link
Member Author

srowen commented Dec 19, 2014

Hm, I can't see how the Hive test failure here is related to this PR. It doesn't occur for me locally and I can't see any relevant change that would cause a NoSuchMethodError. For the moment I think this is an unrelated error. That said, I don't see why other PRs aren't failing, except that perhaps they do not trigger Hive-related tests?

@liancheng
Copy link
Contributor

@srowen Sorry for the late reply, missed this thread. Yes, the Operators.Not should be replaced with Operators.NotEq. The original Parquet filter test cases in ParquetQuerySuite didn't catch this error because the type information in this assertion is actually erased at runtime.

On the other hand, ParquetQuerySuite will be removed soon. It has been deprecated by a new set of Parquet test suites introduced in #3644. Similar type erasure problem doesn't exist in the new ParquetFilterSuite.

@liancheng
Copy link
Contributor

Oh I see you fixed the type erasure issue, thanks :) I'm looking into the Hive test failures.

@liancheng
Copy link
Contributor

The NoSuchMethodError was caused by wrong datanucleus-core version. The target method is defined in datanucleus-core 3.2.2 but not in 3.2.10. The former is used when compiling against Hive 0.12.0, while the latter is used when compiling against Hive 0.13.1. Currently on Jenkins we first do a clean compile against 0.12.0, then build the assembly jar without clean against 0.13.1 and run the tests.

This behavior left both versions of datanucleus-core in the lib_managed directory, and may sometimes mess up class paths. I'll open a PR to clean lib_managed before compiling Hive 0.13.1 in dev/run-tests.

I've seen this error once several days before right after the most recent Jenkins upgrade. Not sure why this issue wasn't detected before.

@liancheng
Copy link
Contributor

Just ran dev/run-tests locally, confirmed that both sets of datanucleus jars can be found:

$ find . -name "datanucleus-*"
./lib_managed/jars/datanucleus-api-jdo-3.2.1.jar
./lib_managed/jars/datanucleus-api-jdo-3.2.6.jar
./lib_managed/jars/datanucleus-core-3.2.10.jar
./lib_managed/jars/datanucleus-core-3.2.2.jar
./lib_managed/jars/datanucleus-rdbms-3.2.1.jar
./lib_managed/jars/datanucleus-rdbms-3.2.9.jar

@srowen
Copy link
Member Author

srowen commented Dec 21, 2014

@liancheng Ah, good catch. Yeah sounds like a separate issue. I think this PR is good to go then, if that was the only error.

@liancheng
Copy link
Contributor

Opened #3756 to fix the datanucleus issue.

asfgit pushed a commit that referenced this pull request Dec 23, 2014
This PR tries to fix the Hive tests failure encountered in PR #3157 by cleaning `lib_managed` before building assembly jar against Hive 0.13.1 in `dev/run-tests`. Otherwise two sets of datanucleus jars would be left in `lib_managed` and may mess up class paths while executing Hive test suites. Please refer to [this thread] [1] for details. A clean build would be even safer, but we only clean `lib_managed` here to save build time.

This PR also takes the chance to clean up some minor typos and formatting issues in the comments.

[1]: #3157 (comment)

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3756)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #3756 from liancheng/clean-lib-managed and squashes the following commits:

e2bd21d [Cheng Lian] Adds lib_managed to clean set
c9f2f3e [Cheng Lian] Cleans lib_managed before compiling with Hive 0.13.1

(cherry picked from commit 395b771)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
asfgit pushed a commit that referenced this pull request Dec 23, 2014
This PR tries to fix the Hive tests failure encountered in PR #3157 by cleaning `lib_managed` before building assembly jar against Hive 0.13.1 in `dev/run-tests`. Otherwise two sets of datanucleus jars would be left in `lib_managed` and may mess up class paths while executing Hive test suites. Please refer to [this thread] [1] for details. A clean build would be even safer, but we only clean `lib_managed` here to save build time.

This PR also takes the chance to clean up some minor typos and formatting issues in the comments.

[1]: #3157 (comment)

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3756)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #3756 from liancheng/clean-lib-managed and squashes the following commits:

e2bd21d [Cheng Lian] Adds lib_managed to clean set
c9f2f3e [Cheng Lian] Cleans lib_managed before compiling with Hive 0.13.1
@srowen
Copy link
Member Author

srowen commented Dec 23, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Dec 23, 2014

Test build #24747 has started for PR 3157 at commit 8c9e469.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 23, 2014

Test build #24747 has finished for PR 3157 at commit 8c9e469.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24747/
Test PASSed.

@JoshRosen
Copy link
Contributor

Yay, the tests passed! @srowen it looks like I could merge this now, but are there any other updates / changes that you want to make? Also, should this be pulled into any branches other than master?

@srowen
Copy link
Member Author

srowen commented Dec 24, 2014

@JoshRosen No, probably enough for now. I'll have another look at cleaning up warnings in a couple months.

@@ -1012,7 +1012,7 @@ public void testPairToPairFlatMapWithChangingTypes() { // Maps pair -> pair
}
});
JavaTestUtils.attachTestOutputStream(flatMapped);
List<List<Tuple2<String, Integer>>> result = JavaTestUtils.runStreams(ssc, 2, 2);
List<List<Tuple2<Integer, String>>> result = JavaTestUtils.runStreams(ssc, 2, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, how did this even compile before if you were able to swap the type parameters to Tuple2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generic types are wrong, but the underlying objects are fine. JavaTestUtils.runStreams returns a List<List<V>> so happily binds the Tuple2 type to whatever the caller says. The reason the comparison compiled was that assertEquals(Object, Object) accepts anything. The ultimate List.equals() method doesn't care about types and compares values which are in fact correct and of the right type and equal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense.

@JoshRosen
Copy link
Contributor

This looks good to me. Since @liancheng and @jkbradley have done a pretty extensive pass through the changes, too, I'm going to merge this into master.

@asfgit asfgit closed this in 29fabb1 Dec 24, 2014
@srowen srowen deleted the SPARK-4297 branch December 25, 2014 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants