Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9613] [CORE] Ban use of JavaConversions and migrate all existing uses to JavaConverters #8033

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Aug 7, 2015

Replace JavaConversions implicits with JavaConverters

Most occurrences I've seen so far are necessary conversions; a few have been avoidable. None are in critical code as far as I see, yet.

@SparkQA
Copy link

SparkQA commented Aug 7, 2015

Test build #40172 has finished for PR 8033 at commit d2aed5a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 10, 2015

Test build #40291 has finished for PR 8033 at commit d82a730.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ShuffledHashOuterJoin(

@SparkQA
Copy link

SparkQA commented Aug 10, 2015

Test build #40311 has finished for PR 8033 at commit a353317.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ShuffledHashOuterJoin(

@JoshRosen
Copy link
Contributor

Is the point to get rid of implicits only, or also replace JavaConversions with JavaConverters? I wasn't sure whether the latter is preferred.

The point is to get rid of JavaConversions, since that makes it really hard for code-reviewers to spot when inefficient implicit conversions are taking place. There are many places where performing these conversions does make sense, such as in some of the Java API wrappers, and in those cases I'd prefer that we use the explicit .toJava and .toScala methods from JavaConverters.

@srowen
Copy link
Member Author

srowen commented Aug 10, 2015

Nice one, since that's what I've been doing. Yes it's clear that this will end up creating some more verbose code, but one where the conversions are visible. I personally agree that's worth it.

What do you think so far? I'm about 1/3 through so want to check whether it feels about right before going all the way through it.

There are certainly a few places where the code can just be simplified; that's about 10% of the changes I seem to be making.

@SparkQA
Copy link

SparkQA commented Aug 10, 2015

Test build #40332 has finished for PR 8033 at commit f8f8898.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen force-pushed the SPARK-9613 branch 2 times, most recently from 4d2de40 to 08ff887 Compare August 11, 2015 13:59
@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #40456 has finished for PR 8033 at commit 08ff887.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ShuffledHashOuterJoin(

@@ -38,7 +38,7 @@ object LogisticRegressionSuite {
scale: Double,
nPoints: Int,
seed: Int): java.util.List[LabeledPoint] = {
seqAsJavaList(generateLogisticInput(offset, scale, nPoints, seed))
seqAsJavaListConverter(generateLogisticInput(offset, scale, nPoints, seed)).asJava
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need seqAsJavaListConverter? asJava should returns a Java List when the input is a Scala Seq.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I seem to have overlooked this entirely, that the point of JavaConverters is that the "xAsY" functions are still implicit. Well this will clean this up a lot. Thanks for pointing that out before I got too far. It won't be hard to touch this up in the changes so far.

@SparkQA
Copy link

SparkQA commented Aug 11, 2015

Test build #40518 has finished for PR 8033 at commit 5815596.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SQLTransformer (override val uid: String) extends Transformer
    • case class EqualNullSafe(attribute: String, value: Any) extends Filter

@srowen srowen force-pushed the SPARK-9613 branch 2 times, most recently from 03b747b to 7c92505 Compare August 12, 2015 09:43
@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #40618 has finished for PR 8033 at commit 7c92505.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 12, 2015

Test build #40617 has finished for PR 8033 at commit 03b747b.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@srowen srowen force-pushed the SPARK-9613 branch 2 times, most recently from e1268c9 to b57657c Compare August 13, 2015 17:11
@SparkQA
Copy link

SparkQA commented Aug 13, 2015

Test build #40781 has finished for PR 8033 at commit b57657c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait Identifiable
    • class VectorUDT extends UserDefinedType[Vector]
    • class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol, HasSeed):

@SparkQA
Copy link

SparkQA commented Aug 14, 2015

Test build #40863 has finished for PR 8033 at commit 14c912f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 17, 2015

Test build #41013 has finished for PR 8033 at commit e772d9f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class FilterNode(condition: Expression, child: LocalNode) extends UnaryLocalNode
    • abstract class LocalNode extends TreeNode[LocalNode]
    • abstract class LeafLocalNode extends LocalNode
    • abstract class UnaryLocalNode extends LocalNode
    • case class ProjectNode(projectList: Seq[NamedExpression], child: LocalNode) extends UnaryLocalNode
    • case class SeqScanNode(output: Seq[Attribute], data: Seq[InternalRow]) extends LeafLocalNode
    • public final class UTF8String implements Comparable<UTF8String>, Externalizable

@srowen
Copy link
Member Author

srowen commented Aug 17, 2015

@zhzhan could I get your eyes on this change? My change is causing OrcQuerySuite to fail, and I'm not yet clear why. I see you created much of this test. I'm looking for clues before I start taking apart my change piece by piece.

The failure is that the Maps which are expected to map 1 -> 1, 2 -> 2, etc. don't match up. My change is replacing implicit Scala <-> Java conversion with more explicit conversion. It's obviously either a typo, or some behavior difference, introduced in the process that broke the test, which passes normally. If anything jumps out at you as a good place to look let me know.

[info] OrcQuerySuite:
[info] - Read/write All Types (1 second, 895 milliseconds)
[info] - Read/write binary data (374 milliseconds)
10:22:00.729 WARN org.apache.spark.scheduler.TaskSetManager: Stage 10988 contains a task of very large size (100 KB). The maximum recommended task size is 100 KB.
[info] - Read/write all types with non-primitive type *** FAILED *** (6 seconds, 266 milliseconds)
[info]   Results do not match for query:
...
[info]   ![10,10,10,10.0,10.0,10,10,true,WrappedArray(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),WrappedArray(0, null, null, 3, null, null, 6, null, null, 9),Map(0 -> 0, 5 -> 5, 1 -> 1, 6 -> 6, 9 -> 9, 2 -> 2, 7 -> 7, 3 -> 3, 8 -> 8, 4 -> 4),Map(0 -> 0, 5 -> 5, 10 -> null, 1 -> 1, 6 -> 6, 9 -> 9, 2 -> 2, 7 -> 7, 3 -> 3, 8 -> 8, 4 -> 4),[WrappedArray(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),[10,10]]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          [10,10,10,10.0,10.0,10,10,true,WrappedArray(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),WrappedArray(0, null, null, 3, null, null, 6, null, null, 9),Map(0 -> 0, 5 -> 1, 1 -> 2, 6 -> 3, 9 -> 4, 2 -> 5, 7 -> 6, 3 -> 7, 8 -> 8, 4 -> 9),Map(0 -> 0, 5 -> 1, 10 -> 2, 1 -> 3, 6 -> 4, 9 -> 5, 2 -> 6, 7 -> 7, 3 -> 8, 8 -> 9, 4 -> null),[WrappedArray(0, 1, 2, 3, 4, 5, 6, 7, 8, 9),[10,10]]]
...

@zhzhan
Copy link
Contributor

zhzhan commented Aug 17, 2015

@srowen It seems that the mapping got messed up, which I don't have clue yet and didn't find any obvious reason why the patch can break the test. I will dig more and let you know if I have any new findings.

@srowen
Copy link
Member Author

srowen commented Aug 17, 2015

@zhzhan if you have a moment to look I'd be grateful. I am a little stuck. I suspect these lines at the moment: https://github.com/apache/spark/pull/8033/files#diff-287b5c238371c028d6c21fb5c2ebe4bcR205

@zhzhan
Copy link
Contributor

zhzhan commented Aug 17, 2015

@srowen Probably you can revert back the change in sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala

@SparkQA
Copy link

SparkQA commented Aug 18, 2015

Test build #41141 timed out for PR 8033 at commit 7568e13 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 18, 2015

Test build #1650 has finished for PR 8033 at commit 7568e13.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41222 has finished for PR 8033 at commit 45f3f2e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #1668 has finished for PR 8033 at commit 45f3f2e.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41256 has finished for PR 8033 at commit 2fd6d1f.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41258 has finished for PR 8033 at commit d53c9ec.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 20, 2015

Test build #41317 timed out for PR 8033 at commit 1d351e2 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 20, 2015

Test build #1674 has finished for PR 8033 at commit 1d351e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 20, 2015

Test build #41324 has finished for PR 8033 at commit 304e352.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 21, 2015

Test build #41364 timed out for PR 8033 at commit b4f58d5 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 21, 2015

Test build #1679 has finished for PR 8033 at commit b4f58d5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Aug 21, 2015

@JoshRosen wanted to call your attention to this PR since it's ready for a look. It is complete and passes tests -- usually. It needs constant rebasing though, so wanted to get it reviewed in a passing state that just needs a minor rebase.

@SparkQA
Copy link

SparkQA commented Aug 22, 2015

Test build #41409 has finished for PR 8033 at commit 68a6bc4.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #1684 timed out for PR 8033 at commit 68a6bc4 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #1687 has finished for PR 8033 at commit 68a6bc4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Aug 25, 2015

I think this PR is clearly passing, and as I understand several people are in favor of the general idea. I think I've implemented it correctly. I am going to merge for master / 1.6 the next time it passes since it frequently needs rebasing, and is evidently passing in general already.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41528 timed out for PR 8033 at commit 1d16afb after a configured wait of 175m.

@srowen
Copy link
Member Author

srowen commented Aug 25, 2015

I'm going to merge this on the grounds that it has passed in its current form (modulo rebases) several times now; the last test run showed all Java/Scala/Python 2.6 tests succeed (and merely timed out at the end with Python 3.4.) It will only go into master as well. Since it needs a rebase every 24-48 hours and about half the tests fail due to flaky tests I think it is rare to get a clean pass in any event.

@srowen srowen closed this Aug 25, 2015
@srowen srowen deleted the SPARK-9613 branch August 25, 2015 11:34
asfgit pushed a commit that referenced this pull request Aug 25, 2015
…ng uses to JavaConverters

Replace `JavaConversions` implicits with `JavaConverters`

Most occurrences I've seen so far are necessary conversions; a few have been avoidable. None are in critical code as far as I see, yet.

Author: Sean Owen <sowen@cloudera.com>

Closes #8033 from srowen/SPARK-9613.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants