Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boosting Spark to 2.1.0. Adding tests that include Java types encoded … #105

Closed
wants to merge 2 commits into from

Conversation

imarios
Copy link
Contributor

@imarios imarios commented Jan 29, 2017

…using injection.

@imarios imarios changed the title Boosting Spark to 2.1.0. Adding tests that include Java type encoded … Boosting Spark to 2.1.0. Adding tests that include Java types encoded … Jan 29, 2017
@codecov-io
Copy link

codecov-io commented Jan 29, 2017

Codecov Report

Merging #105 into master will increase coverage by 0.09%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #105      +/-   ##
==========================================
+ Coverage   90.72%   90.82%   +0.09%     
==========================================
  Files          25       25              
  Lines         507      512       +5     
  Branches        7        7              
==========================================
+ Hits          460      465       +5     
  Misses         47       47
Impacted Files Coverage Δ
core/src/main/scala/frameless/Injection.scala 100% <ø> (ø) ⬆️
dataset/src/main/scala/frameless/implicits.scala 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3ad8d3...59c6a61. Read the comment docs.

@OlivierBlanvillain
Copy link
Contributor

Apparently Spark's 2.10 jar contains an old version of shapeless, introduced by this PR apache/spark#14150. I hope this won't cause binary incompatibilities...

@@ -32,4 +32,12 @@ object implicits {
// implicit def floatToDouble[T](col: TypedColumn[T, Float]): TypedColumn[T, Double] = col.cast[Double]
// implicit def floatToBigDecimal[T](col: TypedColumn[T, Float]): TypedColumn[T, BigDecimal] = col.cast[BigDecimal]
}

object injections {
implicit val javaBoolean: Injection[java.lang.Boolean, Boolean] = Injection(a => a, b => b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DId you consider Injection[java.lang.Boolean, Option[Boolean]]? Often when I see Java boxed primitives, it's Java-style Scala code using nulls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I didn't think about this. From what I tested, using Optional here doesn't come for free in terms of extra code being generated. My main goal with the Java types here was for completeness and and enrich some of the tests.

@imarios
Copy link
Contributor Author

imarios commented Jan 31, 2017

@OlivierBlanvillain looks to be ok for now! I didn't see any serious issue with 2.1.0.

@OlivierBlanvillain
Copy link
Contributor

@imarios
Copy link
Contributor Author

imarios commented Feb 1, 2017

@OlivierBlanvillain I didn't really see any problems in our code. We can try to add an exclusion rule in Spark to make sure that they are using our own shapeless. Else we will need to publish shapeless under a separate maven coordinate so the class-paths don't collide. Shading will not really work for us here.

@julien-truffaut
Copy link
Contributor

will there be a way to get support for several spark version? e.g. 2.0 and 2.1?

@imarios
Copy link
Contributor Author

imarios commented Feb 1, 2017

@julien-truffaut I think the best is to release a new version of Frameless for every new version of Spark and have a table saying Spark 2.0.x use Frameless 0.2.0 for Spark 2.1.0 use Frameless 0.3.0, etc. Supporting multiple Spark version for every release will be a lot of extra work for us (testing and validating all new features across different spark version). We might want to do that later if we see people asking for it and us having more contributors to actually take on the challenge.

@kanterov
Copy link
Contributor

kanterov commented Feb 2, 2017

Fixed problems with udfs #107

@imarios
Copy link
Contributor Author

imarios commented Feb 19, 2017

Hey @kanterov, let me know if you want any changes for this PR. thanks!

@jeremyrsmith
Copy link
Contributor

@imarios hate to be the one to say it, but I really want to use frameless at $work (hence the sudden flurry of PRs) and we're stuck on 2.0 for the time being. If someone needs to maintain cross-building for 2.0 and 2.1 then I'll volunteer, as long as we can drop Scala 2.10 (hopefully nobody is using Spark 2 on Scala 2.10?)

@imarios
Copy link
Contributor Author

imarios commented Feb 19, 2017

@jeremyrsmith so glad that you guys use it work! I think it's a good idea to move Frameless together with Spark. Maybe not same day releases but if we are having a new release it should be with latest Spark. The assumption here is that anyone using frameless is production is also brave enough to be updating their Spark dependencies as new version come out :D

@OlivierBlanvillain
Copy link
Contributor

@imarios Could you explain the need for this injections object? If it's only for testing I would rather have it moved to tests...

@imarios
Copy link
Contributor Author

imarios commented Apr 24, 2017

@OlivierBlanvillain I will actually revisit this PR. Maybe close and open two separate. One for boosting Spark and one for the Injection tests. Right now I wouldn't consider this for 0.3.0.

@imarios imarios closed this May 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants