-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure collection encoders to resolve encoding issues in the REPL #195
Conversation
Codecov Report
@@ Coverage Diff @@
## master #195 +/- ##
==========================================
- Coverage 95.7% 95.69% -0.01%
==========================================
Files 40 39 -1
Lines 814 790 -24
Branches 10 10
==========================================
- Hits 779 756 -23
+ Misses 35 34 -1
Continue to review full report at Codecov.
|
Hmm ... strange indeed. Where you able to reproduce the bug in the REPL? |
Yeah. Pretty weird. The offending code generation is at I'll continue looking later today |
@iravid thank you for looking into this! I really curious to see what is causing this ... |
@imarios Hey - this PR shouldn't be merged yet; I will update it soon(TM) with some restructuring of the typed encoders for collections. If by "wait" you meant the 0.4 release, it'd be good if we could wait a few more days to see if we can come up with a solution. |
@imarios This now resolves the issue. Our encoders are now more in line with how Spark serializes/deserializes collections. |
@iravid overall looks pretty good! I tried this and works super smooth: case class X(a: Array[Array[Array[Long]]], b: Map[Array[String], Array[Map[Int,Int]]])
val t = TypedDataset.create(Seq(X(Array(Array(Array(12323L))), Map(Array("a","aasdas")-
>Array(Map(10->10), Map(1->10, 2->20))))))
t.collect().run.head.b.keys.head(1) |
Awesome :-) |
I’ll work on the coverage shortly |
@iravid great! can you also take a look at the 2 comments I made? Thank you very much for taking the time to re-write this!! |
@imarios did you submit the review? I don't see any comments |
|
||
def toCatalyst(path: Expression): Expression = | ||
T.jvmRepr match { | ||
case IntegerType | LongType | DoubleType | FloatType | ShortType | BooleanType => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CharType does not exist? right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, no such type. I tested again if Array[Char]
works, still no luck.
@@ -196,6 +196,116 @@ object TypedEncoder { | |||
) | |||
} | |||
|
|||
implicit def arrayEncoder[T: ClassTag]( | |||
implicit | |||
T: TypedEncoder[T] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse of T
looks kind of strange to me ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, habits from cats
:-) I'll change it to underlying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's a cute idea :) ... but it does look kind of funky
@iravid yup, didn't click submit :/ |
Looks great. if you have time to improve coverage it will be awesome. This will be the last PR before 0.4. I think @adelbertc is waiting for us to give him the go to publish master as 0.4. |
cf2f8fb
to
1b2aa37
Compare
Coverage looks good now @imarios |
@GrafBlutwurst btw - this also solves the issue of encoding |
LGTM! |
@imarios - I added tests to try and exercise the scenario you described in #193, but they seem to pass locally. Any chance that this is a REPL thing?