Flatten operator for Optional columns. #307

imarios · 2018-06-19T05:01:46Z

val ds: TypedDataset[(Int,Option[Int])] = TypedDataset.create(Seq((1,Option(1))))
ds.flatten('_2): TypedDataset[(Int,Int)]

codecov-io · 2018-06-19T06:11:59Z

Codecov Report

Merging #307 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #307   +/-   ##
=======================================
  Coverage   94.79%   94.79%           
=======================================
  Files          52       52           
  Lines         961      961           
  Branches        9        9           
=======================================
  Hits          911      911           
  Misses         50       50

Impacted Files	Coverage Δ
...ataset/src/main/scala/frameless/TypedDataset.scala	`100% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c763176...a0e8c2e. Read the comment docs.

OlivierBlanvillain

Looks really cool! I'm I right saying that this is not in vanilla? A few comments:

If not it should probably be better documented, at least with an example.
The name make it sound more general that what it is. .flatten should be .flatMap(identity).

imarios · 2018-06-27T03:52:49Z

@OlivierBlanvillain Yea, I think in Vanilla you can use flatMap(_.a) on a Dataset, with the problem of having to serialize all the columns. Another way to do that with DataFrames is filter($"a".isNotNull), which is essentially the dynamically typed version of it. Both examples worth documenting, so let me do that.

Do you have any alternative naming suggestions?

OlivierBlanvillain · 2018-06-27T06:31:22Z

I seen, then maybe

flatMapOption
flatMapNotNull
flattenOption
flattenNotNull

?

imarios · 2018-06-29T05:33:54Z

@OlivierBlanvillain I think flattenOption feels more appropriate.

OlivierBlanvillain · 2018-06-29T07:19:28Z

LGTM! Could you maybe add a minimal example to the scaladoc of flattenOption?

For discoverability, what do you think about adding a "new APIs" and "missing APIs" sections to the readme? This way someone familiar with Spark could pickup frameless in no time by looking at this diff.

imarios · 2018-07-01T02:54:21Z

@OlivierBlanvillain done! If there are no more suggestions, let me squash and merge.

OlivierBlanvillain · 2018-07-01T05:16:08Z

LGTM!

Flatten operator for Optional columns.

cbd850b

imarios requested a review from OlivierBlanvillain June 19, 2018 05:03

fix typo

39cd63b

imarios mentioned this pull request Jun 24, 2018

Add Na Function 'drop' #309

Closed

OlivierBlanvillain reviewed Jun 26, 2018

View reviewed changes

Renaming to flattenOption

4355649

Adding example in Scala Docs

a0e8c2e

imarios merged commit 3ad68b3 into typelevel:master Jul 1, 2018

imarios mentioned this pull request Jul 1, 2018

Projection from Option[B] to B #292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flatten operator for Optional columns. #307

Flatten operator for Optional columns. #307

imarios commented Jun 19, 2018 •

edited

codecov-io commented Jun 19, 2018 •

edited

OlivierBlanvillain left a comment

imarios commented Jun 27, 2018 •

edited

OlivierBlanvillain commented Jun 27, 2018

imarios commented Jun 29, 2018

OlivierBlanvillain commented Jun 29, 2018

imarios commented Jul 1, 2018 •

edited

OlivierBlanvillain commented Jul 1, 2018

Flatten operator for Optional columns. #307

Flatten operator for Optional columns. #307

Conversation

imarios commented Jun 19, 2018 • edited

codecov-io commented Jun 19, 2018 • edited

Codecov Report

OlivierBlanvillain left a comment

Choose a reason for hiding this comment

imarios commented Jun 27, 2018 • edited

OlivierBlanvillain commented Jun 27, 2018

imarios commented Jun 29, 2018

OlivierBlanvillain commented Jun 29, 2018

imarios commented Jul 1, 2018 • edited

OlivierBlanvillain commented Jul 1, 2018

imarios commented Jun 19, 2018 •

edited

codecov-io commented Jun 19, 2018 •

edited

imarios commented Jun 27, 2018 •

edited

imarios commented Jul 1, 2018 •

edited