-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate methods for flattening the results of many joins #1097
Conversation
These look like bijections rather than something for scalding? |
implicit class LeftNestedAlternatingTupleOptionFlatten22[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V](nested: (Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[(Option[A], Option[B])], Option[C])], Option[D])], Option[E])], Option[F])], Option[G])], Option[H])], Option[I])], Option[J])], Option[K])], Option[L])], Option[M])], Option[N])], Option[O])], Option[P])], Option[Q])], Option[R])], Option[S])], Option[T])], Option[U])], Option[V]) ) { | ||
def flatten: (Option[A], Option[B], Option[C], Option[D], Option[E], Option[F], Option[G], Option[H], Option[I], Option[J], Option[K], Option[L], Option[M], Option[N], Option[O], Option[P], Option[Q], Option[R], Option[S], Option[T], Option[U], Option[V]) = { | ||
val (rest1, v) = nested | ||
val (rest2, u) = rest1.getOrElse(pairOfNones) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. How does it compare to:
val (rest21, v) = nested
val t21 = LeftNestedAlternatingTupleOptionFlatten21(rest21)
(t21._1, t21._2 ....t21._21, v)
performance (kind of recursive calls vs getOrElse calls) might just the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that.
It would work, though now you're making nested function calls, so it seems like it would be slightly more expensive.
The other option would be to short circuit after the first .getOrElse
is triggered, but to do that is tricky.
'flatten' normally removes Nones in the context of collections. The context is quite different here and I can't think of a better name, but it could cause some confusion during casual read of the code. |
@rangadi maybe, but this is a method on a tuple, not on a collection (don't know if that makes it better)
The nice thing about generating these methods vs. a macro is that this 'flatten' method will only be available on a Tuple with exactly this structure, and if you look at a tuple of this structure and wonder "what would it mean to flatten this?" I think the answer is what I've got here. If you look at the outer join case, the result of (Option[A], Option[B]...) is about as flat as you can get unless you went with Option[(A, B...)] which isn't very useful and seems like it wouldn't be called flatten? Any ideas for a better name? |
@ianoc don't know if you saw the discussion in #1096 for context These could be written as Bijections (though I don't think the inverse is useful even though it is possible), but I think it wouldn't be as easy to use, eg you would have to do: As for whether it belongs in the Bijection codebase -- maybe? As far as I can tell you would only ever have tuples with this odd structure as a result of a scalding join, so the motivation for these is to help with the output of scalding joins. I think long-term, the best would be to have this happen automatically whenever there's a join followed by a join? I guess the best API would be if:
just automatically returned you a |
I agree, flatten is a very good name. I can't think of good alternative. |
What if you generate the methods for
Which just added the right |
@avibryant that would work, but would it make more sense for putting it on |
I was assuming you'd just inline it, since as you point out the methods are not really very useful in other contexts. |
Yeah makes sense. I'll update with that + the overloaded join methods mentioned in #1096 |
re: serialization -- we'll sidestep this one by only doing join1 through 6, and just inline them CoGroupable (though I'd like to get to the bottom of the fear of self: bounds someday) re: fancy types -- done! good call. re: "Left" well, I didn't know what to name the difference between: Except that the first is something you only get as a result of leftJoins and the second is something you only get as a result of outerJoins I need to also add rightJoins.. |
OK, even with only join2-6 it's 83 extra lines in CoGroupable so I think we should leave it in a separate trait |
Updated to use better types, rename to flattenValueTuple, removed some extra { }, removed use of |
} | ||
} | ||
|
||
implicit def toFlattenLeftJoin[KEY, KLL[KLL_K, +KLL_V] <: KeyedListLike[KLL_K, KLL_V, KLL], A, B, C](nested: KLL[KEY, ((A, B), C)]) = new FlattenLeftJoin3(nested) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you could not have two implicits with the same name. You sure these resolve? I think they shadow each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that might be why it wasn't working :)
I haven't had a chance to look into it
working on this so it can go out today in 0.12 |
Seems sane, would love to see a simple test even just to act as a scaladoc on this if nothing else |
I added a test that calls MultiJoin. This code is so simple, the compiler is really doing most of the work for us. A large set of errors are ruled out due to the types. |
Generate methods for flattening the results of many joins
Thanks @johnynek !
|
...looks like @ianoc merged already :) |
Well discuss away... I haven't finished the release yet... |
I felt like we got consensus after @julienledem and @avibryant fine tuned the .multi proposal. There was a feeling (from me, and strong +1 from Avi) that adding 66 methods to CoGroupable was too much clutter. |
Here's what the code-gen based solution to #1096 looks like (I'm fine with using a macro instead, but I just wanted to see how hard this would be and it was pretty straightforward).
works for both the result of left/right joins:
And for the result of outer joins:
Doesn't have any tests yet, and the generator is a bit long.