-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add ArrayType and explode function #25
base: main
Are you sure you want to change the base?
Conversation
@@ -86,6 +86,14 @@ object Encoder: | |||
type ColumnType = DoubleOptType | |||
def catalystType = sql.types.DoubleType | |||
|
|||
inline given arrayFromMirror[A](using encoder: Encoder[A]): (Encoder[Seq[A]] { type ColumnType = ArrayOptType[encoder.ColumnType] }) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding basic support for arrays is something that probably deserves a separate PR on its own. As it's slightly more complex: we should reuse encoders of element types and support both nullable and nonnullable arrays. I have some drafts of the implementation mixed with other changes locally but I'll try to extract it and get merged to main
import org.virtuslab.iskra.api.* | ||
import functions.explode | ||
|
||
case class Foo(ints: Seq[Int]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider more cases here: Not only Seq[Int]
but also Seq[Option[Int]]
, Option[Seq[Int]]
and Option[Seq[Option[Int]]]
and check how these should behave at compile time and at runtime
Foo(Seq(1)), | ||
Foo(Seq(2)), | ||
Foo(Seq()), | ||
Foo(null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For maximal type safety optional values should be represented by Option[...]
. TBH I haven't thought about how to prevent users from using null
s explicitly yet. Maybe -Yexplicit-nulls
could come to the rescue. Alternatively we could have some runtime assertions performed when toTypedDF
is called`. However both these things would probably have to be opt-in
import org.virtuslab.iskra.Column | ||
import org.virtuslab.iskra.types.{ ArrayOptType, DataType } | ||
|
||
def explode[T <: DataType](c: Column[ArrayOptType[T]]): Column[T] = Column(sql.functions.explode(c.untyped)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a way to prevent users from using explode
more than once in the same select
clause as that would result in a runtime error. This constraint doesn't seem to be easy to express in the current model of iskra. However I'm in the middle of a major redesign of the library's model so I'll try to take this use case into account
@nightscape thanks for your contribution! I'm afraid your changes can't be incorporated into the main branch at the moment because of the reasons mentioned in the comments but I'll try to find some time to take care of them to unblock you |
@prolativ I hadn't seen your |
Not sure if I understood everything correctly, I was mostly applying monkey see, monkey do 😉