New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16286][SQL] Implement stack table generating function #14033
Changes from all commits
51738d7
4c5ed83
8abcab5
9913dd2
7ed75b4
6cef803
4445248
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,6 +93,59 @@ case class UserDefinedGenerator( | |
override def toString: String = s"UserDefinedGenerator(${children.mkString(",")})" | ||
} | ||
|
||
/** | ||
* Separate v1, ..., vk into n rows. Each row will have k/n columns. n must be constant. | ||
* {{{ | ||
* SELECT stack(2, 1, 2, 3) -> | ||
* 1 2 | ||
* 3 NULL | ||
* }}} | ||
*/ | ||
@ExpressionDescription( | ||
usage = "_FUNC_(n, v1, ..., vk) - Separate v1, ..., vk into n rows.", | ||
extended = "> SELECT _FUNC_(2, 1, 2, 3);\n [1,2]\n [3,null]") | ||
case class Stack(children: Seq[Expression]) | ||
extends Expression with Generator with CodegenFallback { | ||
|
||
private lazy val numRows = children.head.eval().asInstanceOf[Int] | ||
private lazy val numFields = Math.ceil((children.length - 1.0) / numRows).toInt | ||
|
||
override def checkInputDataTypes(): TypeCheckResult = { | ||
if (children.length <= 1) { | ||
TypeCheckResult.TypeCheckFailure(s"$prettyName requires at least 2 arguments.") | ||
} else if (children.head.dataType != IntegerType || !children.head.foldable || numRows < 1) { | ||
TypeCheckResult.TypeCheckFailure("The number of rows must be a positive constant integer.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. include the value of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? |
||
} else { | ||
for (i <- 1 until children.length) { | ||
val j = (i - 1) % numFields | ||
if (children(i).dataType != elementSchema.fields(j).dataType) { | ||
return TypeCheckResult.TypeCheckFailure( | ||
s"Argument ${j + 1} (${elementSchema.fields(j).dataType}) != " + | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's better to say There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't sure for 1st and 2nd. So, I borrowed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I replace to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not a big deal, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. Thank you for this. |
||
s"Argument $i (${children(i).dataType})") | ||
} | ||
} | ||
TypeCheckResult.TypeCheckSuccess | ||
} | ||
} | ||
|
||
override def elementSchema: StructType = | ||
StructType(children.tail.take(numFields).zipWithIndex.map { | ||
case (e, index) => StructField(s"col$index", e.dataType) | ||
}) | ||
|
||
override def eval(input: InternalRow): TraversableOnce[InternalRow] = { | ||
val values = children.tail.map(_.eval(input)).toArray | ||
for (row <- 0 until numRows) yield { | ||
val fields = new Array[Any](numFields) | ||
for (col <- 0 until numFields) { | ||
val index = row * numFields + col | ||
fields.update(col, if (index < values.length) values(index) else null) | ||
} | ||
InternalRow(fields: _*) | ||
} | ||
} | ||
} | ||
|
||
/** | ||
* A base class for Explode and PosExplode | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also include the number of args passed and the args ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Could you give some example what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I merged this PR before see your comments. Yea including the number of args makes the error message more friendly, but not a big deal, @dongjoon-hyun you can fix it in your next PR by the way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Now I understand the meaning. Sure, someday later.
Maybe, the pattern is popular, so we can fix all of the error message together in a single PR.