-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approx Percentile #3301
Approx Percentile #3301
Conversation
Signed-off-by: Andy Grove <andygrove@nvidia.com>
6d42899
to
8527f94
Compare
Signed-off-by: Andy Grove <andygrove@nvidia.com>
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/AggregateFunctions.scala
Outdated
Show resolved
Hide resolved
… is at least as accurate as CPU
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
override def aggBufferAttributes: Seq[AttributeReference] = outputBuf :: Nil | ||
|
||
// Mark as lazy to avoid being initialized when creating a GpuApproximatePercentile. | ||
override lazy val initialValues: Seq[GpuExpression] = throw new UnsupportedOperationException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is needed in cases where we get no data to a given task, or in some cases it might be a reduction where no data showed up at all. Then the initialValues
ends up being what is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I will work on writing a test case to trigger this exception and then implement something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is only invoked in reduction use cases when there are empty batches. We don't support reduction yet for this operator so I have updated the exception to indicate that.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/AggregateFunctions.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
…tePercentile.scala Co-authored-by: Alessandro Bellina <abellina@gmail.com>
… into approx-percentile
|
||
// The result type is the same as the input type. | ||
private lazy val internalDataType: DataType = { | ||
if (returnPercentileArray) ArrayType(child.dataType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, indentation seems a little odd in this if statement.
override def columnarEval(batch: ColumnarBatch): Any = { | ||
val expr = child.asInstanceOf[GpuExpression] | ||
withResource(expr.columnarEval(batch).asInstanceOf[GpuColumnVector]) { cv => | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, extra line.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Show resolved
Hide resolved
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuApproximatePercentile.scala
Outdated
Show resolved
Hide resolved
build |
build failed with connection error, will try again |
build |
Signed-off-by: Andy Grove andygrove@nvidia.com
Depends on rapidsai/cudf#9094
Status: