Skip to content

feat(spark): add some numeric function mappings#317

Merged
Blizzara merged 1 commit intosubstrait-io:mainfrom
andrew-coleman:numeric_functions
Nov 26, 2024
Merged

feat(spark): add some numeric function mappings#317
Blizzara merged 1 commit intosubstrait-io:mainfrom
andrew-coleman:numeric_functions

Conversation

@andrew-coleman
Copy link
Copy Markdown
Member

No description provided.

Signed-off-by: Andrew Coleman <andrew_coleman@uk.ibm.com>
Copy link
Copy Markdown
Contributor

@Blizzara Blizzara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Approved but with a note on the StddevSamp

s[Max]("max"),
s[First]("any_value"),
s[HyperLogLogPlusPlus]("approx_count_distinct")
s[HyperLogLogPlusPlus]("approx_count_distinct"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tehcnically the "approx_count_distinct" says it's HyperLogLog while this is HLL++, but given the ++ should just be better that's probably fine!

s[First]("any_value"),
s[HyperLogLogPlusPlus]("approx_count_distinct")
s[HyperLogLogPlusPlus]("approx_count_distinct"),
s[StddevSamp]("std_dev")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substrait's std_dev needs an option to specify if it's sample or population based: https://github.com/substrait-io/substrait/blob/9cccb04fba336489b70ed42b71f73a0a1e34f9f5/extensions/functions_arithmetic.yaml#L1335. Spark has both, as StddevSamp and StddevPop.

Still I think this is fine to merge liek this for now, just noting that plans produced without the option may not work as expected elsewhere or in future with Spark.

I have some code for handling the options which should work here, I'll try to push it up soon.

"q70", "q71", "q73", "q76", "q77", "q79",
"q80", "q81", "q82", "q85", "q86", "q87", "q88",
"q90", "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
val failingSQL: Set[String] = Set(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@andrew-coleman
Copy link
Copy Markdown
Member Author

@Blizzara @vbarua can this one be merged now? Thanks :)

@Blizzara Blizzara merged commit 6bb46ac into substrait-io:main Nov 26, 2024
@Blizzara
Copy link
Copy Markdown
Contributor

@Blizzara @vbarua can this one be merged now? Thanks :)

Ah yep, sorry for the delay!

@andrew-coleman andrew-coleman deleted the numeric_functions branch November 26, 2024 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants