feat(spark): bitwise functions#309
Conversation
3bf4782 to
89a79ba
Compare
| extends TypeVisitor.TypeThrowsVisitor[DataType, RuntimeException]("Unknown expression type.") { | ||
|
|
||
| override def visit(expr: Type.I8): DataType = ByteType | ||
| override def visit(expr: Type.I16): DataType = ShortType |
There was a problem hiding this comment.
while you're at it, mind adding these also to ToSparkType?
There was a problem hiding this comment.
This is in ToSparkType. The conversion in ToSubstraitType is already there further down this file. It converts both ways.
| } | ||
|
|
||
| override def visit(expr: SExpression.I16Literal): Expression = { | ||
| Literal(expr.value().asInstanceOf[Short], ToSubstraitType.convert(expr.getType)) |
There was a problem hiding this comment.
and here also the other direction for the conversion?
There was a problem hiding this comment.
It's already covered in the other direction in ToSubstraitExpression on the following line:
case SubstraitLiteral(substraitLiteral) => Some(substraitLiteral)
The unapply method in this object invokes the conversion.
| "org.apache.spark.sql.catalyst.expressions.PromotePrecision") => | ||
| translateUp(p.children.head) | ||
| case CaseWhen(branches, elseValue) => translateCaseWhen(branches, elseValue) | ||
| case InSet(value, set) => translateIn(value, set.toSeq.map(v => Literal(v))) |
There was a problem hiding this comment.
nit: mind moving this next to the case In(..)?
There was a problem hiding this comment.
Sure, although InSet needs to come before ScalarFunction otherwise it matches the latter (since it's a UnaryExpression). I've moved In up the list so they are together.
| Params: | ||
| base – the base number to shift. | ||
| shift – number of bits to right shift. | ||
| impls: |
There was a problem hiding this comment.
It looks like in Spark the base can be either int or long, and return type is set accordingly. I we should add both options?
Quickly testing this, it works for select shiftright(col, 2) from (values (bigint(1234)) as table(col)) but not for select shiftright(col, 2) from (values (1234) as table(col)), so yep I think we need to list both versions here.
There was a problem hiding this comment.
Overall I think it's fine to add the function here initially, but it'd be good to also file the PR/Issue on the core functions since this seems general enough, I think :)
There was a problem hiding this comment.
I agree, although that might take a bit longer ;)
89a79ba to
8544c81
Compare
|
I'll fixup the conflict :) |
8544c81 to
024b800
Compare
Adds support in the spark module for 8-bit and 16-bit integer types and for some bitwise functions. The catalyst optimizer generates expressions using these for certain query types. Note that `shift_right` (and other bit shifting functions) might want to be considered for the core substrait function catalog, but it has been added here (temporarily?) as spark extension pending a longer term discussion/decision on their wider utility. Signed-off-by: Andrew Coleman <andrew_coleman@uk.ibm.com>
024b800 to
dda5582
Compare
Adds support in the spark module for 8-bit and 16-bit integer types and for some bitwise functions. The catalyst optimizer generates expressions using these for certain query types.
Note that
shift_right(and other bit shifting functions) might want to be considered for the core substrait function catalog, but it has been added here (temporarily?) as spark extension pending a longer term discussion/decision on their wider utility.