New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-42725][CONNECT][PYTHON] Make LiteralExpression support array params #40349
Conversation
28004bc
to
012b9df
Compare
@@ -277,14 +281,26 @@ def _infer_type(cls, value: Any) -> DataType: | |||
return DateType() | |||
elif isinstance(value, datetime.timedelta): | |||
return DayTimeIntervalType() | |||
else: | |||
elif isinstance(value, np.generic): | |||
if isinstance(value, np.generic): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems duplicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
@@ -437,7 +436,6 @@ def test_literal_with_unsupported_type(self): | |||
(0.1, DecimalType()), | |||
(datetime.date(2022, 12, 13), TimestampType()), | |||
(datetime.timedelta(1, 2, 3), DateType()), | |||
([1, 2, 3], ArrayType(IntegerType())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add test for Array<Array> type as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arrayType was not support before, here just check the raised AssertionError
self.assertEqual(l2.array.elements[1].double, -3.0) | ||
self.assertEqual(l2.array.elements[2].double, 0.0) | ||
|
||
l3 = LiteralExpression._from_value([[3, 4], [5, 6, 7]]).to_plan(None).literal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WeichenXu123 l3
and l4
are array < array >
012b9df
to
3411ecd
Compare
add a test for array_string
3411ecd
to
518ffb2
Compare
…arams ### What changes were proposed in this pull request? Make LiteralExpression support array ### Why are the changes needed? MLIib requires literal to carry the array params, like `IntArrayParam`, `DoubleArrayArrayParam`. Note that this PR doesn't affect existing `functions.lit` method which apply unresolved `CreateArray` expression to support array input. ### Does this PR introduce _any_ user-facing change? No, dev-only ### How was this patch tested? added UT Closes #40349 from zhengruifeng/connect_py_ml_lit. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org> (cherry picked from commit d6d0fc7) Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
merged into master/branch-3.4 |
DataType elementType = 1; | ||
repeated Literal element = 2; | ||
DataType element_type = 1; | ||
repeated Literal elements = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first defined it, I confirmed to @hvanhovell that it should be element
rather than elements
, so I defined it as element
.
At the same time, I found that we sometimes use the singular and sometimes the plural in the definition of the repeated
field.
spark/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto
Lines 62 to 65 in 79b5abe
repeated Expression partition_spec = 2; | |
// (Optional) Ordering of rows in a partition. | |
repeated SortOrder order_spec = 3; |
spark/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto
Line 288 in 79b5abe
repeated string name = 2; |
I think we should have a unified rule.
…arams ### What changes were proposed in this pull request? Make LiteralExpression support array ### Why are the changes needed? MLIib requires literal to carry the array params, like `IntArrayParam`, `DoubleArrayArrayParam`. Note that this PR doesn't affect existing `functions.lit` method which apply unresolved `CreateArray` expression to support array input. ### Does this PR introduce _any_ user-facing change? No, dev-only ### How was this patch tested? added UT Closes apache#40349 from zhengruifeng/connect_py_ml_lit. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org> (cherry picked from commit d6d0fc7) Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Make LiteralExpression support array
Why are the changes needed?
MLIib requires literal to carry the array params, like
IntArrayParam
,DoubleArrayArrayParam
.Note that this PR doesn't affect existing
functions.lit
method which apply unresolvedCreateArray
expression to support array input.Does this PR introduce any user-facing change?
No, dev-only
How was this patch tested?
added UT