[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python #41653

panbingkun · 2023-06-19T03:50:17Z

What changes were proposed in this pull request?

Add following functions:

try_add
try_avg
try_divide
try_element_at
try_multiply
try_subtract
try_sum
try_to_binary
try_to_number
try_to_timestamp

to:

Scala API
Python API
Spark Connect Scala Client
Spark Connect Python Client

Why are the changes needed?

for parity

Does this PR introduce any user-facing change?

Yes, new functions.

How was this patch tested?

Add New UT.

zhengruifeng · 2023-06-19T12:39:43Z

connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala

+   * Returns `dividend``/``divisor`. It always performs floating point division. Its result is
+   * always null if `divisor` is 0.
+   *
+   * @note


in this PR, let's use call_udf for better parity

zhengruifeng · 2023-06-20T03:51:54Z

connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala

+   * Returns the sum of `left` and `right` and the result is null on overflow. The acceptable
+   * input types are the same with the `+` operator.
+   *
+   * @note


it should be supported naturally in Connect

zhengruifeng · 2023-06-20T03:52:18Z

connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala

+   * Returns `left``*``right` and the result is null on overflow. The acceptable input types are the
+   * same with the `*` operator.
+   *
+   * @note


zhengruifeng · 2023-06-20T03:52:33Z

connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala

+   * Returns `left`-`right` and the result is null on overflow. The acceptable input types are the
+   * same with the `-` operator.
+   *
+   * @note


zhengruifeng · 2023-06-20T03:53:24Z

python/pyspark/sql/functions.py

+
+    .. versionadded:: 3.5.0
+
+    Notes


zhengruifeng · 2023-06-20T03:53:32Z

python/pyspark/sql/functions.py

+
+    .. versionadded:: 3.5.0
+
+    Notes


zhengruifeng · 2023-06-20T03:53:39Z

python/pyspark/sql/functions.py

+
+    .. versionadded:: 3.5.0
+
+    Notes


zhengruifeng · 2023-06-20T03:54:30Z

sql/core/src/main/scala/org/apache/spark/sql/functions.scala

+   * Returns the sum of `left` and `right` and the result is null on overflow. The acceptable
+   * input types are the same with the `+` operator.
+   *
+   * @note


please check similar function in vanilla scala apis

zhengruifeng · 2023-06-20T07:51:50Z

sql/core/src/main/scala/org/apache/spark/sql/functions.scala

+   * @since 3.5.0
+   */
+  def try_add(left: Column, right: Column): Column = {
+    call_udf("try_add", left, right)


sorry, let's directly use UnresolvedFunction for this case

since there were already some similar cases in functions, e.g.

spark/sql/core/src/main/scala/org/apache/spark/sql/functions.scala

Lines 3454 to 3463 in 476c58e

/**

* Left-pad the binary column with pad to a byte length of len. If the binary column is longer

* than len, the return value is shortened to len bytes.

*

* @group string_funcs

* @since 3.3.0

*/

def lpad(str: Column, len: Int, pad: Array[Byte]): Column = withExpr {

UnresolvedFunction("lpad", Seq(str.expr, lit(len).expr, lit(pad).expr), isDistinct = false)

}

Okay, let me test the issue with scaladoc first.

zhengruifeng · 2023-06-20T16:39:11Z

thank you @panbingkun merged to master

zhengruifeng · 2023-06-20T16:44:38Z

python/pyspark/sql/functions.py

+@try_remote_functions
+def try_element_at(col: "ColumnOrName", extraction: "ColumnOrName") -> Column:
+    """
+    (array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will


@panbingkun for try_element_at, maybe we should unify the second parameter's name to index, in both scala and python

Let me fix it in the following PR.

@panbingkun I just notice that try_element_at use the same names as element_at, so current commit is fine. we don't need any follow up.

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

a1387bf

github-actions bot added CONNECT CORE PYTHON SQL labels Jun 19, 2023

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

416e8f7

zhengruifeng approved these changes Jun 19, 2023

View reviewed changes

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

1237199

zhengruifeng reviewed Jun 19, 2023

View reviewed changes

zhengruifeng reviewed Jun 20, 2023

View reviewed changes

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

8076cb7

zhengruifeng reviewed Jun 20, 2023

View reviewed changes

panbingkun added 2 commits June 20, 2023 16:34

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

6be41ee

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

31a79c1

zhengruifeng approved these changes Jun 20, 2023

View reviewed changes

panbingkun added 2 commits June 20, 2023 19:09

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

afefe85

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python

bdb35ac

zhengruifeng closed this in 68b3005 Jun 20, 2023

zhengruifeng reviewed Jun 20, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python #41653

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python #41653

panbingkun commented Jun 19, 2023 •

edited

zhengruifeng Jun 19, 2023

panbingkun Jun 19, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

zhengruifeng Jun 20, 2023

panbingkun Jun 20, 2023

zhengruifeng commented Jun 20, 2023

zhengruifeng Jun 20, 2023

panbingkun Jun 21, 2023

zhengruifeng Jun 21, 2023

	/**
	* Left-pad the binary column with pad to a byte length of len. If the binary column is longer
	* than len, the return value is shortened to len bytes.
	*
	* @group string_funcs
	* @since 3.3.0
	*/
	def lpad(str: Column, len: Int, pad: Array[Byte]): Column = withExpr {
	UnresolvedFunction("lpad", Seq(str.expr, lit(len).expr, lit(pad).expr), isDistinct = false)
	}

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python #41653

[SPARK-43939][CONNECT][PYTHON] Add try_* functions to Scala and Python #41653

Conversation

panbingkun commented Jun 19, 2023 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhengruifeng commented Jun 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

panbingkun commented Jun 19, 2023 •

edited