[SPARK-48766][PYTHON] Document the behavior difference of `extraction…

…` between `element_at` and `try_element_at` ### What changes were proposed in this pull request? Document the behavior difference of `extraction` between `element_at` and `try_element_at` ### Why are the changes needed? when the function `try_element_at` was introduced in 3.5, its `extraction` handling was unintentionally not consistent with the `element_at`, which causes confusion. This PR document this behavior difference (I don't think we can fix it since it will be a breaking change). ``` In [1]: from pyspark.sql import functions as sf In [2]: df = spark.createDataFrame([({"a": 1.0, "b": 2.0}, "a")], ['data', 'b']) In [3]: df.select(sf.try_element_at(df.data, 'b')).show() +-----------------------+ |try_element_at(data, b)| +-----------------------+ | 1.0| +-----------------------+ In [4]: df.select(sf.element_at(df.data, 'b')).show() +-------------------+ |element_at(data, b)| +-------------------+ | 2.0| +-------------------+ ``` ### Does this PR introduce _any_ user-facing change? doc changes ### How was this patch tested? ci, added doctests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47161 from zhengruifeng/doc_element_at_extraction. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
HyukjinKwon · Jul 1, 2024 · 5ac7c9b · 5ac7c9b
1 parent 5c29d8d
commit 5ac7c9b
Showing 1 changed file with 36 additions and 0 deletions.
diff --git a/python/pyspark/sql/functions/builtin.py b/python/pyspark/sql/functions/builtin.py
@@ -14098,10 +14098,13 @@ def element_at(col: "ColumnOrName", extraction: Any) -> Column:
     Notes
     -----
     The position is not zero based, but 1 based index.
+    If extraction is a string, :meth:`element_at` treats it as a literal string,
+    while :meth:`try_element_at` treats it as a column name.
 
     See Also
     --------
     :meth:`get`
+    :meth:`try_element_at`
 
     Examples
     --------
@@ -14148,6 +14151,17 @@ def element_at(col: "ColumnOrName", extraction: Any) -> Column:
     +-------------------+
     |               NULL|
     +-------------------+
+
+    Example 5: Getting a value from a map using a literal string as the key
+
+    >>> from pyspark.sql import functions as sf
+    >>> df = spark.createDataFrame([({"a": 1.0, "b": 2.0}, "a")], ['data', 'b'])
+    >>> df.select(sf.element_at(df.data, 'b')).show()
+    +-------------------+
+    |element_at(data, b)|
+    +-------------------+
+    |                2.0|
+    +-------------------+
     """
     return _invoke_function_over_columns("element_at", col, lit(extraction))
 
@@ -14172,6 +14186,17 @@ def try_element_at(col: "ColumnOrName", extraction: "ColumnOrName") -> Column:
     extraction :
         index to check for in array or key to check for in map
 
+    Notes
+    -----
+    The position is not zero based, but 1 based index.
+    If extraction is a string, :meth:`try_element_at` treats it as a column name,
+    while :meth:`element_at` treats it as a literal string.
+
+    See Also
+    --------
+    :meth:`get`
+    :meth:`element_at`
+
     Examples
     --------
     Example 1: Getting the first element of an array
@@ -14228,6 +14253,17 @@ def try_element_at(col: "ColumnOrName", extraction: "ColumnOrName") -> Column:
     +-----------------------+
     |                   NULL|
     +-----------------------+
+
+    Example 6: Getting a value from a map using a column name as the key
+
+    >>> from pyspark.sql import functions as sf
+    >>> df = spark.createDataFrame([({"a": 1.0, "b": 2.0}, "a")], ['data', 'b'])
+    >>> df.select(sf.try_element_at(df.data, 'b')).show()
+    +-----------------------+
+    |try_element_at(data, b)|
+    +-----------------------+
+    |                    1.0|
+    +-----------------------+
     """
     return _invoke_function_over_columns("try_element_at", col, extraction)