Skip to content
Permalink
Browse files

[SPARK-23157][SQL] Explain restriction on column expression in withCo…

…lumn()

## What changes were proposed in this pull request?

It's not obvious from the comments that any added column must be a
function of the dataset that we are adding it to. Add a comment to
that effect to Scala, Python and R Data* methods.

Author: Henry Robinson <henry@cloudera.com>

Closes #20429 from henryr/SPARK-23157.

(cherry picked from commit 8b98324)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  • Loading branch information...
Henry Robinson authored and gatorsmile committed Jan 30, 2018
1 parent a81ace1 commit bb7502f9a506d52365d7532b3b0281098dd85763
@@ -2090,7 +2090,8 @@ setMethod("selectExpr",
#'
#' @param x a SparkDataFrame.
#' @param colName a column name.
#' @param col a Column expression, or an atomic vector in the length of 1 as literal value.
#' @param col a Column expression (which must refer only to this DataFrame), or an atomic vector in
#' the length of 1 as literal value.
#' @return A SparkDataFrame with the new column added or the existing column replaced.
#' @family SparkDataFrame functions
#' @aliases withColumn,SparkDataFrame,character-method
@@ -1829,11 +1829,15 @@ def withColumn(self, colName, col):
Returns a new :class:`DataFrame` by adding a column or replacing the
existing column that has the same name.
The column expression must be an expression over this dataframe; attempting to add
a column from some other dataframe will raise an error.
:param colName: string, name of the new column.
:param col: a :class:`Column` expression for the new column.
>>> df.withColumn('age2', df.age + 2).collect()
[Row(age=2, name=u'Alice', age2=4), Row(age=5, name=u'Bob', age2=7)]
"""
assert isinstance(col, Column), "col should be Column"
return DataFrame(self._jdf.withColumn(colName, col._jc), self.sql_ctx)
@@ -2150,6 +2150,9 @@ class Dataset[T] private[sql](
* Returns a new Dataset by adding a column or replacing the existing column that has
* the same name.
*
* `column`'s expression must only refer to attributes supplied by this Dataset. It is an
* error to add a column that refers to some other Dataset.
*
* @group untypedrel
* @since 2.0.0
*/

0 comments on commit bb7502f

Please sign in to comment.
You can’t perform that action at this time.