Skip to content

[SPARK-29664][PYTHON][SQL][FOLLOW-UP] Add deprecation warnings for getItem instead#28327

Closed
HyukjinKwon wants to merge 3 commits intoapache:masterfrom
HyukjinKwon:SPARK-29664
Closed

[SPARK-29664][PYTHON][SQL][FOLLOW-UP] Add deprecation warnings for getItem instead#28327
HyukjinKwon wants to merge 3 commits intoapache:masterfrom
HyukjinKwon:SPARK-29664

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Apr 24, 2020

What changes were proposed in this pull request?

This PR proposes to use a different approach instead of breaking it per Micheal's rubric added at https://spark.apache.org/versioning-policy.html. It deprecates the behaviour for now. It will be gradually removed in the future releases.

After this change,

import warnings
warnings.simplefilter("always")
from pyspark.sql.functions import *
df = spark.range(2)
map_col = create_map(lit(0), lit(100), lit(1), lit(200))
df.withColumn("mapped", map_col.getItem(col('id'))).show()
/.../python/pyspark/sql/column.py:311: DeprecationWarning: A column as 'key' in getItem is 
deprecated as of Spark 3.0, and will not be supported in the future release. Use `column[key]` 
or `column.key` syntax instead.
  DeprecationWarning)
...
import warnings
warnings.simplefilter("always")
from pyspark.sql.functions import *
df = spark.range(2)
struct_col = struct(lit(0), lit(100), lit(1), lit(200))
df.withColumn("struct", struct_col.getField(lit("col1"))).show()
/.../spark/python/pyspark/sql/column.py:336: DeprecationWarning: A column as 'name'
in getField is deprecated as of Spark 3.0, and will not be supported in the future release. Use
`column[name]` or `column.name` syntax instead.
  DeprecationWarning)

Why are the changes needed?

To prevent the radical behaviour change after the amended versioning policy.

Does this PR introduce any user-facing change?

Yes, it will show the deprecated warning message.

How was this patch tested?

Manually tested.

@HyukjinKwon
Copy link
Member Author

cc @imback82 and @viirya can you take a look please?

@SparkQA
Copy link

SparkQA commented Apr 24, 2020

Test build #121749 has finished for PR 28327 at commit eb2d788.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Apr 27, 2020

Test build #121863 has finished for PR 28327 at commit 3ec641d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@HyukjinKwon
Copy link
Member Author

Thanks @viirya and @imback82.

@SparkQA
Copy link

SparkQA commented Apr 27, 2020

Test build #121868 has finished for PR 28327 at commit 3ec641d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 27, 2020

Test build #121872 has finished for PR 28327 at commit 3ec641d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Merged to master and branch-3.0.

HyukjinKwon added a commit that referenced this pull request Apr 27, 2020
…tItem instead

### What changes were proposed in this pull request?

This PR proposes to use a different approach instead of breaking it per Micheal's rubric added at https://spark.apache.org/versioning-policy.html. It deprecates the behaviour for now. It will be gradually removed in the future releases.

After this change,

```python
import warnings
warnings.simplefilter("always")
from pyspark.sql.functions import *
df = spark.range(2)
map_col = create_map(lit(0), lit(100), lit(1), lit(200))
df.withColumn("mapped", map_col.getItem(col('id'))).show()
```

```
/.../python/pyspark/sql/column.py:311: DeprecationWarning: A column as 'key' in getItem is
deprecated as of Spark 3.0, and will not be supported in the future release. Use `column[key]`
or `column.key` syntax instead.
  DeprecationWarning)
...
```

```python
import warnings
warnings.simplefilter("always")
from pyspark.sql.functions import *
df = spark.range(2)
struct_col = struct(lit(0), lit(100), lit(1), lit(200))
df.withColumn("struct", struct_col.getField(lit("col1"))).show()
```

```
/.../spark/python/pyspark/sql/column.py:336: DeprecationWarning: A column as 'name'
in getField is deprecated as of Spark 3.0, and will not be supported in the future release. Use
`column[name]` or `column.name` syntax instead.
  DeprecationWarning)
```

### Why are the changes needed?

To prevent the radical behaviour change after the amended versioning policy.

### Does this PR introduce any user-facing change?

Yes, it will show the deprecated warning message.

### How was this patch tested?

Manually tested.

Closes #28327 from HyukjinKwon/SPARK-29664.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 5dd581c)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-29664 branch July 27, 2020 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments