Skip to content

Commit

Permalink
[SPARK-41225][CONNECT][PYTHON][FOLLOWUP] Disable semanticHash, `sam…
Browse files Browse the repository at this point in the history
…eSemantics`, `_repr_html_ `

### What changes were proposed in this pull request?
Disable `semanticHash`, `sameSemantics`, `_repr_html_ `

### Why are the changes needed?
1, Disable `semanticHash`, `sameSemantics` according to the discussions in apache#38742
2, Disable `_repr_html_ ` since it requires [eager mode](https://github.com/apache/spark/blob/40a9a6ef5b89f0c3d19db4a43b8a73decaa173c3/python/pyspark/sql/dataframe.py#L878), otherwise, it just returns `None`

```
In [2]: spark.range(start=0, end=10)._repr_html_() is None
Out[2]: True

```

### Does this PR introduce _any_ user-facing change?
for these three methods, throw `NotImplementedError`

### How was this patch tested?
added test cases

Closes apache#38815 from zhengruifeng/connect_disable_repr_html_sematic.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
  • Loading branch information
zhengruifeng authored and beliefer committed Dec 15, 2022
1 parent 41d2ff6 commit 4d3074d
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 0 deletions.
9 changes: 9 additions & 0 deletions python/pyspark/sql/connect/dataframe.py
Expand Up @@ -1133,6 +1133,15 @@ def writeStream(self, *args: Any, **kwargs: Any) -> None:
def toJSON(self, *args: Any, **kwargs: Any) -> None:
raise NotImplementedError("toJSON() is not implemented.")

def _repr_html_(self, *args: Any, **kwargs: Any) -> None:
raise NotImplementedError("_repr_html_() is not implemented.")

def semanticHash(self, *args: Any, **kwargs: Any) -> None:
raise NotImplementedError("semanticHash() is not implemented.")

def sameSemantics(self, *args: Any, **kwargs: Any) -> None:
raise NotImplementedError("sameSemantics() is not implemented.")


class DataFrameNaFunctions:
"""Functionality for working with missing data in :class:`DataFrame`.
Expand Down
3 changes: 3 additions & 0 deletions python/pyspark/sql/tests/connect/test_connect_plan_only.py
Expand Up @@ -328,6 +328,9 @@ def test_unsupported_functions(self):
"toLocalIterator",
"checkpoint",
"localCheckpoint",
"_repr_html_",
"semanticHash",
"sameSemantics",
):
with self.assertRaises(NotImplementedError):
getattr(df, f)()
Expand Down

0 comments on commit 4d3074d

Please sign in to comment.