Skip to content

[SPARK-49694][PYTHON][CONNECT] Support scatter plots#48219

Closed
xinrong-meng wants to merge 3 commits intoapache:masterfrom
xinrong-meng:plot_scatter
Closed

[SPARK-49694][PYTHON][CONNECT] Support scatter plots#48219
xinrong-meng wants to merge 3 commits intoapache:masterfrom
xinrong-meng:plot_scatter

Conversation

@xinrong-meng
Copy link
Member

@xinrong-meng xinrong-meng commented Sep 24, 2024

What changes were proposed in this pull request?

Support scatter plots with plotly backend on both Spark Connect and Spark classic.

Why are the changes needed?

While Pandas on Spark supports plotting, PySpark currently lacks this feature. The proposed API will enable users to generate visualizations. This will provide users with an intuitive, interactive way to explore and understand large datasets directly from PySpark DataFrames, streamlining the data analysis workflow in distributed environments.

See more at PySpark Plotting API Specification in progress.

Part of https://issues.apache.org/jira/browse/SPARK-49530.

Does this PR introduce any user-facing change?

Yes. Scatter plots are supported as shown below.

>>> data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)]
>>> columns = ["length", "width", "species"]
>>> sdf = spark.createDataFrame(data, columns)
>>> fig = sdf.plot(kind="scatter", x="length", y="width")  # or fig = sdf.plot.scatter(x="length", y="width")
>>> fig.show() 

newplot (6)

How was this patch tested?

Unit tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@xinrong-meng xinrong-meng marked this pull request as ready for review September 24, 2024 03:20
self._check_fig_data("bar", fig["data"][0], ["A", "B", "C"], [10, 30, 20], "int_val")
self._check_fig_data("bar", fig["data"][1], ["A", "B", "C"], [1.5, 2.5, 3.5], "float_val")

def test_barh_plot(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why removing this test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad.. removed by mistake while rebasing

@xinrong-meng
Copy link
Member Author

Merged to master, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants