Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38993][PYTHON] Impl DataFrame.boxplot and DataFrame.plot.box #36317

Closed
wants to merge 3 commits into from

Conversation

zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Apr 22, 2022

What changes were proposed in this pull request?

Impl DataFrame.boxplot and DataFrame.plot.box

Why are the changes needed?

to increase pandas API coverage in PySpark

Does this PR introduce any user-facing change?

yes

In [2]: df = ps.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1], [6.4, 3.2, 1], [5.9, 3.0, 2]], columns=['length', 'width', 'species'])

In [3]: df.boxplot()
Out[3]:                                                                         
In [4]: df.plot.box()
Out[4]: 

image

How was this patch tested?

added ut and manually tests

@zhengruifeng zhengruifeng marked this pull request as draft April 22, 2022 03:22
@zhengruifeng zhengruifeng changed the title [SPARK-38993][ML] Impl DataFrame.boxplot and DataFrame.plot.box [SPARK-38993][PYTHON][WIP] Impl DataFrame.boxplot and DataFrame.plot.box Apr 22, 2022
@zhengruifeng zhengruifeng marked this pull request as ready for review April 22, 2022 10:54
@zhengruifeng zhengruifeng changed the title [SPARK-38993][PYTHON][WIP] Impl DataFrame.boxplot and DataFrame.plot.box [SPARK-38993][PYTHON] Impl DataFrame.boxplot and DataFrame.plot.box Apr 22, 2022
@zhengruifeng
Copy link
Contributor Author

gentlely ping @HyukjinKwon @ueshin @itholic @xinrong-databricks

@zhengruifeng
Copy link
Contributor Author

boxplot in the pandas' side:

In [25]: df = pd.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1], [6.4, 3.2, 1], [5.9, 3.0, 2]], columns=['length', 'width', 'species'])

In [26]: 

In [26]: df.boxplot(backend='plotly')
    ...: 
Out[26]: 
In [27]: df.plot.box(backend='plotly')
Out[27]: 


image

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon
Copy link
Member

@zhengruifeng thank you really so much for working on this. this is really awesome.

@zhengruifeng
Copy link
Contributor Author

@HyukjinKwon Thanks for reviewing!

@zhengruifeng zhengruifeng deleted the impl_box_plot branch April 29, 2022 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants