Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified PlotAccessor for DataFrame and Series #1662

Merged
merged 16 commits into from
Aug 12, 2020

Conversation

DumbMachine
Copy link
Contributor

@DumbMachine DumbMachine commented Jul 17, 2020

  • Ready for review

Aims to fix #1661
New PlotAccessor works the same way as the old.

TODO:

  • Confirm each function has appropriate example and options in the docs

Examples:

import pandas as pd
import databricks.koalas as ks
import matplotlib.pyplot as plt
kf =  ks.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], columns=["A", "B", "C", "D"])
# dataframe like
print(kf.plot)
kf.plot(kind="line")
kf.plot(kind="line", backend="plotly", width=400, height=400).show()

<databricks.koalas.plot.PlotAccessor object at 0x7f32d669b590>
image
image

# series like
print(kf.A.plot)
kf.A.plot(kind="line");
kf.A.plot(kind="line", backend="plotly", width=400, height=400).show()

<databricks.koalas.plot.PlotAccessor object at 0x7f32cdffd750>
image
image

Error are handled, same as before

kf.plot(kind="pie", legend=False)
kf.plot(kind="pie",  backend="plotly", width=400, height=400).show()

image

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-13-9597ff26ce78> in <module>
      1 kf.plot(kind="pie", legend=False)
----> 2 kf.plot(kind="pie",  backend="plotly", width=400, height=400).show()

~/work/koalas/databricks/koalas/plot.py in __call__(self, kind, backend, **kwargs)
   1322 
   1323         if plot_backend.__name__ != "databricks.koalas.plot":
-> 1324             return plot_backend.plot(plot_data, kind=kind, **kwds)
   1325 
   1326         if kind not in self._all_kinds:

~/anaconda3/lib/python3.7/site-packages/plotly/__init__.py in plot(data_frame, kind, **kwargs)
    101         return histogram(data_frame, **new_kwargs)
    102     raise NotImplementedError(
--> 103         "kind='%s' not yet supported for plotting.backend='plotly'" % kind
    104     )
    105 

NotImplementedError: kind='pie' not yet supported for plotting.backend='plotly'

And finally

# exists for dataframe like
kf.plot(kind="scatter", x='A', y='B')
# does not exist for series like
kf.A.plot(kind="scatter")

image

---------------------------------------------------------------------------
PandasNotImplementedError                 Traceback (most recent call last)
<ipython-input-6-674f15f3badc> in <module>
----> 1 kf.A.new_plot(kind="scatter")

~/work/koalas/databricks/koalas/new_plot.py in __call__(self, kind, backend, **kwargs)
    157         if isinstance(self.data, Series):
    158             if kind not in self._series_kinds:
--> 159                 return  unsupported_function(class_name="pd.Series", method_name=kind)()
    160             return plot_series(data=self.data, kind=kind, **kwds)
    161         elif isinstance(self.data, DataFrame):

~/work/koalas/databricks/koalas/missing/__init__.py in unsupported_function(*args, **kwargs)
     21     def unsupported_function(*args, **kwargs):
     22         raise PandasNotImplementedError(
---> 23             class_name=class_name, method_name=method_name, reason=reason
     24         )
     25 

PandasNotImplementedError: The method `pd.Series.scatter()` is not implemented yet.

@DumbMachine
Copy link
Contributor Author

DumbMachine commented Jul 22, 2020

Could anybody help me with the PIP tests error? I do not get any error during local testing.

@ueshin
Copy link
Collaborator

ueshin commented Jul 22, 2020

I submitted a PR to you branch DumbMachine#2.

databricks/koalas/plot.py Outdated Show resolved Hide resolved
@ueshin
Copy link
Collaborator

ueshin commented Jul 22, 2020

@DumbMachine Could you push an empty commit to run tests again?
Seems like the current build doesn't include the new changes for some reason.

@DumbMachine DumbMachine changed the title [wip] Unified PlotAccessor for DataFrame and Series Unified PlotAccessor for DataFrame and Series Jul 28, 2020
Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM.
@DumbMachine Great work!

return unsupported_function(class_name="pd.DataFrame", method_name=kind)()
return plot_frame(data=self.data, kind=kind, **kwds)

__call__.__doc__ = plot_frame.__doc__
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon What do you think the doc should be if we consolidate the plotting classes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't built it and checked by myself but I think it's okay to consolidate in general.

if isinstance(self.data, Series):
return self(kind="box", **kwds)
elif isinstance(self.data, DataFrame):
return unsupported_function(class_name="pd.DataFrame", method_name="box")()

def hist(self, bins=10, **kwds):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the example for DataFrame?

databricks/koalas/plot.py Show resolved Hide resolved
databricks/koalas/plot.py Show resolved Hide resolved
databricks/koalas/plot.py Outdated Show resolved Hide resolved
databricks/koalas/plot.py Outdated Show resolved Hide resolved
@ueshin
Copy link
Collaborator

ueshin commented Aug 6, 2020

@DumbMachine Could you address my comments above?

@DumbMachine
Copy link
Contributor Author

Sorry for the delay, have made the changes. I will modify if any further changes are required,

databricks/koalas/plot.py Outdated Show resolved Hide resolved
databricks/koalas/plot.py Outdated Show resolved Hide resolved
@ueshin
Copy link
Collaborator

ueshin commented Aug 12, 2020

Thanks! merging.

@ueshin ueshin merged commit ebbd09c into databricks:master Aug 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify the Plot Assessors
3 participants