Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): add plot namespace (which defers to hvplot) #13238

Merged
merged 22 commits into from
Dec 31, 2023
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
2619c8a
add plot namespace which defers to hvplot
MarcoGorelli Dec 24, 2023
4805791
plotting -> plot
MarcoGorelli Dec 27, 2023
648c16f
Merge remote-tracking branch 'upstream/main' into hvplot-backend
MarcoGorelli Dec 27, 2023
fe8729d
lint
MarcoGorelli Dec 27, 2023
4123a73
Merge remote-tracking branch 'upstream/main' into hvplot-backend
MarcoGorelli Dec 28, 2023
ff7aaa6
add tests for series.hist and unsupported dtype
MarcoGorelli Dec 28, 2023
019bcb9
add Series.plot to docs
MarcoGorelli Dec 28, 2023
0506a16
Merge remote-tracking branch 'upstream/main' into hvplot-backend
MarcoGorelli Dec 28, 2023
cde1728
set 0.9.1 as minimum hvplot version
MarcoGorelli Dec 28, 2023
335c8c4
fixup docs build
MarcoGorelli Dec 28, 2023
2d99d1a
typo + missing docs pages
MarcoGorelli Dec 28, 2023
af60d9a
fix docs build
MarcoGorelli Dec 28, 2023
28865de
no need to require hvplot for docs
MarcoGorelli Dec 28, 2023
c3ab5e6
skip plot doctest
MarcoGorelli Dec 28, 2023
e0af814
Merge remote-tracking branch 'upstream/main' into hvplot-backend
MarcoGorelli Dec 29, 2023
56612b3
raise if hvplot not installed or isnt >=0.9.1
MarcoGorelli Dec 30, 2023
0a4ae04
Merge remote-tracking branch 'upstream/main' into hvplot-backend
MarcoGorelli Dec 30, 2023
3696df4
use hvplot post_patch
MarcoGorelli Dec 31, 2023
ab84eb1
lint
MarcoGorelli Dec 31, 2023
19aca75
simplify, remove holoviews from dependencies.py
MarcoGorelli Dec 31, 2023
30e4e01
remove final holoviews
MarcoGorelli Dec 31, 2023
bba5c5a
link to PR comment, add TODO
MarcoGorelli Dec 31, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,10 +192,13 @@ Install Polars with all optional dependencies.

```sh
pip install 'polars[all]'
pip install 'polars[numpy,pandas,pyarrow]' # install a subset of all optional dependencies
```

You can also install the dependencies directly.
You can also install a subset of all optional dependencies.

```sh
pip install 'polars[numpy,pandas,pyarrow]'
```

| Tag | Description |
| ---------- | ---------------------------------------------------------------------------- |
Expand All @@ -209,6 +212,7 @@ You can also install the dependencies directly.
| openpyxl | Support for reading from Excel files with native types |
| deltalake | Support for reading from Delta Lake Tables |
| pyiceberg | Support for reading from Apache Iceberg tables |
| plot | Support for plot functions on Dataframes |
| timezone | Timezone support, only needed if are on Python<3.9 or you are on Windows |

Releases happen quite often (weekly / every few days) at the moment, so updating polars regularly to get the latest bugfixes / features might not be a bad idea.
Expand Down
1 change: 1 addition & 0 deletions docs/user-guide/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ pip install 'polars[numpy,fsspec]'
| connectorx | Support for reading from SQL databases |
| xlsx2csv | Support for reading from Excel files |
| deltalake | Support for reading from Delta Lake Tables |
| plot | Support for plotting Dataframes |
| timezone | Timezone support, only needed if 1. you are on Python < 3.9 and/or 2. you are on Windows, otherwise no dependencies will be installed |

### Rust
Expand Down
1 change: 1 addition & 0 deletions py-polars/docs/source/reference/dataframe/attributes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Attributes
DataFrame.dtypes
DataFrame.flags
DataFrame.height
DataFrame.plot
DataFrame.schema
DataFrame.shape
DataFrame.width
1 change: 1 addition & 0 deletions py-polars/docs/source/reference/dataframe/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ This page gives an overview of all public DataFrame methods.
group_by
modify_select
miscellaneous
plot

.. currentmodule:: polars

Expand Down
44 changes: 44 additions & 0 deletions py-polars/docs/source/reference/dataframe/plot.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
====
Plot
====

Polars does not implement plotting logic itself, but instead defers to
hvplot. Please see the `hvplot reference gallery <https://hvplot.holoviz.org/reference/index.html>`_
for more information and documentation.

Examples
--------
Scatter plot:

.. code-block:: python

df = pl.DataFrame(
{
"length": [1, 4, 6],
"width": [4, 5, 6],
"species": ["setosa", "setosa", "versicolor"],
}
)
df.plot.scatter(x="length", y="width", by="species")

Line plot:

.. code-block:: python

from datetime import date
df = pl.DataFrame(
{
"date": [date(2020, 1, 2), date(2020, 1, 3), date(2020, 1, 3)],
"stock_1": [1, 4, 6],
"stock_2": [1, 5, 2],
}
)
df.plot.line(x="date", y=["stock_1", "stock_2"])

For more info on what you can pass, you can use ``hvplot.help``:

.. code-block:: python

import hvplot
hvplot.help('scatter')

1 change: 1 addition & 0 deletions py-polars/docs/source/reference/series/attributes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ Attributes
Series.shape
Series.str
Series.flags
Series.plot
1 change: 1 addition & 0 deletions py-polars/docs/source/reference/series/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This page gives an overview of all public Series methods.
list
modify_select
miscellaneous
plot
string
struct
temporal
Expand Down
29 changes: 29 additions & 0 deletions py-polars/docs/source/reference/series/plot.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
====
Plot
====

Polars does not implement plotting logic itself, but instead defers to
hvplot. Please see the `hvplot reference gallery <https://hvplot.holoviz.org/reference/index.html>`_
for more information and documentation.

Examples
--------
Histogram:

.. code-block:: python

s = pl.Series([1, 4, 2])
s.plot.hist()

KDE plot (note: in addition to ``hvplot``, this one also requires ``scipy``):

.. code-block:: python

s.plot.kde()

For more info on what you can pass, you can use ``hvplot.help``:

.. code-block:: python

import hvplot
hvplot.help("hist")
50 changes: 49 additions & 1 deletion py-polars/polars/dataframe/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,14 @@
py_type_to_dtype,
)
from polars.dependencies import (
_HVPLOT_AVAILABLE,
_PANDAS_AVAILABLE,
_PYARROW_AVAILABLE,
_check_for_numpy,
_check_for_pandas,
_check_for_pyarrow,
dataframe_api_compat,
hvplot,
)
from polars.dependencies import numpy as np
from polars.dependencies import pandas as pd
Expand Down Expand Up @@ -348,7 +350,7 @@ class DataFrame:

"""

_accessors: ClassVar[set[str]] = set()
_accessors: ClassVar[set[str]] = {"plot"}

def __init__(
self,
Expand Down Expand Up @@ -1116,6 +1118,52 @@ def _replace(self, column: str, new_column: Series) -> Self:
self._df.replace(column, new_column._s)
return self

@property
def plot(self) -> Any:
"""
Create a plot namespace.

Polars does not implement plotting logic itself, but instead defers to
hvplot. Please see the `hvplot reference gallery <https://hvplot.holoviz.org/reference/index.html>`_
for more information and documentation.

Examples
--------
Scatter plot:

>>> df = pl.DataFrame(
... {
... "length": [1, 4, 6],
... "width": [4, 5, 6],
... "species": ["setosa", "setosa", "versicolor"],
... }
... )
>>> df.plot.scatter(x="length", y="width", by="species") # doctest: +SKIP

Line plot:

>>> from datetime import date
>>> df = pl.DataFrame(
... {
... "date": [date(2020, 1, 2), date(2020, 1, 3), date(2020, 1, 3)],
... "stock_1": [1, 4, 6],
... "stock_2": [1, 5, 2],
... }
... )
>>> df.plot.line(x="date", y=["stock_1", "stock_2"]) # doctest: +SKIP

For more info on what you can pass, you can use ``hvplot.help``:

>>> import hvplot # doctest: +SKIP
>>> hvplot.help("scatter") # doctest: +SKIP
"""
if not _HVPLOT_AVAILABLE or parse_version(hvplot.__version__) < parse_version(
"0.9.1"
):
raise ModuleUpgradeRequired("hvplot>=0.9.1 is required for `.plot`")
hvplot.post_patch()
return hvplot.plotting.core.hvPlotTabularPolars(self)

@property
def shape(self) -> tuple[int, int]:
"""
Expand Down
5 changes: 5 additions & 0 deletions py-polars/polars/dependencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
_DELTALAKE_AVAILABLE = True
_FSSPEC_AVAILABLE = True
_GEVENT_AVAILABLE = True
_HVPLOT_AVAILABLE = True
_HYPOTHESIS_AVAILABLE = True
_NUMPY_AVAILABLE = True
_PANDAS_AVAILABLE = True
Expand Down Expand Up @@ -158,6 +159,7 @@ def _lazy_import(module_name: str) -> tuple[ModuleType, bool]:
import deltalake
import fsspec
import gevent
import hvplot
import hypothesis
import numpy
import pandas
Expand All @@ -183,6 +185,7 @@ def _lazy_import(module_name: str) -> tuple[ModuleType, bool]:
)
deltalake, _DELTALAKE_AVAILABLE = _lazy_import("deltalake")
fsspec, _FSSPEC_AVAILABLE = _lazy_import("fsspec")
hvplot, _HVPLOT_AVAILABLE = _lazy_import("hvplot")
hypothesis, _HYPOTHESIS_AVAILABLE = _lazy_import("hypothesis")
numpy, _NUMPY_AVAILABLE = _lazy_import("numpy")
pandas, _PANDAS_AVAILABLE = _lazy_import("pandas")
Expand Down Expand Up @@ -243,6 +246,7 @@ def _check_for_pydantic(obj: Any, *, check_type: bool = True) -> bool:
"deltalake",
"fsspec",
"gevent",
"hvplot",
"numpy",
"pandas",
"pydantic",
Expand All @@ -260,6 +264,7 @@ def _check_for_pydantic(obj: Any, *, check_type: bool = True) -> bool:
"_PYICEBERG_AVAILABLE",
"_FSSPEC_AVAILABLE",
"_GEVENT_AVAILABLE",
"_HVPLOT_AVAILABLE",
"_HYPOTHESIS_AVAILABLE",
"_NUMPY_AVAILABLE",
"_PANDAS_AVAILABLE",
Expand Down
35 changes: 35 additions & 0 deletions py-polars/polars/series/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,13 @@
supported_numpy_char_code,
)
from polars.dependencies import (
_HVPLOT_AVAILABLE,
_PYARROW_AVAILABLE,
_check_for_numpy,
_check_for_pandas,
_check_for_pyarrow,
dataframe_api_compat,
hvplot,
)
from polars.dependencies import numpy as np
from polars.dependencies import pandas as pd
Expand Down Expand Up @@ -237,6 +239,7 @@ class Series:
"str",
"bin",
"struct",
"plot",
}

def __init__(
Expand Down Expand Up @@ -7486,6 +7489,38 @@ def struct(self) -> StructNameSpace:
"""Create an object namespace of all struct related methods."""
return StructNameSpace(self)

@property
def plot(self) -> Any:
"""
Create a plot namespace.

Polars does not implement plotting logic itself, but instead defers to
hvplot. Please see the `hvplot reference gallery <https://hvplot.holoviz.org/reference/index.html>`_
for more information and documentation.

Examples
--------
Histogram:

>>> s = pl.Series([1, 4, 2])
>>> s.plot.hist() # doctest: +SKIP

KDE plot (note: in addition to ``hvplot``, this one also requires ``scipy``):

>>> s.plot.kde() # doctest: +SKIP

For more info on what you can pass, you can use ``hvplot.help``:

>>> import hvplot # doctest: +SKIP
>>> hvplot.help("hist") # doctest: +SKIP
"""
if not _HVPLOT_AVAILABLE or parse_version(hvplot.__version__) < parse_version(
"0.9.1"
):
raise ModuleUpgradeRequired("hvplot>=0.9.1 is required for `.plot`")
hvplot.post_patch()
return hvplot.plotting.core.hvPlotTabularPolars(self)


def _resolve_temporal_dtype(
dtype: PolarsDataType | None,
Expand Down
7 changes: 6 additions & 1 deletion py-polars/polars/series/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,12 @@ def expr_dispatch(cls: type[T]) -> type[T]:
expr_lookup = _expr_lookup(namespace)

for name in dir(cls):
if not name.startswith("_"):
if (
# private
not name.startswith("_")
# `.plot` not available on Expr
and name != "plot"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be required - plot is not an empty function so it's skipped. Everything runs fine when I comment this out - am I missing something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you try building the docs?

Copy link
Member

@stinodego stinodego Dec 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm no, indeed it fails on building the docs. But the problem isn't that .plot is not available on Expr. The problem is that it calls hvPlotTabularPolars which imports polars, which doesn't exist while building the docs.

I'm not sure of the best solution here. I would ideally want to avoid hardcoding random stuff here as the functionality is already complex enough. But I'm not sure what the best solution would be 🤔

I think we should probably update sphinx_accessor in some way to avoid this from happening. Looking into it as we speak...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but my understanding is that expr_dispatch is only needed for methods available on Expr

you're right about complexity, I just couldn't think of a solution which doesn't add any. maybe there's a more generic solution which would mean that this isn't necessary

cc @alexander-beedie in case you have ideas here

):
attr = getattr(cls, name)
if callable(attr):
attr = _undecorated(attr)
Expand Down
2 changes: 2 additions & 0 deletions py-polars/polars/utils/show_versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ def show_versions() -> None:
connectorx: 0.3.2
deltalake: 0.13.0
fsspec: 2023.10.0
hvplot: 0.9.1
gevent: 23.9.1
matplotlib: 3.8.2
numpy: 1.26.2
Expand Down Expand Up @@ -66,6 +67,7 @@ def _get_dependency_info() -> dict[str, str]:
"deltalake",
"fsspec",
"gevent",
"hvplot",
"matplotlib",
"numpy",
"openpyxl",
Expand Down
2 changes: 1 addition & 1 deletion py-polars/polars/utils/various.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,7 @@ def __get__( # type: ignore[override]
return self.fget( # type: ignore[misc]
instance if isinstance(instance, cls) else cls
)
except AttributeError:
except (AttributeError, ImportError):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this to avoid an importerror if building the docs without hvplot installed. I don't think hvplot should be required for building docs, and it's probably not worth showing plots in the docs themselves

return None # type: ignore[return-value]


Expand Down
5 changes: 4 additions & 1 deletion py-polars/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ connectorx = ["connectorx >= 0.3.2"]
deltalake = ["deltalake >= 0.14.0"]
fsspec = ["fsspec"]
gevent = ["gevent"]
plot = ["hvplot >= 0.9.1"]
matplotlib = ["matplotlib"]
numpy = ["numpy >= 1.16.0"]
openpyxl = ["openpyxl >= 3.0.0"]
Expand All @@ -58,7 +59,7 @@ timezone = ["backports.zoneinfo; python_version < '3.9'", "tzdata; platform_syst
xlsx2csv = ["xlsx2csv >= 0.8.0"]
xlsxwriter = ["xlsxwriter"]
all = [
"polars[pyarrow,pandas,numpy,fsspec,connectorx,xlsx2csv,deltalake,timezone,matplotlib,pydantic,pyiceberg,sqlalchemy,xlsxwriter,adbc,cloudpickle,gevent]",
"polars[pyarrow,pandas,numpy,fsspec,plot,connectorx,xlsx2csv,deltalake,timezone,pydantic,pyiceberg,sqlalchemy,xlsxwriter,adbc,cloudpickle,gevent]",
]

[tool.maturin]
Expand Down Expand Up @@ -88,6 +89,8 @@ module = [
"ezodf.*",
"fsspec.*",
"gevent",
"holoviews.*",
stinodego marked this conversation as resolved.
Show resolved Hide resolved
"hvplot.*",
"matplotlib.*",
"moto.server",
"openpyxl",
Expand Down
4 changes: 3 additions & 1 deletion py-polars/requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,10 @@ dataframe-api-compat >= 0.1.6
pyiceberg >= 0.5.0
# Csv
zstandard
# Other
# Plotting
hvplot>=0.9.1
matplotlib
# Other
gevent

# -------
Expand Down