Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: added array #23581

Merged
merged 57 commits into from
Dec 28, 2018
Merged
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
bfefc96
added array
TomAugspurger Nov 8, 2018
51480a3
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 9, 2018
dcb7931
update registry test
TomAugspurger Nov 9, 2018
a635649
update doc examples
TomAugspurger Nov 9, 2018
fb0d8bc
wip
TomAugspurger Nov 9, 2018
d58a320
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 9, 2018
fe06de4
inference
TomAugspurger Nov 9, 2018
72f7f06
ia updates
TomAugspurger Nov 9, 2018
c02e183
test fixup
TomAugspurger Nov 10, 2018
a2d3146
isort
TomAugspurger Nov 10, 2018
37901b0
fixups
TomAugspurger Nov 10, 2018
4403010
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 12, 2018
9401dd3
wip
TomAugspurger Nov 12, 2018
838ce5e
dtype from ea
TomAugspurger Nov 12, 2018
5260b99
series, index tests
TomAugspurger Nov 12, 2018
248e9e0
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 12, 2018
cf07c80
added ndarray case
TomAugspurger Nov 12, 2018
22490a8
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 15, 2018
5e0dc62
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 17, 2018
fe40189
added test for a 2d array
TomAugspurger Nov 17, 2018
7eb9d08
TST: test for Series[EA]
TomAugspurger Nov 17, 2018
fa7b200
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 20, 2018
1ca14fe
Added test for period -> category
TomAugspurger Nov 20, 2018
4473899
copy
TomAugspurger Nov 20, 2018
382f57d
prefix for arrays
TomAugspurger Nov 20, 2018
dd76a2b
Added arrays
TomAugspurger Nov 20, 2018
159d3a2
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 21, 2018
5366950
update docstring
TomAugspurger Nov 21, 2018
c818a8f
docstring order
TomAugspurger Nov 21, 2018
ba8b807
Revert "docstring order"
TomAugspurger Nov 21, 2018
77cd782
Updates
TomAugspurger Nov 21, 2018
dfada7b
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 27, 2018
5eff701
Add docs for the types we infer
TomAugspurger Nov 27, 2018
9406400
API: disallow string alias for NumPy
TomAugspurger Nov 27, 2018
8eb07c3
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 28, 2018
ea3a118
Wrap long error message
TomAugspurger Nov 28, 2018
ecae340
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 29, 2018
fb814fc
updates
TomAugspurger Nov 29, 2018
a6f6d29
removed old test
TomAugspurger Nov 29, 2018
6c243f3
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Nov 29, 2018
86b81b5
formatting
TomAugspurger Nov 29, 2018
2c6cf97
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 8, 2018
50d4206
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 8, 2018
9e1b4e6
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 10, 2018
000967d
Raise on scalars
TomAugspurger Dec 10, 2018
bf829c3
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 11, 2018
faf114d
docs on raising
TomAugspurger Dec 11, 2018
3186ded
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 12, 2018
1c4da0e
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 28, 2018
36c6f00
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 28, 2018
932e119
Updates for PandasArray
TomAugspurger Dec 28, 2018
45d07eb
update docstring
TomAugspurger Dec 28, 2018
d1aba73
Updates
TomAugspurger Dec 28, 2018
981f735
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 28, 2018
1f3bb50
fixed test expected
TomAugspurger Dec 28, 2018
c8d3960
doc lint
TomAugspurger Dec 28, 2018
1b9e251
Merge remote-tracking branch 'upstream/master' into pd.array
TomAugspurger Dec 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,19 @@ strings and apply several methods to it. These can be accessed like
Series.dt
Index.str


.. _api.arrays:

Arrays
------

Pandas and third-party libraries can extend NumPy's type system (see :ref:`extending.extension-types`).

.. autosummary::
:toctree: generated/

array

.. _api.categorical:

Categorical
Expand Down Expand Up @@ -808,6 +821,65 @@ following usable methods and properties:
Series.cat.as_ordered
Series.cat.as_unordered

.. _api.arrays.integerna:

Integer-NA
~~~~~~~~~~

:class:`arrays.IntegerArray` can hold integer data, potentially with missing
values.

.. autosummary::
:toctree: generated/

arrays.IntegerArray

.. _api.arrays.interval:

Interval
~~~~~~~~

:class:`IntervalArray` is an array for storing data representing intervals.
The scalar type is a :class:`Interval`. These may be stored in a :class:`Series`
or as a :class:`IntervalIndex`. :class:`IntervalArray` can be closed on the
``'left'``, ``'right'``, or ``'both'``, or ``'neither'`` sides.
See :ref:`indexing.intervallindex` for more.

.. currentmodule:: pandas

.. autosummary::
:toctree: generated/

IntervalArray

.. _api.arrays.period:

Period
~~~~~~

Periods represent a span of time (e.g. the year 2000, or the hour from 11:00 to 12:00
on January 1st, 2000). A collection of :class:`Period` objects with a common frequency
can be collected in a :class:`PeriodArray`. See :ref:`timeseries.periods` for more.

.. autosummary::
:toctree: generated/

arrays.PeriodArray

Sparse
~~~~~~

Sparse data may be stored and operated on more efficiently when there is a single value
that's often repeated. :class:`SparseArray` is a container for this type of data.
See :ref:`sparse` for more.

.. _api.arrays.sparse:

.. autosummary::
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved
:toctree: generated/

SparseArray

Plotting
~~~~~~~~

Expand Down Expand Up @@ -1701,6 +1773,7 @@ IntervalIndex Components
IntervalIndex.get_indexer
IntervalIndex.set_closed
IntervalIndex.overlaps
IntervalArray.to_tuples


.. _api.multiindex:
Expand Down Expand Up @@ -1933,6 +2006,8 @@ Methods
PeriodIndex.strftime
PeriodIndex.to_timestamp

.. api.scalars:

Scalars
-------

Expand Down
33 changes: 33 additions & 0 deletions doc/source/whatsnew/v0.24.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,39 @@ Reduction and groupby operations such as 'sum' work.

The Integer NA support currently uses the captilized dtype version, e.g. ``Int8`` as compared to the traditional ``int8``. This may be changed at a future date.

.. _whatsnew_0240.enhancements.array:

A new top-level method :func:`array` has been added for creating 1-dimensional arrays (:issue:`22860`).
This can be used to create any :ref:`extension array <extending.extension-types>`, including
extension arrays registered by :ref:`3rd party libraries <ecosystem.extensions>`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you now have a ref in basics where this should point?


.. ipython:: python

pd.array([1, 2, np.nan], dtype='Int64')
pd.array(['a', 'b', 'c'], dtype='category')

Passing data for which there isn't dedicated extension type (e.g. float, integer, etc.)
will return a new :class:`arrays.PandasArray`, which is just a thin (no-copy)
wrapper around a :class:`numpy.ndarray` that satisfies the extension array interface.

.. ipython:: python

pd.array([1, 2, 3])

On their own, a :class:`arrays.PandasArray` isn't a very useful object.
But if you need write low-level code that works generically for any
:class:`~pandas.api.extensions.ExtensionArray`, :class:`arrays.PandasArray`
satisfies that need.

Notice that by default, if no ``dtype`` is specified, the dtype of the returned
array is inferred from the data. In particular, note that the first example of
``[1, 2, np.nan]`` would have returned a floating-point array, since ``NaN``
is a float.

.. ipython:: python

pd.array([1, 2, np.nan])

.. _whatsnew_0240.enhancements.read_html:

``read_html`` Enhancements
Expand Down
12 changes: 10 additions & 2 deletions pandas/arrays/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,17 @@

See :ref:`extending.extension-types` for more.
"""
from pandas.core.arrays import PandasArray
from pandas.core.arrays import (
IntervalArray, PeriodArray, Categorical, SparseArray, IntegerArray,
PandasArray
)


__all__ = [
jreback marked this conversation as resolved.
Show resolved Hide resolved
'PandasArray'
'Categorical',
'IntegerArray',
'IntervalArray',
'PandasArray',
'PeriodArray',
'SparseArray',
]
19 changes: 18 additions & 1 deletion pandas/core/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,26 @@

import numpy as np

from pandas.core.arrays import IntervalArray
from pandas.core.arrays.integer import (
Int8Dtype,
Int16Dtype,
Int32Dtype,
Int64Dtype,
UInt8Dtype,
UInt16Dtype,
UInt32Dtype,
UInt64Dtype,
)
from pandas.core.algorithms import factorize, unique, value_counts
from pandas.core.dtypes.missing import isna, isnull, notna, notnull
from pandas.core.arrays import Categorical
from pandas.core.dtypes.dtypes import (
CategoricalDtype,
PeriodDtype,
IntervalDtype,
DatetimeTZDtype,
)
from pandas.core.arrays import Categorical, array
from pandas.core.groupby import Grouper
from pandas.io.formats.format import set_eng_float_format
from pandas.core.index import (Index, CategoricalIndex, Int64Index,
Expand Down
1 change: 1 addition & 0 deletions pandas/core/arrays/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from .array_ import array # noqa
from .base import (ExtensionArray, # noqa
ExtensionOpsMixin,
ExtensionScalarOpsMixin)
Expand Down
Loading