Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add MultiIndex.from_product convenience function #6055

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions doc/source/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1539,8 +1539,9 @@ The ``MultiIndex`` object is the hierarchical analogue of the standard
``Index`` object which typically stores the axis labels in pandas objects. You
can think of ``MultiIndex`` an array of tuples where each tuple is unique. A
``MultiIndex`` can be created from a list of arrays (using
``MultiIndex.from_arrays``) or an array of tuples (using
``MultiIndex.from_tuples``).
``MultiIndex.from_arrays``), an array of tuples (using
``MultiIndex.from_tuples``), or a crossed set of iterables (using
``MultiIndex.from_product``).

.. ipython:: python

Expand All @@ -1552,6 +1553,14 @@ can think of ``MultiIndex`` an array of tuples where each tuple is unique. A
s = Series(randn(8), index=index)
s

When you want every pairing of the elements in two iterables, it can be easier
to use the ``MultiIndex.from_product`` function:

.. ipython:: python

iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
MultiIndex.from_product(iterables, names=['first', 'second'])

As a convenience, you can pass a list of arrays directly into Series or
DataFrame to construct a MultiIndex automatically:

Expand Down
10 changes: 10 additions & 0 deletions doc/source/v0.13.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,16 @@ Enhancements
improves parsing perf in many cases. Thanks to @lexual for suggesting and @danbirken
for rapidly implementing. (:issue:`5490`, :issue:`6021`)

- ``MultiIndex.from_product`` convenience function for creating a MultiIndex from
the cartesian product of a set of iterables (:issue:`6055`):

.. ipython:: python

shades = ['light', 'dark']
colors = ['red', 'green', 'blue']

MultiIndex.from_product([shades, colors], names=['shade', 'color'])

- The ``ArrayFormatter`` for ``datetime`` and ``timedelta64`` now intelligently
limit precision based on the values in the array (:issue:`3401`)

Expand Down
43 changes: 43 additions & 0 deletions pandas/core/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -2491,6 +2491,8 @@ def from_arrays(cls, arrays, sortorder=None, names=None):
See Also
--------
MultiIndex.from_tuples : Convert list of tuples to MultiIndex
MultiIndex.from_product : Make a MultiIndex from cartesian product
of iterables
"""
from pandas.core.categorical import Categorical

Expand Down Expand Up @@ -2534,6 +2536,8 @@ def from_tuples(cls, tuples, sortorder=None, names=None):
See Also
--------
MultiIndex.from_arrays : Convert list of arrays to MultiIndex
MultiIndex.from_product : Make a MultiIndex from cartesian product
of iterables
"""
if len(tuples) == 0:
# I think this is right? Not quite sure...
Expand All @@ -2552,6 +2556,45 @@ def from_tuples(cls, tuples, sortorder=None, names=None):
return MultiIndex.from_arrays(arrays, sortorder=sortorder,
names=names)

@classmethod
def from_product(cls, iterables, sortorder=None, names=None):
"""
Make a MultiIndex from the cartesian product of multiple iterables

Parameters
----------
iterables : list / sequence of iterables
Each iterable has unique labels for each level of the index.
sortorder : int or None
Level of sortedness (must be lexicographically sorted by that
level).
names : list / sequence of strings or None
Names for the levels in the index.

Returns
-------
index : MultiIndex

Examples
--------
>>> numbers = [0, 1, 2]
>>> colors = [u'green', u'purple']
>>> MultiIndex.from_product([numbers, colors],
names=['number', 'color'])
MultiIndex(levels=[[0, 1, 2], [u'green', u'purple']],
labels=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]],
names=[u'number', u'color'])

See Also
--------
MultiIndex.from_arrays : Convert list of arrays to MultiIndex
MultiIndex.from_tuples : Convert list of tuples to MultiIndex
"""
from pandas.tools.util import cartesian_product
product = cartesian_product(iterables)
return MultiIndex.from_arrays(product, sortorder=sortorder,
names=names)

@property
def nlevels(self):
return len(self.levels)
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/test_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1561,6 +1561,20 @@ def test_from_arrays(self):
result = MultiIndex.from_arrays(arrays)
self.assertEquals(list(result), list(self.index))

def test_from_product(self):
first = ['foo', 'bar', 'buz']
second = ['a', 'b', 'c']
names = ['first', 'second']
result = MultiIndex.from_product([first, second], names=names)

tuples = [('foo', 'a'), ('foo', 'b'), ('foo', 'c'),
('bar', 'a'), ('bar', 'b'), ('bar', 'c'),
('buz', 'a'), ('buz', 'b'), ('buz', 'c')]
expected = MultiIndex.from_tuples(tuples, names=names)

assert_array_equal(result, expected)
self.assertEquals(result.names, names)

def test_append(self):
result = self.index[:3].append(self.index[3:])
self.assert_(result.equals(self.index))
Expand Down