New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: MultiIndex .loc fails with all numpy arrays #19686

Closed
innominate227 opened this Issue Feb 13, 2018 · 3 comments

Comments

Projects
None yet
4 participants
@innominate227

innominate227 commented Feb 13, 2018

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

a = [10,20,30]
b = [1,2,3]
index = pd.MultiIndex.from_product([a,b])
df = pd.DataFrame(np.arange(len(index)), index=index, columns=['Data'])

slice1 = df.loc[([10,20],            [2,3]),           'Data']
slice2 = df.loc[(np.array([10,20]),  [2,3]),           'Data']
slice3 = df.loc[([10,20],            np.array([2,3])), 'Data']
slice4 = df.loc[(np.array([10,20]),  np.array([2,3])), 'Data']

Problem description

slice4 in the sample fails with TypeError: '(array([10, 20]), array([2, 3]))' is an invalid key.
slice1-3 all work as expected.

Expected Output

expect slicing to work with numpy arrays the same as lists.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 45 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0
pytest: 2.9.2
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.2
openpyxl: 2.3.2
xlrd: None
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: 0.9999999
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: 0.1.2
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jorisvandenbossche jorisvandenbossche changed the title from MultiIndex .loc fails with all numpy arrays to BUG: MultiIndex .loc fails with all numpy arrays Feb 15, 2018

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Feb 15, 2018

@innominate227 Thanks for the report! That indeed looks like a bug (I at least cannot think of a reason why we would do it like this)

@PoppyBagel

This comment has been minimized.

Contributor

PoppyBagel commented Feb 16, 2018

I'd like to work on it if nobody minds. May I?

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Feb 16, 2018

Go ahead! If you have any question, feel free to ask here, or on the gitter channel (https://gitter.im/pydata/pandas)
(note: it's a bit hard to predict without looking into details at it, but it might not be the most easy issue)

@jreback jreback added this to the 0.23.0 milestone Feb 19, 2018

@jreback jreback added the Compat label Feb 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment