Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
DataFrame.ix[idx, :] = value sets wrong values when idx is a MultiIndex and DataFrame.columns is also a MultiIndex #11372
Comments
|
Indexing with a specific set of columns also gives the error: Code sample: import pandas as pd
import numpy as np
np.random.seed(1)
from itertools import product
from pandas.util.testing import assert_frame_equal
pd.show_versions()
idx = pd.MultiIndex.from_tuples(
list(
product(['A', 'B', 'C'],
pd.date_range('2015-01-01', '2015-04-01', freq='MS'))
)
)
cols = pd.MultiIndex.from_tuples(
list(
product(['foo', 'bar'],
pd.date_range('2016-01-01', '2016-02-01', freq='MS'))
)
)
# if cols = ['foo', 'bar', 'baz', 'quux'], there is no error.
test = pd.DataFrame(np.random.random((12, 4)), index=idx, columns=cols)
subidx = pd.MultiIndex.from_tuples(
[('A', pd.Timestamp('2015-01-01')), ('A', pd.Timestamp('2015-02-01'))]
)
subcols = pd.MultiIndex.from_tuples(
[('foo', pd.Timestamp('2016-01-01')), ('foo', pd.Timestamp('2016-02-01'))]
)
vals = pd.DataFrame(np.random.random((2, 2)), index=subidx, columns=subcols)
test.ix[subidx, subcols] = vals
print test.ix[subidx, subcols]
print vals
assert_frame_equal(test.ix[subidx, subcols], vals)0.17.0INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.17.0
nose: 1.3.7
pip: 7.1.0
setuptools: 18.0.1
Cython: 0.22
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.2.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.1
pytz: 2015.4
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.7
pymysql: None
psycopg2: None
foo
2016-01-01 2016-02-01
A 2015-01-01 0.287775 0.130029
2015-02-01 0.287775 0.130029
foo
2016-01-01 2016-02-01
A 2015-01-01 0.287775 0.130029
2015-02-01 0.019367 0.678836
Traceback (most recent call last):
File "c:\dev\code\sandbox\multiindex.py", line 48, in <module>
assert_frame_equal(test.ix[subidx, subcols], vals)
File "c:\python\envs\pd017\lib\site-packages\pandas\util\testing.py", line 1028, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
File "c:\python\envs\pd017\lib\site-packages\pandas\util\testing.py", line 925, in assert_series_equal
check_less_precise, obj='{0}'.format(obj))
File "pandas\src\testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas\src\testing.c:3809)
File "pandas\src\testing.pyx", line 147, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2685)
File "c:\python\envs\pd017\lib\site-packages\pandas\util\testing.py", line 798, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 0] are different
DataFrame.iloc[:, 0] values are different (50.0 %)
[left]: [0.287775338586, 0.287775338586]
[right]: [0.287775338586, 0.0193669578703] |
|
(Deleted- misread something, my previous suggestion was not really a fix) |
|
hmm, surprised that broke. there is not must testing on that sub-section actually The issue is here: https://github.com/pydata/pandas/blob/master/pandas/core/indexing.py#L450
So could prob pass in an additional parameter which would determine this. |
jreback
added Bug Indexing MultiIndex
labels
Oct 19, 2015
jreback
added this to the
0.17.1
milestone
Oct 19, 2015
jreback
changed the title from
`DataFrame.ix[idx, :] = value` sets wrong values when `idx` is a `MultiIndex` and `DataFrame.columns` is also a `MultiIndex` to DataFrame.ix[idx, :] = value sets wrong values when idx is a MultiIndex and DataFrame.columns is also a `MultiIndex`
Oct 19, 2015
jreback
added Difficulty Intermediate Effort Medium
labels
Oct 19, 2015
jreback
changed the title from
DataFrame.ix[idx, :] = value sets wrong values when idx is a MultiIndex and DataFrame.columns is also a `MultiIndex` to DataFrame.ix[idx, :] = value sets wrong values when idx is a MultiIndex and DataFrame.columns is also a MultiIndex
Oct 19, 2015
|
Since I've already got two test cases, I'd be happy to have a go if I can be pointed in the right direction. I'll start by looking at the history of |
|
the pointer above is to the relevant issues. the way to do this is to setup the test cases and the expected results (in test_indexing); they should fail before a fix, then you can step thru to see where to put a fix and go from there |
|
OK, here is my first attempt: pydata#11400 I added a test for pydata#5206 as well to test I hadn't broken that existing functionality. C:\dev\code\opensource\pandas-rekcahpassyla [multiindex_setitem +2 ~0 -0 !]> C:\python\envs\pandasdev\scripts\nosetests .\pandas\tests\test_indexing.py
...........................................................................................................................................
----------------------------------------------------------------------
Ran 139 tests in 68.094s
OKAttempted to run the whole test suite, but |
rekcahpassyla commentedOct 19, 2015
This code is broken in
0.17.0but not in0.15.2:0.17.0
0.15.2