Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assignment to slice #5641

Closed
aharoon123 opened this issue Dec 4, 2013 · 6 comments · Fixed by #6301
Closed

assignment to slice #5641

aharoon123 opened this issue Dec 4, 2013 · 6 comments · Fixed by #6301
Labels
Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Usage Question
Milestone

Comments

@aharoon123
Copy link

In pandas version 0.81, this used to work:

import pandas as pd
import numpy as np
from pandas.tseries.offsets import *
from datetime import date


M_ASSETS=50
K_FACTORS=3

dates = pd.bdate_range(date(2013, 1, 1), date(2013, 1, 10))
factor_names=['F'+str(y) for y in xrange(1, K_FACTORS+1)]
asset_names = ['A'+str(y) for y in xrange(1, M_ASSETS+1)]

frames=[]
for dt in dates: 
    f=pd.DataFrame(np.random.randn(M_ASSETS, K_FACTORS),
                    index = pd.MultiIndex.from_tuples(zip([dt]*M_ASSETS, asset_names)), 
                                                    columns = factor_names)
    frames.append(f)


f=pd.concat(frames)

print "DataFrame for asset1\n", f.ix[:, "A1"]
print "DataFrame for factor1\n", f["F1"].unstack()
print "DataFrame for first date\n", f.ix[f.index.get_level_values(0)[0], :]

In pandas 0.11, however this same code gives the following error:


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-1-09af79b5dd36> in <module>()
     22 f=pd.concat(frames)
     23 
---> 24 print "DataFrame for asset1\n", f.ix[:, "A1"]
     25 print "DataFrame for factor1\n", f["F1"].unstack()
     26 print "DataFrame for first date\n", f.ix[f.index.get_level_values(0)[0], :]

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/indexing.pyc in __getitem__(self, key)
     32                 pass
     33 
---> 34             return self._getitem_tuple(key)
     35         else:
     36             return self._getitem_axis(key, axis=0)

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/indexing.pyc in _getitem_tuple(self, tup)
    222                 continue
    223 
--> 224             retval = retval.ix._getitem_axis(key, axis=i)
    225 
    226         return retval

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/indexing.pyc in _getitem_axis(self, key, axis)
    362             if com.is_integer(key) and not _is_integer_index(labels):
    363                 return self._get_loc(key, axis=axis)
--> 364             return self._get_label(lab, axis=axis)
    365 
    366     def _getitem_iterable(self, key, axis=0):

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/indexing.pyc in _get_label(self, label, axis)
     46             return self.obj.xs(label, axis=axis, copy=False)
     47         except Exception:
---> 48             return self.obj.xs(label, axis=axis, copy=True)
     49 
     50     def _get_loc(self, key, axis=0):

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/frame.pyc in xs(self, key, axis, level, copy)
   2285 
   2286         if axis == 1:
-> 2287             data = self[key]
   2288             if copy:
   2289                 data = data.copy()

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key)
   1969                 raise ValueError('Cannot index using non-boolean DataFrame')
   1970         else:
-> 1971             return self._get_item_cache(key)
   1972 
   1973     def _getitem_array(self, key):

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_item_cache(self, item)
    532             return cache[item]
    533         except Exception:
--> 534             values = self._data.get(item)
    535             res = self._box_item_values(item, values)
    536             cache[item] = res

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/internals.pyc in get(self, item)
    882 
    883     def get(self, item):
--> 884         _, block = self._find_block(item)
    885         return block.get(item)
    886 

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/internals.pyc in _find_block(self, item)
   1014 
   1015     def _find_block(self, item):
-> 1016         self._check_have(item)
   1017         for i, block in enumerate(self.blocks):
   1018             if item in block:

/mnt/bos-netapp01/epd/7.3-2_pandas0.10/lib/python2.7/site-packages/pandas/core/internals.pyc in _check_have(self, item)
   1021     def _check_have(self, item):
   1022         if item not in self.items:
-> 1023             raise KeyError('no item named %s' % com.pprint_thing(item))
   1024 
   1025     def reindex_axis(self, new_axis, method=None, axis=0, copy=True):

KeyError: u'no item named A1'

DataFrame for asset1

Now I understand that the slicing changed, so if I use:

print "DataFrame for asset1\n", f.xs("A1", level=1)

it works:


DataFrame for asset1
                  F1        F2        F3
2013-01-01 -0.327087 -0.898145 -1.726459
2013-01-02  1.276976 -0.410213  1.124507
2013-01-03  0.575263  1.132212  0.172895
2013-01-04  0.704949 -0.681410  0.384713
2013-01-07 -0.033519 -0.598248  0.629885
2013-01-08 -0.065769  0.245913 -0.941165
2013-01-09  0.337986 -0.715003 -0.093851
2013-01-10  0.303433 -0.870952 -1.397974

Questions:

  1. Why does ix no longer work for this
  2. However, how to I assign to a slice:
    this does not work:
view = f.xs("A1", level=1, copy = False)
view = 10
@jreback
Copy link
Contributor

jreback commented Dec 4, 2013

  1. your use of ix is not clear, you want a cross-section, but are selecting on a column. If it worked that way prior to current versions, then that was a bug
  2. you cannot directly assign to a slice. Their are many way to do this, easiest would be to swaplevels, then assign and swap back.

@aharoon123
Copy link
Author

Thank you for the quick response

  1. I am not selecting on column, but on the second level of the 2-level multiindex. Not sure why I should expect that type of functionality to be a bug. Seems like an elegant way to slice the data and it works in 0.81

  2. Can you show me the different ways to do this please. Is the swaplevels version this?

f = f.swaplevel(0, 1)
f.ix["A1"] = 10
f = f.swaplevel(0, 1)

That looks like a workaround to me. Are there more elegant ways to do this

@jreback
Copy link
Contributor

jreback commented Dec 4, 2013

  1. I know you are not selecting on the column, that's why your use of the syntax is in effect a bug. 0.8.1 is a quite old version (and has had quite a few API changes/bug fixes since then), you should consider upgrading
  2. this looks fine. It is quite non-trivial to do a mutli-level assignment to the non-zero level. You are welcome to try to come up with a nice syntax for that if you would like.

@ghost
Copy link

ghost commented Jan 27, 2014

@jreback, I'm sure I knew how to do this once with loc, what's the proper way?
There's the usual __getitem__ ambiguity of foo[bar,baz] vs. foo[(bar,baz)],
and the usual syntactical wart of not being able to use the : slice operator sugar
in a tuple.

But I believe the following is unambiguous on which slices apply to which axis:

f.loc[(slice(None),'A1'),:]

Did this ever work? It does seem like a reasonable thing to want to do.

@jreback
Copy link
Contributor

jreback commented Jan 27, 2014

you syntax is fine along with f.loc[(None,'A'),:] (as can't use the more useful ':' where None is

but not implemented ; I don't believe this ever was, related is #3057

@aharoon123
Copy link
Author

Thanks this works.

This e-Mail and any attachments contain privileged and confidential information of Acadian and may be accessed and read only by the intended recipients. Any further distribution or reproduction of this material by recipients, or use for any purpose not authorized by Acadian, is strictly prohibited. If you are not the intended recipient and this e-mail and attachments have been sent or passed on to you in error, please destroy the same and contact us immediately. Confidentiality and privilege are not lost by this transmission having been sent or passed on to you in error. Acadian is not liable for any damage that may be caused by viruses or transmission errors.

Acadian Asset Management LLC is registered as an investment adviser with the U.S. Securities and Exchange Commission. Registered Office: 260 Franklin Street, Boston, Massachusetts 02110. Acadian Asset Management (UK) Limited is a private limited company incorporated in England, number 05644066, and is authorised and regulated by the Financial Conduct Authority of the United Kingdom. Registered office: 36-38 Cornhill, London, EC3V3ND, United Kingdom. Acadian Asset Management (Singapore) Pte Ltd. (Registration Number: 199902125D) is a private company limited by shares organized under Singapore law and is authorized by the Monetary Authority of Singapore. Registered office: 8 Shenton Way, #37-02, Singapore 068811.

[AAM_2010_v1.3]


From: jreback [mailto:notifications@github.com]
Sent: Monday, January 27, 2014 6:28 AM
To: pydata/pandas
Cc: Adoito Haroon
Subject: Re: [pandas] assignment to slice (#5641)

you syntax is fine along with f.loc[(None,'A'),:] (as can't use the more useful ':' where None is

but not implemented ; I don't believe this ever was, related is #3057#3057


Reply to this email directly or view it on GitHubhttps://github.com//issues/5641#issuecomment-33359861.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Usage Question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants