Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cut/qcut "retbins" broken in 0.15.0 #8589

Closed
rosnfeld opened this issue Oct 20, 2014 · 2 comments · Fixed by #8622
Closed

cut/qcut "retbins" broken in 0.15.0 #8589

rosnfeld opened this issue Oct 20, 2014 · 2 comments · Fixed by #8622
Labels
Bug Categorical Categorical Data Type
Milestone

Comments

@rosnfeld
Copy link
Contributor

Works in 0.14.1:

In [11]: s = pd.Series(np.random.randn(100))

In [12]: pd.cut(s, 10, retbins=True)
Out[12]: 
(   (0.993, 1.376]
  (-0.924, -0.541]
   (-0.158, 0.226]
  (-1.695, -1.308]
  (-0.924, -0.541]
  (-0.541, -0.158]
   (-0.158, 0.226]
  (-0.541, -0.158]
    (0.609, 0.993]
    (0.226, 0.609]
    (0.226, 0.609]
    (0.609, 0.993]
    (0.609, 0.993]
 ...
    (0.993, 1.376]
  (-0.541, -0.158]
    (0.226, 0.609]
    (0.226, 0.609]
   (-0.158, 0.226]
    (0.226, 0.609]
   (-0.158, 0.226]
  (-0.924, -0.541]
  (-1.695, -1.308]
    (0.993, 1.376]
   (-0.158, 0.226]
  (-0.924, -0.541]
    (0.993, 1.376]
 Levels (10): Index(['(-1.695, -1.308]', '(-1.308, -0.924]',
                     '(-0.924, -0.541]', '(-0.541, -0.158]',
                     '(-0.158, 0.226]', '(0.226, 0.609]',
                     '(0.609, 0.993]', '(0.993, 1.376]', '(1.376, 1.759]',
                     '(1.759, 2.143]'], dtype=object)
 Length: 100,
 array([-1.69479925, -1.30760325, -0.92424086, -0.54087848, -0.1575161 ,
         0.22584629,  0.60920867,  0.99257106,  1.37593344,  1.75929582,
         2.14265821]))

but does not work in 0.15.0/current master:

In [13]: pd.cut(s, 10, retbins=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-958e04d69222> in <module>()
----> 1 pd.cut(s, 10, retbins=True)

/home/andrew/git/pandas-rosnfeld/pandas/tools/tile.pyc in cut(x, bins, right, labels, retbins, precision, include_lowest)
    113                         include_lowest=include_lowest)
    114     if isinstance(x, Series):
--> 115         res = Series(res, index=x.index)
    116     return res
    117 

/home/andrew/git/pandas-rosnfeld/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
    210                                        raise_cast_failure=True)
    211 
--> 212                 data = SingleBlockManager(data, index, fastpath=True)
    213 
    214         generic.NDFrame.__init__(self, data, fastpath=True)

/home/andrew/git/pandas-rosnfeld/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
   3324             block = make_block(block,
   3325                                placement=slice(0, len(axis)),
-> 3326                                ndim=1, fastpath=True)
   3327 
   3328         self.blocks = [block]

/home/andrew/git/pandas-rosnfeld/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
   2058 
   2059     return klass(values, ndim=ndim, fastpath=fastpath,
-> 2060                  placement=placement)
   2061 
   2062 

/home/andrew/git/pandas-rosnfeld/pandas/core/internals.pyc in __init__(self, values, ndim, fastpath, placement)
   1372         super(ObjectBlock, self).__init__(values, ndim=ndim,
   1373                                           fastpath=fastpath,
-> 1374                                           placement=placement)
   1375 
   1376     @property

/home/andrew/git/pandas-rosnfeld/pandas/core/internals.pyc in __init__(self, values, placement, ndim, fastpath)
     73             raise ValueError('Wrong number of items passed %d,'
     74                              ' placement implies %d' % (
---> 75                                  len(self.values), len(self.mgr_locs)))
     76 
     77     @property

ValueError: Wrong number of items passed 2, placement implies 100

The same goes for qcut. These work if retbins=False.

I found this while testing out the new release, wanted to take a look at the Categorical type. I should have done this with the RC and caught it sooner...

@jreback jreback added Categorical Categorical Data Type Bug labels Oct 20, 2014
@jreback jreback added this to the 0.15.1 milestone Oct 20, 2014
@jreback
Copy link
Contributor

jreback commented Oct 20, 2014

hmm, prob not updated (and not well tested).

pull-request to fix?

@rosnfeld
Copy link
Contributor Author

I'll take a look at it. Interestingly pandas/tools/tests/test_tile.py looks at many similar cases, but it and the API docs always use numpy.ndarrays (which work) as the input, not Series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants