Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

quantile throws error if not convertible to float #2625

Closed
hayd opened this Issue · 2 comments

3 participants

@hayd
Collaborator

If we try and quantile a DataFrame with string entries which are not convertible, there is a ValueError. Should this behave like mean (and ignore these entries)? (taken from this StackOverflow question).

In [1]: df = DataFrame({'col1':['A','A','B','B'], 'col2':[1,2,3,4]})

In [2]: df
Out[2]:
  col1  col2
0    A     1
1    A     2
2    B     3
3    B     4


In [3]: g = df.groupby('col1')

In [4]: g.mean()
Out[4]: 
      col2
col1      
A      1.5
B      3.5

In [5]: g.quantile()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/andy/<ipython-input-70-8b0757805794> in <module>()
----> 1 g.quantile()

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in wrapper(*args, **kwargs)
    258                 return self.apply(curried_with_axis)
    259             except Exception:
--> 260                 return self.apply(curried)
    261 
    262         return wrapper

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in apply(self, func, *args, **kwargs)
    319         func = _intercept_function(func)
    320         f = lambda g: func(g, *args, **kwargs)
--> 321         return self._python_apply_general(f)
    322 
    323     def _python_apply_general(self, f):

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in _python_apply_general(self, f)
    322 
    323     def _python_apply_general(self, f):
--> 324         keys, values, mutated = self.grouper.apply(f, self.obj, self.axis)
    325 
    326         return self._wrap_applied_output(keys, values,

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in apply(self, f, data, axis, keep_internal)
    594             # group might be modified

    595             group_axes = _get_axes(group)
--> 596             res = f(group)
    597             if not _is_indexed_like(res, group_axes):
    598                 mutated = True

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in <lambda>(g)
    318         """
    319         func = _intercept_function(func)
--> 320         f = lambda g: func(g, *args, **kwargs)
    321         return self._python_apply_general(f)
    322 

/usr/lib/pymodules/python2.7/pandas/core/groupby.pyc in curried(x)
    253 
    254             def curried(x):
--> 255                 return f(x, *args, **kwargs)
    256 
    257             try:

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in quantile(self, q, axis)
   4946                 return _quantile(arr, per)
   4947 
-> 4948         return self.apply(f, axis=axis)
   4949 
   4950     def clip(self, upper=None, lower=None):

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in apply(self, func, axis, broadcast, raw, args, **kwds)
   4079                     return self._apply_raw(f, axis)
   4080                 else:
-> 4081                     return self._apply_standard(f, axis)
   4082             else:
   4083                 return self._apply_broadcast(f, axis)

/usr/lib/pymodules/python2.7/pandas/core/frame.pyc in _apply_standard(self, func, axis, ignore_failures)
   4154                     # no k defined yet

   4155                     pass
-> 4156                 raise e
   4157 
   4158         if len(results) > 0 and _is_sequence(results[0]):

ValueError: ('could not convert string to float: A', u'occurred at index col1')
@wesm
Owner

Agreed. This should select only the numeric data and compute the quantiles on that

@changhiskhan
Collaborator

closed via 8d02903

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.