New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.sort(col) fails if col is not unique. #2488

gerigk opened this Issue Dec 11, 2012 · 1 comment


None yet
2 participants

gerigk commented Dec 11, 2012

one more trouble case with duplicate column names
the exception could be nicer (sort ambiguous because of duplicate column or similar) or it could sort by the first? column with this name (although this would not be nice in case the column is not duplicate but has in fact different content with the same name and the user not being aware of this).

import pandas as pd
df = pd.DataFrame([(1,2), (3,4)], columns = ['a', 'a'])

ValueError                                Traceback (most recent call last)
<ipython-input-1-251abbb5c77e> in <module>()
      1 import pandas as pd
      2 df = pd.DataFrame([(1,2), (3,4)], columns = ['a', 'a'])
----> 3 df.sort('a')

/usr/local/lib/python2.7/dist-packages/pandas-0.10.0.dev_74ab638-py2.7-linux-x86_64.egg/pandas/core/frame.pyc in sort(self, columns, column, axis, ascending, inplace)
   3062             columns = column
   3063         return self.sort_index(by=columns, axis=axis, ascending=ascending,
-> 3064                                inplace=inplace)
   3066     def sort_index(self, axis=0, by=None, ascending=True, inplace=False):

/usr/local/lib/python2.7/dist-packages/pandas-0.10.0.dev_74ab638-py2.7-linux-x86_64.egg/pandas/core/frame.pyc in sort_index(self, axis, by, ascending, inplace)
   3125             self._clear_item_cache()
   3126         else:
-> 3127             return self.take(indexer, axis=axis)
   3129     def sortlevel(self, level=0, axis=0, ascending=True, inplace=False):

/usr/local/lib/python2.7/dist-packages/pandas-0.10.0.dev_74ab638-py2.7-linux-x86_64.egg/pandas/core/frame.pyc in take(self, indices, axis)
   2853             new_values = com.take_2d(self.values,
   2854                                      com._ensure_int64(indices),
-> 2855                                      axis=axis)
   2856             if axis == 0:
   2857                 new_columns = self.columns

/usr/local/lib/python2.7/dist-packages/pandas-0.10.0.dev_74ab638-py2.7-linux-x86_64.egg/pandas/core/common.pyc in take_2d(arr, indexer, out, mask, needs_masking, axis, fill_value)
    414                 out = np.empty(out_shape, dtype=arr.dtype)
    415             take_f = _get_take2d_function(dtype_str, axis=axis)
--> 416             take_f(arr, _ensure_int64(indexer), out=out, fill_value=fill_value)
    417             return out
    418     elif dtype_str in ('float64', 'object', 'datetime64[ns]'):

/usr/local/lib/python2.7/dist-packages/pandas-0.10.0.dev_74ab638-py2.7-linux-x86_64.egg/pandas/ in pandas.algos.take_2d_axis0_int64 (pandas/algos.c:73957)()

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

This comment has been minimized.


wesm commented Dec 11, 2012

A better error message would be nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment