Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd.to_numeric produces misleading results on DataFrame #11776

Closed
mortada opened this issue Dec 6, 2015 · 2 comments · Fixed by #11780
Closed

pd.to_numeric produces misleading results on DataFrame #11776

mortada opened this issue Dec 6, 2015 · 2 comments · Fixed by #11780
Labels
API Design Error Reporting Incorrect or improved errors from pandas
Milestone

Comments

@mortada
Copy link
Contributor

mortada commented Dec 6, 2015

when pd.to_numeric is called with errors='coerce' on a DataFrame, it doesn't raise and just returns the original DataFrame.

This may be related to the discussion here #11221 as this function currently doesn't support anything more than 1-d.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'a': [1, 2, 'foo'], 'b': [2.3, -1, 'bar']})

In [3]: df
Out[3]:
     a    b
0    1  2.3
1    2   -1
2  foo  bar

In [4]: pd.to_numeric(df)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-9febd95a7c0a> in <module>()
----> 1 pd.to_numeric(df)

/Users/mortada_mehyar/code/github/pandas/pandas/tools/util.py in to_numeric(arg, errors)
     94         conv = lib.maybe_convert_numeric(arg,
     95                                          set(),
---> 96                                          coerce_numeric=coerce_numeric)
     97     except:
     98         if errors == 'raise':

/Users/mortada_mehyar/code/github/pandas/pandas/src/inference.pyx in pandas.lib.maybe_convert_numeric (pandas/lib.c:52369)()
    518 cdef int64_t iINT64_MIN = <int64_t> INT64_MIN
    519
--> 520 def maybe_convert_numeric(object[:] values, set na_values,
    521                           bint convert_empty=True, bint coerce_numeric=False):
    522     '''

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

In [5]: pd.to_numeric(df, errors='coerce')
Out[5]:
     a    b
0    1  2.3
1    2   -1
2  foo  bar

Note that the last expression doesn't raise but the previous one does.

Seems like we should either

  1. make pd.to_numeric work with DataFrame or NDFrame in general
  2. simply raise here too if a DataFrame or something more than 1-d is passed
@jreback
Copy link
Contributor

jreback commented Dec 6, 2015

best to raise for non 1-d
(and check pd.to_datetime/to_timedelta) for the same

@jreback jreback added API Design Error Reporting Incorrect or improved errors from pandas labels Dec 6, 2015
@jreback jreback added this to the 0.18.0 milestone Dec 6, 2015
@mortada
Copy link
Contributor Author

mortada commented Dec 7, 2015

sounds good, I'll send a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants