BUG: Conversion of Series dtype from object to Int16 etc. fails #41060
Labels
API - Consistency
Internal Consistency of API/Behavior
Bug
Dtype Conversions
Unexpected or buggy dtype conversions
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Problem description
The conversion of an object-series with some text in it to one of the nullable integer dtypes fails even though all elements of the series are convertable to integers (or to pd.NA).
This issue seems to be related to #40729, but the workaround described there for Floatxx doesn't work in all cases here:
The detour via dtype string is not evough if an element in the object-series is a float because int("3.0") doesn't work (but int(3.0) does). A detour via string and then Float64 is necessary for all examples given above to work (for some cases, but not all, the string step can be omitted).
But even the detour via string and Float64 to Int16 is not guaranteed to always work, e.g. if an element of the series is an object with an
__int__()
method (returning a number) and a__str__()
method (returning a description, not an integer literal).I think the topic of this issue has also been mentioned in the discussion of #39616.
Expected Output
The conversion of a series of dtype object to one of the nullable integer dtypes should always work if all elements of the series are convertable to the target dtype.
At least something along the lines of
I'd even prefer something like
such that string literals like "0x7f" work.
As the latter doesn't work with the current string -> Int16 conversion though, that would be more like an enhancement than a bugfix.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2cb9652
python : 3.9.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
...
pandas : 1.2.4
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
...
The text was updated successfully, but these errors were encountered: