Skip to content

dtype=object not obeyed and integer converted to float #2255

@ruidc

Description

@ruidc

reading http://pandas.pydata.org/pandas-docs/dev/gotchas.html it implies that using dtype=object should preserve the integer, passing the data via a numpy array preserves the type as I would like:

In [1]: import pandas

In [2]: pandas.__version__
Out[2]: '0.9.1'

In [3]: data = [(6260L, 20302010L), (6262L, None)]

In [4]: df = pandas.DataFrame(data, dtype=object)

In [5]: df
Out[5]:
      0             1
0  6260  2.030201e+07
1  6262           NaN

In [6]: df.dtypes
Out[6]:
0    object
1    object

In [7]: type(df.values[0][1])
Out[7]: float

In [8]: df = pandas.DataFrame(data)

In [9]: df.dtypes
Out[9]:
0      int64
1    float64

In [10]: type(df.values[0][1])
Out[10]: numpy.float64

In [11]: df = pandas.DataFrame(pandas.np.array(data, dtype=object))

In [12]: df.dtypes
Out[12]:
0    object
1    object

In [13]: type(df.values[0][1])
Out[13]: long

The place where this seems to start breaking is in the call to maybe_convert_objects from _convert_object_array, in pandas.core.frame.py L 5221 (for 0.9.1)

In [3]: type(pandas.lib.maybe_convert_objects(pandas.np.array([20302010, None], dtype=object), try_float=False)[0])
Out[3]: numpy.float64

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions