Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dtype inconsistency when appending to empty dataframe #22621

Open
jonathanrocher opened this issue Sep 6, 2018 · 3 comments
Open

Dtype inconsistency when appending to empty dataframe #22621

jonathanrocher opened this issue Sep 6, 2018 · 3 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@jonathanrocher
Copy link

Code Sample, a copy-pastable example if possible

>>> pd.DataFrame().append({"a": 1}, ignore_index=True)
     a
0  1.0

Problem description

I would expect append not to change the dtype of the a column from int to float, and for the output to be identical to that of pd.concat([pd.DataFrame(), pd.DataFrame([{"a": 1}])])

Expected Output

>>> pd.DataFrame().append({"a": 1}, ignore_index=True)
     a
0  1

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 4612312
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.24.0.dev0+547.g4612312
pytest: None
pip: 18.0
setuptools: 34.3.3
Cython: 0.25.2
numpy: 1.13.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@TomAugspurger
Copy link
Contributor

I suppose it's because the default dtype for a reindex is float

In [28]: pd.DataFrame().reindex(columns=['a']).dtypes
Out[28]:
a    float64
dtype: object

then we append an int Series, and you end up with float.

I'm not sure how feasible changing Out[28] to be anything other than float is.

@jonathanrocher
Copy link
Author

Ah right. But pd.concat does the right thing so I wonder if we could let concat handle more so this use case behaves better. I need to spend more time with the method's implementation...

@gfyoung gfyoung added the Dtype Conversions Unexpected or buggy dtype conversions label Sep 7, 2018
@TomAugspurger TomAugspurger added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Difficulty Intermediate labels Sep 7, 2018
@TomAugspurger TomAugspurger added this to the Contributions Welcome milestone Sep 7, 2018
@mroeschke mroeschke added the Bug label Jun 22, 2021
@devamin
Copy link

devamin commented Jan 17, 2022

This could lead to unexpected bugs, specially if we do an aggregation after

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants