BUG: Data types are not preserved while concatenating DataFrames with nullable integers #27692

vss888 · 2019-08-01T12:57:07Z

Copy-pastable example

# input data
import pandas as pd
t1 = pd.DataFrame(index=[0], data={'x':[1]}, dtype='UInt8')
t2 = pd.DataFrame(index=[1], data={'y':[1]}, dtype='UInt8')
t3 = pd.concat([t1,t2], join='outer', sort=False)

'''actual result'''
print(t3.dtypes)
# x    object
# y    object
# dtype: object

Problem description

Data types are not preserved data type while concatenating DataFrames with nullable integers. Instead, the result of concatenation has mixed data types and so the column types are object:

>>> type(t3.at[0,'x'])
<class 'int'>
>>> type(t3.at[1,'x'])
<class 'float'>

Expected Output

'''expected result'''
print(t3.dtypes)
# x    UInt8
# y    UInt8
# dtype: object

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit : None python : 3.6.3.final.0 python-bits : 64 OS : Linux OS-release : 3.10.0-862.11.6.el7.x86_64 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.utf-8 LANG : en_US.utf-8 LOCALE : en_US.UTF-8

pandas : 0.25.0
numpy : 1.16.3
pytz : 2018.4
dateutil : 2.7.3
pip : 19.1.1
setuptools : 39.0.1
Cython : 0.29.12
pytest : 3.3.2
hypothesis : None
sphinx : 1.6.6
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.4
html5lib : 0.9999999
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : 6.2.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.0.2
numexpr : 2.6.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : None
tables : 3.5.2
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2019-08-02T18:17:06Z

Tangentially related to #22994, which proposes a solution for getting the right dtypes in this situation.

Closes pandas-dev#27692 Closes pandas-dev#33027

* BUG: Fixed concat with reindex and extension types Closes #27692 Closes #33027 * rebase * fixup * cleanup * fixups

vss888 changed the title ~~Data types are not preserved data type while concatenating DataFrames with nullable integers~~ Data types are not preserved while concatenating DataFrames with nullable integers Aug 2, 2019

TomAugspurger added ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Aug 2, 2019

WillAyd mentioned this issue Mar 19, 2020

BUG: Nullable integer type cols become 'object' dtype by concatenation #29588

Closed

jorisvandenbossche added the Bug label Mar 19, 2020

jorisvandenbossche changed the title ~~Data types are not preserved while concatenating DataFrames with nullable integers~~ BUG: Data types are not preserved while concatenating DataFrames with nullable integers Mar 19, 2020

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Apr 13, 2020

BUG: Fixed concat with reindex and extension types

bd316c3

Closes pandas-dev#27692 Closes pandas-dev#33027

TomAugspurger mentioned this issue Apr 13, 2020

BUG: Fixed concat with reindex and extension types #33522

Merged

jreback added this to the 1.1 milestone Apr 17, 2020

jreback closed this as completed in #33522 Jul 1, 2020

jreback pushed a commit that referenced this issue Jul 1, 2020

BUG: Fixed concat with reindex and extension types (#33522)

bf12604

* BUG: Fixed concat with reindex and extension types Closes #27692 Closes #33027 * rebase * fixup * cleanup * fixups

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Data types are not preserved while concatenating DataFrames with nullable integers #27692

BUG: Data types are not preserved while concatenating DataFrames with nullable integers #27692

vss888 commented Aug 1, 2019 •

edited

Loading

TomAugspurger commented Aug 2, 2019

BUG: Data types are not preserved while concatenating DataFrames with nullable integers #27692

BUG: Data types are not preserved while concatenating DataFrames with nullable integers #27692

Comments

vss888 commented Aug 1, 2019 • edited Loading

Copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

TomAugspurger commented Aug 2, 2019

vss888 commented Aug 1, 2019 •

edited

Loading

Output of `pd.show_versions()`