Skip to content

BUG: Wrong column order after inner merge operation on empty dataframes. #51929

@anmyachev

Description

@anmyachev

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

on = [f"col_{x}" for x in range(5)]
columns1 = ["col_0", "col_1", "col_2", "col_3", "col_4", "col_5"]
columns2 = ["col_0", "col_1", "col_2", "col_3", "col_4", "col_6", "col_7"]

last_wide_df = pd.DataFrame([[0] * 6], columns=columns1)
last_pivot_df = pd.DataFrame([[0] * 7], columns=columns2)

print(pd.merge(last_wide_df, last_pivot_df, on=on).columns)  # 1st case
print(pd.merge(last_wide_df[:0], last_pivot_df[:0], on=on).columns)  # 2nd case

Issue Description

The column order is different in these two cases:

Index(['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6', 'col_7'], dtype='object')
Index(['col_5', 'col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_6', 'col_7'], dtype='object')

Expected Behavior

I believe that the order of the columns should be the same in these two cases.

Installed Versions

INSTALLED VERSIONS
------------------
commit           : 2e218d10984e9919f0296931d92ea851c6a6faf5
python           : 3.9.15.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
Version          : 10.0.22000
machine          : AMD64
processor        : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : English_United States.1252

pandas           : 1.5.3
numpy            : 1.23.4
pytz             : 2022.6
dateutil         : 2.8.2
setuptools       : 65.5.1
pip              : 22.3.1
Cython           : None
pytest           : 7.2.0
hypothesis       : None
sphinx           : 5.3.0
blosc            : None
feather          : 0.4.1
xlsxwriter       : None
lxml.etree       : 4.9.1
html5lib         : None
pymysql          : None
psycopg2         : 2.9.3
jinja2           : 3.1.2
IPython          : 8.6.0
pandas_datareader: None
bs4              : 4.11.1
bottleneck       : None
brotli           :
fastparquet      : 2022.11.0
fsspec           : 2022.11.0
gcsfs            : None
matplotlib       : 3.6.2
numba            : None
numexpr          : 2.7.3
odfpy            : None
openpyxl         : 3.0.10
pandas_gbq       : 0.17.9
pyarrow          : 9.0.0
pyreadstat       : None
pyxlsb           : None
s3fs             : 2022.11.0
scipy            : 1.9.3
snappy           : None
sqlalchemy       : 1.4.46
tables           : 3.7.0
tabulate         : None
xarray           : 2022.11.0
xlrd             : 2.0.1
xlwt             : None
zstandard        : None
tzdata           : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions