Skip to content

[SPARK-43453][PS] Ignore the names of MultiIndex when axis=1 for concat#42991

Closed
itholic wants to merge 2 commits intoapache:masterfrom
itholic:SPARK-43453
Closed

[SPARK-43453][PS] Ignore the names of MultiIndex when axis=1 for concat#42991
itholic wants to merge 2 commits intoapache:masterfrom
itholic:SPARK-43453

Conversation

@itholic
Copy link
Contributor

@itholic itholic commented Sep 19, 2023

What changes were proposed in this pull request?

This PR proposes to update the behavior of ps.concat to follow the Pandas behavior, and enable corresponding tests.

Why are the changes needed?

To follow the latest Pandas.

Does this PR introduce any user-facing change?

For the MultiIndex columns:

>>> psdf3
X   X
AB  A  B
1   0  1
2   2  3
3   4  5
>>> psdf4
Y   X
CD  C  D
1   1  4
3   2  5
5   3  6

The behavior of ps.concat with axis=1 is changed:

Before

>>> ps.concat([psdf3, psdf4], axis=1)
X     X
AB    A    B    C    D
1   0.0  1.0  1.0  4.0
2   2.0  3.0  NaN  NaN
3   4.0  5.0  2.0  5.0
5   NaN  NaN  3.0  6.0

After (Ignore the names of MultiIndex columns to follow the latest Pandas)

>>> ps.concat([psdf3, psdf4], axis=1)
     X
     A    B    C    D
1  0.0  1.0  1.0  4.0
2  2.0  3.0  NaN  NaN
3  4.0  5.0  2.0  5.0
5  NaN  NaN  3.0  6.0

How was this patch tested?

Enabling the existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@itholic itholic deleted the SPARK-43453 branch November 20, 2023 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants