Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Bug in concatenation with duplicate columns across dtypes, GH4975 #4989

Merged
merged 1 commit into from
Sep 26, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Sep 25, 2013

partially fixed in #4771

closes #4975

In [1]: w = DataFrame(np.random.randn(4,2), columns=["x", "y"])

In [2]: x = DataFrame(np.random.randn(4,2), columns=["x", "y"])

In [3]: y = DataFrame(np.random.randn(4,2), columns=["x", "y"])

In [4]: z = DataFrame(np.random.randn(4,2), columns=["x", "y"])

In [5]: dta = x.merge(y, left_index=True, right_index=True).merge(z, left_index=True, right_index=True, how="outer")

In [6]: dta = dta.merge(w, left_index=True, right_index=True)

In [7]: dta
Out[7]: 
        x_x       y_x       x_y       y_y       x_x       y_x       x_y       y_y
0  0.393625 -0.340291  0.035043 -0.195235 -0.892856 -0.357269  0.820424  0.142803
1 -1.600176  0.737261 -0.571140  1.352393  0.201634  1.403633  0.590919 -1.003057
2 -1.046113  2.148139  2.406527 -1.460300  0.881712  0.949246 -0.061758 -0.386265
3 -0.472761 -0.055612 -0.449152 -0.209876  1.076689  0.294275 -0.684433  0.925683

Via direct concat

In [11]: concat([x,y,z,w],axis=1)
Out[11]: 
          x         y         x         y         x         y         x         y
0  0.393625 -0.340291  0.035043 -0.195235 -0.892856 -0.357269  0.820424  0.142803
1 -1.600176  0.737261 -0.571140  1.352393  0.201634  1.403633  0.590919 -1.003057
2 -1.046113  2.148139  2.406527 -1.460300  0.881712  0.949246 -0.061758 -0.386265
3 -0.472761 -0.055612 -0.449152 -0.209876  1.076689  0.294275 -0.684433  0.925683

In [8]: expected = concat([x,y,z,w],axis=1)

In [9]: expected.columns=['x_x','y_x','x_y','y_y','x_x','y_x','x_y','y_y']

In [10]: expected
Out[10]: 
        x_x       y_x       x_y       y_y       x_x       y_x       x_y       y_y
0  0.393625 -0.340291  0.035043 -0.195235 -0.892856 -0.357269  0.820424  0.142803
1 -1.600176  0.737261 -0.571140  1.352393  0.201634  1.403633  0.590919 -1.003057
2 -1.046113  2.148139  2.406527 -1.460300  0.881712  0.949246 -0.061758 -0.386265
3 -0.472761 -0.055612 -0.449152 -0.209876  1.076689  0.294275 -0.684433  0.925683

@jreback
Copy link
Contributor Author

jreback commented Sep 25, 2013

@jseabold the issue you report in #4975 was an actual bug (that was 'partially' fixed in #4771)

thanks!

…rging with axis=0 (originally GH4771),

     fixed again is (GH4975)
jreback added a commit that referenced this pull request Sep 26, 2013
BUG: Bug in concatenation with duplicate columns across dtypes, GH4975
@jreback jreback merged commit b891d1b into pandas-dev:master Sep 26, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: merge fails to rename when name already exists
1 participant