combinatorial explosion when merging dataframes #2690

Closed
gdraps opened this Issue Jan 14, 2013 · 1 comment

2 participants

@gdraps

Hi Wes, Not sure if this is real, but opening to follow up on a comment in http://stackoverflow.com/questions/14199168/combinatorial-explosion-when-merging-dataframes-in-pandas. Here's a fabricated example to reproduce MergeError: Combinatorial explosion! (boom) in tools/merge.py:

df1 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G1'])
df2 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G2'])
df1.merge(df2) # Boom

The value of group_sizes at the time of exception is:

[2000, 2000, 2000, 2000, 2000, 2000]

The exception does not occur if the common columns are set as the index, though.

df1 = df1.set_index(list('ABCDEF'))
df2 = df2.set_index(list('ABCDEF'))
df1.merge(df2, left_index=True, right_index=True, how='outer') # OK
@wesm
Python for Data member

Thanks. I'll have a look when I can (later this week)

@wesm wesm closed this in 78d090a Jan 19, 2013
@wesm wesm was assigned Jan 20, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment