combinatorial explosion when merging dataframes #2690

gdraps opened this Issue Jan 14, 2013 · 1 comment

Hi Wes, Not sure if this is real, but opening to follow up on a comment in Here's a fabricated example to reproduce MergeError: Combinatorial explosion! (boom) in tools/

df1 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G1'])
df2 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G2'])
df1.merge(df2) # Boom

The value of group_sizes at the time of exception is:

[2000, 2000, 2000, 2000, 2000, 2000]

The exception does not occur if the common columns are set as the index, though.

df1 = df1.set_index(list('ABCDEF'))
df2 = df2.set_index(list('ABCDEF'))
df1.merge(df2, left_index=True, right_index=True, how='outer') # OK
Thanks. I'll have a look when I can (later this week)

@wesm wesm closed this in 78d090a Jan 19, 2013
@wesm wesm was assigned Jan 20, 2013
