combinatorial explosion when merging dataframes #2690

Closed
gdraps opened this Issue Jan 14, 2013 · 1 comment

Comments

Projects
None yet
2 participants
Contributor

gdraps commented Jan 14, 2013

Hi Wes, Not sure if this is real, but opening to follow up on a comment in http://stackoverflow.com/questions/14199168/combinatorial-explosion-when-merging-dataframes-in-pandas. Here's a fabricated example to reproduce MergeError: Combinatorial explosion! (boom) in tools/merge.py:

df1 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G1'])
df2 = pd.DataFrame(np.random.randn(1000, 7),
                   columns=list('ABCDEF') + ['G2'])
df1.merge(df2) # Boom

The value of group_sizes at the time of exception is:

[2000, 2000, 2000, 2000, 2000, 2000]

The exception does not occur if the common columns are set as the index, though.

df1 = df1.set_index(list('ABCDEF'))
df2 = df2.set_index(list('ABCDEF'))
df1.merge(df2, left_index=True, right_index=True, how='outer') # OK
Owner

wesm commented Jan 14, 2013

Thanks. I'll have a look when I can (later this week)

wesm closed this in 78d090a Jan 19, 2013

wesm was assigned Jan 20, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment