Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: DataFrame.append is slow when the DataFrame is huge #35710

Closed
junjunjunk opened this issue Aug 13, 2020 · 2 comments
Closed

ENH: DataFrame.append is slow when the DataFrame is huge #35710

junjunjunk opened this issue Aug 13, 2020 · 2 comments

Comments

@junjunjunk
Copy link
Contributor

Is your feature request related to a problem?

pandas.DataFrame.append is slow.
Users use DataFrame.append multiple times. And it's more pronounced when the data is huge.
That would take a long time to combine the data.
So some users are using DataFrame.from_dict which is faster than DataFrame.append. (check additional reference)

N = 10000
for i in range(N):
  df = df.append(other, ignore_index=True)

Describe the solution you'd like

I guess DataFrame.append is slow because there's a copy operation of DataFrame in somewhere.
But I haven't found it yet.
The solution will be to rewrite the process to be more efficient by not making copies.

Additional reference

https://stackoverflow.com/questions/27929472/improve-row-append-performance-on-pandas-dataframes

@junjunjunk junjunjunk added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 13, 2020
@mroeschke
Copy link
Member

Of note, append may be deprecated in the future: #35407

We recommend using concat by first compiling a list of DataFrames to concat first.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2020

there is a large warning in the docs about this

@jreback jreback added Usage Question and removed Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 13, 2020
@jreback jreback added this to the No action milestone Aug 13, 2020
@jreback jreback closed this as completed Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants