-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Problem description
Currently to append to a DataFrame, the following is the approach:
df = pd.DataFrame(np.random.rand(5,3), columns=list('abc'))
df = df.append(pd.DataFrame(np.random.rand(5,3), columns=list('abc')))
append
is a DataFrame or Series method, and as such should be able to modify the DataFrame or Series in place. If in place modification is not required, one may use concat
or set inplace
kwag to False
. It will avoid an explicit assignment operation which is quite slow in Python, as we all know. Further, it will make the expected behavior similar to Python lists, and avoid questions such as these: 1, 2...
Additionally at present, append
is full subset of concat
, and as such it need not exist at all. Given the vast number of functions to append a DataFrame or Series to another in Pandas, it makes sense that each has it's merits and demerits. Gaining an inplace
kwag will clearly distinguish append
from concat
, and simplify code.
I understand that this issue was raised in #2801 a long time ago. However, the conversation in that deviated from the simplification offered by the inplace
kwag to performance enhancement. I (and many like me) are looking for ease of use, and not so much at performance. Also, we expect the data to fit in memory (which is a limitation even with current version of append
).
Expected Code
df = pd.DataFrame(np.random.rand(5,3), columns=list('abc'))
df.append(pd.DataFrame(np.random.rand(5,3), columns=list('abc')), inplace=True)