You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I haven't tested append() yet, and I was wondering if duplicates are removed when an append is managed.
I had a look in collection.py script and following pandas function are used: combined = dd.concat([current.data, new]).drop_duplicates(keep="last")
After a look into pandas documentation, I understand that duplicate lines are removed, only the last occurence is kept.
Please, I think it would be relevant to simply say so in the tutorial.
You write: Let's append the last day (row) to our item:
Wouldn't it be worth to add: Let's append the last day (row) to our item. With current data, there is obviously no duplicate rows. If you append a dataframe that contain duplicate rows with that of the existing item, these duplicates will be removed by use of 'drop_duplicates()' method from panda dataframe
Thanks again for bringing pystore!
Bests,
Pierrot
The text was updated successfully, but these errors were encountered:
Hello,
I haven't tested append() yet, and I was wondering if duplicates are removed when an append is managed.
I had a look in collection.py script and following pandas function are used:
combined = dd.concat([current.data, new]).drop_duplicates(keep="last")
After a look into pandas documentation, I understand that duplicate lines are removed, only the last occurence is kept.
Please, I think it would be relevant to simply say so in the tutorial.
You write:
Let's append the last day (row) to our item:
Wouldn't it be worth to add:
Let's append the last day (row) to our item. With current data, there is obviously no duplicate rows. If you append a dataframe that contain duplicate rows with that of the existing item, these duplicates will be removed by use of 'drop_duplicates()' method from panda dataframe
Thanks again for bringing pystore!
Bests,
Pierrot
The text was updated successfully, but these errors were encountered: