Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Filtering by exclusion of duplicate rows does not preserve column list for an empty dataframe #25184
Code Sample, a copy-pastable example if possible
import pandas as pd x_df = pd.DataFrame(columns=['a', 'b']) series = x_df.duplicated(subset=['a']) list(x_df[~series]) # Expected output on Pandas 0.23.4: ['a', 'b'] # But, Pandas 0.24.1 returns: 
We have been using this approach to remove duplicate rows on a dataframe, where rows are compared by one column only. Everything worked perfectly until we found out that, if the original dataframe is empty, in the result dataframe column list is lost after Pandas upgrade to latest version.
We would expect the column list to be preserved.
Just ran into this myself today.
Looks like the root issue may be that duplicated is now returning an empty series of dtype=float64, whereas it used to return an empty series of dtype=bool.
I've never contributed to pandas before, but I think fixing this would be as simple as changing this line to