New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Different initialization methods lead to different dtypes (DataFrame) #42971
Comments
Hi, thanks for your report. This is actuall quite straighforward and has nothing to do with groupby itself. |
The inconsistency appears to be related to the 'float' data type that you have in
|
The dtype inference might be wrong here, but I don't know the history here and if this is intended |
@phofl thanks for the quick reply! I had no idea that these two methods of instantiating empty dataframes led to different dtypes
|
Reopening since the different dtype look strange |
take |
On further investigation, there is a mismatch in the .index as well
|
this has too many moving parts, and seems like a lot of dependent tests. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Problem description
groupby
-ing andsum
ming an empty dataframe led to dropped columns (df1
above); this doesn't occur with non-empty dataframes. This changing the columns of a dataframe based on its content is counter-intuitive and leads to key errors. The expected behaviour is shown above withdf2
, and the fact that two empty dataframes show different behaviours when grouped and summed suggests that this isn't intended behaviour.Expected Output
Output of the above snippet:
Index([], dtype='object')
Index(['b', 'c'], dtype='object')
The text was updated successfully, but these errors were encountered: