Skip to content

BUG: pd.Series.mode failed to give expected result when it is embedded inside groupby.transform #47094

@Fanghaozhi-Liang

Description

@Fanghaozhi-Liang

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar'],
                   'B' : ['one', 'one', 'two', 'three',
                          'two', 'two'],
                   'C' : [1, 5, 5, 2, 5, 5],
                   'D' : [2.0, 5., 8., 1., 2., 9.]})

df.groupby(['A'])['C'].transform(pd.Series.mode)

Issue Description

The result includes unexpected NA value.

Expected Behavior

import statistics
import pandas as pd

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar'],
'B' : ['one', 'one', 'two', 'three',
'two', 'two'],
'C' : [1, 5, 5, 2, 5, 5],
'D' : [2.0, 5., 8., 1., 2., 9.]})

df.groupby(['A'])['C'].transform(statistics.mode)

Installed Versions

INSTALLED VERSIONS

commit : 4bfe3d0
python : 3.8.8.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22000
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Chinese (Simplified)_China.1252

pandas : 1.4.2
numpy : 1.21.5
pytz : 2021.1
dateutil : 2.8.1
pip : 22.1.1
setuptools : 62.1.0
Cython : 0.29.23
pytest : 6.2.3
hypothesis : None
...
xarray : 0.21.1
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions