ENH: Accept no fields for groupby by #61160

simonaubertbd · 2025-03-21T18:19:54Z

Feature Type

Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas

Problem Description

Hello,

Sometimes, you have no fields to group by when aggregating. I know there is then an aggregate function but it would help having a more dynamic code to allow the use of groupy by without any grouping field instead of this error :

Best regards,

Simon

Feature Description

Just the ability to select no fields in the by argument

Alternative Solutions

A conditional function that uses groupby or aggregate

Additional Context

No response

udit710 · 2025-03-23T01:04:02Z

take

udit710 · 2025-03-23T01:05:52Z

Hi @simonaubertbd what's the expected output here? Assuming you have kept the alternative solution to use aggregate, do we need it to work the same as aggregate?

simonaubertbd · 2025-03-23T08:22:35Z

Hello @udit710 and thanks for your answer.

let's say I have

bird	year	weight	age
Wilbur	1992	3	50
Donald Duck	1991	2	70
Scrooge McDuck	1993	2	100

aggregate1 = inlineInput1.groupby([]).agg(weight_max=('weight', 'max')).reset_index()

would be simply 3

snitish · 2025-03-24T00:54:12Z

Any thoughts on this @rhshadrach? I concur with OP that this feature would be helpful in cases where the grouping columns are dynamically determined.

Delengowski · 2025-03-24T02:21:46Z

So you want the groupby to be a no op and just return the dataframe if no grouping columns are specified? I sort of get it, you dont want to split the dataframe on anything so it just passes through.

I don't see how overloading the method and muddying the API here is worth while. Just do the check yourself. I think it would be strange in some cases for groupby to return a data frame and a group by object in another.

snitish · 2025-03-24T02:43:24Z

The resulting object can still be a GroupBy object, as @udit710 implemented in #61168.

rhshadrach · 2025-03-26T14:38:47Z

If I'm understanding the request right, @simonaubertbd desires for df.groupby([]).agg(...) to behave the same as df.agg(...). Here I am -1; the work it would take to get these to agree, and the ongoing maintenance to support, seems to me to be a non-starter.

On the other hand, I'm a bit more receptive to df.groupby([]) behaving the same as df.groupby(pd.Series(0, index=df.index)) (perhaps better would be to groupby np.zeros? Would have to run some benchmarks). However still here, it seems to me that having this live in user code rather than pandas is more explicit and readable.

simonaubertbd · 2025-03-26T15:37:13Z

Hello @rhshadrach

My bad, I may have been unclear. exactly like df.agg(...) would mean one row by aggregation if I'm right and this is typically what I don't want. I have in mind more something like

aggregate1 = inlineInput1.assign(d=0).groupby('d').agg(Age_max=('Age', 'max'), FirstName_count=('FirstName', 'count'), LastName_count=('LastName', 'count')).reset_index(drop=True)

About having this is user code, it took me a lot of time to deal with it, in developing as well as testing.

rhshadrach · 2025-03-26T15:54:05Z

Thanks @simonaubertbd - then I believe your comment in #61160 (comment) should not be "simply 3" (the scalar), but rather a DataFrame with 3 as the value.

About having this is user code, it took me a lot of time to deal with it, in developing as well as testing.

I'm sympathetic, but still think adding a call to assign and reset_index is not onerous.

simonaubertbd · 2025-03-26T16:05:07Z

@rhshadrach This is a little more complex than that ;) The project I'm on is a python code generator so I had to deal with some conditional typescript and even finding the solution wasn't that easy (Well, I'm obviously not talking about days but more about hours.. also, I must acknowledge I'm kind of a newbie with pandas but when asked for more experimented devs, it wasn't that obvious for them).

That said, Yes, it was a dataframe with 3 as a value. Thanks for your remark, I should have been more specific.

Best regards,

Simon

simonaubertbd added Enhancement Needs Triage labels Mar 21, 2025

github-actions bot assigned udit710 Mar 23, 2025

udit710 linked a pull request Mar 23, 2025 that will close this issue

ENH: Accept no fields for groupby by #61168

Open

5 tasks

simonaubertbd mentioned this issue Mar 24, 2025

Aggregate rows does not work without group key amphi-ai/amphi-etl#215

Closed

rhshadrach added Groupby Needs Discussion and removed Needs Triage labels Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Accept no fields for groupby by #61160

ENH: Accept no fields for groupby by #61160

simonaubertbd commented Mar 21, 2025

udit710 commented Mar 23, 2025

udit710 commented Mar 23, 2025 •

edited

Loading

simonaubertbd commented Mar 23, 2025

snitish commented Mar 24, 2025 •

edited

Loading

Delengowski commented Mar 24, 2025

snitish commented Mar 24, 2025

rhshadrach commented Mar 26, 2025

simonaubertbd commented Mar 26, 2025

rhshadrach commented Mar 26, 2025

simonaubertbd commented Mar 26, 2025

ENH: Accept no fields for groupby by #61160

ENH: Accept no fields for groupby by #61160

Comments

simonaubertbd commented Mar 21, 2025

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

udit710 commented Mar 23, 2025

udit710 commented Mar 23, 2025 • edited Loading

simonaubertbd commented Mar 23, 2025

snitish commented Mar 24, 2025 • edited Loading

Delengowski commented Mar 24, 2025

snitish commented Mar 24, 2025

rhshadrach commented Mar 26, 2025

simonaubertbd commented Mar 26, 2025

rhshadrach commented Mar 26, 2025

simonaubertbd commented Mar 26, 2025

udit710 commented Mar 23, 2025 •

edited

Loading

snitish commented Mar 24, 2025 •

edited

Loading