Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: axis=None does not aggregate along both axes for sum method #54547

Closed
1 task done
Tracked by #5
Ishticode opened this issue Aug 14, 2023 · 8 comments · Fixed by selehadin-cyber/pandas#1 or #55240
Closed
1 task done
Tracked by #5

DOC: axis=None does not aggregate along both axes for sum method #54547

Ishticode opened this issue Aug 14, 2023 · 8 comments · Fixed by selehadin-cyber/pandas#1 or #55240
Labels

Comments

@Ishticode
Copy link

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sum.html

Documentation problem

The comment under axis parameter states:

For DataFrames, specifying axis=None will apply the aggregation across both axes.

However the following code seems to suggest otherwise.

import pandas as pd, numpy as np
a = np.arange(6).reshape((3, 2))
df = pd.DataFrame(a, index=list("abc"), columns=list("xy"))
print(df.sum(axis=None)) # or df.sum()

which returns

x    6
y    9
dtype: int64

and seems to sum just along the row as if we had specified axis="index".

In additon to this, the return statement for the sum docs linked above says scalar or series. If the above case does not return a scalar then I can't see when it will be scalar otherwise.

Note:

  1. This behaviour is diffierent to numpy.ndarray.sum which would return a scalar value of 15.
  2. This behaviour is also found amongst other similar df methods like mean.

Suggested fix for documentation

remove
For DataFrames, specifying axis=None will apply the aggregation across both axes.
or
make the methods match numpy behaviour which is perhaps a breaking change.

@Ishticode Ishticode added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 14, 2023
@Ishticode Ishticode changed the title axis=None does not aggregate along both axes for sum method DOC: axis=None does not aggregate along both axes for sum method Aug 14, 2023
@jbrockmendel
Copy link
Member

The current behavior is deprecated; in 3.0 explicitly passing axis=None will sum across both axes. A PR making the documentation more clear in the interim would be welcome.

@rhshadrach rhshadrach added Reduction Operations sum, mean, min, max, etc. good first issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 19, 2023
@nox1134
Copy link

nox1134 commented Aug 19, 2023

Hii @rhshadrach ! I'm new to open source and would like to work on this issue. How do I get started?

@selehadin-cyber
Copy link

Hi @rhshadrach

can you assign me this? can do a quick PR
thanks in advance

@Brooklynn29
Copy link

take

@nox1134
Copy link

nox1134 commented Aug 21, 2023

@rhshadrach I've set up the environment and installed all dependencies, but I'm unsure how to edit docstrings. Should i create a new function and override the the existing sum method?

@salujaditi14
Copy link

@Ishticode , @jbrockmendel , @Brooklynn29 , Do you guys would like to look into this PR?

#54681

@haleematallat
Copy link

take

@haleematallat haleematallat removed their assignment Sep 7, 2023
@natmokval
Copy link
Contributor

I would like to work on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment