Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39907][PS] Implement axis and skipna of Series.argmin #37328

Closed
wants to merge 4 commits into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

1, Implement axis and skipna of Series.argmin
2, compute the argmin on single pass, like argmax

Why are the changes needed?

to add missing parameter
after this change, the underlying implements of argmax and argmin are almost the same

Does this PR introduce any user-facing change?

yes, new parameter

How was this patch tested?

added tests

@HyukjinKwon
Copy link
Member

cc @itholic @xinrong-meng @ueshin FYI

axis : {{None}}
Dummy argument for consistency with Series.
skipna : bool, default True
Exclude NA/null values. If the entire Series is NA, the result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the entire Series is NA, the result

It seems a plus doc for pandas (1.4.3), do we want to add a test for this?

[1] https://pandas.pydata.org/docs/reference/api/pandas.Series.argmin.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it results -1, not NA.

this behavior is the same as Pandas.

good catch!

@Yikun
Copy link
Member

Yikun commented Jul 29, 2022

otherwise LGTM! Thanks

self.assert_eq(pser2.argmax(), psser2.argmax())
self.assert_eq(pser2.argmin(skipna=False), psser2.argmin(skipna=False))
self.assert_eq(pser2.argmax(skipna=False), psser2.argmax(skipna=False))

# Null Series
self.assert_eq(pd.Series([np.nan]).argmin(), ps.Series([np.nan]).argmin())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both pd.Series([np.nan]).argmin() and ps.Series([np.nan]).argmin() return -1

"""
Return int position of the smallest value in the Series.

If the minimum is achieved in multiple locations,
the first row position is returned.

Parameters
----------
axis : {{None}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a mistake? {{None}}

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2 otherwise

@HyukjinKwon
Copy link
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the ps_update_argmin branch August 1, 2022 01:51
@zhengruifeng
Copy link
Contributor Author

thank you @Yikun @HyukjinKwon for review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants