New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#6144: Stop defaulting at API layer for a bunch of methods #6145
FEAT-#6144: Stop defaulting at API layer for a bunch of methods #6145
Conversation
…bunch of Series/DataFrame methods Co-authored-by: mvashishtha <mahesh@ponder.io> Co-authored-by: Naren Krishna <naren@ponder.io> Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few quick questions but overall LGTM!
df = df.squeeze(axis=1) | ||
return df.mask(cond, **kwargs) | ||
|
||
return DataFrameDefault.register(mask)(self, cond, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we pass in pandas.DataFrame.mask
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can, I think I've copy-pasted some of my other fix where the behaviour was different... I had already implemented the change locally before your comment, but haven't tested to it wasn't pushed. Should be fixed now.
@doc_utils.add_refer_to("Series.argsort") | ||
def argsort(self, **kwargs): # noqa: PR02 | ||
""" | ||
Return the integer indices that would sort the Series values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit but why have we added a docstring here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our doc checker insists on having docstrings, so we'll have to either silence it with pragmas or fill in at least some docstrings... and for methods which are a direct match for pandas I feel it useful to copy their docstrings for clarity.
"mask", | ||
cond, | ||
other=other, | ||
return self._create_or_update_from_compiler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using this instead of just the constructor? Or will this method just use the constructor if in place is false, and otherwise do an in place update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. It handles the inplace
properly.
modin/pandas/series.py
Outdated
return self._default_to_pandas(pandas.Series.nlargest, n=n, keep=keep) | ||
if len(self._query_compiler.columns) == 0: | ||
raise NotImplementedError( | ||
"Series.nlargest is not implemented for empty Series." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true for pandas as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, just checked, and in pandas it returns Series([], dtype: float64)
thingy... should be fixed now.
**kwargs, | ||
): # noqa: PR01, RT01, D200 | ||
return self._create_or_update_from_compiler( | ||
self._query_compiler.interpolate( | ||
method=method, | ||
axis=axis, | ||
limit=limit, | ||
inplace=False, |
Check notice
Code scanning / CodeQL
Mismatch between signature and use of an overridden method Note
call
method Series.value_counts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left 2 quick questions - lgtm otherwise!
modin/pandas/base.py
Outdated
@@ -1905,7 +1905,6 @@ def mask( | |||
level=level, | |||
errors=errors, | |||
try_cast=try_cast, | |||
squeeze_self=not self._is_dataframe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this argument removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this is because mask
and pct_change
are equivalent in pandas whether applied to a Series or a 1 col DataFrame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, they seem to behave identically in pandas
return super().mask(cond, other, inplace, axis, level, errors, try_cast) | ||
# This method exists because `errors` has different default value for Series :( | ||
return super(Series, self).mask( | ||
cond, other, inplace, axis, level, "raise", try_cast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we pass in errors
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems this argument has been deprecated since 1.5? So we should be coercing types anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That deprecation is the reason why I'm not passing the errors
through, as it is expected to always behave as if raise
is given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date