Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Centralised _check_percentile #27584

Merged
merged 21 commits into from
Oct 3, 2019

Conversation

hedonhermdev
Copy link
Contributor

@hedonhermdev hedonhermdev commented Jul 25, 2019

- Fixes GH27559.
- Moved the _check_percentile method on NDFrame to algorithms as
check_percentile.
- Changed the references to _check_percentile in pandas/core/series.py
and pandas/core/frame.py
@WillAyd
Copy link
Member

WillAyd commented Jul 25, 2019

Can you also update the groupby reference from the OP?

@WillAyd WillAyd added the Refactor Internal refactoring of code label Jul 25, 2019
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment

@hedonhermdev
Copy link
Contributor Author

See above comment

Sorry I couldn't figure out what it was referring to. I didnt find any references to _check_percentile in groupby. Can you help me figure it out?

@WillAyd
Copy link
Member

WillAyd commented Jul 25, 2019

Oh sorry misread that on my end - not there yet for groupby so I think OK for now

@@ -1105,6 +1105,22 @@ def _get_score(at):
return result


def check_percentile(q):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you annotate this function? I think q should just be float, Iterable[float]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return should be an ndarray

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""

msg = "percentiles should all be in the interval [0, 1]. " "Try {0} instead."
q = np.asarray(q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just create a new variable here q_arr instead of reusing the argument; should help with some of the typing errors

pandas/core/algorithms.py Outdated Show resolved Hide resolved
Co-Authored-By: William Ayd <william.ayd@icloud.com>
@jreback jreback added this to the 0.25.1 milestone Jul 25, 2019
@jreback
Copy link
Contributor

jreback commented Jul 25, 2019

ok will have to backport this as #27473 depends

pandas/core/algorithms.py Outdated Show resolved Hide resolved
@pep8speaks
Copy link

pep8speaks commented Jul 26, 2019

Hello @hedonhermdev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-10-03 15:22:43 UTC

@TomAugspurger
Copy link
Contributor

@hedonhermdev seems like there's still a linting issue, and maybe some CI failures.

@TomAugspurger
Copy link
Contributor

@hedonhermdev can you update?

@TomAugspurger
Copy link
Contributor

Pushing to 1.0

@TomAugspurger TomAugspurger modified the milestones: 0.25.1, 1.0 Aug 20, 2019
@jbrockmendel
Copy link
Member

@hedonhermdev can you rebase

@WillAyd
Copy link
Member

WillAyd commented Sep 13, 2019

Nice idea but I think this has gone stale. @hedonhermdev please ping if you'd like to pick this back up

@WillAyd WillAyd closed this Sep 13, 2019
@hedonhermdev
Copy link
Contributor Author

Hey sorry for the disappearance, if possible I would like to work on it again. Can you reopen the pull request?

@WillAyd WillAyd reopened this Sep 14, 2019
@WillAyd
Copy link
Member

WillAyd commented Sep 14, 2019

Sure - thanks for the contribution

- Fixes GH27559.
- Moved the _check_percentile method on NDFrame to algorithms as
check_percentile.
- Changed the references to _check_percentile in pandas/core/series.py
and pandas/core/frame.py

Annotated check_percentile function.

Update pandas/core/algorithms.py

Co-Authored-By: William Ayd <william.ayd@icloud.com>

Fixed typing error in check_percentile.

Refactored docstring of check_percentile function.

Fixed PEP8 issues.
@@ -1102,6 +1102,37 @@ def _get_score(at):
return result


def check_percentile(q: Union[float, Iterable[float]]) -> np.ndarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this make more sense in validators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's reasonable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I shift it to utilts/_validators.py ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea let's do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would renaming it to validate_percentile be better?

@@ -1864,7 +1860,7 @@ def searchsorted(arr, value, side="left", sorter=None):
else:
value = array(value, dtype=dtype)
elif not (
is_object_dtype(arr) or is_numeric_dtype(arr) or is_categorical_dtype(arr)
is_object_dtype(arr) or is_numeric_dtype(arr) or is_categorical_dtype(arr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you not add unrelated changes (all of this whitespace)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that.

pandas/util/_validators.py Outdated Show resolved Hide resolved
pandas/util/_validators.py Outdated Show resolved Hide resolved
otherwise raises a ValueError.

Parameters
-------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be the length of Parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay.

or is_period_dtype(dtype)
or is_datetime64_any_dtype(dtype)
or is_timedelta64_dtype(dtype)
needs_i8_conversion(values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for these

@TomAugspurger
Copy link
Contributor

Looks like a linting failure: https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=17752

Try running black pandas. Instructions are in the contributing guide.

@TomAugspurger
Copy link
Contributor

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (small comments). pls merge master and ping on green

@@ -1165,7 +1163,6 @@ def compute(self, method):

# slow method
if n >= len(self.obj):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you revert these files as they are not actually changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isort fails if I don't commit them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that's odd

------
ValueError if percentiles are not in given interval([0, 1]).
"""
msg = "percentiles should all be in the interval [0, 1]. " "Try {0} instead."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra quotation marks before 'Try'

@hedonhermdev
Copy link
Contributor Author

@jbrockmendel should I modify excel.py too? As I'm getting a linting issue because of it.

@jbrockmendel
Copy link
Member

It looks like it just wants you to revert the whitespace change in the excel.py file

@jreback jreback merged commit 5686e9a into pandas-dev:master Oct 3, 2019
@jreback
Copy link
Contributor

jreback commented Oct 3, 2019

thanks @hedonhermdev

galuhsahid added a commit to galuhsahid/pandas that referenced this pull request Oct 4, 2019
* master: (22 commits)
  DOC: fix PR09,PR08 errors for pandas.Timestamp (pandas-dev#28739)
  WEB: Add diversity note to team.md (pandas-dev#28630)
  DOC: Minor fixes in pandas/testing.py docstring. (pandas-dev#28752)
  TST: port maybe_promote tests from pandas-dev#23982 (pandas-dev#28764)
  Bugfix/groupby datetime issue (pandas-dev#28569)
  reenable codecov (pandas-dev#28750)
  CLN: Centralised _check_percentile (pandas-dev#27584)
  DEPR: Deprecate Index.set_value (pandas-dev#28621)
  CLN: Fix typo in contributing.rst (pandas-dev#28761)
  Fixed docstring errors in pandas.period range and pandas.PeriodIndex (pandas-dev#28756)
  BUG: Fix TypeError raised in libreduction (pandas-dev#28643)
  DOC: Pandas.Series.drop docstring PR02 (pandas-dev#27976) (pandas-dev#28742)
  DOC: Fixed doctring errors PR08, PR09 in pandas.io (pandas-dev#28748)
  TST: Fix broken test cases where Timedelta/Timestamp raise (pandas-dev#28729)
  REF: Consolidate alignment calls in DataFrame ops (pandas-dev#28638)
  BUG: Fix dep generation (pandas-dev#28734)
  Added doctstring to fixture (pandas-dev#28727)
  DOC: Fixed PR06 docstrings errors in pandas.timedelta_range (pandas-dev#28719)
  replaced safe_import with a corresponding test decorator (pandas-dev#28731)
  BUG: Fix RangeIndex.get_indexer for decreasing RangeIndex (pandas-dev#28680)
  ...
jbrockmendel pushed a commit to jbrockmendel/pandas that referenced this pull request Oct 8, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020
@hedonhermdev hedonhermdev deleted the small-refactor branch September 7, 2020 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLN: centrailize _check_percentile
6 participants