New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: tolerance now takes list-like argument for reindex and get_indexer. #17367

Merged
merged 6 commits into from Oct 14, 2017

Conversation

Projects
None yet
6 participants
@buntwo
Contributor

buntwo commented Aug 29, 2017

Enable use of list-like values for tolerance argument in DataFrame.reindex(), Series.reindex(), Index.get_indexer().

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@pep8speaks

This comment has been minimized.

Show comment
Hide comment
@pep8speaks

pep8speaks Aug 29, 2017

Hello @buntwo! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 14, 2017 at 19:10 Hours UTC

pep8speaks commented Aug 29, 2017

Hello @buntwo! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on October 14, 2017 at 19:10 Hours UTC
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Aug 29, 2017

Contributor

does this have an associated issue?

can u show an example use case

Contributor

jreback commented Aug 29, 2017

does this have an associated issue?

can u show an example use case

@buntwo

This comment has been minimized.

Show comment
Hide comment
@buntwo

buntwo Aug 29, 2017

Contributor

Use case: suppose you have a timeseries of values for a stock, with some days with missing data in between, and you want to select some date range that includes the missing data. You want to have some fill backward (forward, nearest) N-many-days logic when doing so, but you want N to depend on some other property of the stock that varies as time. Currently pandas only supports N constant. This features enables this use case, by allowing the user to construct a list-like object of the same size that contains the tolerance per point and pass that in.

Contributor

buntwo commented Aug 29, 2017

Use case: suppose you have a timeseries of values for a stock, with some days with missing data in between, and you want to select some date range that includes the missing data. You want to have some fill backward (forward, nearest) N-many-days logic when doing so, but you want N to depend on some other property of the stock that varies as time. Currently pandas only supports N constant. This features enables this use case, by allowing the user to construct a list-like object of the same size that contains the tolerance per point and pass that in.

@gfyoung

This comment has been minimized.

Show comment
Hide comment
@gfyoung

gfyoung Aug 29, 2017

Member

@buntwo : It seems like you want the tolerance time-varying. I suppose that could be useful, but I haven't seen a use-case like this really.

I would suggest you open an issue for this and point your PR to it so we can discuss.

In the future, for enhancements like this, it would be preferable to open an issue first so that we can discuss rather than in the PR.

Member

gfyoung commented Aug 29, 2017

@buntwo : It seems like you want the tolerance time-varying. I suppose that could be useful, but I haven't seen a use-case like this really.

I would suggest you open an issue for this and point your PR to it so we can discuss.

In the future, for enhancements like this, it would be preferable to open an issue first so that we can discuss rather than in the PR.

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Sep 1, 2017

Codecov Report

❗️ No coverage uploaded for pull request base (master@36dadd7). Click here to learn what that means.
The diff coverage is 95.91%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #17367   +/-   ##
=========================================
  Coverage          ?   91.02%           
=========================================
  Files             ?      162           
  Lines             ?    49603           
  Branches          ?        0           
=========================================
  Hits              ?    45150           
  Misses            ?     4453           
  Partials          ?        0
Flag Coverage Δ
#multiple 88.8% <95.91%> (?)
#single 40.22% <10.2%> (?)
Impacted Files Coverage Δ
pandas/core/generic.py 91.99% <ø> (ø)
pandas/core/indexes/datetimelike.py 96.75% <100%> (ø)
pandas/core/indexes/numeric.py 97.24% <100%> (ø)
pandas/core/indexes/datetimes.py 95.35% <100%> (ø)
pandas/core/indexes/base.py 95.89% <100%> (ø)
pandas/core/indexes/period.py 92.5% <77.77%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 36dadd7...d6ec1d6. Read the comment docs.

codecov bot commented Sep 1, 2017

Codecov Report

❗️ No coverage uploaded for pull request base (master@36dadd7). Click here to learn what that means.
The diff coverage is 95.91%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #17367   +/-   ##
=========================================
  Coverage          ?   91.02%           
=========================================
  Files             ?      162           
  Lines             ?    49603           
  Branches          ?        0           
=========================================
  Hits              ?    45150           
  Misses            ?     4453           
  Partials          ?        0
Flag Coverage Δ
#multiple 88.8% <95.91%> (?)
#single 40.22% <10.2%> (?)
Impacted Files Coverage Δ
pandas/core/generic.py 91.99% <ø> (ø)
pandas/core/indexes/datetimelike.py 96.75% <100%> (ø)
pandas/core/indexes/numeric.py 97.24% <100%> (ø)
pandas/core/indexes/datetimes.py 95.35% <100%> (ø)
pandas/core/indexes/base.py 95.89% <100%> (ø)
pandas/core/indexes/period.py 92.5% <77.77%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 36dadd7...d6ec1d6. Read the comment docs.

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Sep 1, 2017

Codecov Report

Merging #17367 into master will decrease coverage by 0.02%.
The diff coverage is 92.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17367      +/-   ##
==========================================
- Coverage   91.24%   91.21%   -0.03%     
==========================================
  Files         163      163              
  Lines       50080    50099      +19     
==========================================
+ Hits        45693    45700       +7     
- Misses       4387     4399      +12
Flag Coverage Δ
#multiple 89.02% <92.68%> (-0.01%) ⬇️
#single 40.31% <21.95%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.2% <ø> (ø) ⬆️
pandas/core/indexes/timedeltas.py 91.19% <100%> (ø) ⬆️
pandas/core/tools/timedeltas.py 98.38% <100%> (+0.05%) ⬆️
pandas/core/indexes/datetimelike.py 97.1% <100%> (ø) ⬆️
pandas/core/indexes/numeric.py 97.23% <100%> (+0.05%) ⬆️
pandas/core/indexes/base.py 96.42% <80%> (-0.06%) ⬇️
pandas/core/indexes/datetimes.py 95.5% <83.33%> (-0.08%) ⬇️
pandas/core/indexes/period.py 92.68% <91.66%> (-0.11%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e001500...7e7051a. Read the comment docs.

codecov bot commented Sep 1, 2017

Codecov Report

Merging #17367 into master will decrease coverage by 0.02%.
The diff coverage is 92.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17367      +/-   ##
==========================================
- Coverage   91.24%   91.21%   -0.03%     
==========================================
  Files         163      163              
  Lines       50080    50099      +19     
==========================================
+ Hits        45693    45700       +7     
- Misses       4387     4399      +12
Flag Coverage Δ
#multiple 89.02% <92.68%> (-0.01%) ⬇️
#single 40.31% <21.95%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.2% <ø> (ø) ⬆️
pandas/core/indexes/timedeltas.py 91.19% <100%> (ø) ⬆️
pandas/core/tools/timedeltas.py 98.38% <100%> (+0.05%) ⬆️
pandas/core/indexes/datetimelike.py 97.1% <100%> (ø) ⬆️
pandas/core/indexes/numeric.py 97.23% <100%> (+0.05%) ⬆️
pandas/core/indexes/base.py 96.42% <80%> (-0.06%) ⬇️
pandas/core/indexes/datetimes.py 95.5% <83.33%> (-0.08%) ⬇️
pandas/core/indexes/period.py 92.68% <91.66%> (-0.11%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e001500...7e7051a. Read the comment docs.

@buntwo buntwo closed this Sep 1, 2017

@buntwo buntwo reopened this Sep 1, 2017

@jreback

comments in-line. you are doing quite a bit of extra checking, I think you can pare this down substantially. you may need to adjust your tests slightly as well.

Show outdated Hide outdated doc/source/whatsnew/v0.21.0.txt Outdated
Show outdated Hide outdated pandas/core/generic.py Outdated
Show outdated Hide outdated pandas/core/generic.py Outdated
Show outdated Hide outdated pandas/core/indexes/base.py Outdated
Show outdated Hide outdated pandas/core/indexes/base.py Outdated
Show outdated Hide outdated pandas/core/indexes/datetimes.py Outdated
Show outdated Hide outdated pandas/core/indexes/numeric.py Outdated
Show outdated Hide outdated pandas/core/indexes/period.py Outdated
Show outdated Hide outdated pandas/core/indexes/period.py Outdated
Show outdated Hide outdated pandas/core/indexes/period.py Outdated

jreback added a commit to jreback/pandas that referenced this pull request Sep 23, 2017

@buntwo

This comment has been minimized.

Show comment
Hide comment
@buntwo

buntwo Oct 14, 2017

Contributor

The failed test in appveyor I think has nothing to do with my change, didn't fail on the last commit.

Contributor

buntwo commented Oct 14, 2017

The failed test in appveyor I think has nothing to do with my change, didn't fail on the last commit.

Show outdated Hide outdated doc/source/whatsnew/v0.21.0.txt Outdated
Show outdated Hide outdated pandas/core/tools/timedeltas.py Outdated
Show outdated Hide outdated pandas/tests/indexes/test_base.py Outdated
Show outdated Hide outdated pandas/core/indexes/period.py Outdated
offset = frequencies.to_offset(self.freq.rule_code)
if isinstance(offset, offsets.Tick):
nanos = tslib._delta_to_nanoseconds(other)
if isinstance(other, np.ndarray):
nanos = np.vectorize(tslib._delta_to_nanoseconds)(other)

This comment has been minimized.

@jreback

jreback Oct 14, 2017

Contributor

the array version of this function is almost trivial
if u can add it alongside the other and call here
(u just need to type the input as ndarray i think)

@jreback

jreback Oct 14, 2017

Contributor

the array version of this function is almost trivial
if u can add it alongside the other and call here
(u just need to type the input as ndarray i think)

This comment has been minimized.

@jreback

jreback Oct 14, 2017

Contributor

actually ignore the above this is ok here

@jreback

jreback Oct 14, 2017

Contributor

actually ignore the above this is ok here

@jreback jreback added the Enhancement label Oct 14, 2017

@jreback jreback added this to the 0.21.0 milestone Oct 14, 2017

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 14, 2017

Contributor

@buntwo ok parametrized the tests, ping on green.

Contributor

jreback commented Oct 14, 2017

@buntwo ok parametrized the tests, ping on green.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 14, 2017

Contributor

going to fail from some flake errors

git diff master -u "*.py" | flake8 --diff
will show you them

Contributor

jreback commented Oct 14, 2017

going to fail from some flake errors

git diff master -u "*.py" | flake8 --diff
will show you them

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 14, 2017

Contributor

also rebase on master (you don't need to squash, though its easier to do so)

Contributor

jreback commented Oct 14, 2017

also rebase on master (you don't need to squash, though its easier to do so)

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 14, 2017

Contributor

your rebasing is removing things. just leave it and restore code (in a new commit) that was removed. making sure all tests pass.

Contributor

jreback commented Oct 14, 2017

your rebasing is removing things. just leave it and restore code (in a new commit) that was removed. making sure all tests pass.

@buntwo

This comment has been minimized.

Show comment
Hide comment
@buntwo

buntwo Oct 14, 2017

Contributor

Apologies for the mistake in rebasing. I used git diff to cherry pick the unintended changes and git apply -R to revert.

Contributor

buntwo commented Oct 14, 2017

Apologies for the mistake in rebasing. I used git diff to cherry pick the unintended changes and git apply -R to revert.

@buntwo

This comment has been minimized.

Show comment
Hide comment
@buntwo

buntwo Oct 14, 2017

Contributor

Ping

Contributor

buntwo commented Oct 14, 2017

Ping

@jreback jreback merged commit 5c0b20a into pandas-dev:master Oct 14, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 14, 2017

Contributor

thanks @buntwo

Contributor

jreback commented Oct 14, 2017

thanks @buntwo

@buntwo

This comment has been minimized.

Show comment
Hide comment
@buntwo

buntwo Oct 14, 2017

Contributor

thank you @jreback !

Contributor

buntwo commented Oct 14, 2017

thank you @jreback !

@buntwo buntwo deleted the buntwo:ndarray_tolerance branch Oct 14, 2017

kchomski-reef added a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

kchomski-reef added a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

Merge remote-tracking branch 'upstream/master'
* upstream/master: (76 commits)
  CategoricalDtype construction: actually use fastpath (pandas-dev#17891)
  DEPR: Deprecate tupleize_cols in to_csv (pandas-dev#17877)
  BUG: Fix wrong column selection in drop_duplicates when duplicate column names (pandas-dev#17879)
  DOC: Adding examples to update docstring (pandas-dev#16812) (pandas-dev#17859)
  TST: Skip if no openpyxl in test_excel (pandas-dev#17883)
  TST: Catch read_html slow test warning (pandas-dev#17874)
  flake8 cleanup (pandas-dev#17873)
  TST: remove moar warnings (pandas-dev#17872)
  ENH: tolerance now takes list-like argument for reindex and get_indexer. (pandas-dev#17367)
  ERR: Raise ValueError when week is passed in to_datetime format witho… (pandas-dev#17819)
  TST: remove some deprecation warnings (pandas-dev#17870)
  Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) (pandas-dev#17843)
  BUG: merging with a boolean/int categorical column (pandas-dev#17841)
  DEPR: Deprecate read_csv arguments fully (pandas-dev#17865)
  BUG: to_json - prevent various segfault conditions (GH14256) (pandas-dev#17857)
  CLN: Use pandas.core.common for None checks (pandas-dev#17816)
  BUG: set tz on DTI from fixed format HDFStore (pandas-dev#17844)
  RLS: v0.21.0rc1
  Whatsnew cleanup (pandas-dev#17858)
  DEPR: Deprecate the convert parameter completely (pandas-dev#17831)
  ...

wooyekim added a commit to wooyekim/pandas that referenced this pull request Oct 29, 2017

Dobatymo added a commit to Dobatymo/pandas that referenced this pull request Oct 30, 2017

alanbato added a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

No-Stream added a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment