Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Implement interpolating NaT values in datetime series #17709

Closed
wants to merge 4 commits into from

Conversation

s-celles
Copy link
Contributor

@s-celles s-celles commented Sep 28, 2017

  • closes Interpolate NaT #11701
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Sep 28, 2017

Hello @scls19fr! Thanks for updating the PR.

Line 1225:80: E501 line too long (82 > 79 characters)

Comment last updated on September 28, 2017 at 19:41 Hours UTC

@s-celles
Copy link
Contributor Author

If I change test from

    expected = pd.Series(pd.date_range('2015-01-01', '2015-01-30'))

to

    expected = pd.Series(pd.date_range('2015-01-01', '2015-01-30', tz="UTC"))

unit test doesn't pass

this is because of

if is_datetime64_dtype(self) and self.isnull().any():
    ...

Any idea?

@codecov
Copy link

codecov bot commented Sep 28, 2017

Codecov Report

Merging #17709 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17709      +/-   ##
==========================================
+ Coverage   91.24%   91.26%   +0.01%     
==========================================
  Files         163      163              
  Lines       49766    49773       +7     
==========================================
+ Hits        45411    45423      +12     
+ Misses       4355     4350       -5
Flag Coverage Δ
#multiple 89.05% <100%> (+0.02%) ⬆️
#single 40.33% <37.5%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/series.py 95.06% <100%> (+0.12%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.73% <0%> (-0.1%) ⬇️
pandas/core/generic.py 92.13% <0%> (+0.05%) ⬆️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 074b485...3a90ea7. Read the comment docs.

@codecov
Copy link

codecov bot commented Sep 28, 2017

Codecov Report

Merging #17709 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17709      +/-   ##
==========================================
+ Coverage   91.24%   91.26%   +0.01%     
==========================================
  Files         163      163              
  Lines       49766    49773       +7     
==========================================
+ Hits        45411    45423      +12     
+ Misses       4355     4350       -5
Flag Coverage Δ
#multiple 89.05% <100%> (+0.02%) ⬆️
#single 40.33% <37.5%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/series.py 95.06% <100%> (+0.12%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.73% <0%> (-0.1%) ⬇️
pandas/core/generic.py 92.13% <0%> (+0.05%) ⬆️
pandas/plotting/_converter.py 65.2% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 074b485...3a90ea7. Read the comment docs.

is_datetime64tz_dtype(self)) and self.isnull().any():
s2 = self.astype('i8').astype('f8')
s2[self.isnull()] = np.nan
return to_datetime(s2.interpolate(*args, **kwargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll need to handle timezone information here. Once you do the .astype('i8') all TZ info is gone.

If someone has timezones, should it be tz_convert('UTC') -> interploate -> tz_localize("UTC") -> tz_convert(original)?

We'll want to test this around DST transitions...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be done here at all, rather in pandas.core.missing.py

is_datetime64tz_dtype(self)) and self.isnull().any():
s2 = self.astype('i8').astype('f8')
s2[self.isnull()] = np.nan
return to_datetime(s2.interpolate(*args, **kwargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be done here at all, rather in pandas.core.missing.py

@sinhrks sinhrks added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Datetime Datetime data dtype Bug labels Sep 29, 2017
@jreback
Copy link
Contributor

jreback commented Nov 10, 2017

can you rebase an respond to comments.

@s-celles
Copy link
Contributor Author

I do not have time right now. Sorry
Especially because I think that @TomAugspurger is right!
We need to handle timezone and test around DST transitions carefully
So I'm closing this PR. Please go ahead!

@s-celles s-celles closed this Nov 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interpolate NaT
5 participants