Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolate NaT #11701

Closed
scls19fr opened this issue Nov 25, 2015 · 7 comments
Closed

Interpolate NaT #11701

scls19fr opened this issue Nov 25, 2015 · 7 comments
Assignees
Labels
Duplicate Report Duplicate issue or pull request

Comments

@scls19fr
Copy link
Contributor

Hello,

interpolate doesn't work with NaT
see http://stackoverflow.com/questions/33921795/fill-timestamp-nat-with-a-linear-interpolation/33922824#33922824

Here is a trivial example to show the situation:

s = pd.Series(pd.date_range('2015-01-01' , '2015-01-30'), name='t')

s[3], s[4], s[5] = pd.NaT, pd.NaT, pd.NaT
s[13], s[14], s[15] = pd.NaT, pd.NaT, pd.NaT
print(s)

0    2015-01-01
1    2015-01-02
2    2015-01-03
3           NaT
4           NaT
5           NaT
6    2015-01-07
7    2015-01-08
8    2015-01-09
9    2015-01-10
10   2015-01-11
11   2015-01-12
12   2015-01-13
13          NaT
14          NaT
15          NaT
16   2015-01-17
17   2015-01-18
18   2015-01-19
19   2015-01-20
20   2015-01-21
21   2015-01-22
22   2015-01-23
23   2015-01-24
24   2015-01-25
25   2015-01-26
26   2015-01-27
27   2015-01-28
28   2015-01-29
29   2015-01-30
Name: t, dtype: datetime64[ns]

print(s.interpolate())
0    2015-01-01
1    2015-01-02
2    2015-01-03
3           NaT
4           NaT
5           NaT
6    2015-01-07
7    2015-01-08
8    2015-01-09
9    2015-01-10
10   2015-01-11
11   2015-01-12
12   2015-01-13
13          NaT
14          NaT
15          NaT
16   2015-01-17
17   2015-01-18
18   2015-01-19
19   2015-01-20
20   2015-01-21
21   2015-01-22
22   2015-01-23
23   2015-01-24
24   2015-01-25
25   2015-01-26
26   2015-01-27
27   2015-01-28
28   2015-01-29
29   2015-01-30
Name: t, dtype: datetime64[ns]

assert s.interpolate().isnull().sum() == 0
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-150-8a59e397a174> in <module>()
----> 1 assert s.interpolate().isnull().sum() == 0

AssertionError:

Kind regards

@jreback
Copy link
Contributor

jreback commented Nov 26, 2015

this is not implemented ATM on datetimes. pull-requests are welcome.

@jreback jreback added Enhancement Timeseries Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Difficulty Intermediate labels Nov 26, 2015
@jreback jreback added this to the Next Major Release milestone Nov 26, 2015
@scls19fr
Copy link
Contributor Author

What is your opinion about the last solution proposed by CT Zhu on StackOverflow ?

df.ix[df.t.isnull(), 't'] = pd.to_datetime(pd.to_numeric(df.t).interpolate())[df.t.isnull()]

isn't there a method to support NaN with integers without converting to float (which lead to precision issue) ?

@scls19fr
Copy link
Contributor Author

Shouldn't we look for example to

np.int64(pd.NaT)

which is -9223372036854775808

@axelv
Copy link

axelv commented Sep 27, 2017

I have the impression that interpolating NaT is still not possible in v20.3.
Any updates on this issue?

@jreback
Copy link
Contributor

jreback commented Sep 28, 2017

This is not very hard to actually do directly (and what .interpolate() should basically do, PRs welcome)

In [12]: s2 = s.astype('i8').astype('f8')

In [13]: s2[s.isnull()] = np.nan

In [14]: pd.to_datetime(s2.interpolate())
Out[14]: 
0    2015-01-01
1    2015-01-02
2    2015-01-03
3    2015-01-04
4    2015-01-05
5    2015-01-06
6    2015-01-07
7    2015-01-08
8    2015-01-09
9    2015-01-10
10   2015-01-11
11   2015-01-12
12   2015-01-13
13   2015-01-14
14   2015-01-15
15   2015-01-16
16   2015-01-17
17   2015-01-18
18   2015-01-19
19   2015-01-20
20   2015-01-21
21   2015-01-22
22   2015-01-23
23   2015-01-24
24   2015-01-25
25   2015-01-26
26   2015-01-27
27   2015-01-28
28   2015-01-29
29   2015-01-30
Name: t, dtype: datetime64[ns]

@scls19fr
Copy link
Contributor Author

scls19fr commented Sep 28, 2017

@rinoc did some work on this issue in https://github.com/rinoc/pandas/commit/e77e4c8566db68c0ec144f9aeb01dc5225c971d6
But no PR have been send.
Any news?

@mroeschke
Copy link
Member

Looks to be a duplicate of #11312

@mroeschke mroeschke added Duplicate Report Duplicate issue or pull request and removed Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Timeseries labels Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
No open projects
DatetimeArray Refactor
  
Reductions
5 participants