Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: unequally spaced timeseries #60

Closed
ajdapretnar opened this issue Feb 4, 2020 · 5 comments
Closed

Question: unequally spaced timeseries #60

ajdapretnar opened this issue Feb 4, 2020 · 5 comments
Labels
question Further information is requested

Comments

@ajdapretnar
Copy link
Contributor

First, big thanks for a nice library! Very useful!

I am trying to follow your Quick Start for SeasonalAD, but I am encountering a problem. My timeseries seem be to unequally spaced (e.g. 09:05, 09:15, 9:30, 9:55). Hence the SeasonalAD complains:
RuntimeError: Series does not follow any known frequency (e.g. second, minute, hour, day, week, month, year, etc.
How to overcome this?
I have tried rounding my series to 15min, removing duplicates and resampling.

s_train.index = s_train.index.round('15min')
s_train = s_train[~s_train.index.duplicated()]
s_train = s_train.asfreq('15min')

Obviously nothing worked. Any ideas how to solve this? I wish to retain as much granularity as possible.

@tailaiw
Copy link
Contributor

tailaiw commented Feb 4, 2020

@ajdapretnar Seasonal decomposition in ADTK requires the input follows equally spaced time series, as you noticed. Therefore, the time series should be resampled with a constant frequency, for example 15 min.

If you got a ValueError regarding NaN values, the reason is that your resampling may introduce NaN value to the new time series (in your example, 9:45 will have NaN value). Currently, SeasonAD does not support time series with NaN, unless the NaN values are on the starting or ending part of the time series and they will be ignored.

In the adtk.data module, we offer a resample function, which resamples a time series with user-given space. The resampling is based on linear interpolation. You may try it like the follows.

from adtk.data import resample
s_train = resample(s_train, dT="15 min")

If you want to fill NaN with forward or backward filling instead of interpolation, I believe you can also use the fillna method of Pandas.

@tailaiw tailaiw added the question Further information is requested label Feb 4, 2020
@ajdapretnar
Copy link
Contributor Author

Thanks, this is useful! Will try it.

@abhimanyu3-zz
Copy link

My df is not having any null values but i am still getting this error when going to higher samples.
I have minute by minute data and when i am going above 7days i am getting this error. How many data points it can handle because when i am doing this for 10 days above its not givign any anomalies.

@zhj0513
Copy link

zhj0513 commented Dec 24, 2020

try s_train.resample('15min').ffill() if adtk.data doesn't have resample

@taroyutao
Copy link

Removed adtk.data.resample because its functionality is highly overlapped with pandas resampler module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants