Skip to content
This repository has been archived by the owner on Nov 13, 2021. It is now read-only.

period problem with AnomalyDetectionTs #45

Open
blatoo opened this issue Jul 5, 2015 · 11 comments
Open

period problem with AnomalyDetectionTs #45

blatoo opened this issue Jul 5, 2015 · 11 comments

Comments

@blatoo
Copy link

blatoo commented Jul 5, 2015

Hi everybody,

After successfully running the example, I created an own data set, which has the same format like raw_data, I create an myData, which has the same structure as the raw_data. But there still two places are a little different

  • It constains missing value in the second column (raw_data has no missing value)
  • The timestamp is just for one day, the time interval is every 15 seconds. (raw_data has 5 day history and the time interval is every minute)

It looks like:
1 1970-01-01 01:00:55 NA
2 1970-01-01 01:00:10 NA
3 1970-01-01 01:00:25 2.871
4 1970-01-01 01:00:40 2.654
5 1970-01-01 01:00:55 3.060
6 1970-01-01 01:00:10 9.074

after I run the same command like the example:

res = AnomalyDetectionTs(myData, max_anoms=0.02, direction='both', plot=TRUE)

I got the error message:

Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : must supply period length for time series decomposition

How can I fix this problem?

If I don't know the period, can I still find the anomalies?

Thanks very much for the great work!

Best Regards

Conny

@owenvallis
Copy link

Hi Conny,

I would suggest trying the AnomalyDetectionVec function instead of the TS function. At the moment, the TS function aggregates secondly data into minutely data. The Vec function simply takes a list of values, and then treats them as a time series without the timestamp column. A few things might help when using the Vec function:

  • We suggest replacing all non-leading NAs with interpolated values (see na.approx in Zoo package).
  • Make a best estimate of the period. If there isn't a strong seasonal component, then I might recommend simply removing the trend and applying general ESD to the residual.

Hope that helps.

@blatoo
Copy link
Author

blatoo commented Jul 6, 2015

Hi Owenvallis,

thanks very much for the answer! But I still have another stupid question, what is ESD?

@terrytangyuan
Copy link
Contributor

@blatoo ESD stands for Seasonal Hybrid ESD (S-H-ESD), which is the primary algorithm of this package.

@blatoo
Copy link
Author

blatoo commented Aug 20, 2015

Hi @terrytangyuan , Thanks very much!!!

@terrytangyuan
Copy link
Contributor

Could anyone close this? Thanks.

@tintojames
Copy link

Hey I also ran into the same issue even though my input time series is having a regular interval. Please note I haven't used any NA values. Looks like this issue is still open.

@evanhenry
Copy link

Hello, I am experiencing a similar error with 1 Hz data. Have there been any developments on this issue since feb?

`> str(data)
'data.frame': 3600 obs. of 2 variables:
$ V1: POSIXct, format: "2016-10-29 07:00:00" "2016-10-29 07:00:01" "2016-10-29 07:00:02" ...
$ V2: num 28.7 28.7 28.7 28.7 28.7 ...
head(data)
V1 V2
1 2016-10-29 07:00:00 28.69
2 2016-10-29 07:00:01 28.69
3 2016-10-29 07:00:02 28.70
4 2016-10-29 07:00:03 28.70
5 2016-10-29 07:00:04 28.70
6 2016-10-29 07:00:05 28.71

data_anomaly = AnomalyDetectionTs(data, max_anoms=0.02, direction="pos", plot=TRUE, e_value = T)
Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, :
must supply period length for time series decomposition`

@jj7353
Copy link

jj7353 commented Nov 1, 2016

This worked for me.

res = AnomalyDetectionVec(group_prof_10252016[,2], max_anoms=0.02,
period=1440, direction='both', only_last=FALSE, plot=TRUE)

On Sat, Oct 29, 2016 at 3:38 PM, Evan Henry notifications@github.com
wrote:

I also noticed that changing the time period in the sample data and code
here results in the same error: https://github.com/pablo14/
anomaly_detection_post


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#45 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AVIHHWZA1QxiW0IxPRL20LKJjqhzqMTBks5q4662gaJpZM4FSJhj
.

@aaishaosman
Copy link

Hi all, I am quite new to this package and would like to use it for some analysis i am doing. I have data that is not regular ie. trading. Would i be able to use the AnomalyDetection to identify say irregular rices charged? If so, what would i set the "period" to, as on some days there might be a trade every second, or hour, and on some days none? i have data for roughly a year.

Any help will be greatly appreciated!
Thanks!

@guiyang882
Copy link

hi jj7353,
I want to know about the parameter period, why you choose the period = 1440, how to choose this parameter rightly?
@jj7353

thx.

@Maryoda2
Copy link

In the example of raw_data, it is by minute.
So 1440 because of 24 hrs * 60 minutes whiwh is equal to 1440

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants