Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asking for parameters setting tricks #5

Closed
miazoin opened this issue Aug 14, 2018 · 4 comments
Closed

Asking for parameters setting tricks #5

miazoin opened this issue Aug 14, 2018 · 4 comments
Labels
enhancement New feature or request

Comments

@miazoin
Copy link

miazoin commented Aug 14, 2018

Hi Hundman,

I am trying to implement 'telemanom' on my own data. And after few experiments, I have some questions about 'telemanom', would you like to give me some intuitions about tuning parameters?

  1. Dose 'telemanom' fit better on seasonal streaming data (the one we are gonna predict)? And should I delete the known anomalies in the training data or do denoising on the data?

  2. Do I need to reset the anomalies labels when I using different ' l_s : num previous timesteps provided to model to predict future values '?
    Actually, I did it in my experiments, otherwise the results are not as expected.

  3. How could I find a set of parameters could be widely-used for different multiple time series? Will you consider the "score" derived from the unsupervised anomaly detection part?

  4. Will you add the code for the supervised anomaly detection by using the labels (which you mentioned in the paper ) in the 'telemanom' open source code?

Thanks a lot.

@miazoin miazoin changed the title asking for parameter setting tricks Asking for parameter setting tricks Aug 14, 2018
@miazoin miazoin changed the title Asking for parameter setting tricks Asking for parameters setting tricks Aug 14, 2018
@miazoin
Copy link
Author

miazoin commented Aug 21, 2018

@khundman

@khundman
Copy link
Owner

Hi @miazoin,

See comments below:

  1. This depends on the context of your problem. If you want to identify anomalies independent of seasonality effects then you may want to consider differencing your data to make it stationary (see https://people.duke.edu/~rnau/411diff.htm, for example). In response to the second part, the method seems to be pretty robust to known anomalies (they generally aren't useful for the model to learn during training), but removing them is probably good practice.

  2. Yes, shortening or lengthening l_s effectively shortens/lengthens the test and training sets and the labeled ranges should be changed as l_s is changed. A pull request to automate this is welcome, or I may get around to it eventually.

  3. This is a challenging question. If you don't have labeled anomalies for your various sets then another option is to look at the normalized prediction accuracy. If you can curate datasets that you at least know are nominal or don't contain anomalies then you can try and optimize the tradefoff between computation/complexity and prediction accuracy for your different time series to find a reasonable set of parameters. This method is highly dependent on the quality of predictions - if the models can identify normal behavior well, you will get better differentiation when abnormal behavior occurs. I would be interested to hear about your findings as you explore different sets of parameters.

  4. I'm not sure what you are referring to? The master branch demonstrates evaluation on a labeled set, but I don't think we mention a fully-supervised setting (it isn't applicable in our work currently).

Thanks.

@khundman khundman added the enhancement New feature or request label Aug 28, 2018
@khundman khundman assigned khundman and unassigned khundman Aug 28, 2018
@miazoin
Copy link
Author

miazoin commented Aug 29, 2018

Hi @khundman,
Thank you for your reply. And for the 4th question, the reference is :
image

@khundman
Copy link
Owner

I see what you mean. Implementing this would make more sense in fully-deployable Telemanom system that collects anomalies over time and stores them in a database. We are working on open sourcing a version like this, but it is pretty straightforward to implement on your own too. If your rate of anomalies for a channel is higher than some minimum rate determined by your context (e.g. more than one anomaly per week), then you can set a minimum anomaly score that new anomalies for that channel must exceed to be flagged. One simple way to set this minimum is to use prior scores for a channel and set some percentile threshold (e.g. new anomalies must have a score in the top 25% of all prior scores for a channel).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants