Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which smoother is the best to detect and remove outliers? #8

Closed
muhlbach opened this issue Apr 29, 2021 · 2 comments
Closed

Which smoother is the best to detect and remove outliers? #8

muhlbach opened this issue Apr 29, 2021 · 2 comments

Comments

@muhlbach
Copy link

muhlbach commented Apr 29, 2021

Hi Marco! Thank you for an awesome package!

I have a quick question for you. Since you're obviously well-rehearsed in time-series smoothing, which particular smoother will you recommend as a default option?

In particular, I have a training series y_train (which is potentially very short, <50 observations), and I use some univariate forecasting model to forecast H-periods ahead, resulting in an H-dim vector y_hat. Since my training vector is not always very long, some flexible methods give me crazy results for y_hat, which I want to reset to some sensible value.

I could do, for instance,

# Instantiate smoother
smoother = ConvolutionSmoother(window_len=0.1*len(y_train), window_type='ones')
smoother.smooth(pd.concat([y_train, y_hat], axis=0)
        
# Get threshold
threshold_lower, threshold_upper = smoother.get_intervals('sigma_interval', n_sigma=2)
        
# Subset to match length
threshold_lower = threshold_lower[0,-len(y_hat):]
threshold_upper = threshold_upper[0,-len(y_hat):]

and then use these thresholds. Do you have any recommendations in this setup?

@cerlymarco
Copy link
Owner

Hi, thanks for your support.
I think that there isn't a correct answer here. The choice is related to your needs and to the domain of analysis. If it's not computer expensive you can try different smoothers on a part of your data and see what happens.

My suggestion is always to start with ConvolutionSmoother (as u did) for its simplicity and adaptability.

@tanhhkhcndn
Copy link

Hi Marco! Thank you for an awesome package!
I read your article "Real-Time Time Series Anomaly Detection". So, please give me an advice that the recommendation "window_len" for real time detect univariable interger of 1000 point data size. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants