For Exponential Smoothing with seasonality, the initial Level (if not provided by the user) is set as follows:
y[np.arange(self.nobs) % m == 0].mean()
|
l0 = y[np.arange(self.nobs) % m == 0].mean() if l0 is None else l0 |
In other words, take the mean of all the points corresponding to the first season.
This is an unusual way to initialize, and produces strange results for time series with a strongly non-zero slope: you'll start with a Level close to the mean of the series (which might not be anywhere close to the level at the start of the series). Then you'll have huge initial seasonality to compensate. The result is a forecast that looks fine in total but whose components are weird. Ideally, the mean of the seasonality time series should be zero (for addititive seasonality) or one (for multiplicative), but this method results in values far from that.
Hyndman (whose text is mentioned in the documentation) provides some suggestions. His simplest method is to take the mean of the first period (not the first season of each period, as is currently implemented above).
NIST's method is even simpler: just use y[0], identical to the non-seasonal version. (They don't explicitly mention it, but the example on the next page shows that result.)
I think we should use Hyndman's 1998 method, and change L769 to be
l0 = y[:m].mean() if l0 is None else l0
I can put up a PR if you wish.
Thanks!
For Exponential Smoothing with seasonality, the initial Level (if not provided by the user) is set as follows:
y[np.arange(self.nobs) % m == 0].mean()statsmodels/statsmodels/tsa/holtwinters.py
Line 769 in da9d7e9
In other words, take the mean of all the points corresponding to the first season.
This is an unusual way to initialize, and produces strange results for time series with a strongly non-zero slope: you'll start with a Level close to the mean of the series (which might not be anywhere close to the level at the start of the series). Then you'll have huge initial seasonality to compensate. The result is a forecast that looks fine in total but whose components are weird. Ideally, the mean of the seasonality time series should be zero (for addititive seasonality) or one (for multiplicative), but this method results in values far from that.
Hyndman (whose text is mentioned in the documentation) provides some suggestions. His simplest method is to take the mean of the first period (not the first season of each period, as is currently implemented above).
NIST's method is even simpler: just use y[0], identical to the non-seasonal version. (They don't explicitly mention it, but the example on the next page shows that result.)
I think we should use Hyndman's 1998 method, and change L769 to be
I can put up a PR if you wish.
Thanks!