-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusion with se_mean and standard deviation #8699
Comments
For a simple state space model (of which SARIMAX is a special case), we have: Here we will assume that the matrices By default,
(Aside: @josef-pkt pointed out that this actually doesn't match the intended/typical Statsmodels usage of the I'm not sure if that answers your question or not, but please feel free to follow up. |
"mean" here sounds fine, it's a conditional expectation of y
In OLS I used se_mean would be the uncertainty of the conditional expectations (coming from parameter uncertainty) aside: |
Thanks @josef-pkt! |
I would like to clarify that I understood conception of se_mean and standard deviation in statsmodels correctly. Could you help me with this?
In documentation statsmodels.tsa.base.prediction.PredictionResults.se_mean we have description that se_mean is the standard deviation of the predicted mean. At the same time in Release 0.8.0 there is a passage that get_forecast provides standard errors. As far as I know standard deviation and standard errors of mean are not the same things.
As a rookie in statistics I found in wiki that std is a variation in measurements, while the standard error of the mean is a probabilistic statement about how the sample size will provide a better bound on estimates of the population mean, in light of the central limit theorem. However, standard error can be described as an estimation of that standard deviation. Does it mean that in statespace.sarimax we estimate possible future values of standard deviations and the model outputs std which depends on the number of time series point (more additional time points, better prediction of std)?
I build SARIMAX model and want to construct сonfidence interval as a variation in measurements for forecast. Is it possible to use mean_se for this or I need to convert these values to std by multiplying SE by sqrt(n)? And does n equal the number of data points in time series before forecasting?
Thank you for yor reply in advance!
The text was updated successfully, but these errors were encountered: