New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUMM/ENH: inference for variance, variance estimation #8261
Comments
after reading around, it looks like we want to have at least 3 to 5 kurtosis estimators for use in the variance confint. standard kurtosis estimate is unbiased AFAICS (not reading everything) Minitab uses SJ test to distinguish sample by Low, Medium, and High kurtosis, but only uses it to add warnings and minimum sample size recommendations to the Bonett test. default method for confint_variance ? Benett is a good default except for highly skewed or/and heavy tailed distributions. In those cases some adjustments or transformation would be more accurate dataplot https://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/sdconfli.htm two sample comparison Suwan, Sirima, and Sa-aat Niwitpong. 2013. “Interval Estimation for a Linear Function of Variances of Nonnormal Distributions That Utilize the Kurtosis.” Applied Mathematical Sciences 7: 4909–18. https://doi.org/10.12988/ams.2013.37366. |
found an article that makes connection to multivariate case by Yuan Bentler #4144 Yuan, Ke-Hai, Peter M. Bentler, and Wei Zhang. 2005. “The Effect of Skewness and Kurtosis on Mean and Covariance Structure Analysis: The Univariate Case and Its Multivariate Implication.” Sociological Methods & Research 34 (2): 240–58. looks good based on very brief skimming |
my current selected reading list one sample variance or standard deviationBonett, Douglas G. 2006. “Approximate Confidence Interval for Standard Deviation of Nonnormal Distributions.” Computational Statistics & Data Analysis 50 (3): 775–82. https://doi.org/10.1016/j.csda.2004.10.003. Minitab documentation, white paper on 1-Sample Standard Deviation Test Searls, Donald T., and Pichai Intarapanich. 1990. “A Note on an Estimator for the Variance That Utilizes the Kurtosis.” The American Statistician 44 (4): 295–96. https://doi.org/10.1080/00031305.1990.10475745. Akyüz, Hayriye Esra, and Hamza Gamgam. 2017. “Interval Estimation for Nonnormal Population Variance with Kurtosis Coeffificient Based on Trimmed Mean.” Turkiye Klinikleri Journal of Biostatistics 9 (3): 213–21. https://doi.org/10.5336/biostatic.2017-57348. Banik, Shipra, Ahmed N. Albatineh, Moustafa Omar Ahmed Abu-Shawiesh, and B. M. Golam Kibria. 2014. “Estimating the Population Standard Deviation with Confidence Interval: A Simulation Study under Skewed and Symmetric Conditions.” International Journal of Statistics in Medical Research 3 (4): 356–67. https://doi.org/10.6000/1929-6029.2014.03.04.4. Burch, Brent D. 2014. “Estimating Kurtosis and Confidence Intervals for the Variance under Nonnormality.” Journal of Statistical Computation and Simulation 84 (12): 2710–20. https://doi.org/10.1080/00949655.2013.840628. ———. 2017. “Distribution-Dependent and Distribution-Free Confidence Intervals for the Variance.” Statistical Methods & Applications 26 (4): 629–48. https://doi.org/10.1007/s10260-017-0385-z. Niwitpong, Sa-aat, and Pianpool Kirdwichai. 2008. “Adjusted Bonett Confidence Interval for Standard Deviation of Non-Normal Distributions.” Thailand Statistician 6 (1): 1–16. Wencheko, Eshetu, and Honest W. Chipoyera. 2009. “Estimation of the Variance When Kurtosis Is Known.” Statistical Papers 50 (3): 455–64. https://doi.org/10.1007/s00362-007-0084-1. Yuan, Ke-Hai, Peter M. Bentler, and Wei Zhang. 2005. “The Effect of Skewness and Kurtosis on Mean and Covariance Structure Analysis: The Univariate Case and Its Multivariate Implication.” Sociological Methods & Research 34 (2): 240–58. two samplesI didn't check the references systematically yet Bonett, Douglas G. 2006. “Robust Confidence Interval for a Ratio of Standard Deviations.” Applied Psychological Measurement 30 (5): 432–39. https://doi.org/10.1177/0146621605279551. Jan, Show-Li, and Gwowen Shieh. 2022. “A Comparative Study of TOST and UMPT Procedures for Evaluating Dispersion Equivalence.” Statistics in Biopharmaceutical Research 14 (2): 162–67. https://doi.org/10.1080/19466315.2020.1821762. Shoemaker, Lewis H. 2003. “Fixing the F Test for Equal Variances.” The American Statistician 57 (2): 105–14. https://doi.org/10.1198/0003130031441. Suwan, Sirima, and Sa-aat Niwitpong. 2013. “Interval Estimation for a Linear Function of Variances of Nonnormal Distributions That Utilize the Kurtosis.” Applied Mathematical Sciences 7: 4909–18. https://doi.org/10.12988/ams.2013.37366. estimating kurtosis and skewJoanes, D. N., and C. A. Gill. 1998. “Comparing Measures of Sample Skewness and Kurtosis.” Journal of the Royal Statistical Society: Series D (The Statistician) 47 (1): 183–89. https://doi.org/10.1111/1467-9884.00122. An, Lihua, and S. Ejaz Ahmed. 2008. “Improving the Performance of Kurtosis Estimator.” Computational Statistics & Data Analysis 52 (5): 2669–81. https://doi.org/10.1016/j.csda.2007.09.024. Guo, Yawen, and B. M. Golam Kibria. 2017. “Testing the Population Kurtosis Parameter: An Empirical Study with Applications.” International Journal of Computational and Theoretical Statistics 04 (01): 45–63. https://doi.org/10.12785/IJCTS/040104. others(I didn't read yet in this direction) Bonett also has articles on regression residuals, and other spread/dispersion measures like MAD Bonett, Douglas G. 2005a. “Confidence Interval for Residual Mean Absolute Deviation in Regression Models.” Journal of Statistical Computation and Simulation 75 (8): 673–78. https://doi.org/10.1080/00949650412331299148. ———. 2005b. “Robust Confidence Interval for a Residual Standard Deviation.” Journal of Applied Statistics 32 (10): 1089–94. https://doi.org/10.1080/02664760500165339. Bonett, Douglas G, and Edith Seier. 2003. “Confidence Intervals for Mean Absolute Deviations.” The American Statistician 57 (4): 233–36. https://doi.org/10.1198/0003130032323. |
I just realized that the references use different "contrasts" for the variance confidence interval in the univariate case all based on distribution of sum of squares S = (nobs - ddof) * var
|
It looks like Benett is mainly available in Minitab, minitab has one sample and two sample ratio version R package table 2 footer contains formula for variance of variance, |
check variance of log-normal distribution |
I don't know where to put hypothesis tests and confidence intervals for variance and standard deviation.
It should have the usual test_, confint_, tost_, power functions for one and two sample cases as in rates and proportions.
update: preliminary decision on module name
variance_moments
I didn't really like any of the shorter namesoneway case is handled in
oneway
with standard tests. This is mainly equality of scale and dispersion measure, not necessarily of variance.I want something specific to variance (and not other dispersion measures), that can be for example used in functions like zscore standardized mean or coefficient of variation. one usage MOVER confint for those.
related issue: where do we put skew and kurtosis functions. (e.g. extensions to skew and kurtosis tests, we need one-sided alternatives)
can we have a module for variance and higher moments.
related:
#2765 organizing cov and corr, which is now in stats.multivariate
(outlier) robust scale is in
robust
and some open PRs, note, those can only be converted to variance measure under specific distributionall assumptions) related robust definitions of skew and kurtosis #6790 and in stattoolsweightstats has descriptive statistics in the class, but only inference on one and two sample means.
inferential statistics for a single correlation coefficient
I'm not sure where that goes
specific to variance
I would like the kurtosis corrected version of Bonett, and similar variation on it.
supporting code will need kurtosis estimators
(aside: standard kurtosis estimate is downward (!) biased for heavy tailed distributions)
(I didn't see any reference that uses score confidence intervals or score test for variance. Score test for one variance should be easy. )
Note, Bonnet is still liberal for heavy-tailed skewed distributions.
Stata also has bonnet confint, minitab has a good working paper for bonnet, and a kurtosis test
other ideas:
Can we use transformation to normality #3224 of underlying data to get a better variance inference?, e.g. for data that are closer to log-nromal (I have not looked for references for this)
Can we use a "working 4th moment or kurtosis function" with a 4th moment robust variance estimate to improve inference?. e.g. based on GLM Gamma (corresponds to quadratic 4th moment if endog is variance/squared residuals
Right now, I mainly want Bonnet test which looks easy to implement but needs a location.
related multivariate version, covariance, correlation matrix PR #6696
e.g.
https://github.com/statsmodels/statsmodels/pull/6696/files#diff-fc0daeb7f971edb0e8a8866ed966f86ced2c81767eb394f3a81a6a4761184595R180
has option for given kurt(osist) and general, the latter uses 4th moment estimate for cov_cov
I doubt those functions work for the univariate case, i.e. cov is a single variance.
The text was updated successfully, but these errors were encountered: