There are several possible methods to test normality of a variable.
- Graphical methods
- See, for example, Schucany & Tony Ng (2006) arguing for its use.
- The decision based on it is subjective.
- Cannot be automatic, which is important in a software with automatic decisions.
- Numerical methods with descriptives
- E.g., skewness and kurtosis
- It cannot consider the sampling error.
- Hypothesis tests
- See a few of them here. There are more than 40 hypothesis tests for normality test.
- A few studies are investigating the power of normality tests, e.g., Razali & Wah, 2011; Noughabi & Arghami, 2011; Farrell & Rogers-Stewart, 2006; Romão, Delgado & Costa, 2010; Yap & Sim, 2011
- Shapiro-Wilk test is the most recommended test.
- Jarque-Bera test, Anderson-Darling test and D'Agostino test also perform well.
- Use of Kolmogorov-Smirnov test is mostly discouraged.
- Power of the specific tests depends on the distribution, therefore in some cases it is impossible to tell which test is the best choice.
- (Thanks Ákos Laczkó for the summary.)
- Lack of power for small samples. E.g., see the referred simulation studies how the specific tests perform for specific sample sizes and specific distributions.
- As an assumption test, it could influence the overall alpha level of the main test.
Chosen method for CogStat
- Graphical methods are subjective, which is not in line with the idea of use of consensual methods in CogStat. Still, CogStat displays data and the result of the analyses. Therefore, histograms with normal distribution and Q-Q plots are displayed, but the decision (e.g., for assumption check) is not based on them.
- It is hard to tell whether numerical methods or hypothesis tests would cause more problems. Because hypothesis tests seem to be more accepted, CogStat uses hypothesis tests.
- Simulations demonstrate that Shapiro-Wilk test shows the highest power in most cases.
Farrell, P. J., & Rogers-Stewart, K. (2006). Comprehensive study of tests for normality and symmetry: extending the Spiegelhalter test. Journal of Statistical Computation and Simulation, 76(9), 803–816. https://doi.org/10.1080/10629360500109023
Noughabi, H. A., & Arghami, N. R. (2011). Monte Carlo comparison of seven normality tests. Journal of Statistical Computation and Simulation, 81(8), 965–972. https://doi.org/10.1080/00949650903580047
Razali, N. M. & Wah, Y. B. (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests, Journal of Statistical Modeling and Analytics, Vol. 2, pp. 21-33.
Romão, X., Delgado, R., & Costa, A. (2010). An empirical power comparison of univariate goodness-of-fit tests for normality. Journal of Statistical Computation and Simulation, 80(5), 545–591. https://doi.org/10.1080/00949650902740824
Schucany, W. R., & Tony Ng, H. K. (2006). Preliminary Goodness-of-Fit Tests for Normality do not Validate the One-Sample Student t. Communications in Statistics - Theory and Methods, 35(12), 2275–2286. https://doi.org/10.1080/03610920600853308
Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141–2155. https://doi.org/10.1080/00949655.2010.520163