You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the status and roadmap for multivariate Statistics
main methods
either available or started in statsmodels (or related packages)
Models
_MultivariateOLS
SUR
...
GMM (can we get more explicit or direct multivariate, multi equation support?)
panel data, GEE, system of equations, VAR, VECM, multivariate statespace models,
two-part or multi-part models (e.g. heckman type, bivariate Probit)
many related models, not in narrow definition of multivariate statistics
large sparse versions of Models
Principal Components Analysis (PCA)
Canonical Correlation Analysis (CCA)
Factor Analysis
other methods
some are available in scikit-learn, those are not high priority unless there are specific results statistics that are important.
Partial Least Square (PLS)
Classical Multidimensional Scaling (MDS)
Linear Discriminant Analysis (LDA)
Multiclass LDA
Independent Component Analysis (ICA), FastICA
Probabilistic PCA
Kernel PCA
Correspondence Analysis
Cluster Analysis (scipy)
analogues for ordered variables: ?, e.g. item response models
Redundancy Analysis
standalone and supporting hypothesis tests
hypothesis tests and related statistics including confidence intervals
tests on mean, Hotelling, multivariate control charts
MANOVA
tests of covariances and correlation, covariance structure analysis
gof tests for distributions
rank test
test for non-continuous data, e.g. multinomial, multivariate GLM
non-Pearson correlation and association measures (polychoric, robust, regularized, ...),
(kendalltau and spearman in scipy)
supporting code
factor rotation, procrustes
matrix tools
multivariate transformation, e.g. compositional
correlation_tools, closest positive definite
missing values, imputation (what and where does it belong?)
R taskview also lists multivariate distributions including copulas
What's the status and roadmap for multivariate Statistics
main methods
either available or started in statsmodels (or related packages)
Models
two-part or multi-part models (e.g. heckman type, bivariate Probit)
many related models, not in narrow definition of multivariate statistics
Principal Components Analysis (PCA)
Canonical Correlation Analysis (CCA)
Factor Analysis
other methods
some are available in scikit-learn, those are not high priority unless there are specific results statistics that are important.
standalone and supporting hypothesis tests
hypothesis tests and related statistics including confidence intervals
(kendalltau and spearman in scipy)
supporting code
R taskview also lists multivariate distributions including copulas
list of methods partially taken from https://github.com/JuliaStats/MultivariateStats.jl, TOC of Stata mv, and R taskview https://cran.r-project.org/web/views/Multivariate.html
Todo, Priorities
???
whatever someone is working on and contributes
The text was updated successfully, but these errors were encountered: