ESTYP (Extended Statistical Toolkit Yet Practical) is a Python library that serves as a multifaceted toolkit for statistical analysis. The testing
module encompasses a wide range of statistical tests, including t-tests, chi-squared tests, and correlation tests, providing robust methods for data comparison and validation. In the linear_model
module, users can find functionalities related to logistic regression, including variable selection techniques and additional methods for calculating confidence intervals and p-values. This module enhances the capabilities of traditional logistic regression analysis. The cluster module is designed to assist in clustering analysis, offering tools to identify the optimal number of clusters
using methods like the elbow or silhouette techniques. Together, these modules form a comprehensive and practical statistical toolkit that caters to various analytical needs.
Actually, the name comes from the way my friends call me (Esti), plus "p" which is the initial of python
.
To install this library, you can use PyPI:
pip install estyp
Also, you can install it from the source code:
git clone https://github.com/estebanrucan/estyp.git
cd estyp
pip install -e .
You can have a friendly introduction to this library in the documentation.
You can see the full changelog here.
testing.CheckModel()
: This class provides methods to test the assumptions of the linear regression model., inspired by theperformance::check_model()
function of the R software.testing.t_test()
: Performs one and two sample t-tests on groups of data. This function is inspired by thet.test()
function of the R software.testing.var_test()
: Performs an F test to compare the variances of two samples from normal populations. This function is inspired by thevar.test()
function of the R software.testing.prop_test()
: it can be used for testing the null that the proportions (probabilities of success) in several groups are the same, or that they equal certain given values. This function is inspired by theprop.test()
function of the R software.testing.chisq_test()
: Performs a chi-squared test of independence of variables in a contingency table. This function is inspired by thechisq.test()
function of the R software.testing.cor_test()
: Performs a correlation test with Pearson, Spearman or Kendall method. This function is inspired by thecor.test()
function of the R software.testing.nested_models_test()
: Performs a nested models test to compare two nested models using deviance criterion.testing.dw_test()
: Performs the Durbin-Watson test for autocorrelation of disturbances (includes a p-value). Inspired by thelmtest::dwtest()
function of the R software.
linear_model.LogisticRegression()
: This class implements a logistic regression model. It is like theLogisticRegression()
class fromscikit-learn
, but adds additional methods for calculating confidence intervals, p-values, and model summaries likeLogit
class instatsmodels
.linear_model.Stepwise()
: Provides a implementation to add or remove predictors based on their significance, AIC or BIC in a model.
cluster.NClusterSearch
: A helper class to identify the optimal number of clusters for clustering algorithms with elbow or silhuette methods.
This library is under the MIT license.
If you have any questions about this library, you can contact me at LinkedIn.