Skip to content

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's…

Notifications You must be signed in to change notification settings

schijioke-uche/data-analysis-with-python-an-spss-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IBM SPSS® Statistics Modeler with Python

  by Jeffrey Chijioke-Uche, MSIS, MSIT, CPDS (IBM Sr. Solution Architect, Hybrid Cloud & Multicloud)
  IBM Information Technology PhD Scholar at Harvard University & Walden University 

Usage:

Usage is typically with data analysis for collected non-textual/thematic data. For, example, it could be used when performing dependent sample t-tests where you typically need to determine the following two hypotheses: Null hypothesis (H0), that is if the true mean difference is equal to zero (between the observations) OR Alternative hypothesis(H1), that is if the true mean difference is not equal to zero (two-tailed). With SPSS Modeler statistics flows, you can quickly develop predictive models using business expertise to improve decision making at hypothetical level.

Basics

  1. [CDF ]: CDF plots for a random distribution

More advanced topics

  1. [Confidence intervals]
    1.1 Standard confidence intervals for normal distribution
    1.2 Bootstrapped confidence intervals
    1.3 Bayesian estimates

  2. [Rejection sampling] A method to sample a random distribution

  3. [Binomial distribution] Binomial distribution and Bayesian theorem.

  4. [Power estimation]
    4.1 Standard solver
    4.2 Bootstrapping

Testing hypotheses

Test normality on a distribution

  1. [Normality tests ]
    1.1 Q-Q plots
    1.2 Skew and Kurtosis tests
    1.3 Kormogorov-Smirnov test
    1.4 Shapiro-Wilk test
    1.5 Anderson-Darling test

Goodness of fit

  1. For categorical data
    [chi square test]
  2. For 2 sample distributions
    [Kolmogorov-Smirnov test]

Test difference between means

  1. [Parametric tests & Bootstrapping]
    1.1 t-test
    1.2 Cohen's d (effect size)
    1.3 Bootstrapping

  2. [Non parameteric tests ]
    1.1 Wilcoxon rank-sum test
    1.2 Mann-Whitney test

Test difference between means for dependent groups (repeated measures)

  1. Parametric tests
    1.1 [Paired t-test]
    1.2 [Repeated Measures ANOVA]
  2. Non-parametric tests
    2.1 [Friedman chi square test]

Test percentage change

  1. [Delta method (A/B testing)]

About

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's…

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published