Skip to content

Latest commit

 

History

History
88 lines (66 loc) · 2.83 KB

README.md

File metadata and controls

88 lines (66 loc) · 2.83 KB

Stats

Description

This is a prototype of a statistical library for Ruby. Starting out, the purpose of the library is to be readable (for people studying statistics), to be well-tested (against R and Python statistical functions), and to be useful for Small Data. Big Data can come later, if I have enough fun. With stats, I aim to create an API that makes statistics intuitive and harder to mess up. For example, I'd like to take a stab at an assumption framework that can tag specific functions with assumptions that will throw warnings if they're not met.


Try it out

Once this is stable and fully tested (it is so far for all the functions listed below), I'll consider publishing it as a gem. Until then, you can play around with master:

brew install gsl
git clone https://github.com/davejacobs/stats.git
cd stats
bundle

To implement

For developers

  • Get Ruby GSL bindings (gem install gsl) to work on Ruby 2.0/OS X
  • Implement gemspec so this is installable via git URL

Distribution functions

I've added a wrapper around GSL distribution functions, for more intuitive access and testing.

  • Normal distribution - PDF & CDF
  • Chi square distribution - PDF & CDF
  • T distribution - PDF & CDF
  • F distribution - PDF & CDF

Basic functions

  • Mean, arithmetic
  • Mean, geometric
  • Median
  • Mode
  • Variance
  • Standard deviation
  • Standard error of the mean (for samples only)
  • Relative standard error of the mean (for samples only)
  • Coefficient of variation

Significance tests

  • Chi square
  • T-test, single sample
  • T-test, two-sample
  • T-test, repeated measures
  • Wilcoxon rank sum test
  • Wilcoxon signed rank test
  • Median test
  • Kruskall-Wallis H test
  • Friedman test
  • ANOVA, one-way
  • Factorial ANOVA, two-way
  • Factorial ANOVA, three-way
  • ANOVA, repeated measures
  • MANOVA
  • ANCOVA
  • Welch's ANOVA
  • Fisher's least significant difference

Regressions

  • Linear regression
  • Multiple linear regression
  • Pearson's correlation
  • Spearman correlation

Support & other

  • Basic assumption framework
  • Confidence intervals (general idea)
  • Basic data structures
  • Significance methods on data structures
  • Test using R integration and something like Rantly

Resources