Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A suite for basic and advanced statistics on Ruby.

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
data
doc_latex/manual
examples
lib
po
test
.gitignore
History.txt
LICENSE.txt
Manifest.txt
README.txt
Rakefile
setup.rb

README.txt

= Statsample

http://ruby-statsample.rubyforge.org/


== DESCRIPTION:

A suite for basic and advanced statistics on Ruby. Tested on Ruby 1.8.7, 1.9.1, 1.9.2 (April, 2010) and JRuby 1.4 (Ruby 1.8.7 compatible).

Include:
* Descriptive statistics: frequencies, median, mean, standard error, skew, kurtosis (and many others).
* Imports and exports datasets from and to Excel, CSV and plain text files.
* Correlations: Pearson's r, Spearman's rank correlation (rho), Tetrachoric, Polychoric.
* Anova: generic and vector-based One-way ANOVA
* Tests: F, T, Levene, U-Mannwhitney.
* Regression: Simple, Multiple (OLS), Probit  and Logit
* Factorial Analysis: Extraction (PCA and Principal Axis), Rotation (Varimax, Equimax, Quartimax) and Parallel Analysis, for estimation of number of factors.
* Dominance Analysis, with multivariate dependent and bootstrap (Azen & Budescu)
* Sample calculation related formulas
* Creates reports on text, html and rtf, using ReportBuilder gem

== FEATURES:

* Classes for manipulation and storage of data:
  * Statsample::Vector: An extension of an array, with statistical methods like sum, mean and standard deviation
  * Statsample::Dataset: a group of Statsample::Vector, analog to a excel spreadsheet or a dataframe on R. The base of almost all operations on statsample. 
  * Statsample::Multiset: multiple datasets with same fields and type of vectors
* Anova module provides generic Statsample::Anova::OneWay and vector based Statsample::Anova::OneWayWithVectors
* Module Statsample::Bivariate provides covariance and pearson, spearman, point biserial, tau a, tau b, gamma, tetrachoric (see Bivariate::Tetrachoric) and polychoric (see Bivariate::Polychoric) correlations. Include methods to create correlation and covariance matrices
* Multiple types of regression.
  * Simple Regression :  Statsample::Regression::Simple
  * Multiple Regression: Statsample::Regression::Multiple
  * Logit Regression:    Statsample::Regression::Binomial::Logit
  * Probit Regression:    Statsample::Regression::Binomial::Probit
* Factorial Analysis algorithms on Statsample::Factor module.
  * Classes for Extraction of factors: 
    * Statsample::Factor::PCA
    * Statsample::Factor::PrincipalAxis
  * Classes for Rotation of factors: 
    * Statsample::Factor::Varimax
    * Statsample::Factor::Equimax
    * Statsample::Factor::Quartimax
  * Statsample::Factor::ParallelAnalysis performs Horn's 'parallel analysis' to a principal components analysis to adjust for sample bias in the retention of components. 
* Dominance Analysis. Based on Budescu and Azen papers, Statsample::DominanceAnalysis class can report dominance analysis for a sample, using uni or multivariate dependent variables and DominanceAnalysisBootstrap can execute bootstrap analysis to determine dominance stability, as recomended by  Azen & Budescu (2003) link[http://psycnet.apa.org/journals/met/8/2/129/]. 
* Module Statsample::Codification, to help to codify open questions
* Converters to import and export data:
  * Statsample::Database : Can create sql to create tables, read and insert data
  * Statsample::CSV : Read and write CSV files
  * Statsample::Excel : Read and write Excel files
  * Statsample::Mx    : Write Mx Files
  * Statsample::GGobi : Write Ggobi files
* Module Statsample::Crosstab provides function to create crosstab for categorical data
* Reliability analysis provides functions to analyze scales. Class ItemAnalysis provides statistics like mean, standard deviation for a scale, Cronbach's alpha and standarized Cronbach's alpha, and for each item: mean, correlation with total scale, mean if deleted, Cronbach's alpha is deleted. With HtmlReport, graph the histogram of the scale and the Item Characteristic Curve for each item
* Module Statsample::SRS (Simple Random Sampling) provides a lot of functions to estimate standard error for several type of samples
* Module Statsample::Test provides several methods and classes to perform inferencial statistics
  * Statsample::Test::Levene
  * Statsample::Test::UMannWhitney
  * Statsample::Test::T
  * Statsample::Test::F  
* Interfaces to gdchart, gnuplot and SVG::Graph 


== Examples of use:

=== Correlation matrix

    require 'statsample'
    a=1000.times.collect {rand}.to_scale
    b=1000.times.collect {rand}.to_scale
    c=1000.times.collect {rand}.to_scale
    d=1000.times.collect {rand}.to_scale
    ds={'a'=>a,'b'=>b,'c'=>c,'d'=>d}.to_dataset
    cm=Statsample::Bivariate.correlation_matrix(ds)
    puts cm.summary

=== Tetrachoric correlation

    require 'statsample'
    a=40
    b=10
    c=20
    d=30
    tetra=Statsample::Bivariate::Tetrachoric.new(a,b,c,d)
    puts tetra.summary
    
=== Polychoric correlation

    require 'statsample'
    ct=Matrix[[58,52,1],[26,58,3],[8,12,9]]
    
    poly=Statsample::Bivariate::Polychoric.new(ct)
    puts poly.summary

== REQUIREMENTS:

Optional: 

* Plotting: gnuplot and rbgnuplot, SVG::Graph
* Factorial analysis and polychorical correlation(joint estimate and polychoric series): gsl library and rb-gsl (http://rb-gsl.rubyforge.org/). You should install it using <tt>gem install gsl</tt>. 

<b>Note</b>: Use gsl 1.12.109 or later.

== RESOURCES

* Source code on github: http://github.com/clbustos/statsample
* API: http://ruby-statsample.rubyforge.org/statsample/
* Bug report and feature request: http://code.google.com/p/ruby-statsample/issues/list


== INSTALL:

  sudo gem install ruby-statsample

For optimization on *nix env

  sudo gem install gsl ruby-statsample-optimization

Available setup.rb file

  sudo gem ruby setup.rb

== LICENSE:

GPL-2 (See LICENSE.txt)
Something went wrong with that request. Please try again.