Skip to content

Commit

Permalink
March 18, 2010
Browse files Browse the repository at this point in the history
  • Loading branch information
clbustos committed Mar 18, 2010
1 parent 4f87528 commit b6871d7
Show file tree
Hide file tree
Showing 33 changed files with 1,576 additions and 1,475 deletions.
42 changes: 30 additions & 12 deletions README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,25 +10,43 @@ A suite for basic and advanced statistics on Ruby. Tested on Ruby 1.8.7, Ruby 1.
Includes:
* Descriptive statistics: frequencies, median, mean, standard error, skew, kurtosis (and many others).
* Imports and exports datasets from and to Excel, CSV and plain text files.
* Correlations: Pearson (r), Rho, Tetrachoric, Polychoric
* Regression: Simple, Multiple, Probit and Logit
* Correlations: Pearson's r, Spearman's rank correlation (rho), Tetrachoric, Polychoric
* Regression: Simple, Multiple, Probit and Logit
* Factorial Analysis: Extraction (PCA and Principal Axis) and Rotation (Varimax and relatives)
* Dominance Analysis, with multivariate dependent and bootstrap (Azen & Budescu)
* Sample calculation related formulas

== FEATURES:

* Factorial Analysis. Principal Component Analysis and Principal Axis extraction, with orthogonal rotations (Varimax, Equimax, Quartimax)
* Multiple Regression. Listwise analysis optimized with use of Alglib library. Pairwise analysis is executed on pure ruby with matrixes and reports same values as SPSS
* Module Bivariate provides covariance and pearson, spearman, point biserial, tau a, tau b, gamma, tetrachoric and polychoric correlation correlations. Include methods to create correlation (pearson and tetrachoric) and covariance matrices
* Regression module provides linear regression methods
* Dominance Analysis. Based on Budescu and Azen papers, <strong>DominanceAnalysis</strong> class can report dominance analysis for a sample, using uni or multivariate dependent variables and <strong>DominanceAnalysisBootstrap</strong> can execute bootstrap analysis to determine dominance stability, as recomended by Azen & Budescu (2003) link[http://psycnet.apa.org/journals/met/8/2/129/].
* Classes for Vector, Datasets (set of Vectors) and Multisets (multiple datasets with same fields and type of vectors), and multiple methods to manipulate them
* Module Codification, to help to codify open questions
* Converters to and from database and csv files, and to output Mx and GGobi files
* Module Crosstab provides function to create crosstab for categorical data
* Classes for manipulation and storage of data:
* Statsample::Vector: An extension of an array, with statistical methods like sum, mean and standard deviation
* Statsample::Dataset: a group of Statsample::Vector, analog to a excel spreadsheet or a dataframe on R. The base of almost all operations on statsample.
* Statsample::Multiset: multiple datasets with same fields and type of vectors
* Module Statsample::Bivariate provides covariance and pearson, spearman, point biserial, tau a, tau b, gamma, tetrachoric (see Bivariate::Tetrachoric) and polychoric (see Bivariate::Polychoric) correlations. Include methods to create correlation and covariance matrices
* Multiple types of regression.
* Simple Regression : Statsample::Regression::Simple
* Multiple Regression: Statsample::Regression::Multiple
* Logit Regression: Statsample::Regression::Binomial::Logit
* Probit Regression: Statsample::Regression::Binomial::Probit
* Factorial Analysis algorithms on Statsample::Factor module.
* Classes for Extraction of factors:
* Statsample::Factor::PCA
* Statsample::Factor::PrincipalAxis
* Classes for Rotation of factors:
* Statsample::Factor::Varimax
* Statsample::Factor::Equimax
* Statsample::Factor::Quartimax
* Dominance Analysis. Based on Budescu and Azen papers, Statsample::DominanceAnalysis class can report dominance analysis for a sample, using uni or multivariate dependent variables and DominanceAnalysisBootstrap can execute bootstrap analysis to determine dominance stability, as recomended by Azen & Budescu (2003) link[http://psycnet.apa.org/journals/met/8/2/129/].
* Module Statsample::Codification, to help to codify open questions
* Converters to import and export data:
* Statsample::Database : Can create sql to create tables, read and insert data
* Statsample::CSV : Read and write CSV files
* Statsample::Excel : Read and write Excel files
* Statsample::Mx : Write Mx Files
* Statsample::GGobi : Write Ggobi files
* Module Statsample::Crosstab provides function to create crosstab for categorical data
* Reliability analysis provides functions to analyze scales. Class ItemAnalysis provides statistics like mean, standard deviation for a scale, Cronbach's alpha and standarized Cronbach's alpha, and for each item: mean, correlation with total scale, mean if deleted, Cronbach's alpha is deleted. With HtmlReport, graph the histogram of the scale and the Item Characteristic Curve for each item
* Module SRS (Simple Random Sampling) provides a lot of functions to estimate standard error for several type of samples
* Module Statsample::SRS (Simple Random Sampling) provides a lot of functions to estimate standard error for several type of samples
* Interfaces to gdchart, gnuplot and SVG::Graph


Expand Down
2 changes: 1 addition & 1 deletion Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ desc "Ruby Lint"
task :lint do
executable=Config::CONFIG['RUBY_INSTALL_NAME']
Dir.glob("lib/**/*.rb") {|f|
if !system %{#{executable} -cw -W2 "#{f}"}
if !system %{#{executable} -w -c "#{f}"}
puts "Error on: #{f}"
end
}
Expand Down
8 changes: 1 addition & 7 deletions demo/multiple_regression.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,14 @@

rb=ReportBuilder.new("Multiple Regression Engines")

if HAS_GSL
if Statsample.has_gsl?
x.report("GSL:") {
lr=Statsample::Regression::Multiple::GslEngine.new(ds,'y',:name=>"Multiple Regression using GSL")
rb.add(lr.summary)
}
end


if HAS_ALGIB
x.report("Alglib:") {
lr=Statsample::Regression::Multiple::AlglibEngine.new(ds,'y', :name=>"Multiple Regression using Alglib")
rb.add(lr.summary)
}
end
x.report("Ruby:") {
lr=Statsample::Regression::Multiple::RubyEngine.new(ds,'y',:name=>"Multiple Regression using RubyEngine")
rb.add(lr.summary)
Expand Down
198 changes: 99 additions & 99 deletions lib/spss.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,114 +7,114 @@
# Claudio Bustos mailto:clbustos@gmail.com

module SPSS # :nodoc: all
module Dictionary
class Element
def add(a)
@elements.push(a)
end
def parse_elements(func=:to_s)
@elements.collect{|e| " "+e.send(func)}.join("\n")
end
def init_with config
config.each {|key,value|
self.send(key.to_s+"=",value) if methods.include? key.to_s
}
end
def initialize(config={})
@config=config
@elements=[]
end
end
class Dictionary < Element
attr_accessor :locale, :date_time, :row_count
def initialize(config={})
super
init_with ({
:locale=>"en_US",
:date_time=>Time.new().strftime("%Y-%m-%dT%H:%M:%S"),
:row_count=>1
})
init_with config
end

def to_xml
"<dictionary locale='#{@locale}' creationDateTime='#{@date_time}' rowCount='#{@row_count}' xmlns='http://xml.spss.com/spss/data'>\n"+parse_elements(:to_xml)+"\n</dictionary>"

end
def to_spss
parse_elements(:to_spss)
end
end
module Dictionary
class Element
def add(a)
@elements.push(a)
end
def parse_elements(func=:to_s)
@elements.collect{|e| " "+e.send(func)}.join("\n")
end
def init_with config
config.each {|key,value|
self.send(key.to_s+"=",value) if methods.include? key.to_s
}
end
def initialize(config={})
@config=config
@elements=[]
end
end
class Dictionary < Element
attr_accessor :locale, :date_time, :row_count
def initialize(config={})
super
init_with ({
:locale=>"en_US",
:date_time=>Time.new().strftime("%Y-%m-%dT%H:%M:%S"),
:row_count=>1
})
init_with config
end

def to_xml
"<dictionary locale='#{@locale}' creationDateTime='#{@date_time}' rowCount='#{@row_count}' xmlns='http://xml.spss.com/spss/data'>\n"+parse_elements(:to_xml)+"\n</dictionary>"

class MissingValue < Element
attr_accessor :data, :type, :from, :to
def initialize(data,type=nil)
@data=data
if type.nil? or type=="lowerBound" or type=="upperBound"
@type=type
else
raise Exception,"Incorrect value for type"
end
end
def to_xml
"<missingValue data='#{@data}' "+(type.nil? ? "":"type='#{type}'")+"/>"
end
end
class LabelSet
attr_accessor
def initialize(labels)
@labels=labels
end
def parse_xml(name)
"<valueLabelSet>\n "+@labels.collect{|key,value| "<valueLabel label='#{key}' value='#{value}' />"}.join("\n ")+"\n <valueLabelVariable name='#{name}' />\n</valueLabelSet>"
end
def parse_spss()
@labels.collect{|key,value| "#{key} '#{value}'"}.join("\n ")
end
end
def to_spss
parse_elements(:to_spss)
end
end

class MissingValue < Element
attr_accessor :data, :type, :from, :to
def initialize(data,type=nil)
@data=data
if type.nil? or type=="lowerBound" or type=="upperBound"
@type=type
else
raise Exception,"Incorrect value for type"
end
class Variable < Element
attr_accessor :aligment, :display_width, :label, :measurement_level, :name, :type, :decimals, :width, :type_format, :labelset, :missing_values
def initialize(config={})
super
@@var_number||=1
init_with({
:aligment => "left",
:display_width => 8,
:label => "Variable #{@@var_number}",
:measurement_level => "SCALE",
:name => "var#{@@var_number}",
:type => 0,
:decimals => 2,
:width => 10,
:type_format => "F",
:labelset => nil
})
init_with config
@missing_values=[]
@@var_number+=1
end
def to_xml
labelset_s=(@labelset.nil?) ? "":"\n"+@labelset.parse_xml(@name)
missing_values=(@missing_values.size>0) ? @missing_values.collect {|m| m.to_xml}.join("\n"):""
"<variable aligment='#{@aligment}' displayWidth='#{@display_width}' label='#{@label}' measurementLevel='#{@measurement_level}' name='#{@name}' type='#{@type}'>\n<variableFormat decimals='#{@decimals}' width='#{@width}' type='#{@type_format}' />\n"+parse_elements(:to_xml)+missing_values+"</variable>"+labelset_s
end
def to_spss
out=<<HERE
end
def to_xml
"<missingValue data='#{@data}' "+(type.nil? ? "":"type='#{type}'")+"/>"
end
end
class LabelSet
attr_accessor
def initialize(labels)
@labels=labels
end
def parse_xml(name)
"<valueLabelSet>\n "+@labels.collect{|key,value| "<valueLabel label='#{key}' value='#{value}' />"}.join("\n ")+"\n <valueLabelVariable name='#{name}' />\n</valueLabelSet>"
end
def parse_spss()
@labels.collect{|key,value| "#{key} '#{value}'"}.join("\n ")
end
end
class Variable < Element
attr_accessor :aligment, :display_width, :label, :measurement_level, :name, :type, :decimals, :width, :type_format, :labelset, :missing_values
def initialize(config={})
super
@@var_number||=1
init_with({
:aligment => "left",
:display_width => 8,
:label => "Variable #{@@var_number}",
:measurement_level => "SCALE",
:name => "var#{@@var_number}",
:type => 0,
:decimals => 2,
:width => 10,
:type_format => "F",
:labelset => nil
})
init_with config
@missing_values=[]
@@var_number+=1
end
def to_xml
labelset_s=(@labelset.nil?) ? "":"\n"+@labelset.parse_xml(@name)
missing_values=(@missing_values.size>0) ? @missing_values.collect {|m| m.to_xml}.join("\n"):""
"<variable aligment='#{@aligment}' displayWidth='#{@display_width}' label='#{@label}' measurementLevel='#{@measurement_level}' name='#{@name}' type='#{@type}'>\n<variableFormat decimals='#{@decimals}' width='#{@width}' type='#{@type_format}' />\n"+parse_elements(:to_xml)+missing_values+"</variable>"+labelset_s
end
def to_spss
out=<<HERE
VARIABLE LABELS #{@name} '#{label}' .
VARIABLE ALIGMENT #{@name} (#{@aligment.upcase}) .
VARIABLE WIDTH #{@name} (#{@display_width}) .
VARIABLE LEVEL #{@name} (#{@measurement_level.upcase}) .
HERE
if !@labelset.nil?
out << "VALUE LABELS #{@name} "+labelset.parse_spss()+" ."
end
if @missing_values.size>0
out << "MISSING VALUES #{@name} ("+@missing_values.collect{|m| m.data}.join(",")+") ."
end
out
end
if !@labelset.nil?
out << "VALUE LABELS #{@name} "+labelset.parse_spss()+" ."
end
if @missing_values.size>0
out << "MISSING VALUES #{@name} ("+@missing_values.collect{|m| m.data}.join(",")+") ."
end
out
end
end
end
end
n=SPSS::Dictionary::Dictionary.new
ls=SPSS::Dictionary::LabelSet.new({1=>"Si",2=>"No"})
Expand Down
Loading

0 comments on commit b6871d7

Please sign in to comment.