Skip to content

Commit

Permalink
math.statistics: add Spearman's correlation, and rank-by-{avg,min,max}
Browse files Browse the repository at this point in the history
  • Loading branch information
mrjbq7 committed Aug 20, 2023
1 parent 30df45c commit b607962
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 11 deletions.
4 changes: 4 additions & 0 deletions basis/math/statistics/statistics-docs.factor
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@ HELP: population-corr
{ $values { "x-seq" sequence } { "y-seq" sequence } { "corr" "a real number" } }
{ $description "Computes the correlation of two sequences, " { $snippet "x-seq" } " and " { $snippet "y-seq" } "." } ;

HELP: spearman-corr
{ $values { "x-seq" sequence } { "y-seq" sequence } { "corr" "a real number" } }
{ $description "Computes the Spearman's correlation of two sequences, " { $snippet "x-seq" } " and " { $snippet "y-seq" } "." $nl "For more information see " { $url "https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient" } "." } ;

HELP: histogram
{ $values
{ "seq" sequence }
Expand Down
14 changes: 7 additions & 7 deletions basis/math/statistics/statistics-tests.factor
Original file line number Diff line number Diff line change
Expand Up @@ -221,17 +221,17 @@ IN: math.statistics

{ { 0 1/4 1/2 3/4 1 } } [ 5 <iota> rescale ] unit-test


{
{ 2 2 2 1 0 5 6 7 7 7 7 }
{ 3 3 3 2 1 6 7 8 8 8 8 }
} [
{ 30 30 30 20 10 40 50 60 60 60 60 } rank-values
{ 30 30 30 20 10 40 50 60 60 60 60 } rank-by-min
] unit-test

{
{ 1 0 2 3 4 }
}
[ { 3 1 4 15 92 } rank-values ] unit-test
{ { 2 1 3 4 5 } } [ { 3 1 4 15 92 } rank ] unit-test

{ { 1 1 1 4 5 6 } } [ { 1 1 1 2 3 4 } rank-by-min ] unit-test
{ { 2 2 2 4 5 6 } } [ { 1 1 1 2 3 4 } rank-by-avg ] unit-test
{ { 3 3 3 4 5 6 } } [ { 1 1 1 2 3 4 } rank-by-max ] unit-test

{ { 1 1 2 3 3 4 } }
[ { 1 2 3 3 2 3 } [ odd? ] cum-count ] unit-test
Expand Down
24 changes: 20 additions & 4 deletions basis/math/statistics/statistics.factor
Original file line number Diff line number Diff line change
Expand Up @@ -403,11 +403,27 @@ PRIVATE>
: rescale ( u -- v )
dup minmax over - [ v-n ] [ v/n ] bi* ;

: rankings ( histogram -- assoc )
sort-keys 0 swap [ rot [ + ] keep swapd ] H{ } assoc-map-as nip ;
<PRIVATE

: rankings ( histogram method: ( min max -- rank ) -- assoc )
[ sort-keys 0 swap ] dip
'[ swapd dupd + _ keep -rot ] H{ } assoc-map-as nip ; inline

: rank-by ( seq method: ( min max -- rank ) -- seq' )
[ dup histogram ] [ rankings ] bi* '[ _ at ] map ; inline

PRIVATE>

: rank-by-avg ( seq -- seq' ) [ + 1 + 2 / ] rank-by ;

: rank-by-min ( seq -- seq' ) [ drop 1 + ] rank-by ;

: rank-by-max ( seq -- seq' ) [ nip ] rank-by ;

ALIAS: rank rank-by-avg

: rank-values ( seq -- seq' )
dup histogram rankings '[ _ at ] map ;
: spearman-corr ( x-seq y-seq -- corr )
[ rank ] bi@ population-corr ;

: z-score ( seq -- n )
[ demean ] [ sample-std ] bi v/n ;
Expand Down

0 comments on commit b607962

Please sign in to comment.