Skip to content

Commit

Permalink
#81
Browse files Browse the repository at this point in the history
Adding H-Test and Documentation
&
Correction of chi-squared-distribution (PDF&CDF)
  • Loading branch information
zieglerSe committed Sep 24, 2020
1 parent 60baf52 commit 09e5fcf
Show file tree
Hide file tree
Showing 4 changed files with 75 additions and 25 deletions.
29 changes: 16 additions & 13 deletions HTest.fsx
Expand Up @@ -14,18 +14,6 @@ let myAxisRange title range = LinearAxis.init(Title=title,Range=Range.MinMax ran
let styleChart x y chart = chart |> Chart.withX_Axis (myAxis x) |> Chart.withY_Axis (myAxis y)
let styleChartRange x y rx ry chart = chart |> Chart.withX_Axis (myAxisRange x rx) |> Chart.withY_Axis (myAxisRange y ry)

(**
#Statistical testing
FSharp.Stats provides hypothesis tests for different applications.
A hypothesis test is a statistical test that is used to determine whether there is enough evidence
in a sample of data to infer that a certain condition is true for the entire population.
A hypothesis test examines two opposing hypotheses about a population: the null hypothesis and the alternative hypothesis.
<a name="TestStatistics"></a>
##Test Statistics
<a name="Anova"></a>
##Anova
*)

open FSharp.Stats
open FSharp.Stats.Testing

Expand Down Expand Up @@ -56,4 +44,19 @@ let groupB = seq {45.;55.;60.;70.;72.}
let groupC = seq {18.;30.;34.;40.;44.}
let samples = seq{groupA;groupB;groupC}

// calculation of p-Value
// calculation of p-Value
HTest.createHTest samples

(*
{ Statistic = 6.72
DegreesOfFreedom = 2.0
PValueLeft = 0.9652647411
PValueRight = 0.03473525894
PValue = 0.06947051789 }
*)
// PValueRight equals the alpha level

(**
The implemented H-test is testing for double values in the data.
Double values lead to ties in the ranking, and are corrected by using a correction factor.
*)
50 changes: 50 additions & 0 deletions docsrc/content/Testing.fsx
Expand Up @@ -325,6 +325,56 @@ let fTestFromParameters = FTest.testVariancesFromVarAndDof sampleF1 sampleF2
Using a significance level of 0.05 the sample variances do differ significantly.
*)

(**
##H-Test
The H test is also known as Kruskal-Wallis one-way analysis-of-variance-by-ranks and is the nonparametric equivalent of one-way ANOVA.
It is a non-parametric test for comparing the means of more than two independent samples (equal or different sample size), and therefor is an extension of Wilcoxon-Mann-Whitney two sample test.
Testing with H test gives information whether the samples are from the same distribution.
A benefit of the H-test is that it does not require normal distribution of the samples.
The downside is that there is no information which samples are different from each other, or how many differences occur. For further investigation a Post Hoc test is recommended.
The distribution of the H test statistic is approximated by chi-square distribution with degrees of freedom - 1.
Prerequisites:
- random and independent samples
- observations are from populations with same shape of distribution
- nominal scale, ordinal scale, ratio scale or interval scale data
References:
- E. Ostertagová, Methodology and Application of the Kruskal-Wallis Test (2014)
- Y. Chan, RP Walmsley, Learning and understanding the Kruskal-Wallis one-way analysis-of-variance-by-ranks test for differences among three or more independent groups (1997)
*H-test*
*)

// input : seq{seq<float>}
let groupA = seq {23.;41.;54.;66.;78.}
let groupB = seq {45.;55.;60.;70.;72.}
let groupC = seq {18.;30.;34.;40.;44.}
let samples = seq{groupA;groupB;groupC}

// calculation of p-Value
HTest.createHTest samples

(*
{ Statistic = 6.72
DegreesOfFreedom = 2.0
PValueLeft = 0.9652647411
PValueRight = 0.03473525894
PValue = 0.06947051789 }
*)
// PValueRight equals the alpha level

(**
The implemented H-test is testing for double values in the data.
Double values lead to ties in the ranking, and are corrected by using a correction factor.
*)


(**
<a name="ChiSquareTest"></a>
Expand Down
19 changes: 8 additions & 11 deletions src/FSharp.Stats/Distributions/Continuous.fs
Expand Up @@ -49,16 +49,13 @@ module Continuous =
if x < 0.0 || dof < 1. then
0.0
else
let k = float dof * 0.5
let x = x * 0.5
if dof = 2. then
exp (-1. * x)
else
let pValue = SpecialFunctions.Gamma.lowerIncomplete k x // incGamma -> gamma lower incomplete
if (isNan pValue) || (isInf pValue) || (pValue <= 1e-8) then
1e-14
else
1.- pValue / (SpecialFunctions.Gamma.gamma k)
let gammaF = Gamma.gamma (dof/2.)
let k = 2.**(dof/2.)
let fraction = (1./((k)*gammaF))
let ex1 = (x**((dof/2.)-1.))
let ex2 = exp(-x/2.)
let pdffunction = fraction*(ex1*ex2)
pdffunction

/// Computes the logarithm of probability density function.
static member PDFLn dof x =
Expand All @@ -73,7 +70,7 @@ module Continuous =
if dof = 0. then
if x > 0. then 1.
else 0.
else (Gamma.lowerIncomplete (dof /2.0) (x/2.0) )/ (Gamma.gamma (dof/2.0))
else Gamma.lowerIncomplete (dof/2.) (x/2.)

/// Returns the support of the exponential distribution: [0, Positive Infinity).
static member Support dof =
Expand Down
2 changes: 1 addition & 1 deletion src/FSharp.Stats/HTest.fs
Expand Up @@ -6,7 +6,7 @@ module HTest =
open FSharp.Stats
// H-test / one-way ANOVA of ranks
// input : seq{seq<float>}
let htest (samples : seq<#seq<float>>) =
let createHTest (samples : seq<#seq<float>>) =
// calculating n for each group
let n = Seq.map Seq.length samples |> Seq.map float

Expand Down

0 comments on commit 09e5fcf

Please sign in to comment.