# Student's *t*-test variances
The purpose of Student's *t*-tests is to verify the null hypothesis that the means of the two samples (A and B) are equal (two-sample version). With the null hypothesis so stated, there are 3 possible alternative hypothesis:
*   $H_1$: The means between groups A and B are significantly different ($\mu_A \neq \mu_B$).
*   $H_2$: The mean in group A is higher than in group B ($\mu_A > \mu_B$).
*   $H_3$: The mean in group B is less than in group B ($\mu_A < \mu_B$).

There is also a one-sample Student's t-test version. Then the null hypothesis of equality of the sample mean with a predetermined expected value is verified. The set of alternative hypotheses is of the form above (but this time for one sample).

Student's t-test belongs to the parametric test family, which means that the following conditions must be met:
*   data come from a normal distribution (for both group A and B),
*   variances between groups are homogeneous.









## Checking that the data comes from a normal distribution + homogeneity of variance

In [1]:
check.normal <- function(X){
  p.val <- shapiro.test(X)$p.value
  ifelse(p.val >= 0.05, TRUE, FALSE)
    }

check.var <- function(X, Y){
  p.val <- var.test(X, Y)$p.value
  ifelse(p.val >= 0.05, TRUE, FALSE)
    }

## One-sample *t*-test

In [2]:
t.one <- function(X, mu0, alternative){
    cat('One sample T-Test (mu0 =)', mu0)
    n <- length(X)
    mean.X <- mean(X)
    sd.X <- sd(X)
    T.stat <- sqrt(n) * (mean.X - mu0) / sd.X
    if(alternative == 'two.sided'){
            p.value <- 2 * pt(abs(T.stat), n-1, lower.tail = FALSE)
        }
        else if(alternative == 'greater'){
            p.value <- pt(T.stat, n-1, lower.tail = FALSE)
        }
        else if(alternative == 'less'){
            p.value <- 1 - pt(T.stat, n-1, lower.tail = FALSE)
        }
        else{
            stop('Incorrect value "alternative".')
        }
            summary <- list(T.value = T.stat,
                            p.value = p.value,
                            mean.X = mean.X)
            return(summary)
    }

## Unpaired two-sample *t*-test

In [3]:
t.two.unpaired <- function(X, Y, alternative){
    cat("Unpaired two-sample t-test")
    n <- length(X)
    m <- length(Y)
    mean.X <- mean(X)
    mean.Y <- mean(Y)
    var.X <- var(X)
    var.Y <- var(Y)
    df <- n + m - 2
    z.znorm <- sqrt((n * m * (df)) / (n + m))
    T.stat = (mean.X - mean.Y) / sqrt((n-1) * var.X + (m-1) * var.Y) * z.znorm
    if(alternative == 'two.sided'){
        p.value <- 2 * pt(abs(T.stat), df, lower.tail = FALSE)
    }
    else if(alternative == 'greater'){
        p.value <- pt(T.stat, df, lower.tail = FALSE)
    }
    else if(alternative == 'less'){
        p.value <- 1 - pt(T.stat, df, lower.tail = FALSE)
    }
    else{
        stop('Incorrect value "alternative".')
    }
        summary <- list(T.value = T.stat,
                        p.value = p.value)
        return(summary)
}

## Paired two-sample *t*-test

In [4]:
t.two.paired <- function(X, Y, alternative){
    n <- length(X)
    m <- length(Y)
    if(n != m){
        stop("The lengths of the vectors (groups) X and Y are not equal.")
    }
    cat('Paired two-sample t-test.')
    D <- X - Y
    mean.D <- mean(D)
    sd.D <- sd(D)
    T.stat <- sqrt(n) * mean.D / sd.D
    if(alternative == 'two.sided'){
        p.value <- 2 * pt(abs(T.stat), n-1, lower.tail = FALSE)
    }
    else if(alternative == 'greater'){
        p.value <- pt(T.stat, n-1, lower.tail = FALSE)
    }
    else if(alternative == 'less'){
        p.value <- 1 - pt(T.stat, n-1, lower.tail = FALSE)
    }
    else{
        stop('Incorrect value "alternative".')
    }
        summary <- list(T.value = T.stat,
                        p.value = p.value)
        return(summary)
}

## Student's *t*-test (aggregate function)

In [5]:
t_test <- function(X, Y, mu0 = 0, test = 'unpaired', alternative = 'two.sided'){
    if(missing(Y) == TRUE && is.numeric(X)){
        summary <- t.one(X, mu0, alternative)
        }
    else if(test == 'unpaired'){
        stopifnot(is.numeric(X) && is.numeric(Y))
        if(check.normal(X) == FALSE || check.normal(Y) == FALSE){
            stop('The samples are not from a normal distribution, 
                  indicated use of U-Mann-Whitney test.')
        }
        if(check.var(X, Y) == FALSE){
            stop('Variances of the variables are not homogeneous,
                  indicated use of Welch-Aspin`s test.')
        }
        summary <- t.two.unpaired(X, Y, alternative)
    }
    else if(test == 'paired'){
        stopifnot(is.numeric(X) && is.numeric(Y))
        if(check.normal(X) == FALSE || check.normal(Y) == FALSE){
            stop('The samples are not from a normal distribution,
                  indicated use of Wilcoxon signed rank test.')
            }
        summary <- t.two.paired(X, Y, alternative)
        }
    else{
        stop('The test method introduced does not exist.')
    }
        return(summary)
        }

### Case: One-sample *t*-test

In [6]:
set.seed(7)

X <- rnorm(30)
Y <- rnorm(30)
Z <- rnorm(20)

t_test(X, mu0 = 2, alternative = "less")
t.test(X, mu = 2, alternative = "less")

One sample T-Test (mu0 =) 2


	One Sample t-test

data:  X
t = -7.8005, df = 29, p-value = 6.653e-09
alternative hypothesis: true mean is less than 2
95 percent confidence interval:
      -Inf 0.7451256
sample estimates:
mean of x 
0.3956669 


### Case: Unpaired two-sample *t*-test

In [7]:
t_test(X, Y)
t.test(X, Y, paired = FALSE, var.equal = TRUE)

Unpaired two-sample t-test


	Two Sample t-test

data:  X and Y
t = 1.4253, df = 58, p-value = 0.1594
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1453284  0.8639501
sample estimates:
 mean of x  mean of y 
0.39566687 0.03635604 


### Case: Paired two-sample *t*-test

In [8]:
t_test(X, Y, test = "paired")
t.test(X, Y, paired = TRUE)

Paired two-sample t-test.


	Paired t-test

data:  X and Y
t = 1.2769, df = 29, p-value = 0.2118
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2161853  0.9348069
sample estimates:
mean of the differences 
              0.3593108 


### Case: The lengths of the variables (groups) X and Z are not equal

In [9]:
t_test(X, Z, test = "paired")

ERROR: ignored