# Statistical hypotheses in R. Example 6
## Two-sample paired test
**Task.** New surgery method was tested by its influence on the luekocytes levels in blood on the animal models. Samples were taken from six subjects before and after surgery. Find if the surgery affects the leukocytes level in blood.

Before surgery|After surgery
-|-
10.8|10.6
12.9|16.6
9.59|17.2
8.81|14
12|10.6
6.07|8.6


### 1.	Define the type of the variable
Because the variable is a physical value (concentration) it's a numeric type of data.

**Get data in R**

In [1]:
Input = ("Before surgery 	After surgery
10.8 	10.6
12.9 	16.6
9.59 	17.2
8.81 	14
12 	10.6
6.07 	8.6")
Data = as.data.frame(read.table(textConnection(Input), header = TRUE, sep = "\t"))
Data

Before.surgery,After.surgery
10.8,10.6
12.9,16.6
9.59,17.2
8.81,14.0
12.0,10.6
6.07,8.6


### 2. Check if the samples follow normal distribution
Let's automate the inference of whether sample follows normal distribution or not. To do so we'll make our custom function `Shapiro.Test()`. 

It'll take as a single argument out sample which we want to test. Inside the function body we'll test calculated `p-value` against the significance level of 0.05. If `p-value` exceeds the significance level, then we print message about sample being normally distributed. Otherwise we print message about sample not following normal distribution.

In [2]:
Shapiro.Test <- function(x) {
    test_result = shapiro.test(x)
    print(test_result)
    if(test_result$p.value > 0.05) {
        print("Normal distribution")
    } else {
        print("Non-normal distribution")
    }
}

Let's put our newly created function to use and check if the Before surgery leukocyte levels follow normal distribution:

In [3]:
Shapiro.Test(Data$Before.surgery)


	Shapiro-Wilk normality test

data:  x
W = 0.96736, p-value = 0.8742

[1] "Normal distribution"


In [4]:
Shapiro.Test(Data$After.surgery)


	Shapiro-Wilk normality test

data:  x
W = 0.90401, p-value = 0.3982

[1] "Normal distribution"


As one can see both samples are normally distributed which means we have to use parametrical methods.

### 3. Check if the samples are paired
To define if samples are paired we have to take another look on task conditions. There we can see that samples were taken from the *same* subjects *before* and *after* the surgery. Because we have the same objects as the source of data in the samples it makes them paired. Hence we have to use *paired t-test*.

### 4. Formulate the statistical hypotheses

**Null hypothesis (H0):** Leukocytes levels in blood before and after the surgery are the same

**Alternative hypothesis (H1):** Leukocytes levels in blood before and after the surgery are different

### 5. Test the hypotheses

In [5]:
t.test(Data$Before.surgery, Data$After.surgery, paired = TRUE)


	Paired t-test

data:  Data$Before.surgery and Data$After.surgery
t = -2.1205, df = 5, p-value = 0.08745
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.426633  0.616633
sample estimates:
mean of the differences 
                 -2.905 


Test results show `p-value` exceeds the significance level of 0.05 and this means we have to reject the alternative hypothesis and accept the null hypothesis. (Task for the reader to automate the process of hypothesis rejection).

### Conclusion

Leukocytes levels in blood of the test subjects before and after the surgery were same. From this we infer that considered type of surgery doesn't affect the leukocytes level in blood.