# Data Analysis for Origanum vulgare on Human Blood Samples
#### Date: April 18, 2018
Data Analysts:
- Engr. Michael James C. Quidilla, CE
- Ms. Queeny Ann Añora
- Ms. Sarah Anwar

Software used: R

References: 

 - Elementary Statistics Eight Edition by BLUMAN
 - Modern Business Statistics 4th Edition by Anderson, Sweeney and Williams
 - Data Analysis Basics for Managers by Harvard Press Review

## 1.0. Objectives of the Analysis
### 1.1. General Objective

- To determine the anticoagulant activity of different concentrations and different types of extract of Origanum vulgare on human blood samples.

### 1.2.	Specific Objectives 

  - To determine whether there will be significant increase in the Mean Plasma Recalcification Time using 200µg/ml of O. vulgare aqueous extract VS Positive Control on human blood samples.
  
  - To determine whether there will be significant increase in the Mean Plasma Recalcification Time using 400µg/ml of O. vulgare aqueous extract VS Positive Control on human blood samples.
  
  - To determine whether there will be significant increase in the Mean Plasma Recalcification Time using 200µg/ml of O. vulgare ethanolic extract VS Positive Control on human blood samples.
  
  - To determine whether there will be significant increase in the Mean Plasma Recalcification Time using 400µg/ml of O. vulgare ethanolic extract VS Positive Control on human blood samples.

  - To compare the Mean Plasma Recalcification Time between the aqueous extract or ethanolic extract of O. vulgare.

## 2.0 Data Results and Data Exploration

Step 1: Importing the Results data

In [1]:
library(readr)
Research_results <- read_csv("./Research_results.csv")
Research_results

Parsed with column specification:
cols(
  Subjects = col_character(),
  `Positive_Control_(s)` = col_integer(),
  `Negative_Control_(s)` = col_integer(),
  `Untreated_(s)` = col_integer(),
  `Aqueous_Extract_200microg/ml_(s)` = col_integer(),
  `Aqueous_Extract_400microg/ml_(s)` = col_integer(),
  `Ethanol_Extract_200microg/ml_(s)` = col_integer(),
  `Ethanol_Extract_400microg/ml_(s)` = col_integer()
)


Subjects,Positive_Control_(s),Negative_Control_(s),Untreated_(s),Aqueous_Extract_200microg/ml_(s),Aqueous_Extract_400microg/ml_(s),Ethanol_Extract_200microg/ml_(s),Ethanol_Extract_400microg/ml_(s)
A,534,160,140,485,640,600,738
B,602,230,270,583,638,622,890
C,628,333,230,640,742,690,930
D,583,268,142,526,562,633,848


Step 2: Showing the descriptive statistics summary of the Data

In [2]:
summary(Research_results)

   Subjects         Positive_Control_(s) Negative_Control_(s) Untreated_(s)  
 Length:4           Min.   :534.0        Min.   :160.0        Min.   :140.0  
 Class :character   1st Qu.:570.8        1st Qu.:212.5        1st Qu.:141.5  
 Mode  :character   Median :592.5        Median :249.0        Median :186.0  
                    Mean   :586.8        Mean   :247.8        Mean   :195.5  
                    3rd Qu.:608.5        3rd Qu.:284.2        3rd Qu.:240.0  
                    Max.   :628.0        Max.   :333.0        Max.   :270.0  
 Aqueous_Extract_200microg/ml_(s) Aqueous_Extract_400microg/ml_(s)
 Min.   :485.0                    Min.   :562.0                   
 1st Qu.:515.8                    1st Qu.:619.0                   
 Median :554.5                    Median :639.0                   
 Mean   :558.5                    Mean   :645.5                   
 3rd Qu.:597.2                    3rd Qu.:665.5                   
 Max.   :640.0                    Max.   :742.0     

As shown above, the mean of the Positive Control Population, Aqueous Extract 200microgram/ml, Aqueous Extract 400microgram/ml and Ethanol Extract 200microgram/ml is near. Using Statistical test such as T-test will determine the there is a significant difference between 2 population

## 3.0. Inference Statistics

### 3.1. Comparing two sample population between two populations with sample less than 30

- To compare the sample means of two populations, the analyst chose to use the student's t test for the following criteria:

    a. Sample size is less than 30.
    
    b. Sample population is assumed to be normally distributed.
    
    c. Samples are random samples
    
    d. Sample data are independent of one another

### 3.2. Positive Control Population VS Aqueous Extract 200microgram/millilter

#### 3.2.1 Hypothesis

Null Hypothesis $ H_0 $ : There is no difference of the sample mean between the Positive Population and the Aqueous Extract 200microgram/milliliter.

Alternative Hypothesis $H_a$ : There is a difference of the sample mean between the Positive Population and the Aqueous Extract 200microgram/milliliter.

#### 3.2.2 Assumptions

- Sample Variance is not equal to both population

- Sample Population is normally distributed

- 95% Confidence Interval

In [11]:
t.test(x=Research_results$`Positive_Control_(s)`, # data for the Positive Control as x
       y=Research_results$`Aqueous_Extract_200microg/ml_(s)`, # data for the Aqueous Extract 200 as y
       var.equal = FALSE, # Assume Variance of the population is not equal
       mu = 0) # Null hypothesis M1 - M2 = 0


	Welch Two Sample t-test

data:  Research_results$`Positive_Control_(s)` and Research_results$`Aqueous_Extract_200microg/ml_(s)`
t = 0.72082, df = 4.851, p-value = 0.5042
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -73.43284 129.93284
sample estimates:
mean of x mean of y 
   586.75    558.50 


Criteria for rejection of Null Hypothesis for 95% interval:
- critical value approach: $ t \le t_{alpha} $
- p-value approach: $ p-value \le \alpha $

Using p-value approach, since $ 0.5042 > 0.05 $, do not Reject Null Hypothesis

**At 95% Confidence, There is no difference between the Positive Control Population VS Aqueous Extract 200microgram/milliliter Population**

### 3.3. Positive Control Population VS Aqueous Extract 400microgram/millilter

#### 3.3.1 Hypothesis

Null Hypothesis $ H_0 $ : There is no difference of the sample mean between the Positive Population and the Aqueous Extract 400microgram/milliliter.

Alternative Hypothesis $H_a$ : There is a difference of the sample mean between the Positive Population and the Aqueous Extract 400microgram/milliliter.

#### 3.3.2 Assumptions

- Sample Variance is not equal to both population

- Sample Population is normally distributed

- 95% Confidence Interval

In [12]:
t.test(x=Research_results$`Positive_Control_(s)`, # data for the Positive Control as x
       y=Research_results$`Aqueous_Extract_400microg/ml_(s)`, # data for the Aqueous Extract 400 as y
       var.equal = FALSE, # Assume Variance of the population is not equal
       mu = 0) # Null hypothesis M1 - M2 = 0


	Welch Two Sample t-test

data:  Research_results$`Positive_Control_(s)` and Research_results$`Aqueous_Extract_400microg/ml_(s)`
t = -1.401, df = 4.6002, p-value = 0.2249
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -169.42501   51.92501
sample estimates:
mean of x mean of y 
   586.75    645.50 


Criteria for rejection of Null Hypothesis for 95% interval:
- critical value approach: $ t \le t_{alpha} $
- p-value approach: $ p-value \le \alpha $

Using p-value approach, since $ 0.2249 > 0.05 $, do not Reject Null Hypothesis

**At 95% Confidence, There is no difference between the Positive Control Population VS Aqueous Extract 400microgram/milliliter Population**

### 3.4. Positive Control Population VS Ethanol Extract 200microgram/millilter

#### 3.4.1 Hypothesis

Null Hypothesis $ H_0 $ : There is no difference of the sample mean between the Positive Population and the Ethanol Extract 200microgram/milliliter.

Alternative Hypothesis $H_a$ : There is a difference of the sample mean between the Positive Population and the Ethanol Extract 200microgram/milliliter.

#### 3.4.2 Assumptions

- Sample Variance is not equal to both population

- Sample Population is normally distributed

- 95% Confidence Interval

In [14]:
t.test(x=Research_results$`Positive_Control_(s)`, # data for the Positive Control as x
       y=Research_results$`Ethanol_Extract_200microg/ml_(s)`, # data for the Ethanol Extract 200 as y
       var.equal = FALSE, # Assume Variance of the population is not equal
       mu = 0) # Null hypothesis M1 - M2 = 0


	Welch Two Sample t-test

data:  Research_results$`Positive_Control_(s)` and Research_results$`Ethanol_Extract_200microg/ml_(s)`
t = -1.7929, df = 5.9929, p-value = 0.1232
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -117.07763   18.07763
sample estimates:
mean of x mean of y 
   586.75    636.25 


Criteria for rejection of Null Hypothesis for 95% interval:
- critical value approach: $ t \le t_{alpha} $
- p-value approach: $ p-value \le \alpha $

Using p-value approach, since $ 0.1232 > 0.05 $, do not Reject Null Hypothesis

**At 95% Confidence, There is no difference between the Positive Control Population VS Ethanol Extract 200microgram/milliliter Population**

### 3.5. Positive Control Population VS Ethanol Extract 200microgram/millilter

#### 3.5.1 Hypothesis

Null Hypothesis $ H_0 $ : There is no difference of the sample mean between the Positive Population and the Ethanol Extract 200microgram/milliliter.

Alternative Hypothesis $H_a$ : There is a difference of the sample mean between the Positive Population and the Ethanol Extract 200microgram/milliliter.

#### 3.5.2 Assumptions

- Sample Variance is not equal to both population

- Sample Population is normally distributed

- 95% Confidence Interval

In [15]:
t.test(x=Research_results$`Positive_Control_(s)`, # data for the Positive Control as x
       y=Research_results$`Ethanol_Extract_400microg/ml_(s)`, # data for the Ethanol Extract 400 as y
       var.equal = FALSE, # Assume Variance of the population is not equal
       mu = 0) # Null hypothesis M1 - M2 = 0


	Welch Two Sample t-test

data:  Research_results$`Positive_Control_(s)` and Research_results$`Ethanol_Extract_400microg/ml_(s)`
t = -5.7693, df = 4.3124, p-value = 0.003546
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -388.6003 -140.8997
sample estimates:
mean of x mean of y 
   586.75    851.50 


Criteria for rejection of Null Hypothesis for 95% interval:
- critical value approach: $ t \le t_{alpha} $
- p-value approach: $ p-value \le \alpha $

Using p-value approach, since $ 0.003546 < 0.05 $, do not Reject Null Hypothesis

**At 95% Confidence, There is a difference between the Positive Control Population VS Ethanol Extract 400microgram/milliliter Population**

## 4.0 Summary of Analysis

From computation above, the summary of the results are as follows:

1. Only the mean of the **Ethanol Extract of 400microgram/milliliter has an significant difference towards the mean of the Positive Control Population**.

2. Other Extract means do not have significant difference towards the Positive Control Population

Ways to improve result:

1. **Increase size of the sample population.** This method ensures that the sample population would be nearer to the normal distribution curve. Hence, more accuracy of the results may follow. This also ensures that the variance of the data will captured. 4 data points per sample population is quite low.