# Task 2: Does the test environment (independent variable), i.e. Lab (ID=1) or home (ID=0), have a significant influence on speech quality ratings (dependent variable). Use the quality ratings of condition 3 provided in the file (lab_crowd_speech_quality).

### Step 1: Include libraries

In [1]:
# install.packages('dplyr')      # processing 
# install.packages('gdata')      # file reading
# install.packages('effsize')    # Cohen's D
# install.packages('car')        # homogenity of variances

In [2]:
library(dplyr)     # processing
library(readxl)    # reading in data
library(effsize)   # Cohen's D
library(car)       # homogenity of variances

"package 'dplyr' was built under R version 3.6.2"
Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

"package 'car' was built under R version 3.6.2"Loading required package: carData

Attaching package: 'car'

The following object is masked from 'package:dplyr':

    recode



### Step 2: Load data sets

In [3]:
# read in separate data sets for both environments
get_data_for_environment <- function(home=TRUE) {
    data <- read_excel("datasets/lab_crowd_speech_quality.xlsx")
    
    if(home) {
        data <- data %>% filter(Environment == 0) %>% mutate(Quality_Home = Quality) %>% select(Quality_Home)
    } else {
        data <- data %>% filter(Environment == 1) %>% mutate(Quality_Lab = Quality) %>% select(Quality_Lab)
    }
    
}

home_data <- get_data_for_environment(home=TRUE)
lab_data <- get_data_for_environment(home=FALSE)
head(home_data)
head(lab_data)

Quality_Home
4
3
4
4
4
3


Quality_Lab
3
3
3
3
3
2


### Step 3: Decide on which test to use for the research question

### As not necessarily (even most likely different) people were asked to rate speech quality in different environments, one can assume independence between the ratings of both environments => use Independent Samples t-test!

### Step 4: Check if conditions for Independent Samples t-test are met

#### 1.) Dependent variable [= Quality] measured at least at interval / ration level => OK, as Quality ratings are continuous 
#### 2.) No significant outliers => Don't know, need to check that in the next step!
#### 3.) Independent variable [= Environment: lab or home] should consist of two categorical, independent groups => OK, as lab & home are categorical and one can assume independence between lab and home raters [as they are most likely different people]
#### 4.) Independence of observations => OK, as this can easily be assumed as lab and home ratings were given individually without any known source of dependency
#### 5.) Dependent variable [= Quality] approximately normally distributed in both groups of the independent variable => Don't know, need to check that in the next step!
#### 6.) Homogenity of variances => Don't know, need to check that in the next step!

### Step 5: Check not yet confirmed requirements => no significant outliers in data set, both group ratings approximately normally distributed & homogenity of variances

In [4]:
# Check for outliers using Z-Score method
get_significant_outliers <- function(data, home=TRUE) {
    
    if(home) {
        data %>% 
            mutate(Std_Dev_Quality_Home = sd(Quality_Home), 
                   Mean_Quality_Home = mean(Quality_Home)) %>%
            mutate(Z_Score_Home = (Quality_Home - Mean_Quality_Home) / Std_Dev_Quality_Home) %>%
            select(Quality_Home, Z_Score_Home) %>% 
            filter(abs(Z_Score_Home) > 3.29)
    } else {
        data %>% 
            mutate(Std_Dev_Quality_Lab = sd(Quality_Lab), 
                   Mean_Quality_Lab = mean(Quality_Lab)) %>%
            mutate(Z_Score_Lab = (Quality_Lab - Mean_Quality_Lab) / Std_Dev_Quality_Lab) %>%
            select(Quality_Lab, Z_Score_Lab) %>% 
            filter(abs(Z_Score_Lab) > 3.29)
    }
}

get_significant_outliers(home_data, home=TRUE)
get_significant_outliers(lab_data, home=FALSE)

Quality_Home,Z_Score_Home


Quality_Lab,Z_Score_Lab


#### No significant outliers found neither in home nor in lab data => condition 2.) met!

In [5]:
# Conduct Shapiro-Wilk-Test & just to be sure also Kolmogorov-Smirnov test for normality check for home ratings
shapiro.test(home_data[['Quality_Home']])
ks.test(home_data[['Quality_Home']], "pnorm", mean=mean(home_data[['Quality_Home']]), sd=sd(home_data[['Quality_Home']]))


	Shapiro-Wilk normality test

data:  home_data[["Quality_Home"]]
W = 0.88359, p-value = 9.368e-12


"ties should not be present for the Kolmogorov-Smirnov test"


	One-sample Kolmogorov-Smirnov test

data:  home_data[["Quality_Home"]]
D = 0.24924, p-value = 6.429e-12
alternative hypothesis: two-sided


In [6]:
# Conduct Shapiro-Wilk-Test & just to be sure also Kolmogorov-Smirnov test for normality check for lab ratings
shapiro.test(lab_data[['Quality_Lab']])
ks.test(lab_data[['Quality_Lab']], "pnorm", mean=mean(lab_data[['Quality_Lab']]), sd=sd(lab_data[['Quality_Lab']]))


	Shapiro-Wilk normality test

data:  lab_data[["Quality_Lab"]]
W = 0.86034, p-value = 2.772e-12


"ties should not be present for the Kolmogorov-Smirnov test"


	One-sample Kolmogorov-Smirnov test

data:  lab_data[["Quality_Lab"]]
D = 0.25398, p-value = 3.496e-11
alternative hypothesis: two-sided


#### As p-values of both the Kolmogrov-Smirnov-Test & Shapiro-Wilk-Test < 0.05 => NO normal distribution of quality ratings can be assumed (neither in home, nor in lab environment), hence condition 5.) NOT met! But we are going to continue anyway...

In [7]:
# Check for homogenity of groups' rating variances
get_levene_test_results <- function() {
    entire_data <- read_excel("datasets/lab_crowd_speech_quality.xlsx") %>% 
                mutate(Environment = case_when(Environment == 0 ~ "Home",Environment != 0  ~ "Lab"))

    test_results <- leveneTest(Quality ~ Environment, data = entire_data, center = mean)
    #test_results$Pr(>F)
    #test_results$F value
    n_home <- (entire_data %>% filter(Environment == 'Home') %>% mutate(df = n() - 1))[1,][['df']]
    n_lab <- (entire_data %>% filter(Environment == 'Lab') %>% mutate(df = n() - 1))[1,][['df']]
    result <- paste0('F(df_home = ', n_home, ', df_lab = ', n_lab, ') = ', 
               round(test_results[1,2], digits=3), 
               ' | p-value = ', 
               round(test_results[1,3], digits=3))
    
    if(test_results[1,3] > 0.05) {
        result <- paste0(result, ' => homogenity of variance CAN be assumed')
    } else {
        result <- paste0(result, ' => homogenity of variance CANNOT be assumed')
    }
    
    result
}

get_levene_test_results()

"group coerced to factor."

#### As Levene's test delivers p-value > 0.05 => homogenity of variance can be assumed, thus condition 6.) met!

### Step 6: Conduct Independent Sample t-test

#### We wish to check if the mean of the quality rating distribution measured in lab environment is not equal to the one measured in home environment. Denote m as the difference of both distributions' mean values:

#### H0: m = 0, H1: m != 0

In [8]:
# Conduct Independent Sample t-test
t.test(home_data[['Quality_Home']], lab_data[['Quality_Lab']], alternative = "two.sided", var.equal = TRUE)


	Two Sample t-test

data:  home_data[["Quality_Home"]] and lab_data[["Quality_Lab"]]
t = 3.4419, df = 403, p-value = 0.0006382
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.1240392 0.4544526
sample estimates:
mean of x mean of y 
 3.117371  2.828125 


#### As p-value < 0.05 => null hypothesis rejected => H1 (true difference in means is not equal to 0) assumed

### Step 7: Compute effect size

In [9]:
# Compute Cohen's D
paste0('Already implemented Cohens d function:')
cohen.d(home_data[['Quality_Home']], lab_data[['Quality_Lab']], paired=FALSE)

compute_cohens_d_alternatively <- function(home_data, lab_data) {
    
    mean_home <- (home_data %>% mutate(meanQuality = mean(Quality_Home)))[[1,2]]
    mean_lab <- (lab_data %>% mutate(meanQuality = mean(Quality_Lab)))[[1,2]]
    sd_home <- (lab_data %>% mutate(sdQuality = sd(Quality_Lab)))[[1,2]]
    sd_lab <- (lab_data %>% mutate(sdQuality = sd(Quality_Lab)))[[1,2]]
    
    return(round((mean_home - mean_lab) / sqrt((sd_home * sd_home + sd_lab * sd_lab) / 2), digits=3))
    
}

paste0('Self-implemented Cohens d function => Effect size = ', compute_cohens_d_alternatively(home_data, lab_data))


Cohen's d

d estimate: 0.3425161 (small)
95 percent confidence interval:
    lower     upper 
0.1454580 0.5395742 

### Step 8: Interpretation of results

#### With a p-value of 0.0006382 / a t-score (test statistic value) of 3.4419, n-1=213-1=212 degrees of freedom for the home environment (n-1=192-1=191 for the lab environment) & a Cohen's d effect size of around 0.3425161 [= small], we were able to show with a statistical significance of alpha = 0.05 that the difference of the average quality ratings ratings (for home & lab environment) between the two independent samples (n_home = 213, n_lab = 192) is not equal to 0. Therefore, we can statistically significantly confirm the above stated research question: the test environment has a significant impact on the speech quality ratings.