# Effect of Gamification on Learning Outcomes

## Questions
Does gamification increase learning growth as measured by the diffrence between final exam and practice exam (pre-test) score? 
Do students who participated in gamified learning have a higher final exam scores compared to students who did not?

In [None]:
# This R environment comes with many helpful analytics packages installed
# It is defined by the kaggle/rstats Docker image: https://github.com/kaggle/docker-rstats
# For example, here's a helpful package to load

library(tidyverse) # metapackage of all tidyverse packages
library(ggplot2) # package for data visualization
library(RColorBrewer) #pacakage with color palettes for visualization 

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

list.files(path = "../input")

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

## Load Data

In [None]:
#load data and view the first rows to ensure it was read correctly
gamification_data <- read_csv("/kaggle/input/gamification-students-grades/Students_gamification_grades.csv")
head(gamification_data)

## Prepare and Clean Data

While the data was mostly useable a few steps were needed for further analysis:

* Remove students with missing practice exam scores as growth could not be calculated for these studetns
* Calulate growth and create a new column with these values
* Create a new column that convers the binary values in the User column to a more readable string varaible
* Sort the tibble by whether or not the user recieved a gamified learning experince to increase readability 
* Split the data by user treament status to make running the t-tests simpler

In [None]:
#clean and prepare data for further anaylsis 
gamification_data <- gamification_data %>%
    drop_na(Practice_Exam) %>% #filter out students with missing data
    mutate(Growth=(Final_Exam-Practice_Exam), .after=Final_Exam) %>% #compute growth 
    mutate(User_Group = ifelse(User == 1, 'gamified', 'control'), .after=User) %>% #create a string field that is easier for a human to understand than the binary
    arrange(User) %>% #sort data by treatment of the user
    View() #prints tibble in spreadsheet 

#split tibble by user treatment
gamification_data_split <- split(gamification_data, f = gamification_data$User) 
#rename split tibbles
gamification_data_control <- gamification_data_split$`0` 
gamification_data_gamified <- gamification_data_split$`1` 

### Effect on Final Exam Scores

In [None]:
#Summary statistics for final exam 
print("Control")
summary(gamification_data_control$Final_Exam)
print("Gamified")
summary(gamification_data_gamified$Final_Exam)
#t-test to analyze whether there was statistcally signficant diffrence between the groups
t.test(gamification_data_gamified$Final_Exam, gamification_data_control$Final_Exam, , alternative='greater')
#create a box plot to visualize the diffrences between groups
ggplot(gamification_data, aes(x=User_Group, y=Final_Exam, fill=User_Group)) +
    geom_boxplot() +
    labs (
        title = "Effect of Gamification",
        subtitle= "on Final Exam Scores",
        y = "Final Exam Score",
        x = "User Group",
        color = "User Group"
        ) +
    scale_fill_brewer(palette = "Accent") +
    theme_bw()


#### Final exam scores are higher for the students that participated in the gamified learning experince. 

The t-test shows a statistically significant higher avearage final exam scores for students who participated in the gamified learning. There is less than a 1% chance that null hypothesis, that there was no effect of the gamified learning expererince, is true. 

### Effect on Growth

The summary data for both the control and treatment groups was computed. The 

In [None]:
print("Control")
summary(gamification_data_control$Growth)
print("Gamified")
summary(gamification_data_gamified$Growth)
t.test(gamification_data_gamified$Growth, gamification_data_control$Growth, alternative='greater')
ggplot(gamification_data, aes(x=User_Group, y=Growth, fill=User_Group)) +
    geom_boxplot() +
    labs (
    title = "Effect of Gamification",
    subtitle= "on Growth",
    x = "User Group",
    color = "User Group"
    ) +
    scale_fill_brewer(palette = "Accent") +
    theme_bw()

#### There is not a signficant diffrence in growth between groups. 

There are many reasons that this could be true while the final exam scores were higher for the gamified groups. The average growth was higher for those who particpated in the gamified learning, but not signfcantly so. It could be that the starting exam scores were higher for the gamified group leading. Or, it could be an unkown factor. 

### Comparing Practice Exam (Pre-test) Scores

In [None]:
print("Control")
summary(gamification_data_control$Practice_Exam)
print("Gamified")
summary(gamification_data_gamified$Practice_Exam)
t.test(gamification_data_gamified$Practice_Exam, gamification_data_control$Growth, alternative='greater')
ggplot(gamification_data, aes(x=User_Group, y=Practice_Exam, fill=User_Group)) +
    geom_boxplot() +
    labs (
    title = "Practice Exam Scores",
    y = "Practice Exam Score",
    x = "User Group",
    color = "User Group"
    ) +
    scale_fill_brewer(palette = "Accent") +
    theme_bw()

#### Practice exam scores were not diffrent between the groups. 

More exploration of other factors is needed to explore why the growth between the groups was sigificantly diffrent, but the final exam scores were. 