Does the gender of a student affect their final math grade?
This project uses the UCI Student Performance Data Set (https://archive.ics.uci.edu/ml/datasets/Student+Performance) to evaluate the relationship between a student gender (Female/Male) and their final math grade. The data set contains math and Portuguese grades of high school students attending two Portuguese schools: Gabriel Pereira (GP) and Mousinho da Silveira (MS) as well as demographics, social and school related features.
The results of a two-tailed hypothesis test that determine if there was a statistically significant difference in the mean math grade for male and female students is reported. Additionally, there is a visualization of the data that shows the mean, confidence intervals and distribution for each sample (Female and Male).
The data analysis is carried out in 4 scripts saved in the ./src/folder and ran in the following order:
1- clean_student_perf_data.R: cleans the original data and saves the transforms data
2- explore_student_perf.R: creates a visualization of the data distribution with a violin/jitter plot
3- analysis_t-test_estimates.R: performs a t-test and calculates the estimate and confidence intervals for each sample
4- report_mean_CI.R: creates a visualization of the mean and confidence intervals for each sample
RStudio tidyverse (version 1.2.1)
RStudio ggplot2 (version 3.1.0)
You can reproduce our analysis with the following steps:
- Clone this repo
- Using the command line, navigate to the root of this project
- Run the Makefile by typing following code in the terminal:
File | Commands |
---|---|
Makefile | make clean make all |
The Makefile creates an entire data analysis pipeline for our project by executing the four scripts listed above one by one.
To run the analysis using docker:
- Clone this repo
- Using the command line, navigate to the root of this project
- Type the following (filling in PATH_ON_YOUR_COMPUTER with the absolute path to the root of this project on your computer) :
docker run --rm -it -e PASSWORD=stuperf -v PATH_ON_YOUR_COMPUTER:/home/ellognea-smwatts-student-performance ellognea/ellognea-smwatts-student-performance make -C /home/ellognea-smwatts-student-performance all
To clean up the analysis type:
docker run --rm -it -e PASSWORD=stuperf -v PATH_ON_YOUR_COMPUTER :/home/ellognea-smwatts-student-performance ellognea/ellognea-smwatts-student-performance make -C /home/ellognea-smwatts-student-performance clean
The final report is saved in the student_perf_report.Rmd file, found in the ./doc folder. It presents the original data, a statistical summary, and figures.