This repository is about using 2 statistical techniques both from the GLM (General Linear Models) family, namely, Multiple Linear Regression and Two Way ANOVA. We will use these methods to explore and make inferences about 2 data sets of interest.
- kmL Data Set: this data set is used for our Two Way Analysis of Variance test and is about how different drivers and car types influence the fuel consumption/efficiency.
- surg Data Set: this data set is about the survival time of patients following a specific liver surgery and different variables regarding them.
Languages Used : R and Latex.
Packages Used : ggplot2, dplyr, GGally, knitr.
Format/Structure of Analysis: The pdf document can be broken down into 2 distinct parts with sub sections for both.
- Exploratory data analysis + checking assumptions for a multiple regression model
- Creating an initial multiple regression model
- Performing backwards model selection (aka. hyperparameter optimization) to create a final model
- Validating the final model
- Feature engineering our response using a log transformation
- Exploratory data analysis + checking assumptions for a Two Way ANOVA
- Peforming a Two Way Analysis
If you have any questions on the topic in this repo, feel free to reach out (contact details in profile).