This project is written in R.
This is a course project, and groups will be ranked on Kaggle. Our prediction result ranked 2nd out of 6 groups.
The main file of analyzing is written in qmd. We've tried 2 ways of feature engineering because we cannot get a satisfying result through the first way.
The models we've tried include Naive Bayes, Decision Tree, KNN, LDA, Logistic, SVM, Bagging, Ranger. And Ranger seemed to be the best.
Details could be found in [Student Retention Analysis Presentation.pdf]. Data dictionary could be found in /Student Retention Challenge Data/Data Dictionary/.