The goal of this project is to predict the students’ success and failure in online courses and identify those who are at risk of failure or withdrawal, using the anonymized Open University Learning Analytics Dataset (OULAD, https://analyse.kmi.open.ac.uk/open_dataset).
For this project, we collected the partial behavioral data of students from the
Six supervised learning algorithms were used to predict the success or failure of students based on their data from the
The following tables present the accuracy and F1 score of six supervised learning models for the
Model | Accuracy | F1 Score |
---|---|---|
Logistic Regression (Lasso) | 0.743 | 0.752 |
Support Vector Machine | 0.674 | 0.741 |
Decision Tree | 0.740 | 0.748 |
Random Forests | 0.754 | 0.761 |
Naive Bayes Classifier | 0.707 | 0.721 |
Bayesian Neural Network | 0.659 | 0.709 |
Model | Accuracy | F1 Score |
---|---|---|
Logistic Regression (Lasso) | 0.783 | 0.790 |
Support Vector Machine | 0.716 | 0.769 |
Decision Tree | 0.771 | 0.779 |
Random Forests | 0.790 | 0.795 |
Naive Bayes Classifier | 0.745 | 0.759 |
Bayesian Neural Network | 0.701 | 0.763 |
Notably, the most influential feature across these intervals is the average assessment score. This finding suggests that higher assessment scores are reliable indicators of a student's likelihood to pass or fail the module by the final period. Additionally, the timing of assignment submissions emerges as another significant predictor; submissions made earlier than the deadline are associated with a higher probability of passing the module.
Adding to this, it's interesting to note how these features align with educational psychology theories, emphasizing the importance of timely feedback and the positive effects of continuous assessment on student performance. Moreover, engagement metrics, such as the number of attempts at the learning platform, still contribute valuable insights into student success rates. This underscores the multifaceted nature of learning analytics and the potential of machine learning models to synthesize complex data into actionable predictions.