This Project Has Been Confirmed As Successful By A Udacity Reviewer.
What I Did
Developed a model that can predict the likelihood that a given student will pass using classification. Thus helping diagnose whether or not an intervention is necessary. I tested three different classifiers on the data: Naive Bayes, Gradient Boosting, and Support Vector Machines. I chose the most optimal classifier through analysis of its results and then tuned it with a grid search and their F1 Scores to find the optimal parameters for prediction. The details of the project can be seen in the Python notebook provided in this repository.
What I Learned
From this unit of the Nanodegree program and project, I learned about the various classification algorithms that are used for Machine Learning. By analysis of multiple classifiers on the dataset, I was able to understand the strengths and weaknesses of various classifier algorithms. I reinforced my current understanding by performing model fitting, data preperation, and using the F1 score to optimize classifier parameters alongside grid search.
Things I learned from this project:
- The strengths and weaknesses of various Machine Learning classifiers (e.g. Naive Bayes, Decision Trees, SVM, etc.)
- General applications of multiple Machine Learning classifiers (Spam Detection, Student Intervention, etc.)
- Evaluating performance of various ML classifiers to find the best model for the situation (training time, testing time, F1 scores, etc.)
- Reinforced concepts learned such as model fitting, data preperation, splitting into training & testing sets, and model tuning.