20220518
This is my submission for the qualifying challenge for the hackathon for Juniors taking place on 20220531.
I built ML classification models Logistic Regression & Random Forest Classifier, as well as a classification pipeline which considered Logistic Regression, Random Forest Classifier, Decision Tree Classifier, KNeighbors Classifier, Gaussian NB, & SVM.
The construction of the models & the pipeline seems to be "not too bad", but I still need to figure out how to improve the quality of the predictions. The quantity of the predictions is already fairly accurate. The train-test subsets of the initial dataset have accuracy & F1 scores above 80% 💟, but when completely new data is introduced, those scores drop down to 40%! 😱