You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 12, 2020. It is now read-only.
For now, we are only dividing in training and testing. Adding the validation set will give us "one more opportunity" to test the model in unseen data!
Suggestion:
We split in training and testing
The training, we split again (K times) and find the best hyper-parameters using these splits.
As soon as we have the best model here (and the K accuracies we collected from the Grid/Random search), we then test it again in the never seen training. We will have one more accuracy number here.
At the end, we have 10 accuracy numbers for the model tuning, and one for the unseen data.
The text was updated successfully, but these errors were encountered:
Split the model in 80%-20%. (We discussed about doing it inside the _run_single_model, but that's not a good idea; we want all the different models to be trained on the same training and test set, so this needs to be done one layer before... Who calls this method?)
Call the search, with k=10. (From here on, all changes are in the _run_single_model method.
Extract the best estimator as well as its scores on the 10 folds.
Use the best estimator to predict results in the unseen test set.
Collect precision, recall, accuracy, and the confusion matrix (TP, TN, FP, FN)
Re-train the model on the full 100% dataset, so that we have a "production-ready" model.
For now, we are only dividing in training and testing. Adding the validation set will give us "one more opportunity" to test the model in unseen data!
Suggestion:
The text was updated successfully, but these errors were encountered: