- Data Type
- Information
- Rows and Columns
- Numerical or Categorical
- Find Missing Data
- Find Relation between Features and Labels
- Visual Data
- Fill or Drop Missing
- Encode Categorical Data
- Use K Fold Cross Validation to get Better Accuracy and Observe the Cross Validation Score
- Apply Grid Search Cross Validation to Find Optimal Hyperparameters of a Model which results in the most Accurate Predictions
- Find Best Parameters
- Evaluate the Results on Validation Set using the Best Performing Parameters
- Create more than one Model to Find Best Performing Model for Test Set
- Select the Final Best Performing Model on Test Set for Evaluation.
Model | Type | Train Speed | Predict Speed | Performance |
---|---|---|---|---|
Logistic Regression | Classification | Fast | Fast | Low |
Support Vector Machine | Classification | Slow | Moderate | Medium |
Multi Layer Perceptron | Both | Slow | Moderate | High |
Random Forest | Both | Moderate | Moderate | Medium |
Boosted Tree | Both | Slow | Fast | High |