## Split dataset
- If you evaluate your model on the same data you used to train it, your model could be very overfit and you wouldn’t even know! A model should be judged on its ability to predict new, unseen data.
- Therefore, you should have separate training and test subsets of your dataset.
- **Training sets** are used to fit and tune your models. 
- **Test sets** are put aside as "unseen" data to evaluate your models.
    - You should always split your data before doing anything else.
    - This is the best way to get reliable estimates of your models’ performance.
    - After splitting your data, don’t touch your test set until you’re ready to choose your final model!

## Hyperparameters
- When we talk of tuning models, we specifically mean tuning hyperparameters.
- **They key distinction is that model parameters can be learned directly from the training data while hyperparameters cannot.**


- **MODEL PARAMETERS**
    - Model parameters are learned attributes that define individual models.
    - e.g. regression coefficients
    - e.g. decision tree split locations
    - They can be learned directly from the training data
- **HYPERPARAMTERS**
    - Hyperparameters express "higher-level" structural settings for algorithms.
    - e.g. strength of the penalty used in regularized regression
    - e.g. the number of trees to include in a random forest
    - They are decided before fitting the model because they can't be learned from the data

## Cross-validation
- Cross-validation is a method for getting a reliable estimate of model performance using only your training data.
- There are several ways to cross-validate. The most common one, 10-fold cross-validation, breaks your training data into 10 equal parts (a.k.a. folds), essentially creating 10 miniature train/test splits.
- These are the steps for 10-fold cross-validation:
    - Split your data into 10 equal parts, or "folds".
    - Train your model on 9 folds (e.g. the first 9 folds).
    - Evaluate it on the 1 remaining "hold-out" fold.
    - Perform steps (2) and (3) 10 times, each time holding out a different fold.
    - Average the performance across all 10 hold-out folds.

## Fit and tune models
- Now that we've split our dataset into training and test sets, and we've learned about hyperparameters and cross-validation, we're ready fit and tune our models.
- Basically, all we need to do is perform the entire cross-validation loop detailed above on each set of hyperparameter values we'd like to try.
- At the end of this process, you will have a cross-validated score for each set of hyperparameter values... for each algorithm.
- Insert Image1

## Conclusion
- There are a variety of performance metrics you could choose from. We won't spend too much time on them here, but in general:
    - For regression tasks, we recommend Mean Squared Error (MSE) or Mean Absolute Error (MAE). (Lower values are better)
    - For classification tasks, we recommend Area Under ROC Curve (AUROC). (Higher values are better)

**STEPS**
- Finally, use these questions to help you pick the winning model:
- Which model had the best performance on the test set? (performance)
- Does it perform well across various performance metrics? (robustness)
- Did it also have (one of) the best cross-validated scores from the training set? (consistency)
- Does it solve the original business problem? (win condition)