# Model Evaluation

* split a data set in training and testing sets
* it can be a 70-30 split
* after testing, all the data is used to train the model to maximize performance
* we use the sklearn.model_selection train_test_split functionality which randomly splits a given dataset into train and test
![image.png](attachment:image.png)

* we should try to minimize the "Generalization Error" which is the error we get when we apply the model (built on a training set) on a test state. 
* large amounts of training data => good accuracy of model but poor precision of performance
* small amounts of training data => poor accuracy but better precision

#### To overcome this, we use Cross-Validation

## Cross Validation
* out-of-sample evaluation metric
* more effective use of data
1. we split the data-set into *folds* so N folds means the data has N equal-sized parts
2. then we use N-1 folds for training and 1 fold for testing for different combinations of data subsets.
3. we then use the sklearn.import_model cross_val_score, cross_val_predict to evaluate the average errors of these models


## How to pick the best Polynomial order?

1. When the function is too basic to fit the training data => underfitting
2. When the function is too flexible(and complex) and seems to fit the noise, more than the data-points => overfitting
3. ![image.png](attachment:image.png)
4. left side is underfitting, right side is overfitting


## Ridge Regression

1. to help in choosing a better, more suitable model for our data we can use hyperparameter tuning
2. one such hyperparameter is alpha
3. lower values of alpha show overfitting, while higher values show reduced flexibility or underfitting
4. Hence the choice of a proper alpha can help us build a more accurate model
5. syntax :-<br>
from sklearn.linear_model import ridge<br>
RidgeModel = Ridge(alpha = 0.1) #can be any value but here we've chosen 0.1<br>
RidgeModel.fit(X,Y)<br>
Yhat = RidgeModel.predict(X)
6. to help in hyperparameter tuning, we use a **validation set** in addition to train & test.

## Grid Search

1. Grid Search helps us compare the performance of different values of hyperparameters (which are thus different models)
2. we can use multiple parameters/hyperparameters at once by passing a dictionary wherein 'keys' are the hyperparameter names and values are a list of hyperparameter-values that we wish to compare
3. The process looks like this :-
* the search object splits the data into test,validation and training sets
* We train the model for different hyperparameters, we use the r squared or mean squared error for each model
*  We select the hyperparameter that minimizes the mean squared error or maximizes the r squared on the validation set.
* We finally test our model performance using the test data. 
* as a output we get the R^2/MSE data for both the train_set and test_set for a given as well as the mean_score for different hyperparameter combinations