#  Regression on Diamonds Price Dataset with SVM

The **Diamonds dataset** from Kaggle is a dataset containing information about the physical and pricing attributes of nearly 54,000 diamonds. The dataset is commonly employed in tasks like regression analysis, feature engineering, and exploratory data analysis.

We will consider a **reduced version** of the dataset, containing 4000 samples, and without categorical features.

### Key Features:
- **Carat**: The weight of the diamond.
- **Depth**: The total depth percentage (z / mean(x, y)).
- **Table**: Width of the diamond's top as a percentage of its widest point.
- **Price**: Price in US dollars.
- **X, Y, Z**: Dimensions of the diamond in mm (length, width, depth).

This dataset is useful for exploring relationships between physical attributes and pricing, and for building predictive models to estimate diamond prices based on their features.

For more information see: https://www.kaggle.com/datasets/shivam2503/diamonds.

# Overview

In the notebook you will perform a complete pipeline of machine learning - regression task. First, you will:
- split the data into training, validation, and test;
- standardize the data.

You will then be asked to learn various SVM models, in particular:
- for each of the kernels *linear*, *poly*, *rbf*, and *sigmoid*, you will learn the best model, choosing among some fixed values of the considered hyperparameters. In particular, the choice of hyperparameters must be done with **5-fold cross-validation**, as we have seen in the labs.

Then, from the models trained with the best hyperparameters selected as above, you will:
- choose the best kernel, using a validation approach (not cross-validation), and
- learn the best SVM model overall.

Furthermore, you will then be asked to estimate the generalization error of the best SVM model you report. 

At the end, just for comparison, you will also be asked to learn a standard linear regression model (with squared loss), and estimate its generalization error.

### IMPORTANT
- Note that in each of the above steps you will have to choose the appropriate split of the data (see the first bullet point above);
- The code should run without requiring modifications even if some best choice of parameters, changes; for example, you should not pass the best value of hyperparameters "manually" (i.e., passing the values as input parameters to the models). The only exception is in the TO DO titled 'ANSWER THE FOLLOWING'
- $\texttt{epsilon}$ parameter: For SVM, since the values to be predicted are all in the thousands of dollars, you will need to always set $\texttt{epsilon} = 100$
- Do not change the printing instructions (other than adding the correct variable name for your code), and do not add printing instructions!

## TO DO - INSERT YOUR NUMERO DI MATRICOLA BELOW

In [1]:
# -- put here your ID number (numero di matricola)
numero_di_matricola = 2095665 # COMPLETE

The following code loads all required packages

In [2]:
# -- import all packages needed
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn import svm
from sklearn import model_selection
from sklearn import linear_model
from sklearn.model_selection import KFold
from itertools import product

The code below loads the data and remove samples with missing values. It also prints the number of samples and a brief description of our dataset.

In [3]:
# -- load the data - do not change the path below!
df = pd.read_csv('diamonds.csv', sep = ',')

# -- remove the data samples with missing values (NaN)
df = df.dropna()
# -- let's drop the column containing the id of the data
df = df.drop(columns=['Unnamed: 0'], axis=1)

In [4]:
print('Dataset shape:', df.shape)
# -- description of dataset
print(df.describe())

Dataset shape: (4000, 7)
             carat        depth       table         price            x  \
count  4000.000000  4000.000000  4000.00000   4000.000000  4000.000000   
mean      0.797945    61.776925    57.44035   3920.239250     5.735810   
std       0.462251     1.468899     2.26052   3935.292841     1.106897   
min       0.210000    52.200000    52.00000    339.000000     0.000000   
25%       0.400000    61.100000    56.00000    936.000000     4.720000   
50%       0.710000    61.900000    57.00000   2468.000000     5.710000   
75%       1.050000    62.500000    59.00000   5297.500000     6.550000   
max       3.010000    70.600000    79.00000  18730.000000     9.100000   

                 y            z  
count  4000.000000  4000.000000  
mean      5.736307     3.540002  
std       1.099129     0.691834  
min       0.000000     0.000000  
25%       4.730000     2.910000  
50%       5.730000     3.540000  
75%       6.550000     4.040000  
max       8.970000     5.670000  


In [5]:
print('First 5 samples of the dataset:\n\n', df.head(5))

First 5 samples of the dataset:

    carat  depth  table  price     x     y     z
0   0.33   61.7   55.0    564  4.43  4.46  2.74
1   1.20   62.1   57.0   5914  6.78  6.71  4.19
2   0.62   61.0   57.0   2562  5.51  5.54  3.37
3   0.34   63.1   56.0    537  4.41  4.46  2.80
4   1.20   62.5   55.0   5964  6.77  6.84  4.25


In the following cell, we convert our (pandas) dataframe into set X (containing our features) and the set Y (containing our target, i.e., the price)

In [6]:
m = df.shape[0]

# -- let's compute X and Y sets
X = df.drop(columns=['price'], axis=1)
Y = df['price']

print("Total number of samples:", m)

X = X.values
Y = Y.values

# -- print shapes
print('X shape: ', X.shape)
print('Y shape: ', Y.shape)

Total number of samples: 4000
X shape:  (4000, 6)
Y shape:  (4000,)


# Data preprocessing

## TO DO - SPLIT DATA INTO TRAINING, VALIDATION, AND TESTING, WITH THE FOLLOWING PERCENTAGES: 60%, 20%, 20%

Use the $\texttt{train\_test\_split}$ function from sklearn.model_selection to do it; in every call fix $\texttt{random\_state}$ to your numero_di_matricola. 
At the end, you should store the data in the following variables:
- X_train, Y_train: training data;
- X_val, Y_val: validation data;
- X_train_val, Y_train_val: training and validation data;
- X_test, Y_test: test data.

The code then prints the number of samples in X_train, X_val, X_train_val, and X_test

**IMPORTANT:**
- first split the data into training+validation and test; the first part of the data in output from $\texttt{train\_test\_split}$ must correspond to the training+validation;
- then split training+validation into training and validation; the first part of the data in output from $\texttt{train\_test\_split}$ must correspond to the training


In [7]:
# -- split the data into training + validation and test
X_train_val, X_test, Y_train_val, Y_test = train_test_split(X, Y, test_size=0.2, random_state=numero_di_matricola)

# -- split the training + validation data into training and validation
X_train, X_val, Y_train, Y_val = train_test_split(X_train_val, Y_train_val, test_size=0.25, random_state=numero_di_matricola) # 0.25 x 0.8 = 0.2

print("Training size:", X_train.shape[0])
print("Validation size:", X_val.shape[0])
print("Training and validation size:", X_train_val.shape[0])
print("Test size:", X_test.shape[0])

Training size: 2400
Validation size: 800
Training and validation size: 3200
Test size: 800


## TO DO - STANDARDIZE THE DATA

Standardize the data using the $\texttt{preprocessing.StandardScaler}$ from scikit learn.

If V is the name of the variable storing part of the data, the corresponding standardized version should be stored in V_scaled. For example, the scaled version of X_train should be stored in X_train_scaled.

In [8]:
# -- Create a StandardScaler object and fit it on the training set
scaler = preprocessing.StandardScaler().fit(X_train)

# -- Apply the scaler on the training and validation sets
X_train_scaled = scaler.transform(X_train)
X_val_scaled = scaler.transform(X_val)

# SVM models: learning the best model for each kernel

The following function, i.e., $\texttt{k\_fold\_cross\_validation}$, will perform $k$-fold cross validation (with $k$ = 5 by default). Look carefully at the signature of the below function: you have in input some sets X and Y, the default number of folds, and a length-variable keyword argumens, with which the SVM model will be trained in the cross-validation phase. If you are not familiar with the notation, look at kwargs in Python documentation.

In the first lines of the below function, the unpacked parameters (i.e., input parameter $\texttt{param\_grid}$) are converted into a python list by means of cartesian product. The resulting list (i.e., $\texttt{param\_list}$) will be the one for which you need to iterate over and perform $k$-fold cross-validation, using $\texttt{KFold}$ object frmo scikit-learn.

At the end, note that you need to return $\texttt{best\_param}$, that is the best set of parameters you found with the cross-validation procedure. 

In [9]:
def k_fold_cross_validation(X, Y, num_folds = 5, **param_grid):

    # -- grid of hyperparams into list
    param_keys = list(param_grid.keys())
    param_values = list(param_grid.values())

    # Generate Cartesian product of values
    combinations = product(*param_values)

    # Create a list of dictionaries from combinations
    param_list = [dict(zip(param_keys, combination)) for combination in combinations]

    best_param = None
    kf = KFold(n_splits=num_folds)
    err_train_kfold = np.zeros(len(param_list),)
    err_val_kfold = np.zeros(len(param_list),)

    for i, param in enumerate(param_list):
        model = svm.SVR(**param)  # create a model with the current parameter

        for train_index, val_index in kf.split(X):

			# -- split the data into training and validation according to the current fold
            X_train_kfold, X_val_kfold = X[train_index], X[val_index]
            Y_train_kfold, Y_val_kfold = Y[train_index], Y[val_index]

            # -- standardize the data
            scaler_kfold = preprocessing.StandardScaler().fit(X_train_kfold)
            X_train_kfold_scaled = scaler_kfold.transform(X_train_kfold)
            X_val_kfold_scaled = scaler_kfold.transform(X_val_kfold)

			# -- model training
            model.fit(X_train_kfold_scaled, Y_train_kfold)

			# -- (1 - R^2) as error
            err_train_kfold[i] += (1 - model.score(X_train_kfold_scaled, Y_train_kfold))
            err_val_kfold[i] += (1 - model.score(X_val_kfold_scaled, Y_val_kfold))

	# -- compute the mean for each parameter
    err_train_kfold /= num_folds
    err_val_kfold /= num_folds

	# -- find the best parameter (the one that minimizes the error on the validation set)
    best_param = param_list[np.argmin(err_val_kfold)]

    return best_param

## TO DO - CHOOSE THE BEST HYPERPARAMETERS FOR LINEAR KERNEL

For the SVM, consider $\texttt{svm.SVR}$ class. We will begin by training the SVM with linear kernel. For the latter, consider the following hyperparameters and their values:

- $C: [0.1, 1, 10, 100, 1000]$

Remember that both the $\texttt{kernel}$ type and the value of $\texttt{epsilon}$ are considered as parameters to pass to the above method. Leave all other input parameters to default. 

Find the best value of the hyperparameters using 5-fold cross validation. Use the function defined above to perform the cross-validation.

Print the best value of the hyperparameters.

In [10]:
print("\nLinear SVM:")

linear_best_param = k_fold_cross_validation(X_train_val,
                                            Y_train_val,
                                            num_folds=5,
                                            kernel = ['linear'],
                                            C=[0.1, 1, 10, 100, 1000],
                                            epsilon = [100]
                                            )

print("Best value for hyperparameters: ", linear_best_param)


Linear SVM:
Best value for hyperparameters:  {'kernel': 'linear', 'C': 1000, 'epsilon': 100}


## TO DO - LEARN A MODEL WITH LINEAR KERNEL AND BEST CHOICE OF HYPERPARAMETERS

This model will be compared with the best models with other kernels using validation (not cross validation).

DO NOT PASS PARAMETERS BY HARD-CODING THEM IN THE CODE.

Print the **training score** (that is, $R^2$ coefficient) of the best model, trained with the best parameter find from the above cell.

In [11]:
linearKernel_model = svm.SVR(kernel=linear_best_param['kernel'],
                            C=linear_best_param['C'],
                            epsilon=linear_best_param['epsilon']
                            )  # create a model with the current best parameter

linearKernel_model.fit(X_train_scaled, Y_train)
linear_training_score = linearKernel_model.score(X_train_scaled, Y_train)
print("Training score:", linear_training_score)

Training score: 0.8463423306879191


## TO DO - CHOOSE THE BEST HYPERPARAMETERS FOR POLY KERNEL

Now, let's consider $\texttt{svm.SVR}$ with polynomial kernel. Consider the following hyperparameters and their values:
- $C: [0.1, 1, 10, 100, 1000]$
- $degree: [2, 3, 4]$

Leave all other input parameters to default. 

Find the best value of the hyperparameters using 5-fold cross validation. Use the function defined above to perform the cross-validation.

Print the best value of the hyperparameters.

In [12]:
print("\nPoly SVM")

poly_best_param = k_fold_cross_validation(X_train_val,
                                          Y_train_val,
                                          num_folds=5,
                                          kernel = ['poly'],
                                          degree = [2, 3, 4],
                                          C=[0.1, 1, 10, 100, 1000],
                                          epsilon = [100]
                                          )

print("Best value for hyperparameters: ", poly_best_param)


Poly SVM
Best value for hyperparameters:  {'kernel': 'poly', 'degree': 3, 'C': 1000, 'epsilon': 100}


## TO DO - LEARN A MODEL WITH POLY KERNEL AND BEST CHOICE OF HYPERPARAMETERS

This model will be compared with the best models with other kernels using validation (not cross validation).

DO NOT PASS PARAMETERS BY HARD-CODING THEM IN THE CODE.

Print the **training score** (that is, $R^2$ coefficient) of the best model, trained with the best parameter find from the above cell.

In [13]:
poly_model = svm.SVR(kernel=poly_best_param['kernel'],
                     degree=poly_best_param['degree'],
                     C=poly_best_param['C'],
                     epsilon=poly_best_param['epsilon']
                     )  # create a model with the current best parameter

poly_model.fit(X_train_scaled, Y_train)
poly_training_score = poly_model.score(X_train_scaled, Y_train)
print("Training score:", poly_training_score)

Training score: 0.7045297720322465


## TO DO - CHOOSE THE BEST HYPERPARAMETERS FOR RBF KERNEL

Consider $\texttt{svm.SVR}$ with RBF kernel. Consider the following hyperparameters and their values:
- $C: [0.1, 1, 10, 100, 1000]$
- $gamma: [0.01, 0.03, 0.04, 0.05]$

Leave all other input parameters to default. 

Find the best value of the hyperparameters using 5-fold cross validation. Use the function defined above to perform the cross-validation.

Print the best value of the hyperparameters.

In [14]:
print("\nRBF SVM")

rbf_best_param = k_fold_cross_validation(X_train_val,
                                         Y_train_val,
                                         num_folds=5,
                                         kernel = ['rbf'],
                                         gamma=[0.01, 0.03, 0.04, 0.05],
                                         C=[0.1, 1, 10, 100, 1000],
                                         epsilon = [100]
                                         )

print("Best value for hyperparameters: ", rbf_best_param)


RBF SVM
Best value for hyperparameters:  {'kernel': 'rbf', 'gamma': 0.05, 'C': 1000, 'epsilon': 100}


## TO DO - LEARN A MODEL WITH RBF KERNEL AND BEST CHOICE OF HYPERPARAMETERS

This model will be compared with the best models with other kernels using validation (not cross validation).

DO NOT PASS PARAMETERS BY HARD-CODING THEM IN THE CODE.

Print the **training score** (that is, $R^2$ coefficient) of the best model, trained with the best parameter find from the above cell.

In [15]:
rbf_model = svm.SVR(kernel=rbf_best_param['kernel'],
                    gamma=rbf_best_param['gamma'],
                    C=rbf_best_param['C'],
                    epsilon=rbf_best_param['epsilon']
                    )  # create a model with the current best parameter

rbf_model.fit(X_train_scaled, Y_train)
rbf_training_score = rbf_model.score(X_train_scaled, Y_train)
print("Training score:", rbf_training_score)

Training score: 0.8658074662664135


## TO DO - CHOOSE THE BEST HYPERPARAMETERS FOR SIGMOID KERNEL

Consider $\texttt{svm.SVR}$ with sigmoid kernel. Consider the following hyperparameters and their values:
- $C: [0.1, 1, 10, 100, 1000]$
- $gamma: [0.01, 0.05, 0.1]$
- $coef0: [0, 1]$

Leave all other input parameters to default. 

Find the best value of the hyperparameters using 5-fold cross validation. Use the function defined above to perform the cross-validation.

Print the best value of the hyperparameters.

In [16]:
print("\nSigmoid SVM")

sigmoid_best_param = k_fold_cross_validation(X_train_val,
                                             Y_train_val,
                                             num_folds=5,
                                             kernel = ['sigmoid'],
                                             gamma=[0.01, 0.05, 0.1],
                                             coef0=[0, 1],
                                             C=[0.1, 1, 10, 100, 1000],
                                             epsilon = [100]
                                             )

print("Best value for hyperparameters: ", sigmoid_best_param)


Sigmoid SVM
Best value for hyperparameters:  {'kernel': 'sigmoid', 'gamma': 0.01, 'coef0': 0, 'C': 1000, 'epsilon': 100}


## TO DO - LEARN A MODEL WITH SIGMOID KERNEL AND BEST CHOICE OF HYPERPARAMETERS

This model will be compared with the best models with other kernels using validation (not cross validation).

DO NOT PASS PARAMETERS BY HARD-CODING THEM IN THE CODE.

Print the **training score** (that is, $R^2$ coefficient) of the best model, trained with the best parameter find from the above cell.

In [17]:
sigmoid_model = svm.SVR(kernel=sigmoid_best_param['kernel'],
                        gamma=sigmoid_best_param['gamma'],
                        coef0=sigmoid_best_param['coef0'],
                        C=sigmoid_best_param['C'],
                        epsilon=sigmoid_best_param['epsilon']
                        )  # create a model with the current best parameter

sigmoid_model.fit(X_train_scaled, Y_train)
sigmoid_training_score = sigmoid_model.score(X_train_scaled, Y_train)
print("Training score:", sigmoid_training_score)

Training score: 0.7845478483700403


## TO DO - USE VALIDATION TO CHOOSE THE BEST MODEL AMONG THE ONES LEARNED FOR THE VARIOUS KERNELS

Use validation to choose the best model among the four ones (one for each kernel) you have learned above.

Print, following exactly the order described here, with 1 value for each line:
- the validation score of SVM with linear kernel (the template below does not include such print)
- the validation score of SVM with polynomial kernel (the template below does not include such print)
- the validation score of SVM with rbf kernel (the template below does not include such print)
- the validation score of SVM with sigmoid kernel (the template below does not include such print)
- the best kernel (e.g., sigmoid) 
- the validation score of the best kernel 

For the first 4 prints, use the format: "*kernel* validation score: ". For example, for linear kernel "linear validation score: ", for rbf "rbf validation score: "

In [18]:
print("\nVALIDATION TO CHOOSE SVM KERNEL:")

# -- Validation score for each kernel

# -- Linear Kernel
linear_val_score = linearKernel_model.score(X_val_scaled, Y_val)
print("Linear validation score:", linear_val_score)

# -- Poly Kernel
poly_val_score = poly_model.score(X_val_scaled, Y_val)
print("Poly validation score:", poly_val_score)

# -- RBF Kernel
rbf_val_score = rbf_model.score(X_val_scaled, Y_val)
print("rbf validation score:", rbf_val_score)

# -- Sigmoid Kernel
sigmoid_val_score = sigmoid_model.score(X_val_scaled, Y_val)
print("Sigmoid validation score:", sigmoid_val_score)

# -- Best kernel
max_val_score = max(linear_val_score, poly_val_score, rbf_val_score, sigmoid_val_score)

# -- Saving in best_hyperparam the best kernel hyperparameters to access them later
if max_val_score == linear_val_score:
	best_hyperparam = linear_best_param
elif max_val_score == poly_val_score:
	best_hyperparam = poly_best_param
elif max_val_score == rbf_val_score:
	best_hyperparam = rbf_best_param
elif max_val_score == sigmoid_val_score:
	best_hyperparam = sigmoid_best_param
else:
	best_hyperparam = None

# -- Using best_hyperparam to get the best kernel
best_kernel = best_hyperparam['kernel']
print("\n---\nBest kernel: ", best_kernel)
print("Validation score of best kernel: ", max_val_score)


VALIDATION TO CHOOSE SVM KERNEL:
Linear validation score: 0.8368843457712244
Poly validation score: 0.7409124543450272
rbf validation score: 0.8640033989740028
Sigmoid validation score: 0.7807720338438254

---
Best kernel:  rbf
Validation score of best kernel:  0.8640033989740028


## TO DO - LEARN THE FINAL MODEL FOR WHICH YOU WANT TO ESTIMATE THE GENERALIZATION SCORE

Learn the final model (i.e., the one you would use to make predictions about future data).

Print the **final model hyperparameters** and the **score** of the model on the data used to learn it.

In [19]:
print("\nBEST MODEL:")

# -- Saving both the best model (I'll access to it to calculate the generalization score)
# and its training score (I save it to avoid retraining and re-evaluating the model, because I've done it already)
if best_kernel == 'linear':
	best_model = linearKernel_model
elif best_kernel == 'poly':
	best_model = poly_model
elif best_kernel == 'rbf':
	best_model = rbf_model
elif best_kernel == 'sigmoid':
	best_model = sigmoid_model
else:
	best_model = None
	best_model_training_score = None

### IMPORTANT! ###
# I'm going to train the best model on the whole training set, not only on the training set ####

#Standardize the data on the whole training set
scaler = preprocessing.StandardScaler().fit(X_train_val)
X_train_val_scaled = scaler.transform(X_train_val)
X_test_scaled = scaler.transform(X_test)

#Training the best model on the whole training set
best_model.fit(X_train_val_scaled, Y_train_val)
best_model_training_score = best_model.score(X_train_val_scaled, Y_train_val)

print("Best model hyperparameters:", best_hyperparam)
print("Score of the best model on the data used to learn it: ", best_model_training_score)


BEST MODEL:
Best model hyperparameters: {'kernel': 'rbf', 'gamma': 0.05, 'C': 1000, 'epsilon': 100}
Score of the best model on the data used to learn it:  0.8668471192888929


## TO DO - PRINT THE ESTIMATE  OF THE GENERALIZATION SCORE FOR THE FINAL MODEL

Print the estimate of the generalization **score** for the final model. The generalization "score" is the score computed on the data used to estimate the generalization error.

In [20]:
print("\nGENERALIZATION SCORE BEST MODEL:")

# -- Generalization score of the best model
best_model_gen_score = best_model.score(X_test_scaled, Y_test)

print("Estimate of the generalization score for best SVM model: ", best_model_gen_score)


GENERALIZATION SCORE BEST MODEL:
Estimate of the generalization score for best SVM model:  0.8673884659579472


## TO DO - ANSWER THE FOLLOWING

Print the **training score** (score on data used to train the model) and the **generalization score** (score on data used to assess generalization) of the final SVM model THAT YOU OBTAIN WHEN YOU RUN THE CODE, one per line, printing the smallest one first. 

NOTE: THE VALUES HERE SHOULD BE HARDCODED.

Print you answer (YES/NO) to the following question: does the relation (i.e., smaller, larger) between the training score and the generalization score agree with the theory?

Print your motivation for the YES/NO answer above, using at most 500 characters.

In [21]:
print("\nANSWER")

# -- note that you may have to invert the order of the following 2 lines, print the smallest 1 first
print("Training score: ", best_model_training_score)
print("Generalization score: ", best_model_gen_score)

# -- the following is a string with you answer
motivation_yes = ("\n \n-YES \n"
"The generalization score being lower than the training score is consistent with the theory. \n"
"This occurs because the model is trained to minimize error on the training data, resulting in a high score there. \n"
"In contrast, the generalization score reflects the model's ability to perform on unseen data. \n"
"This is more challenging because here we're generalizing our model on new and unseen data, leading to a lower score. \n"
"Furthermore, the high values and the small difference between the scores suggest that the model is not overfitting and is generalizing well. \n")

motivation_no = ("\n \n-NO\n"
"In theory, the training score should be higher, as it is computed on the data used to train the model,\n"
"while the generalization score is computed on unseen data.\n"
"A higher generalization score is unusual but can be explained by random split effects.\n"
"The dataset's random division might produce a test set that better represents the data distribution than the training set,\n"
"slightly increasing the generalization score.\n"
"Repeating evaluations across multiple splits is advised to ensure consistency.")


print(motivation_no)


ANSWER
Training score:  0.8668471192888929
Generalization score:  0.8673884659579472

 
-NO
In theory, the training score should be higher, as it is computed on the data used to train the model,
while the generalization score is computed on unseen data.
A higher generalization score is unusual but can be explained by random split effects.
The dataset's random division might produce a test set that better represents the data distribution than the training set,
slightly increasing the generalization score.
Repeating evaluations across multiple splits is advised to ensure consistency.


## TO DO: LEARN A STANDARD LINEAR MODEL
Learn a standard linear model using scikit learn.

Print the **score** of the model on the data used to learn it.

Print the **generalization score** of the model.

In [22]:
print("\nLR MODEL")

# -- Linear Regression model
lr_model = linear_model.LinearRegression()
lr_model.fit(X_train_val_scaled, Y_train_val)
print("Score of LR model on data used to learn it: ", lr_model.score(X_train_val_scaled, Y_train_val))
print("Generalization score of LR model: ", lr_model.score(X_test_scaled, Y_test))


LR MODEL
Score of LR model on data used to learn it:  0.8537716635083005
Generalization score of LR model:  0.8440963478424226
