# ML with Ridge Regression (8 models)

In this notebook, we will use the functions in the file ridge_regression.py. This time, we will use the 8 data sets and see if the prediction becomes better.

In [6]:
# Useful starting lines
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
%load_ext autoreload
%autoreload 2
from IPython import display
# Import everything in the functions folder
from functions.costs import *
from functions.helpers import *
from functions.split import *
from functions.ridge_regression import *
from functions.helpers import *
from functions.least_squares_GD import *

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Jet 0 (with mass)

In [20]:
DATA_TRAIN_PATH = 'data/train_jet_0_wout_mass.csv' 
y_0_m, tX_0_m, ids_0_m = load_csv_data(DATA_TRAIN_PATH)
#tX_0_m, _, _ = standardize(tX_0_m)

We do a 5-fold cross validation to find the best lambda and best degree

In [21]:
degrees = np.arange(1, 10) 
deg_lambdas = np.arange(-10, 5)

min_loss_0_m, degree_star_0_m, lambda_star_0_m = cross_validation(y_0_m, tX_0_m, 
                                            deg_lambdas, degrees, k_fold=5, digits=3)

print("Min Loss = %f"%min_loss_0_m)
print("Lambda* = %10.3e"%lambda_star_0_m)
print("Degree* = %i"%degree_star_0_m)    

Start the 5-fold Cross Validation!
Start degree 1
  Start for digit 1
    Power of lambda: -10
    Power of lambda: -9
    Power of lambda: -8
    Power of lambda: -7
    Power of lambda: -6
    Power of lambda: -5
    Power of lambda: -4
    Power of lambda: -3
    Power of lambda: -2
    Power of lambda: -1
    Power of lambda: 0
    Power of lambda: 1
    Power of lambda: 2
    Power of lambda: 3
    Power of lambda: 4
  Start for digit 2
  Start for digit 3
Finished Degree 1. Best lambda is  1.000e-10 with percentage wrong pred 0.058576
--------------------
Start degree 2
  Start for digit 1
    Power of lambda: -10
    Power of lambda: -9
    Power of lambda: -8
    Power of lambda: -7
    Power of lambda: -6
    Power of lambda: -5
    Power of lambda: -4
    Power of lambda: -3
    Power of lambda: -2
    Power of lambda: -1
    Power of lambda: 0
    Power of lambda: 1
    Power of lambda: 2
    Power of lambda: 3
    Power of lambda: 4
  Start for digit 2
  Start for digit 3
F

In [22]:
# Just to avoid retraining =)
#lambda_star_0_m = 6.5e-1
#degree_star_0_m = 7

We can split the data just to see if we have a good prediction.

In [23]:
ratio = 0.8
x_train_0, y_train_0, x_test_0, y_test_0 = split_data(tX_0_m, y_0_m, ratio)

Now, that we have the best degree and best lambda, we can do the Ridge Regression and get the best weights. 

In [24]:
# Build poly first
tX_train_0 = build_poly(x_train_0, degree_star_0_m)
tX_test_0 = build_poly(x_test_0, degree_star_0_m)
print("Polynomials done")

# Ridge Regression
loss_0, w_star_0 = ridge_regression(y_train_0, tX_train_0, lambda_star_0_m)
print("Loss = %f"%(loss_0))

Polynomials done
Loss = 0.420310


In [25]:
prediction(y_test_0, tX_test_0, w_star_0)

Good prediction: 4970/5225 (95.119617%)
Wrong prediction: 255/5225 (4.880383%)


Retrain on all the train data

In [None]:
tX_poly_0_m = build_poly(tX_0_m, degree_star_0_m)
loss_0_m, w_star_0_m = ridge_regression(y_0, tX_poly_0_m, lambda_star_0_m)
print("Loss = %f"%(loss_0_m))

Load the test data and predict.

In [None]:
DATA_TEST_PATH = 'data/test_jet_0_with_mass.csv' # TODO: download train data and supply path here 
_, tX_test_0_m, ids_test_0_m = load_csv_data(DATA_TEST_PATH)
tX_test_0_m, _, _ = standardize(tX_test_0_m)
tX_test_poly_0_m = build_poly(tX_test_0_m, degree_star_0_m)

y_pred_0_m = predict_labels(w_star_0_m, tX_test_poly_0_m)