#### General guidance

This serves as a template which will guide you through the implementation of this task. It is advised
to first read the whole template and get a sense of the overall structure of the code before trying to fill in any of the TODO gaps.
This is the jupyter notebook version of the template. For the python file version, please refer to the file `template_solution.py`.

First, we import necessary libraries:

In [10]:
import numpy as np
import pandas as pd
from sklearn.model_selection import KFold

# Add any additional imports here (however, the task is solvable without using 
# any additional imports)
# import ...

 #### Loading data

In [11]:
data = pd.read_csv("train.csv")
y = data["y"].to_numpy()
data = data.drop(columns=["Id", "y"])
# print a few data samples
print(data.head())
X = data.to_numpy()

     x1    x2    x3    x4    x5
0  0.02  0.05 -0.09 -0.43 -0.08
1 -0.13  0.11 -0.08 -0.29 -0.03
2  0.08  0.06 -0.07 -0.41 -0.03
3  0.02 -0.12  0.01 -0.43 -0.02
4 -0.14 -0.12 -0.08 -0.02 -0.08


#### Transform data

In [12]:
"""
Transform the 5 input features of matrix X (x_i denoting the i-th component of X) 
into 21 new features phi(X) in the following manner:
5 linear features: phi_1(X) = x_1, phi_2(X) = x_2, phi_3(X) = x_3, phi_4(X) = x_4, phi_5(X) = x_5
5 quadratic features: phi_6(X) = x_1^2, phi_7(X) = x_2^2, phi_8(X) = x_3^2, phi_9(X) = x_4^2, phi_10(X) = x_5^2
5 exponential features: phi_11(X) = exp(x_1), phi_12(X) = exp(x_2), phi_13(X) = exp(x_3), phi_14(X) = exp(x_4), phi_15(X) = exp(x_5)
5 cosine features: phi_16(X) = cos(x_1), phi_17(X) = cos(x_2), phi_18(X) = cos(x_3), phi_19(X) = cos(x_4), phi_20(X) = cos(x_5)
1 constant feature: phi_21(X)=1

Parameters
----------
X: matrix of floats, dim = (700,5), inputs with 5 features

Compute
----------
X_transformed: array of floats: dim = (700,21), transformed input with 21 features
"""
#X_transformed = np.zeros((700, 21))
# TODO: Enter your code here

X_transformed = (X , X**2 , np.exp(X) , np.cos(X) , np.ones([X.shape[0],1]))
X_transformed = np.concatenate(X_transformed , axis=1)

assert X_transformed.shape == (700, 21)

In [13]:
def calculate_RMSE(w, X, y):
    y_hat = X @ w
    RMSE = np.sqrt(np.sum((y - y_hat)**2) / len(y))
    return RMSE

def fit(X, y, lam):
    w = np.linalg.inv(X.T @ X + lam * np.identity(X.shape[1])) @ X.T @ y
    return w

#initialize lambdas and the number of folds
lambdas = np.linspace(0,50,1000)
n_folds = 5
RMSE_mat = np.zeros((n_folds, len(lambdas)))


#split the training set in n_folds and find out the lambda which gives the smallest RMSE
kf = KFold(n_splits=n_folds)
for m , (train_index , test_index) in enumerate(kf.split(X_transformed)):
    for n , lam in enumerate(lambdas):
        w = fit(X_transformed[train_index],y[train_index],lam)
        RMSE = calculate_RMSE(w,X_transformed[test_index],y[test_index])
        RMSE_mat[m][n] = RMSE

avg_RMSE = np.mean(RMSE_mat, axis=0) 
lam = lambdas[np.argmin(avg_RMSE)]
print(lam)


#retrain the model with the chosen lambda
w = fit(X_transformed , y , lam)
RMSE = calculate_RMSE(w,X_transformed[test_index],y[test_index])
print(RMSE)
print(w)
#assert w.shape == (21,)

11.061061061061062
2.035035284875038
[ 0.12724652 -0.29733336 -0.43613793  0.21885479  0.08247578 -0.14783658
  0.07962164  0.08215889 -0.1135829   0.03001243 -0.51418645 -0.82454909
 -0.96425633 -0.40101277 -0.47139526 -0.49168687 -0.60472175 -0.60594929
 -0.50947163 -0.57946951 -0.56513949]


In [14]:
# Save results in the required format
np.savetxt("./resultsRidge.csv", w, fmt="%.12f")