### 1.2 Filtered Boston housing and kernels

Downloaded csv in command line using:
`curl -o boston-filter.csv http://www0.cs.ucl.ac.uk/staff/M.Herbster/boston-filter/Boston-filtered.csv`


In [16]:
import numpy as np
import python_functions as py_func

# dataset_np_headers_dropped
ds = np.genfromtxt('boston-filter.csv', delimiter=',', skip_header=1)

#### Naive Regression
a. Using polynomial basis for k=1 only i.e. $y = b$.

b. The constant function effectively determines the bias ($y$-intercept) of the linear regression.
It is the lowest predicted value of the dependent variable, the median house price.

In [3]:
MSEs_train_part_a, MSEs_test_part_a = py_func.split_dataset_and_compute_20_MSEs_with_ones(ds)

c. For each of the 12 attributes, perform a linear regression using only the single attribute but incorporating a bias term so that the inputs are augmented with an additional 1 entry, (xi , 1), so that we learn a weight vector w
 ∈ R2.

(Does this mean they want all 12 weights or the average of the 12 weights or MSEs .. ?)

In [14]:
MSEs_train_part_c, MSEs_test_part_c = py_func.split_dataset_and_compute_20_MSEs_with_single_attr(ds)

means_train, means_test = [], []
for MSEs_train_part_c_per_attr, MSEs_test_part_c_per_attr in zip(MSEs_train_part_c, MSEs_test_part_c):
    means_train.append(np.mean(MSEs_train_part_c_per_attr))
    means_test.append(np.mean(MSEs_test_part_c_per_attr))

print(f'Means for each of the 12 attributes in train ds = \n{means_train}\n')
print(f'Means for each of the 12 attributes in test ds = \n{means_test}')

Means for each of the 12 attributes in train ds= 
[70.7041647837489, 72.03692002816041, 64.05057020382704, 80.34603836491502, 68.1197348390914, 43.675830130679536, 71.36935623649204, 78.04456954554622, 71.55957656203262, 65.32423821973977, 62.090915760503876, 37.47723015145918]

Means for each of the 12 attributes in test ds= 
[74.4874685503509, 76.79322964603105, 66.16293396472426, 85.56934661294534, 71.1069732261131, 43.808498207167375, 74.93678917645552, 81.79524781858404, 73.52533896219495, 67.23687830792827, 64.18149020991304, 40.87963802805747]


d. Perform linear regression using all of the data attributes at once.
Perform linear regression on the training set using this regressor, and incorporate a bias term as above.

Calculate the MSE on the training and test sets and note down the results.
You should find that this method outperforms any of the individual regressors.

In [5]:
MSEs_train_part_d, MSEs_test_part_d = py_func.split_dataset_and_compute_20_MSEs_with_all_12_attr(ds)

In [15]:
print(f'Mean MSE for train dataset, using all 12 attributes = {np.mean(MSEs_train_part_d)}')  # gives 65.39992873551633
print(f'Mean MSE for test dataset, using all 12 attributes = {np.mean(MSEs_test_part_d)}')  # gives 68.3736527258721

Mean MSE for train dataset, using all 12 attributes = 21.70373755394515
Mean MSE for test dataset, using all 12 attributes = 25.273765249937956


#### 1.3 Kernelised ridge regression

A Kernel function is given as an element-wise product ?:
$K_{i,j} = K(x_i, x_j)$
$l$ is the size of the training set.
$\gamma$ is the regularisation parameter.
$\sigma$ is a parameter of the Gaussian kernel.


In [64]:
def gaussian_kernel(X, sigma: float):
    num_of_rows_of_x = X.shape[0]
    kernel_values = np.empty((num_of_rows_of_x, num_of_rows_of_x))
    for i in range(num_of_rows_of_x):
        for j in range(num_of_rows_of_x):
            pairwise_difference = X[i] - X[j]
            sqrd_norm = np.square(np.linalg.norm(pairwise_difference))
            kernel_values[i][j] = np.exp(-1 * sqrd_norm / 2 * np.square(sigma))
    return kernel_values

In [65]:
g_dataset_30x, g_dataset_30y = py_func.generate_dataset_about_g(num_of_data_pairs=30)


In [53]:
# Create a vector of gamma values [2^-40, 2^-39,...,2^-26]
gammas = [2**pow for pow in list(range(-40, -25))]
# Create vector of sigma values [2^7, 2^7.5, . . . , 2^12.5, 2^13]
sigmas = []
for pow in list(range(7, 14)):
    sigmas.append(2**pow)
    sigmas.append(2**(pow+0.5))
sigmas = sigmas[:-1]

res = []
for sigma in sigmas:

    res.append(gaussian_kernel(g_dataset_30x, sigma))
res = np.array(res)
print(res.shape)

(13, 30)


In [66]:
res = gaussian_kernel(g_dataset_30x, sigmas[0])
res.shape

(30, 30)

In [73]:
def a_star(sigma, dataset_x, gamma, dataset_y):
    kernel_matrix = gaussian_kernel(dataset_x, sigma)
    l = dataset_x.shape[0]
    return (np.linalg.inv(kernel_matrix + gamma * l * np.identity(l))) @ dataset_y

In [76]:
astar = a_star(sigma=128, dataset_x=g_dataset_30x, gamma=2**-40, dataset_y=g_dataset_30y)
astar[0]

-0.38638126724692334

In [None]:
def evaluation_of_regression(alpha_star, X_train, X_test_data_point):
    y_test = []
    for i in range(X_train.shape[0]):
        bla = alpha_star[i]
        rint(bla.shape)

    return 0



In [63]:
from sklearn.model_selection import train_test_split
ds = np.genfromtxt('boston-filter.csv', delimiter=',', skip_header=1)
train_dataset, test_dataset = train_test_split(ds, test_size=1 / 5)
def get_x_train_y_train_x_test_y_test(m_train: int, train_ds, m_test: int, test_ds) -> tuple:
    X_train_all_attr = train_ds[:, 0: 12]
    ones_train = np.ones((m_train, 1))
    X_train = np.column_stack((ones_train, X_train_all_attr))
    y_train = train_ds[:, -1]
    X_test_all_attr = test_ds[:, 0: 12]
    ones_test = np.ones((m_test, 1))
    X_test = np.column_stack((ones_test, X_test_all_attr))
    y_test = test_ds[:, -1]
    return X_train, y_train, X_test, y_test

m_train, m_test = train_dataset.shape[0], test_dataset.shape[0]
X_train, y_train, X_test, y_test = get_x_train_y_train_x_test_y_test(m_train=m_train, train_ds=train_dataset,
                                                                     m_test=m_test, test_ds=test_dataset)

print(X_train.shape)
print(X_test.shape)

(404, 13)
(102, 13)


In [None]:
gaussian_kernel_of_test_point(X_test, X_test_point, sigma):
    num_of_rows_of_x = X_test.shape[0]
    sqrd_norm = np.empty(num_of_rows_of_x)
    for i in range(num_of_rows_of_x):
        pairwise_difference = X_test[i] - X_test_point
        sqrd_norm[i] = np.square(np.linalg.norm(pairwise_difference))
    return np.exp(-1 * sqrd_norm / 2 * sigma ** 2)

In [78]:
import numpy as np

bla = np.ones(2)
bla

array([1., 1.])

In [81]:
np.sum(bla, axis=0)

2.0