# Linear basis function regression

You have labeled data $\left(x_i, y_i\right), i=1, \ldots, n$ and you want to fit an linear basis function regression model of the form,

$$
\hat{y}_i=\sum_{j=0}^d w_j e^{-j x_i}
$$

where $x_i$ and $y_i$ are scalars. To do this, you will form a matrix $\Phi(\mathbf{x})=\left[\phi_0(\mathbf{x}), \ldots, \phi_d(\mathbf{x})\right]$ and then solve

$$
\hat{\mathbf{y}}=\Phi \mathbf{w}
$$

for the weights that minimize the mean squared error.

**Part 1: Entries of the design matrix**

Given training data
$$
((x_1, y_1), (x_2, y_2), (x_3, y_3), (x_4, y_4))
$$
write out the entries of the design matrix $\Phi$ you would use to fit the model above, for _d=2_.

$$
\Phi =
\left[
\begin{matrix}
1 & e^{-x_1} & e^{-2x_1}\\
1 & e^{-x_2} & e^{-2x_2}\\
1 & e^{-x_3} & e^{-2x_3}\\
1 & e^{-x_4} & e^{-2x_4}
\end{matrix}
\right]
$$

**Part 2: Get transformed _X_**

In this question, you will write a Python function to return the design matrix described in the previous part of the question, but for any arbitrary _d_ and _n_.

Your code should define a function **transform** that accepts a 1D numpy array named **x** with shape **(n,)** (for arbitrary **n**) and a positive integer argument **d**, and returns the 2D numpy array with shape **(n,d+1)**.

For full credit, your code should not use any explicit **for** or **while** loop.

Then, you will train a linear regression on the transformed version of the data, and save the model predictions in **y_hat**.

| Name |	Type |	Description |
| --- | --- | --- |
**transform**	| function	| function that accepts a 1d numpy array and positive integer argument and returns 2d numpy array
**y_hat** |	1d numpy array	| 1d numpy array containing predictions of the model fit

In [1]:
import numpy as np
from sklearn import metrics
from sklearn.linear_model import LinearRegression

First, load in data from a file:

In [2]:
x, y = np.genfromtxt('data.csv',delimiter=',', unpack=True)

In [3]:
x.shape

(100,)

Now, write a `transform` function to create the matrix you described in the previous part of this question, but for arbitrary `n` and `d`.

In [4]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
def transform(x, d):
    exponents = np.arange(d + 1)
    return np.exp(-np.outer(x, exponents))

Then, generate the "transformed" version of the data for `d=3` by calling your function. 

In [5]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
x_trans = transform(x, d=3)

Verify that the "transformed" version of the data has 100 rows and 4 columns:

In [6]:
x_trans.shape

(100, 4)

Finally, fit a linear regression (you can use `scikit-learn` or you can use `numpy`) on the transformed data.
Note that the transformed data has a 'ones' column so if you use `numpy` you wouldn't add another ones column; and if you use `scikit-learn`, you would use `fit_intercept=False` when creating the model.
Then, get the predictions of the model on the data, and save the result in `y_hat`.


In [7]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
# fit model here... 
model = LinearRegression(fit_intercept=False)
model.fit(x_trans, y)

# then get model predictions and save in y_hat
y_hat = model.predict(x_trans)
y_hat

array([24.91113745,  8.3242551 ,  6.52256034, 14.78174122,  7.13584298,
        7.45235793,  7.2051708 ,  8.32482926,  6.33540023,  7.08435974,
        9.13552576, 13.48119484,  6.07993674,  5.4544498 , 11.26552176,
        5.46149085, 13.20819077, 44.78075501,  7.41997901, 17.0125213 ,
       17.79134049,  5.47184038,  6.90016478,  8.30038473, 48.66316938,
        7.32318745, 14.97445691,  7.16545138,  5.97777459,  5.6634383 ,
        7.87376748,  5.5286746 ,  6.36290931,  8.907893  ,  6.43945551,
       13.44784809,  8.72952091,  6.91041898,  6.20791788, 37.05157306,
       47.85376456,  9.17664343,  7.27515906, 36.10296017,  6.89200952,
       22.82878146,  7.68811071,  5.90143578,  7.21425668,  9.90277307,
        8.76272592,  5.90903295, 19.94958417,  5.80320185, 20.76586316,
        5.48749427,  6.74152854,  5.42935205,  8.26886539, 10.96034444,
       10.84442355,  5.90075725,  9.01574197,  7.05710245,  9.4701328 ,
        7.7850996 ,  6.45002779,  7.72396596, 24.25079983, 46.57