# Week 3: Nonlinear Regression Solutions


Written by Dr Sara Wade

### Exercise 3: 

#### (a)
In one-dimension (to keep things simple), plot the regression curve for the RBF basis function expansion:
$$ \text{E}[y|x,w] = \sum_{m=1}^M w_m \exp \left( -\frac{1}{2 \ell^2} (x - \mu_m)^2\right),$$
 fixed values of $\mu_1,\ldots, \mu_M$ and different values of $\ell$.

In [None]:
# Data libraries
import numpy as np
import numpy.matlib

# Plotting libraries
import matplotlib.pyplot as plt
import seaborn as sns

# Plotting defaults
plt.rcParams['figure.figsize'] = (10,6)
plt.rcParams['figure.dpi'] = 80

# sklearn modules
import sklearn

First generate some inputs:

In [None]:
# Generate some inputs
N = 100
x = np.random.rand(N,1)*10
x = np.sort(x,0)

Define the RBF kernel and extract features

In [None]:
# Define RBF kernel in one dimension
def rbf(x,mu,bw):
    return np.exp(-1/(2*bw**2)*(x-mu)**2)

In [None]:
# Define parameters: number of basis functions, centroids and bandwith
M = 11
mu = np.arange(0, M, 1) 
bw = 1 # change the bandwith parameter to see how it changes the curve

In [None]:
# Transform x using RBF basis functions
from sklearn.preprocessing import FunctionTransformer
rbftransformer = FunctionTransformer(rbf,kw_args= {"mu": mu, "bw": bw},validate=False)
Phi = rbftransformer.transform(x)


Simulate values of the regression weights and plot the regression function

In [None]:
# Set values of the regression weights
w = np.random.normal(0,1,M)

# Compute and plot the regression function
rf = Phi @ w
plt.plot(x, rf, 'k-')
plt.axis([0, 10, -3.5, 3.5]);
plt.legend(['bandwidth = 1']);

Plot function for different bandwidths $\ell = 0.1,1,100$

In [None]:
# Extract feature matrix for different bandwidths
rbftransformer = FunctionTransformer(rbf,kw_args= {"mu": mu, "bw": 0.1},validate=False)
Phi_small = rbftransformer.transform(x)
rbftransformer = FunctionTransformer(rbf,kw_args= {"mu": mu, "bw": 100},validate=False)
Phi_big = rbftransformer.transform(x)

# Compute and plot the regression function
rf_small = Phi_small @ w
rf_big = Phi_big @ w
plt.plot(x, rf, 'k-')
plt.plot(x, rf_small, 'b-')
plt.plot(x, rf_big, 'g-')
plt.axis([0, 10, -3.5, 3.5]);
plt.legend(['bandwidth = 1','bandwidth = 0.1', 'bandwidth = 100']);

**Bonus**: let's fit the RBF kernel machine to some data!
First, we will generate data assuming,
$$y = \sin(x)\exp(-x/5)+\epsilon, \quad \epsilon \sim \text{N}(0, 0.05),$$ 
and use the `LinearRegression` model in `sklearn` to train the RBF kernel machine with fixed number of basis functions, centroids, and bandwidth

In [None]:
## Bonus: Let's fit to some simulated data

# Generate some outputs
y = np.sin(x)*np.exp(-x/5) + np.random.normal(0,0.05, N).reshape(-1,1)
plt.plot(x, y, 'o');

In [None]:
from sklearn.linear_model import LinearRegression

# Transform inputs
rbftransformer = FunctionTransformer(rbf,kw_args= {"mu": mu, "bw": 1},validate=False)
Phi = rbftransformer.transform(x)

# Estimate the regression weights
l = LinearRegression(fit_intercept=False).fit(X = Phi, y = y)
print(l.coef_)

# Predict
yhat = l.predict(X = Phi)
ftrue = np.sin(x)*np.exp(-x/5)
plt.plot(x, y, 'o');
plt.plot(x,yhat, 'k-');
plt.plot(x,ftrue, 'r--');


 
How does changing the bandwidth parameter affect the fit?

#### (b)
In one-dimension (to keep things simple), plot the regression curve for the (reparametrized ) logistic-sigmoid basis function expansion:
$$ \text{E}[y|x,w] = \sum_{m=1}^M w_m \frac{1}{1+\exp(-\gamma(x - \mu_m))},$$
 fixed values of $\mu_1,\ldots, \mu_M$ and different values of $\gamma$.

Define the logistic-sigmoid kernel and extract features

In [None]:
# Define logistic-sigmoid kernel in one dimension
def ls(x,mu,gam):
    return 1/(1+np.exp(-gam*(x-mu)))

# Set the steepness parameter
gam = 2

# Transform x using logistic sigmoid basis functions
lstransformer = FunctionTransformer(ls,kw_args= {"mu": mu, "gam": gam},validate=False)
Phi = lstransformer.transform(x)

Simulate values of the regression weights and plot the regression function

In [None]:
# Set values of the regression weights
w = np.random.normal(0,1,M)

# Compute and plot the regression function
rf = Phi @ w
plt.plot(x, rf, 'k-')
plt.axis([0, 10, -3.5, 3.5]);
plt.legend(['gamma = 2']);

Plot function for different $\gamma = 2,100,0.01$
 

In [None]:
# Extract feature matrix for different gamma
lstransformer = FunctionTransformer(ls,kw_args= {"mu": mu, "gam": 0.01},validate=False)
Phi_small = lstransformer.transform(x)
lstransformer = FunctionTransformer(ls,kw_args= {"mu": mu, "gam": 100},validate=False)
Phi_big = lstransformer.transform(x)

# Compute and plot the regression function
rf_small = Phi_small @ w
rf_big = Phi_big @ w
plt.plot(x, rf, 'k-')
plt.plot(x, rf_small, 'b-')
plt.plot(x, rf_big, 'g-')
plt.legend(['gamma=2','gamma = 0.01', 'gamma = 100']);

**Bonus:** let's fit the logistic-sigmoid kernel machine to the same data.

In [None]:
from sklearn.linear_model import LinearRegression

# Transform inputs
lstransformer = FunctionTransformer(ls,kw_args= {"mu": mu, "gam": 2},validate=False)
Phi = lstransformer.transform(x)

# Estimate the regression weights
l = LinearRegression(fit_intercept=False).fit(X = Phi, y = y)
print(l.coef_)

# Predict
yhat = l.predict(X = Phi)
ftrue = np.sin(x)*np.exp(-x/5)
plt.plot(x, y, 'o');
plt.plot(x,yhat, 'k-');
plt.plot(x,ftrue, 'r--');

### Exercise 4

The sigmoid kernel is defined as:
$$ k(\mathbf{x}, \mathbf{x}') = \tanh(\gamma \mathbf{x}^T\mathbf{x}' +b) = \frac{\exp \left( 2(\gamma \mathbf{x}^T\mathbf{x}' +b)\right)-1}{\exp \left( 2(\gamma \mathbf{x}^T\mathbf{x}' +b)\right)+1}$$
In one dimension, plot the curve as function of $x$ (with $x'=1$), for different values of $\gamma$ and $b$. 

In [None]:
# Plot for different choices of gamma
x = np.arange(-4, 4, 0.1) # grid of input values
gam = np.array([0.25,1,4]) # different values of gamma
b = 0 # fix b
#Compute sigmoid kernel for different gamma
kf = np.zeros((len(x),len(gam)))
for g in range(len(gam)):
    kf[:,g] = (np.exp(2*(gam[g]*x+b))-1)/(np.exp(2*(gam[g]*x+b))+1)
    
plt.plot(x,kf[:,0],'k-')
plt.plot(x,kf[:,1],'b-');
plt.plot(x,kf[:,2],'g-');
plt.legend(['gamma ='+str(gam[0]), 'gamma ='+str(gam[1]),'gamma ='+str(gam[2])]);

In [None]:
# Plot for different choices of b
x = np.arange(-4, 4, 0.1) # grid of input values
gam = 1 # fix gamma
b = np.array([-2,0,2]) # fix b
#Compute sigmoid kernel for different gamma
kf = np.zeros((len(x),len(b)))
for bind in range(len(b)):
    kf[:,bind] = (np.exp(2*(gam*x+b[bind]))-1)/(np.exp(2*(gam*x+b[bind]))+1)

plt.plot(x,kf[:,0],'k-')
plt.plot(x,kf[:,1],'b-');
plt.plot(x,kf[:,2],'g-');
plt.legend(['b ='+str(b[0]), 'b ='+str(b[1]),'b ='+str(b[2])]);
