**Lifting:** Lifting is where you modify your data with higher-order features.

Why do we do this? The main reason is to allow a model to work in a feature space that it can't normally encompass. The best way to illustrate this is with a circular dataset:

In [None]:
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
np.random.seed(2)

In [None]:
#all points within .5, of (0,0) are 1, all else -1
def gen_circle(n):
    x = 2*np.random.rand(n,2) - 1
    y = []
    for point in x:
        if np.sqrt(point[0]**2+point[1]**2) < .5:
            y.append(1)
        else:
            y.append(-1)
    y = np.asarray(y)
    return x, y

#and lets plot to visualise
def plot_data(x, y):

    colors = ["purple", "red"]
    plt.scatter(x[:,0], x[:,1], c=y, cmap=matplotlib.colors.ListedColormap(colors))
    circle = plt.Circle((0, 0), 0.5, color='r', fill=False)
    plt.gcf().gca().add_artist(circle)
    plt.gca().set_aspect('equal', adjustable='box')
    plt.show()
    
def lstsq(x,y):
    return np.linalg.inv(x.T @ x) @ x.T @ y

def rmse(x, y, w):
    return np.sqrt(np.mean((y - x @ w)**2))

def prediction_accuracy(x,y,w):
    predictions = x @ w
    correct = 0
    for i in range(len(y)):
        if np.sign(predictions[i])  == np.sign(y[i]):
            correct+=1
    return correct/len(y)

def prediction_labels(x,w):
    return np.sign(x @ w)

In [None]:
x, y = gen_circle(500)

In [None]:
plot_data(x,y)

#now lets run least squares! There is no error, so this should be an easy fit
w = lstsq(x,y)

In [None]:
w = lstsq(x,y)

In [None]:
#and measure our accuracy
err = prediction_accuracy(x,y,w)
print(err)

So, even with perfect knowledge, we correctly labeled only 53% of the data correctly, which is barely better than half.

So what happened? To figure out, lets try visualizing w.

In [None]:
print(w)

W has shape [2,1]. What does that mean?

The equation we are trying to solve is $Xw = y$. Lets break that down by the size of the matrices: $(n,2) x (2,1) = (n,1)$

Now put that multiplication in terms of linear algebra: $y_1 = x_i[0]*w_0+x_i[1]*w_1$


So if y is the label, x[:,0] is the x axis, and x[:,1] is the y axis, we get the equation

$label = w_0*x + w_1*y$

In [None]:
#generate new labels
labels = prediction_labels(x,w)
plot_data(x,labels)

In [None]:
#Time to lift
def lift(x):
    lifted_x = []
    
    for dp in x:
        #x is [x,y]
        #we will expand to [x,y,x^2,y^2,xy]
        new_datapoint = [dp[0], dp[1], dp[0]**2, dp[1]**2, dp[0]*dp[1], 1]
        lifted_x.append(new_datapoint)
    return np.asarray(lifted_x)

In [None]:
new_x = lift(x)
print(x.shape)
print(new_x.shape)

With this lifted x, our new equation is 

$label = w_0x + w_1y + w_2x^2 + w_3 y^2 + w_4xy + w_5$

This equation can be a circle!

In [None]:
new_w = lstsq(new_x,y)

correct = prediction_accuracy(new_x, y, new_w)
print(correct)

In [None]:
new_labels = prediction_labels(new_x, new_w)

plot_data(x, new_labels)

98% is much better, and as we can see, the model basically learned the correct shape!
