# Summary

We're going to derive logistic regression then run some simulations.

## References

* [Strang, Introduction to Linear Algebra](https://math.mit.edu/~gs/linearalgebra/)

In [1]:
import numpy as np

np.random.seed(42)

# Strang 4.2 Projections

Starting with matrix projections. Reviewing Example 3 (p. 211)

In [2]:
A = np.array([[1,0],[1,1],[1,2]]); b = np.reshape(np.array([6,0,0]),(-1,1))


In [3]:
# A^T A
np.dot(np.transpose(A),A)

array([[3, 3],
       [3, 5]])

In [4]:
# A^T b
np.dot(np.transpose(A),b)

array([[6],
       [0]])

In [5]:
P = np.dot(np.linalg.inv(np.dot(np.transpose(A),A)),np.transpose(A)) # (7) projection matrix

In [6]:
# Now solve the normal equation A^T * A*xhat = A^T * bhat

xhat = np.dot(P,b) # (8)
xhat

array([[ 5.],
       [-3.]])

In [7]:
# The combination p = A * xhat is the projection of b onto the column space of A
p = np.dot(A,xhat)
p

array([[ 5.],
       [ 2.],
       [-1.]])

In [8]:
# The error is
e = b - p
e # (9)

array([[ 1.],
       [-2.],
       [ 1.]])

In [9]:
# Two checks on the calculation. 
all(# First, the error is perpendicular to both columns.
    [np.dot(A[:,0],e) == 0, np.dot(A[:,1],e) == 0],
    # Second, the matrix P times b correctly gives p
)

True

# Strang 4.3 Least Squares Approximations

When $Ax=b$ has no solution, $\hat{x}$ is the "least-squares solution": $||b - A \hat{x}||^2 = $ minimum.

## Example 1

A crucial application of least squares is fitting a straight line to $m$ points. This 3 by 2 system has *no solution*.

In [17]:
A = np.array([[1,0],[1,1],[1,2]]); b = np.reshape(np.array([6,0,0]),(-1,1))
A, b # Ax = b is not solvable

(array([[1, 0],
        [1, 1],
        [1, 2]]),
 array([[6],
        [0],
        [0]]))

## Minimizing the Error

The best $\hat{x}$ comes from the normal equations $A^TA \hat{x} = A^Tb$. $E$ is a minimum.

In [20]:
xhat = np.dot(np.linalg.inv(np.dot(np.transpose(A),A)), np.dot(np.transpose(A),b))
xhat # (1)

array([[ 5.],
       [-3.]])

In [23]:
p = np.dot(A,xhat) # (2)
p

array([[ 5.],
       [ 2.],
       [-1.]])

In [27]:
e = b - p
all([np.sum(e) == 0])

True

In [49]:
# X is the features
m = 100; n = 10
X = np.random.rand(m,n)

# y is the target
y = np.random.rand(m,1)

print("X shape {}, y shape {}".format(X.shape, y.shape))

X shape (100, 10), y shape (100, 1)


# Least Squares Approximation with a Random Matrix

In [52]:
# Analytic solution to linear regression
# 
# Strang Ch. 4

def analyticlm(A,b):
    xhat = np.dot(np.linalg.inv(np.dot(np.transpose(A),A)), np.dot(np.transpose(A),b))
    return  xhat

In [56]:
bhat = analyticlm(X,y)
bhat

array([[ 0.28228163],
       [ 0.1556087 ],
       [ 0.07891688],
       [ 0.18083481],
       [ 0.17747304],
       [ 0.047676  ],
       [ 0.06236389],
       [ 0.03891797],
       [ 0.03090966],
       [-0.07421763]])

# Logistic Regression with a Random Matrix

Sigmoid function is $S(x) = \frac{1}{1 + \exp^{-x}}$

In [57]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

In [58]:
def analytic_logit(A,b):
    xhat = analyticlm(A,b)
    return sigmoid(xhat)

In [60]:
bhat = analytic_logit(X,y)
bhat

array([[0.57010551],
       [0.53882387],
       [0.51971899],
       [0.5450859 ],
       [0.54425217],
       [0.51191674],
       [0.51558592],
       [0.50972826],
       [0.5077268 ],
       [0.4814541 ]])