# Practice-Wk13
## Linear Regression - Using SVD
- The young adults data set (**`YoungAdults.csv`**) contains five columns. **`Height`**, **`Weight`**, **`HairLength`**, **`Age`** and **`Sex`**. 
- Let's develop a linear regression model using __SVD__. 


In [None]:
# Read the data from csv file
import csv
reader=csv.reader(open('YoungAdults.csv'), delimiter=',')
next(reader, None)  # skip the headers
data = matrix(QQ, [map(float, row[0:4]) for row in reader])
print("Number of observations in the data file YoungAdults.csv:", data.dimensions()[0])

## Prepare the _design matrix_
Use **`Height`** and **`Weight`** as the independent variables and **`HairLength`** as the dependent variable. Prepare the **design matrix** ($\boldsymbol X$)  and the **dependent column vector** ($\boldsymbol y$). 

In [None]:
# Split the data into independent (height = data.column(0), weight = data.column(1)) and 
# dependent (y = data.column(3)) parts
# Build the design matrix [1, height, weight]
dim = data.dimensions()
print("dim", dim)
X = ones_matrix(RDF, dim[0], dim[1]-1)
X[:, 1] = data.column(0)
X[:, 2] = data.column(1)
Y = matrix(data.column(3)).transpose()
print("")
print("Ten rows of Designer Matrix X:\n", X[0:10,:])
print("")
print("Ten elements of Dependent Variable Y:\n",Y[0:10,:])

### Compute the parameters ($\boldsymbol\beta$) of linear regression using `projection`
$$\boldsymbol{X^TX\hat\beta = X^Ty}$$

$$\boldsymbol{\hat\beta = (X^TX)^{-1}X^Ty}$$

In [None]:
var('x')
Beta = ( X.transpose() * X ).inverse() * X.transpose() * Y
print("Dimensions of Beta: ", Beta.dimensions())
# Make Beta a vector
Beta = vector(Beta)
print("Parameter Estimates:\n", Beta)

### Compute the parameters ($\boldsymbol\beta$) of linear regression using `SVD`
- __Compute $X^{+}$__
    - $X = U {\Sigma} V{^T}$ (similar to $A = U {\Sigma} V{^T}$)
    -  => $X^{+} = (U {\Sigma} V{^T})^{-1} $
    -  => $X^{+} = V {\Sigma}^{+} U{^T} $ 
       - $U^{-1} = U{^T}$
       - $V^{-1} = V{^T}$
- $\boldsymbol{X\beta = y}$ (similar to $Ax=b$)    
    - ${\beta = X^{+}y}$

In [None]:
# Compute SVD
U, Sigma, V = X.SVD()
# SVD() works only with RDF, CDF

print("Shapes of the matrices:")
print("X: ", X.nrows(), X.ncols())
print("U: ", U.nrows(), U.ncols())
print("Sigma: ",Sigma.nrows(), Sigma.ncols())
print("V: ",V.nrows(), V.ncols())
print("Y: ", Y.nrows(), Y.ncols())

#Compute pseudo-inverse of Sigma
pinv_sigma = Sigma.transpose()

var('m,n')
for m in range(pinv_sigma.nrows()):
    for n in range(pinv_sigma.ncols()):
        if(pinv_sigma[m,n]>0):
            pinv_sigma[m,n] = 1 / pinv_sigma[m,n]
print("pinv_sigma: ", pinv_sigma.nrows(), pinv_sigma.ncols())


# Compute pseudo inverse of X
pinv_x = V * pinv_sigma * U.transpose()
print("pinv_x: ", pinv_x.nrows(), pinv_x.ncols())
print("")


#Compute beta = X^+ * y
print("Parameter estimates: ")
svd_beta = pinv_x * Y
# Make beta a vector
SVD_Beta = vector(svd_beta)
print(SVD_Beta)

In [None]:
# Dimensionality reduction
print("Sigma = ")
show(Sigma[0:3, 0:3])
Sigma1 = Sigma[0:2, 0:2]
show(Sigma1)
U1 = U[0:127,0:2]
V1 = V[0:3, 0:2]
X1 = U1 * Sigma1 * V1.transpose()
diff = X - X1
print("Diff = ", diff.norm('frob'))
print("Rel. Diff (%) = ", 2 * 100 * diff.norm('frob') / (X.norm('frob') + X1.norm('frob')))

## An example: How to use python libraries to compute pseudo inverse in 1 step

In [None]:
# You can use libraries such as numpy to calculate psedo inverse directly
import numpy
numpy_pinv_x= matrix(numpy.linalg.pinv(X))
print("Parameter estimates computed using numpy: ")
print(vector(numpy_pinv_x * Y))

## Visualize Our Linear Regression

In [None]:
# x = Height, y = Weigtht and z = HairLength
var('Height, Weight, HairLength')
HairLength = SVD_Beta[0] + SVD_Beta[1] * Height + SVD_Beta[2] * Weight
show(plot3d(HairLength, (Height, 0, 200), (Weight, 0, 80)))