<a href="https://colab.research.google.com/github/kursataker/cng562-machine-learning-spring-19/blob/master/Matrices.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Vector Spaces
--

Any nonempty set $V$ endowed with 
- an addition operation $+$ to sum two elements in $V$
- a scalar multiplication operation $\cdot$ to multiply elements in $V$ by a real/complex number 
is called real (resp. complex) vector space.

Examples
--
- $\mathbb{R}^D$
- $C[0,1]$, the set of continuous real-valued functions


Linear Transformations
--

A function $L:V \to W$ from the vector space $V$ to vector space $W$ is called a *linear transformation* if 
$$ L(cv_1+v_2) = cL(v_1)+L(v_2)$$
for all $v_1, v_2 \in V$ and $c\in \mathbb{R}$.

Examples
--
- $L(x_1, x_2, x_3)=(0, 3x_1-x_2+\pi x_3)$ is a linear transformation from $\mathbb{R}^3$ to $\mathbb{R}^2$.
- Transformation $T$ defined by $Tf(x)=\int_0^1f(t)dt$ is a linear transformation from $C[0,1]$ to $\mathbb{R}$.


Inner Product
--

A symmetric bi-linear function $\langle\cdot\,,\cdot\rangle$ is called an *inner product* if $\langle v\,,v\rangle > 0$ for nontrivial vectors $v\in V$:
- $\langle cv_1+v_2\,, v_3\rangle=c\langle v_1\,, v_3\rangle+\langle v_2\,, v_3\rangle$
- $\langle v_1\,, v_2\rangle$ = $\langle v_2\,, v_1\rangle$
for all $v_1, v_2, v_3 \in V$ and $c\in \mathbb{R}$.

Examples
--
- Usual dot product on $\mathbb{R}^D$: $\langle v_1\,, v_2\rangle = v_1^T v_2$.
- $\langle f\,, g\rangle = \int_0^1 f(x)g(x)dx$ for $f, g \in C[0,1]$.


Matrix
--
- Rows: Observations (e.g. people)
- Columns: Features (e.g. weight, height, hair color)

Matrix Operations
--
- Addition
- Multiplication

Determinant and Trace
--

Matrix Inverse
--


Matrix Problems
--
- Matrix Equation $A\mathbf{x}=\mathbf{y}$
>- Models *Linear Systems*
>- Describes a *Static System* (*System is time-independent*)
>- Matrix $A$ is not necessarily a *square* matrix
>- Vector $\mathbf{y}$ represents the *known repsonse* of the system 
>- Vector $\mathbf{x}$ represents the *unknown input* to the system 
>- Typically *exact solutions* do not exist

- Eigenvalue Equation $A\mathbf{x}=\lambda \mathbf{x}$
>- Models *Linear Systems*
>- Describes a *Dynamical System* (*System is time-dependent*)
>>- Discrete time system: $\mathbf{x}_n = A^n \mathbf{x}_0.$
>>- Continuous time system: $\mathbf{x(t)}=e^{At}\mathbf{x(0)}.$
>- Matrix $A$ is necessarily a *square* matrix
>- Vector $\mathbf{x}$ represents a special state of the system, an *eigenvector*
>- Number $\lambda$ measures the effect of $A$ on the eigenvector $\mathbf{x}$ called an *eigenvalue* 



In [0]:
import numpy as np

In [0]:
A = np.array([[1,2,3],[4,5,6]])

In [0]:
B = np.ones((2,3))

In [0]:
A

array([[1, 2, 3],
       [4, 5, 6]])

In [0]:
B

array([[1., 1., 1.],
       [1., 1., 1.]])

In [0]:
A+B

array([[2., 3., 4.],
       [5., 6., 7.]])

In [0]:
A-B

array([[0., 1., 2.],
       [3., 4., 5.]])

In [0]:
A+2*B

array([[3., 4., 5.],
       [6., 7., 8.]])

In [0]:
A*(1+B)

array([[ 2.,  4.,  6.],
       [ 8., 10., 12.]])

In [0]:
# Recall that the product and the addition above are not matrix operations!!!

In [0]:
np.matmul(A, B)

ValueError: ignored

In [0]:
B.transpose()

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [0]:
A.shape, B.transpose().shape

((2, 3), (3, 2))

In [0]:
np.matmul(A, B.transpose())

array([[ 6.,  6.],
       [15., 15.]])

In [0]:
A @ B.transpose()

array([[ 6.,  6.],
       [15., 15.]])

In [0]:
A

array([[1, 2, 3],
       [4, 5, 6]])

In [0]:
b = np.array([1,1])
b

array([1, 1])

In [0]:
A @ b

ValueError: ignored

### Solve $Ax=b$

In [0]:
np.linalg.solve(A,b)

LinAlgError: ignored

`numpy.linalg.solve` needs matrix $A$ to be a square matrix.

Solve $A\mathbf{x}=\mathbf{b}$ approximately.

In [0]:
x = np.linalg.lstsq(A, b)
x

  """Entry point for launching an IPython kernel.


(array([-5.00000000e-01,  8.32667268e-17,  5.00000000e-01]),
 array([], dtype=float64),
 2,
 array([9.508032  , 0.77286964]))

In [0]:
len(x)

4

In [0]:
x[0]

array([-5.00000000e-01,  8.32667268e-17,  5.00000000e-01])

In [0]:
x[1]

array([], dtype=float64)

In [0]:
A @ x[0], b

(array([1., 1.]), array([1, 1]))

In [0]:
np.eye(2)

array([[1., 0.],
       [0., 1.]])

In [0]:
np.exp(np.eye(2))

array([[2.71828183, 1.        ],
       [1.        , 2.71828183]])

In [0]:
import scipy

In [0]:
scipy.linalg.expm(np.eye(2))

array([[2.71828183, 0.        ],
       [0.        , 2.71828183]])

In [0]:
A

array([[1, 2, 3],
       [4, 5, 6]])

In [0]:
A.transpose(), A.transpose().shape

(array([[1, 4],
        [2, 5],
        [3, 6]]), (3, 2))

Families of Square Matrices
--
- Diagonal: Only non-zero entries of $A$ are along the diagonal.
- Diagonalizable: There are matrices $P$ and $D$ (diagonal matrix) so that $A=PDP^{-1}$.
- Symmetric: $A^T=A$.
- Positive Definite: $\mathbf{x}^TA\mathbf{x}>0$ for all $\mathbf{x}\neq 0$.
- Positive Semi-Definite: $\mathbf{x}^TA\mathbf{x}\geq 0$ for all $\mathbf{x}$.
- Orthogonal: $AA^T=Id=A^TA.$
- Positive: All of entries of $A$ are $>0$.

Examples
--

### Diagonal/Diagonalizable

- $N\times N$ zero matrix and identity matrices are diagonal.
- A randomly chosen square matrix has distinct eigenvalues and it is diagonalizable.
- All symmetric matrices are diagonalizable.

### Symmetric

- $B+B^T$
- $BB^T$ and $B^TB$ (positive semi-definite in general, positive definite if $\det B\neq 0$.)


### Orthogonal
- Rotation Matrices


## Matrix Decompositions

Important Algorithms | Matrix Factorization
---|---
Row Reduction | LU decomposition, Cholesky Decomposition
Diagonalization | Eigendecomposition
Generalization of Diagonalization | Singular Value Decomposition
Orthogonalization | QR decomposition

In [0]:
from sklearn import datasets

In [0]:
diabetes = datasets.load_diabetes()

In [0]:
print(diabetes['DESCR'])

.. _diabetes_dataset:

Diabetes dataset
----------------

Ten baseline variables, age, sex, body mass index, average blood
pressure, and six blood serum measurements were obtained for each of n =
442 diabetes patients, as well as the response of interest, a
quantitative measure of disease progression one year after baseline.

**Data Set Characteristics:**

  :Number of Instances: 442

  :Number of Attributes: First 10 columns are numeric predictive values

  :Target: Column 11 is a quantitative measure of disease progression one year after baseline

  :Attribute Information:
      - Age
      - Sex
      - Body mass index
      - Average blood pressure
      - S1
      - S2
      - S3
      - S4
      - S5
      - S6

Note: Each of these 10 feature variables have been mean centered and scaled by the standard deviation times `n_samples` (i.e. the sum of squares of each column totals 1).

Source URL:
http://www4.stat.ncsu.edu/~boos/var.select/diabetes.html

For more information see:
Brad

In [0]:
diabetes.keys()

dict_keys(['data', 'target', 'DESCR', 'feature_names', 'data_filename', 'target_filename'])

In [0]:
diabetes['data'].shape

(442, 10)

In [0]:
diabetes['target'].shape

(442,)

In [0]:
X = diabetes['data']

In [0]:
X[0,:]

array([ 0.03807591,  0.05068012,  0.06169621,  0.02187235, -0.0442235 ,
       -0.03482076, -0.04340085, -0.00259226,  0.01990842, -0.01764613])

In [0]:
X[:, 0]

array([ 0.03807591, -0.00188202,  0.08529891, -0.08906294,  0.00538306,
       -0.09269548, -0.04547248,  0.06350368,  0.04170844, -0.07090025,
       -0.09632802,  0.02717829,  0.01628068,  0.00538306,  0.04534098,
       -0.05273755, -0.00551455,  0.07076875, -0.0382074 , -0.02730979,
       -0.04910502, -0.0854304 , -0.0854304 ,  0.04534098, -0.06363517,
       -0.06726771, -0.10722563, -0.02367725,  0.05260606,  0.06713621,
       -0.06000263, -0.02367725,  0.03444337,  0.03081083,  0.01628068,
        0.04897352,  0.01264814, -0.00914709, -0.00188202, -0.00188202,
        0.00538306, -0.09996055, -0.06000263,  0.01991321,  0.04534098,
        0.02717829, -0.05637009, -0.07816532,  0.06713621, -0.04183994,
        0.03444337,  0.05987114, -0.05273755, -0.00914709, -0.04910502,
       -0.04183994, -0.04183994, -0.02730979,  0.04170844,  0.06350368,
       -0.07090025, -0.04183994, -0.02730979, -0.03457486,  0.06713621,
       -0.04547248, -0.00914709,  0.04170844,  0.03807591,  0.01

In [0]:
diabetes['target']

array([151.,  75., 141., 206., 135.,  97., 138.,  63., 110., 310., 101.,
        69., 179., 185., 118., 171., 166., 144.,  97., 168.,  68.,  49.,
        68., 245., 184., 202., 137.,  85., 131., 283., 129.,  59., 341.,
        87.,  65., 102., 265., 276., 252.,  90., 100.,  55.,  61.,  92.,
       259.,  53., 190., 142.,  75., 142., 155., 225.,  59., 104., 182.,
       128.,  52.,  37., 170., 170.,  61., 144.,  52., 128.,  71., 163.,
       150.,  97., 160., 178.,  48., 270., 202., 111.,  85.,  42., 170.,
       200., 252., 113., 143.,  51.,  52., 210.,  65., 141.,  55., 134.,
        42., 111.,  98., 164.,  48.,  96.,  90., 162., 150., 279.,  92.,
        83., 128., 102., 302., 198.,  95.,  53., 134., 144., 232.,  81.,
       104.,  59., 246., 297., 258., 229., 275., 281., 179., 200., 200.,
       173., 180.,  84., 121., 161.,  99., 109., 115., 268., 274., 158.,
       107.,  83., 103., 272.,  85., 280., 336., 281., 118., 317., 235.,
        60., 174., 259., 178., 128.,  96., 126., 28