# [Linear Algebra for Data Scientists](https://www.kaggle.com/mjbahmani/linear-algebra-for-data-scientists)

데이터 사이언티스트에게 필요한 선대수의 기본 개념을 알아보자!
![](https://camo.githubusercontent.com/e42ea0e40062cc1e339a6b90054bfbe62be64402/68747470733a2f2f63646e2e646973636f72646170702e636f6d2f6174746163686d656e74732f3339313937313830393536333530383733382f3434323635393336333534333331383532382f7363616c61722d766563746f722d6d61747269782d74656e736f722e706e67)

In [1]:
# Import
import matplotlib.patches as patch
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy import linalg
from numpy import poly1d
from sklearn import svm
import tensorflow as tf
import pandas as pd
import numpy as np
import glob
import sys
import os

In [2]:
# Setup
%matplotlib inline
%precision 4
plt.style.use('ggplot')
np.set_printoptions(suppress=True)

### 선형대수(Linear Algebra)란?
![](https://wikimedia.org/api/rest_v1/media/math/render/svg/f4f0f2986d54c01f3bccf464d266dfac923c80f3)위와 같은 선형 방정식 관련 수학분야 중 하나이며 수학의 모든 영역에서 중심.
![](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2f/Linear_subspaces_with_shading.svg/800px-Linear_subspaces_with_shading.svg.png)

In [3]:
# 3d vector in numpy
a = np.zeros((2,3,4))
a

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [4]:
# vector
x = [1,2,3]
y = [4,5,6]
print(type(x),x)

<class 'list'> [1, 2, 3]


In [10]:
# vector append
print(type(x+y),x+y)

<class 'list'> [1, 2, 3, 4, 5, 6]


In [11]:
# vector add
print(type(np.add(x,y)),np.add(x,y))

<class 'numpy.ndarray'> [5 7 9]


In [12]:
# vector Cross Product
print(type(np.cross(x,y)),np.cross(x,y))

<class 'numpy.ndarray'> [-3  6 -3]


### 벡터화(Vectorization)란 ?
![](https://wikimedia.org/api/rest_v1/media/math/render/svg/30ca6a8b796fd3a260ba3001d9875e990baad5ab)
* 행렬을 열 벡터로 변환하는 선형 변환. 
* m X n 행렬 A를 벡터화 하면 m*n X 1 벡터가 됨.
* 길이가 n인 벡터는 n차원 공간에서 점처럼 취급 될 수 있음.
    * 유클리드 거리(Euclidean Distance) : 점 사이 거리 계산
    * 코사인 유사도(Cosine Similarity) : 벡터의 유사성 계산

### 표기법
![](http://s8.picofile.com/file/8349058626/la.png)

### 행렬 연산
![](https://cdn.britannica.com/06/77706-004-31EE92F3.jpg)

In [20]:
# initializing matrices 
x = np.array([[1, 2], [4, 5]]) 
y = np.array([[7, 8], [9, 10]])

In [14]:
# Add Metrices
print(np.add(x,y)) 

[[ 8 10]
 [13 15]]


In [17]:
# Sub Metrices
print(np.subtract(x,y))

[[-6 -6]
 [-5 -5]]


In [18]:
# Divide Metrices
print(np.divide(x,y))

[[0.1429 0.25  ]
 [0.4444 0.5   ]]


In [21]:
# Multyply Metrices
print(np.multiply(x,y))

[[ 7 16]
 [36 50]]


### 벡터간 Dot Product
![](http://gamedevelopertips.com/wp-content/uploads/2017/11/image8.png)

In [35]:
# Vector Cross Product (외적)
x = [1,2,3]
y = [4,5,6]
np.cross(x,y)

array([-3,  6, -3])

In [47]:
# Vector Dot Product (벡터곱)
x = np.array([1, 2, 3, 4])
y = np.array([5, 6, 7, 8])
np.dot(x,y)

70

In [55]:
# Matrix Transform
x = np.array([[1, 2, 3, 4]])
y = np.array([[5, 6, 7, 8]])
print(y.T)
print(np.reshape(y,(4,1)))
print(y.transpose())

[[5]
 [6]
 [7]
 [8]]
[[5]
 [6]
 [7]
 [8]]
[[5]
 [6]
 [7]
 [8]]


In [59]:
# Matrix Dot Product 
print(np.dot(x,y.T))
print(np.dot(x,y.T)[0][0])

[[70]]
70


In [57]:
# Matrix Outer Product
print(x.T * y)
print(np.outer(x,y))

[[ 5  6  7  8]
 [10 12 14 16]
 [15 18 21 24]
 [20 24 28 32]]
[[ 5  6  7  8]
 [10 12 14 16]
 [15 18 21 24]
 [20 24 28 32]]


In [60]:
# Metrix-Vector Dot Product
a = np.array([[5,1,3],[1,1,1],[1,2,1]])
b = np.array([1,2,3])
print(a.dot(b))

[16  6  8]


In [61]:
# Matrix-Matrix Product
a = [[1,0],[0,1]]
b = [[4,1],[2,2]]
np.matmul(a,b)

array([[4, 1],
       [2, 2]])

In [62]:
# Matrix-Matrix Add
matrix1 = np.matrix(a)
matrix2 = np.matrix(b)
matrix1 + matrix2

matrix([[5, 1],
        [2, 3]])

In [63]:
# Matrix-Matrix Sub
matrix1 - matrix2

matrix([[-3, -1],
        [-2, -1]])

In [65]:
# Matrix-Matrix Dot Product
print(np.dot(matrix1, matrix2))
print(matrix1 * matrix2)

[[4 1]
 [2 2]]
[[4 1]
 [2 2]]


### Identity Matrix (단위행렬)
numpy.identity(n, dtype=None)

In [66]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [68]:
identy = np.array([[21,5,7],[9,8,16]])
print(identy.shape)
print(np.identity(identy.shape[1], dtype="int"))
print(np.identity(identy.shape[0], dtype="int"))

(2, 3)
[[1 0 0]
 [0 1 0]
 [0 0 1]]
[[1 0]
 [0 1]]


In [75]:
# Inverse Matrices (역행렬) : X 행렬과 곱하여 단위행렬이 되는 행렬
inverse = np.linalg.inv(matrix2)
print(inverse)
print(matrix2)

[[ 0.3333 -0.1667]
 [-0.3333  0.6667]]
[[4 1]
 [2 2]]


## Diagonal Matrix(대각 행렬)

In [77]:
A = np.array([[0,   1,  2,  3],
              [4,   5,  6,  7],
              [8,   9, 10, 11],
              [12, 13, 14, 15]])
print(np.diag(A))
print(np.diag(A,k=1))
print(np.diag(A,k=-1))

[ 0  5 10 15]
[ 1  6 11]
[ 4  9 14]


## Transpose of a Matrix (전치행렬)
행과 열을 교환

In [78]:
a = np.array([[1, 2], [3, 4]])
print(a)
print(a.transpose())

[[1 2]
 [3 4]]
[[1 3]
 [2 4]]


## Symmetric Matrix ( 대칭행렬 )
![](https://wikimedia.org/api/rest_v1/media/math/render/svg/ad8a5a3a4c95de6f7f50b0a6fb592d115fe0e95f)

In [80]:
N = 100
b = np.random.random_integers(-2000,2000,size=(N,N))
b_symm = (b + b.T)/2

  


In [81]:
b_symm

array([[  939. ,   -31. ,  -257. , ...,  -610. ,  1153. ,   525.5],
       [  -31. ,  1738. ,  -327.5, ...,  -173.5,  -963.5,  -593.5],
       [ -257. ,  -327.5,  1470. , ...,  -211. ,    82. ,   714. ],
       ...,
       [ -610. ,  -173.5,  -211. , ..., -1883. ,    91.5,  -566.5],
       [ 1153. ,  -963.5,    82. , ...,    91.5, -1429. ,  -874. ],
       [  525.5,  -593.5,   714. , ...,  -566.5,  -874. ,  -505. ]])

## The Trace
행렬의 대각 합

In [83]:
print(np.eye(3))
print(np.trace(np.eye(3)))

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
3.0


In [84]:
det = np.linalg.det(matrix2)
print(det)
print(matrix2)

6.0
[[4 1]
 [2 2]]


## Norms
