## 2. Gram-Schmidt 

### 2.1. 개요
  
  - 직교화(orthogonalization)는 벡터공간에서 서로 직교하는 정규 기저1(orthonormal basis)를 찾는 과정으로서 수치적 선형대수가 활용되는 대부분의 분야에서 빈번하게 사용됨.
  
  - Gram-Schmidt 과정은 직교화의 방법 중 하나로 다음과 같은 알고리즘으로 진행.
  
    <img src = "images/image12.png" width="70%" height="70%">

    <img src = "images/image13.png" width="70%" height="70%">

  - 이를 전개하여 표현하면 다음곽 같음.

  $$\bf{u} _1 = \bf{v} _1$$
  
  $$\bf{u} _2 = \bf{v} _2 - \frac{\langle \bf{v _2}, \bf{u}_{1} \rangle}{\left\| \bf{u}_{1} \right\| ^{2}} \bf{u}_{1} $$

  $$\bf{u} _3 = \bf{v} _3 - \frac{\langle \bf{v _3}, \bf{u}_{2} \rangle}{\left\| \bf{u}_{2} \right\| ^{2}} \bf{u}_{2}-\frac{\langle \bf{v _3}, \bf{u}_{1} \rangle}{\left\| \bf{u}_{1} \right\| ^{2}} \bf{u}_{1}$$
    
  $$ \vdots $$
  
  $$ \bf{u} _n 
    = \bf{v} _n - \frac{\langle \bf{v _n}, \bf{u}_{n-1} \rangle}{\left\| \bf{u}_{n-1} \right\| ^{2}} \bf{u}_{n-1}
    - \frac{\langle \bf{v _n}, \bf{u}_{n-2} \rangle}{\left\| \bf{u}_{n-2} \right\| ^{2}} \bf{u}_{n-2}
    - \cdots 
    - \frac{\langle \bf{v _n}, \bf{u}_{1} \rangle}{\left\| \bf{u}_{1} \right\| ^{2}} \bf{u}_{1}$$

  - 2차원 벡터에 대한 Gram-Schmidt 과정
  
  <img src = "images/image14.png">


### 2.2. QR 분해

  - 다음과 같이 $\bf{A}$를 분해

  $$ \bf{A} = \bf{QR} $$

  - 여기서 $\bf{Q}$의 열벡터는 $\bf{A}$의 열벡터가 span하는 공간 $S$의 직교 기저 벡터이며, $\bf{R}$는 Upper triangular matrix임
  
  - $\bf{A}$ 의 열벡터 ${ \bf{a}_i}$ 들은 Gram-Schmidt 과정을 거쳐 얻어진 정규 직교 벡터 ${ (\bf{u}_1, \bf{u}_2, \cdots, \bf{u}_n)}$ 에 대해 다음을 만족
  
  $$ \bf{a} _i
  = \langle \bf{a} _i, \bf{u}_{1} \rangle \bf{u}_{1} + 
  \langle \bf{a} _i, \bf{u}_{2} \rangle \bf{u}_{2} + \cdots +
  \langle \bf{a} _i, \bf{u}_{n} \rangle \bf{u}_{n} $$

In [21]:
import numpy as np

# 10개의 20차원 열벡터
N = 10
D = 20
v = np.random.rand(D, N)

np.save("v", v)

In [23]:
# QR 분해 : Q의 열벡터가 정규기저벡터
import numpy as np
np.set_printoptions(linewidth=np.inf)

v = np.load("v.npy")
print("v :")
print(v)

Q, R = np.linalg.qr(v)

print("Q :")
print(Q)

print("R :")
print(R)

print("Norm Q[0] :")
print(np.linalg.norm(Q[0]))

v :
[[0.61448168 0.07235556 0.4389395  0.42730344 0.94697236 0.04387906 0.52548623 0.21785031 0.17991593 0.56850956]
 [0.92986266 0.46505684 0.19566946 0.7762281  0.98902623 0.87480525 0.54677156 0.50492006 0.13778494 0.866273  ]
 [0.30793052 0.39547751 0.16468631 0.07224246 0.19774108 0.26347682 0.12051584 0.77731454 0.16580076 0.44427551]
 [0.96550275 0.39391238 0.61150515 0.13110477 0.14339658 0.02383516 0.0907576  0.27775854 0.82333924 0.91207443]
 [0.46645976 0.63141524 0.90411531 0.69596225 0.18017131 0.57235858 0.71536311 0.57107034 0.58262199 0.12221966]
 [0.50915522 0.15115154 0.96388314 0.36635454 0.98525873 0.6555545  0.4427118  0.84264701 0.8772291  0.19321914]
 [0.24059734 0.29576895 0.07977821 0.46558081 0.09223574 0.39204769 0.37072194 0.08074988 0.01277274 0.77276581]
 [0.85429808 0.9798355  0.96245775 0.02371648 0.2182037  0.98055483 0.80745034 0.77889045 0.53841057 0.88293821]
 [0.55086635 0.60137301 0.26206061 0.74399306 0.74278049 0.47858085 0.42840647 0.27572742 0.

1. 순차코드
   
  - 열벡터를 행벡터로 전환
  - 메모리 접근이 효율적임

In [25]:
import numpy as np

V = np.transpose(v).copy()
for j in range (N) :
    for i in range (j) :
        coef = -np.dot(V[i], V[j])
        V[j] = V[j] + coef * V[i]
    coef = np.linalg.norm(V[j])
    V[j] = V[j] / coef

np.set_printoptions(linewidth=np.inf)
print("V transpose : ")
print(V)

print("Norm ( V_T [0] ) : ")
print(np.linalg.norm(np.transpose(V)[0]))

V transpose : 
[[ 0.23544719  0.3562898   0.11798786  0.3699458   0.17873054  0.1950899   0.09218822  0.32733618  0.2110721   0.12105036  0.23745621  0.32619763  0.15919549  0.15222175  0.10799139  0.03186767  0.14950309  0.2818894   0.30566781  0.05249678]
 [-0.24667318 -0.15797181  0.09390456 -0.2179003   0.16182996 -0.1491683   0.06535864  0.1894647   0.10373653  0.4043465  -0.0235409  -0.19256324  0.01127239 -0.09968802  0.32790184  0.50868291 -0.02851126  0.16621913  0.04042528  0.3885604 ]
 [ 0.07978279 -0.36205536 -0.14463915 -0.04407944  0.24251831  0.46455595 -0.14313909  0.01248916 -0.23965375 -0.05691462  0.02877124  0.19003568 -0.25938004  0.20904565 -0.24222294  0.25049448  0.361977   -0.1416913   0.02518605  0.25032944]
 [ 0.06208642  0.13049737 -0.14305212 -0.3751843   0.24151523  0.04906853  0.21082054 -0.51160987  0.2441057  -0.06619053  0.08455435  0.09169425  0.2222337   0.3123466   0.22179421 -0.14066351  0.18462567  0.1508432  -0.27517353  0.15148376]
 [ 0.31344672

2. 병렬코드

  - 의존성 분석
    - $u_i$들은 순차적으로 계산됨 : $u_i$를 위해 $u_{i-1}, \cdots, u_{1}$이 필요
    - $u_i, v_i$의 각 $i$ 성분 계산은 독립적임 : 이를 나누어 계산하며 벡터 분할과 동일
  
    <img src = "images/image03.png">

  - 행 대신 열방향으로 분해 : 각각의 벡터를 chunk단위로 분해
  
    <img src = "images/image25.png">

In [30]:
%%writefile gs.py
import numpy as np
from mpi4py import MPI
from tools import para_range

np.set_printoptions(linewidth=np.inf)

N = 10
D = 20

comm = MPI.COMM_WORLD

size = comm.Get_size()
rank = comm.Get_rank()

if rank == 0 : 
    v = np.load("v.npy")
    V = np.transpose(v).copy()
else :
    V = np.empty((N, D), dtype = np.float64)

ista, iend = para_range(D, size, rank)

chunk = iend - ista + 1

# Scatterv에 필요한 cnts와 disp 계산
cnts = comm.allgather(chunk) #FIX ME
disp = []
for i in range (size) :
    disp.append(sum(cnts[:i])) #FIX ME

V_chunk = np.empty([N, chunk], dtype = np.float64)

# V의 분할. 모든 V[i]에 대한 분할 필요
for i in range (N) :
    comm.Scatterv( (V[i],cnts), V_chunk[i], root = 0) #FIX ME
    
# 분할 계산후 reduction
for j in range (N) :
    for i in range (j) :
        coef = -np.dot(V_chunk[i], V_chunk[j]) #FIX ME
        coef_all = comm.allreduce(coef) #FIX ME
        V_chunk[j] = V_chunk[j] + coef_all * V_chunk[i]
    coef = np.dot(V_chunk[j], V_chunk[j])
    coef_all = comm.allreduce(coef) #FIX ME
    V_chunk[j] = V_chunk[j] / np.sqrt(coef_all)

# print(V_chunk)

for i in range (N) :
    comm.Gatherv( V_chunk[i], (V[i],cnts), root = 0)

if rank == 0 :
    print(V)
    


Overwriting gs.py


In [31]:
! mpirun -np 4 python gs.py

[[ 0.23544719  0.3562898   0.11798786  0.3699458   0.17873054  0.1950899   0.09218822  0.32733618  0.2110721   0.12105036  0.23745621  0.32619763  0.15919549  0.15222175  0.10799139  0.03186767  0.14950309  0.2818894   0.30566781  0.05249678]
 [-0.24667318 -0.15797181  0.09390456 -0.2179003   0.16182996 -0.1491683   0.06535864  0.1894647   0.10373653  0.4043465  -0.0235409  -0.19256324  0.01127239 -0.09968802  0.32790184  0.50868291 -0.02851126  0.16621913  0.04042528  0.3885604 ]
 [ 0.07978279 -0.36205536 -0.14463915 -0.04407944  0.24251831  0.46455595 -0.14313909  0.01248916 -0.23965375 -0.05691462  0.02877124  0.19003568 -0.25938004  0.20904565 -0.24222294  0.25049448  0.361977   -0.1416913   0.02518605  0.25032944]
 [ 0.06208642  0.13049737 -0.14305212 -0.3751843   0.24151523  0.04906853  0.21082054 -0.51160987  0.2441057  -0.06619053  0.08455435  0.09169425  0.2222337   0.3123466   0.22179421 -0.14066351  0.18462567  0.1508432  -0.27517353  0.15148376]
 [ 0.31344672  0.28628083  0