# Gradient Descent

- 실제로 Gradient Descent를 이용할 때는 아래와 같이 복잡하게 사용 X
- 이전처럼 함수나 모듈을 통해 가져옴

In [1]:
import torch
import torch.nn.functional as F

In [2]:
target = torch.FloatTensor([[.1, .2, .3],
                            [.4, .5, .6],
                            [.7, .8, .9]])

![image](https://user-images.githubusercontent.com/105966480/210149672-119345e2-ce8c-48a1-beb8-e63405c783a4.png)
- .1, .2 => 표기 실수

In [3]:
x = torch.rand_like(target)
# This means the final scalar will be differentiate by x.
x.requires_grad = True
# You can get gradient of x, after differentiation.

x

tensor([[0.5931, 0.8655, 0.9750],
        [0.9080, 0.1375, 0.1968],
        [0.9310, 0.0709, 0.2003]], requires_grad=True)

---
- x가 target에 가까워질 수록, Loss값도 작아짐

![image](https://user-images.githubusercontent.com/105966480/210149713-b257cd21-9bb4-4d1b-ad3b-922c50a62252.png)

In [4]:
loss = F.mse_loss(x, target)

loss

tensor(0.3076, grad_fn=<MseLossBackward0>)

- 아래 함수 로직
    - 전체 흐름
        - ![image](https://user-images.githubusercontent.com/105966480/210149762-73ea714d-2b9d-4ebe-b360-1e0ee709036a.png)
    ---
    - while loss > threshold:
        - loss가 threshold보다 작을 경우
    ---
    - loss.backward()
        - ![image](https://user-images.githubusercontent.com/105966480/210149830-498a5c6c-812a-4666-9572-abc43bd5e56b.png)
        - loss를 x로 미분
    ---
    - x = x - learning_rate * x.grad
        - ![image](https://user-images.githubusercontent.com/105966480/210149932-8edd8e98-fefc-4049-960a-7b13717e9215.png)
    ---
    - x.detach_()
        - ![image](https://user-images.githubusercontent.com/105966480/210150061-03f86369-8aaa-4a73-812a-c3ef353b7895.png)
        - 현재 detach를 완전히 이해할 수 없음 -> 차후 다뤄볼 예정

In [5]:
threshold = 1e-5
learning_rate = 1.
iter_cnt = 0

while loss > threshold: # loss가 threshold보다 작을 경우
    iter_cnt += 1
    
    loss.backward() # Calculate gradients. # loss Scala값을 미분해라
    
    x = x - learning_rate * x.grad # 미분을 진행했기 때문에, x.grad(저장된 미분값 호출)값이 존재함
    
    # You don't need to aware this now.
    x.detach_()
    x.requires_grad_(True)
    
    loss = F.mse_loss(x, target)
    
    print('%d-th Loss: %.4e' % (iter_cnt, loss))
    print(x)

1-th Loss: 1.8607e-01
tensor([[0.4836, 0.7176, 0.8250],
        [0.7951, 0.2180, 0.2864],
        [0.8796, 0.2330, 0.3558]], requires_grad=True)
2-th Loss: 1.1256e-01
tensor([[0.3983, 0.6026, 0.7083],
        [0.7073, 0.2807, 0.3561],
        [0.8397, 0.3590, 0.4767]], requires_grad=True)
3-th Loss: 6.8091e-02
tensor([[0.3320, 0.5131, 0.6176],
        [0.6390, 0.3294, 0.4103],
        [0.8087, 0.4570, 0.5708]], requires_grad=True)
4-th Loss: 4.1191e-02
tensor([[0.2805, 0.4435, 0.5470],
        [0.5859, 0.3673, 0.4525],
        [0.7845, 0.5332, 0.6439]], requires_grad=True)
5-th Loss: 2.4918e-02
tensor([[0.2404, 0.3894, 0.4921],
        [0.5446, 0.3968, 0.4852],
        [0.7657, 0.5925, 0.7008]], requires_grad=True)
6-th Loss: 1.5074e-02
tensor([[0.2092, 0.3473, 0.4494],
        [0.5125, 0.4197, 0.5107],
        [0.7511, 0.6386, 0.7451]], requires_grad=True)
7-th Loss: 9.1187e-03
tensor([[0.1849, 0.3146, 0.4162],
        [0.4875, 0.4376, 0.5306],
        [0.7398, 0.6745, 0.7795]], requi