Skip to content

Latest commit

ย 

History

History
175 lines (142 loc) ยท 6.8 KB

part1.md

File metadata and controls

175 lines (142 loc) ยท 6.8 KB

์ถœ์ฒ˜

๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ 3 : https://www.hanbit.co.kr/store/books/look.php?p_code=B6627606922

์†Œ์Šค์ฝ”๋“œ : https://github.com/WegraLee/deep-learning-from-scratch-3

[DeZero] 1. ๋ฏธ๋ถ„ ์ž๋™ ๊ณ„์‚ฐ(Automatic Differential Calculation)

1. ๋ณ€์ˆ˜

๋ณ€์ˆ˜(Variable)

  • ๋ณ€์ˆ˜์™€ ๋ฐ์ดํ„ฐ๋Š” ๋ณ„๊ฐœ๋‹ค.
  • ๋ณ€์ˆ˜์— ๋ฐ์ดํ„ฐ๋ฅผ ๋Œ€์ž… ํ˜น์€ ํ• ๋‹นํ•œ๋‹ค.
  • ๋ณ€์ˆ˜๋ฅผ ๋“ค์—ฌ๋‹ค๋ณด๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋‹ค(์ฐธ์กฐํ•œ๋‹ค).

๋‹ค์ฐจ์› ๋ฐฐ์—ด(Tensor)

  • ์ฐจ์›(dimension), ์ถ•(axis) : ๋‹ค์ฐจ์› ๋ฐฐ์—ด์—์„œ ์›์†Œ์˜ ์ˆœ์„œ์˜ ๋ฐฉํ–ฅ
  • ์Šค์นผ๋ผ(scalar) : 0์ฐจ์› ๋ฐฐ์—ด
  • ๋ฒกํ„ฐ(vector) : 1์ฐจ์› ๋ฐฐ์—ด
  • ํ–‰๋ ฌ(matrix) : 2์ฐจ์› ๋ฐฐ์—ด

2. ํ•จ์ˆ˜

ํ•จ์ˆ˜(Function)

  • ์ •์˜ : ์–ด๋–ค ๋ณ€์ˆ˜๋กœ๋ถ€ํ„ฐ ๋‹ค๋ฅธ ๋ณ€์ˆ˜๋กœ์˜ ๋Œ€์‘ ๊ด€๊ณ„๋ฅผ ์ •ํ•œ ๊ฒƒ
  • ํ•ฉ์„ฑ ํ•จ์ˆ˜(composite function) : ์—ฌ๋Ÿฌ ํ•จ์ˆ˜๋กœ ๊ตฌ์„ฑ๋œ ํ•จ์ˆ˜
  • ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„(computational graph) : ๋…ธ๋“œ๋“ค์„ ํ™”์‚ดํ‘œ๋กœ ์—ฐ๊ฒฐํ•ด ๊ณ„์‚ฐ ๊ณผ์ •์„ ํ‘œํ˜„ํ•œ ๊ทธ๋ฆผ, ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์ด์šฉํ•˜๋ฉด ๊ฐ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ๋ฏธ๋ถ„์„ ํšจ์œจ์ ์œผ๋กœ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

3. ๋ฏธ๋ถ„

์ˆ˜์น˜ ๋ฏธ๋ถ„

  • ๋ฏธ๋ถ„ : (๊ทนํ•œ์œผ๋กœ ์งง์€ ์‹œ๊ฐ„์—์„œ์˜) ๋ณ€ํ™”์œจ
  • ์ „์ง„์ฐจ๋ถ„(forward difference)
  • ์ค‘์•™์ฐจ๋ถ„(centered difference)
    • ๊ทผ์‚ฌ ์˜ค์ฐจ๋ฅผ (์กฐ๊ธˆ์ด๋ผ๋„) ์ค„์ด๋Š” ๋ฐฉ๋ฒ•
  • ์ˆ˜์น˜ ๋ฏธ๋ถ„(numerical differentation) : ๊ทผ์‚ฌ๊ฐ’๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•จ์ˆ˜์˜ ๋ณ€ํ™”๋Ÿ‰์„ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•
    • (์˜ˆ์‹œ) ๋ฅผ ๋Œ€์ž…ํ•ด์„œ ๊ณ„์‚ฐ
    • ์žฅ์  : ์‰ฌ์šด ๊ตฌํ˜„
    • ๋‹จ์  : ํฐ ๊ณ„์‚ฐ ๋น„์šฉ, ์ •ํ™•๋„ ๋ฌธ์ œ(์ž๋ฆฟ์ˆ˜ ๋ˆ„๋ฝ)
  • ๊ธฐ์šธ๊ธฐ ํ™•์ธ(gradient checking) : ์—ญ์ „ํŒŒ๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ๊ตฌํ˜„ํ–ˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์ˆ˜์น˜ ๋ฏธ๋ถ„์˜ ๊ฒฐ๊ณผ ์ด์šฉ

์—ญ์ „ํŒŒ

  • ์—ญ์ „ํŒŒ(backpropagation) : ๋ณ€์ˆ˜๋ณ„ ๋ฏธ๋ถ„์„ ๊ณ„์‚ฐํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
  • ์—ฐ์‡„ ๋ฒ•์น™(chain rule) : ํ•ฉ์„ฑ ํ•จ์ˆ˜์˜ ๋ฏธ๋ถ„์€ ๊ตฌ์„ฑ ํ•จ์ˆ˜ ๊ฐ๊ฐ์„ ๋ฏธ๋ถ„ํ•œ ํ›„ ๊ณฑํ•œ ๊ฒƒ๊ณผ ๊ฐ™๋‹ค.
    • ์ผ ๋•Œ,
  • ์†์‹ค ํ•จ์ˆ˜(loss function)์˜ ๊ฐ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ๋ฏธ๋ถ„์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ
  • ์ˆœ์ „ํŒŒ์™€ ์—ญ์ „ํŒŒ์˜ ๋Œ€์‘๊ด€๊ณ„
    • ํ†ต์ƒ๊ฐ’, ํ†ต์ƒ ๊ณ„์‚ฐ(์ˆœ์ „ํŒŒ)
    • ๋ฏธ๋ถ„๊ฐ’, ๋ฏธ๋ถ„๊ฐ’์„ ๊ตฌํ•˜๊ธฐ ์œ„ํ•œ ๊ณ„์‚ฐ(์—ญ์ „ํŒŒ)
    • ๋ฅผ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ๊ฐ’์ด ํ•„์š”ํ•˜๋‹ค.
    • ์—ญ์ „ํŒŒ๋ฅผ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋จผ์ € ์ˆœ์ „ํŒŒ๋ฅผ ํ•˜๊ณ , ๊ฐ ํ•จ์ˆ˜์˜ ์ž…๋ ฅ ๋ณ€์ˆ˜์˜ ๊ฐ’์„ ๊ธฐ์–ตํ•ด๋‘ฌ์•ผ ํ•œ๋‹ค.
  • ์—ญ์ „ํŒŒ ์ž๋™ํ™”
    • Define-by-Run
    • Wengert List(or tape)
    • ๋™์  ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„(Dynamic Computational Graph)

4. ๊ตฌํ˜„

import numpy as np

class Variable:
    def __init__(self, data):
        if data is not None:
            if not isinstance(data, np.ndarray):
                raise TypeError('{}์€(๋Š”) ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.'.format(type(data)))

        self.data = data # ํ†ต์ƒ๊ฐ’
        self.grad = None # ๋ฏธ๋ถ„๊ฐ’
        self.creator = None # ํ•จ์ˆ˜์™€์˜ ๊ด€๊ณ„ (ํ•ต์‹ฌโ˜†)
    
    def set_creator(self, func):
        self.creator = func
    
    def backward(self):
        if self.grad is None:
            self.grad = np.ones_like(self.data) # dy/dy = 1

        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            x, y = f.input, f.output
            x.grad = f.backward(y.grad)
            if x.creator is not None:
                funcs.append(x.creator)

def as_array(x):
    '''0์ฐจ์› ndarray ์ธ์Šคํ„ด์Šค๋Š” ๊ณ„์‚ฐ ๊ฒฐ๊ณผ์˜ ๋ฐ์ดํ„ฐ ํƒ€์ž…์ด ๋‹ฌ๋ผ์ง€๊ธฐ ๋•Œ๋ฌธ์— ์กฐ์ •ํ•ด์•ผ ํ•œ๋‹ค.'''
    if np.isscalar(x):
        return np.array(x)
    return x

class Function:
    def __call__(self, input):
        x = input.data
        y = self.forward(x)
        output = Variable(as_array(y))
        output.set_creator(self) # ํ•จ์ˆ˜์™€์˜ ๊ด€๊ณ„ ์„ค์ •
        self.input = input # ์ž…๋ ฅ ๋ณ€์ˆ˜ ๊ธฐ์–ต
        self.output = output # ์ถœ๋ ฅ ๋ณ€์ˆ˜ ๊ธฐ์–ต
        return output

    def forward(self, x):
        raise NotImplementedError()
    
    def backward(self, gy):
        raise NotImplementedError()

# ================ ์˜ˆ์ œ ================================
# ์˜ˆ์ œ ๊ณ„์‚ฐ ํด๋ž˜์Šค ์ •์˜
class Square(Function):
    def forward(self, x):
        y = x ** 2
        return y

    def backward(self, gy):
        x = self.input.data
        gx = 2 * x * gy
        return gx

class Exp(Function):
    def forward(self, x):
        y = np.exp(x)
        return y

    def backward(self, gy):
        x = self.input.data
        gx = np.exp(x) * gy
        return gx

# ํŒŒ์ด์ฌ ํ•จ์ˆ˜๋กœ ์ •์˜
def square(x):
    return Square()(x)

def exp(x):
    return Exp()(x)

# ์˜ˆ์ œ ์‹คํ–‰
x = Variable(np.array(0.5))
y = square(exp(square(x))) # ํ•ฉ์„ฑ ํ•จ์ˆ˜
y.backward() # ๋งจ ๋งˆ์ง€๋ง‰ ๋ณ€์ˆ˜๋งŒ backward ํ˜ธ์ถœ

print(x.grad) # 3.297442541400256

5. ํ…Œ์ŠคํŠธ

  • test.py
  • ์‹คํ–‰ ๋ฐฉ๋ฒ•
    • add to last line in test.py
      unittest.main()
      and run
      python test.py
    • run test.py
      python -m unittest test.py
    • run tests/test*.py
      python -m unittest discover tests