<a href="https://colab.research.google.com/github/pphos/pytorch_tutorial/blob/master/two_layer_net_custom.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pytorch: Defining new autograd functions
内部では, それぞれのプリミティブautograd演算子は, 実際にはTensorで動作する2つの関数です.
`forward`関数は入力テンソルから出力テンソルを計算します.
`backward`関数はスカラー値に関する出力テンソルの勾配を受け取り, その同じスカラー値に関する入力テンソルの勾配を計算します.

Pytorchでは, `torch.autograd.Function`を継承したクラスを定義し, `forward`関数と`backward`関数を実装することで, 独自の`autograd`演算子を簡単に定義できます.
次に, インスタンスを作成して関数のように呼び出し, 入力データを含むTensorを渡すことで, 新しいautogradオペレータを使用できます.

例では, ReLU非線形性を実行するための独自のautograd関数を定義し, それを使用して2層ネットワークを実装します.





In [8]:
import torch

class MyReLU(torch.autograd.Function):
  """
  We can implement our custom autograd Functions by subclassing
  torch.autograd.Function and implementing the forward and backward passes
  which operate on Tensors.
  """

  @staticmethod
  def forward(ctx, input):
    """
    In the forward pass we receive a Tensor containing the input and return
    a Tensor containing the output. ctx is a context object that can be used
    to stash information for backward computation. You can cache arbitrary
    objects for use in the backward pass using the ctx.save_for_backward method.
    """
    ctx.save_for_backward(input)
    return input.clamp(min=0)

  @staticmethod
  def backward(ctx, grad_output):
    """
    In the backward pass we receive a Tensor containing the gradient of the loss
    with respect to the output, and we need to compute the gradient of the loss
    with respect to the input.
    """
    input, = ctx.saved_tensors
    grad_input = grad_output.clone()
    grad_input[input < 0] = 0
    return grad_input

dtype = torch.float
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)


learning_rate = 1e-6
for t in range(500):
  # To apply our Function, we use Function.apply method. We alias this as 'relu'
  relu = MyReLU.apply

  # Forward pass: compute predicted y using operations; we compute
  # ReLU using our custom autograd operation.
  y_pred = relu(x.mm(w1)).mm(w2)

  # Compute and print loss
  loss = (y_pred - y).pow(2).sum()
  if t % 100 == 99:
    print(t, loss.item())

  # Use autograd to compute the backward pass.
  loss.backward()

  # Update weights using gradient descent
  with torch.no_grad():
    w1 -= learning_rate * w1.grad
    w2 -= learning_rate * w2.grad

    # Manually zero the gradients after updating weights
    w1.grad.zero_();
    w2.grad.zero_();

99 349.2152099609375
199 1.0389724969863892
299 0.004940866492688656
399 0.00012938988220412284
499 2.5338729756185785e-05
