# Tensor 

## Tensor in PyTorch

However, Numpy cannnot utilize GPUs to accelerate its numerical computations, which are widely use in modern deep neural networks. GPUs can speedup 50x or greater in training, thus Numpy won't be enough for deep learning.

A PyTorch Tensor is conceptually indentical to a Numpy array. Unlike Numpy, PyTorch Tensor can utilize GPUs to accelerate the numeric computations.

Here we use PyTorch Tensors to fit a two-layer network to random data:

In [None]:
# -*- coding: utf-8 -*-

In [None]:
import torch

In [None]:
dtype = torch.float

# device = torch.device("cpu") # Uncomment this to run on CPU
device = torch.device("cuda:0") # Uncomment this to run on GPU


In [None]:
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension;

N, D_in, H, D_out = 64, 1000, 100, 10

In [None]:
# Create random input and output data

x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

In [None]:
# Randomly init weights

w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)

In [None]:
# init learning rate

learning_rate = 1e-6

In [None]:
for t in range(500):
    
    # Forward pass: compute predicted y
    
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)
    
    # Compute and print loss
    
    loss = (y_pred - y).pow(2).sum().item()
    if t % 100 == 99:
        print(t, loss)
        
    # Backprop to compute gradients of w1 and w2 with respect to loss
    
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h < 0] = 0
    grad_w1 = x.t().mm(grad_h)
    
    # Update weighs using gradient descent
    
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

-- by HanaRo, 2020/09/07