Lecture 1: Basic Knowledge of PyTorch

Zhang Huaizheng
huaizhen001@e.ntu.edu.sg
http://zhanghuaizheng.info/PyTorch4DL

Comparison: Numpy and PyTorch

In [2]:
import numpy as np

x = np.array([[1, 2, 3],
              [4, 5, 6]])
y = np.array([[2, 2, 2],
              [2, 2, 2]])
z = np.array([[3, 3, 3],
              [3, 3, 3]])

a = x * y
b = a + z
c = np.sum(b)

grad_c = 1.0
grad_b = grad_c * np.ones(b.shape)
grad_a = grad_b.copy()
grad_z = grad_b.copy()
grad_x = grad_a * y
grad_y = grad_a * x

print grad_x
print grad_y

[[ 2.  2.  2.]
 [ 2.  2.  2.]]
[[ 1.  2.  3.]
 [ 4.  5.  6.]]


In [1]:
import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([[1, 2, 3],
                           [4, 5, 6]]), requires_grad=True)
y = Variable(torch.Tensor([[2, 2, 2],
                           [2, 2, 2]]), requires_grad=True)
z = Variable(torch.Tensor([[3, 3, 3],
                           [3, 3, 3]]), requires_grad=True)

a = x * y
b = a + z
c = torch.sum(b)

c.backward()

print x.grad.data
print y.grad.data


 2  2  2
 2  2  2
[torch.FloatTensor of size 2x3]


 1  2  3
 4  5  6
[torch.FloatTensor of size 2x3]



Comparison: Tensorflow and PyTorch

In [3]:
import tensorflow as tf
import numpy as np

N, D_in, H, D_out = 64, 1000, 100, 10

x = tf.placeholder(tf.float32, shape=(None, D_in))
y = tf.placeholder(tf.float32, shape=(None, D_out))

w1 = tf.Variable(tf.random_normal((D_in, H)))
w2 = tf.Variable(tf.random_normal((H, D_out)))

h = tf.matmul(x, w1)
h_relu = tf.maximum(h, tf.zeros(1))
y_pred = tf.matmul(h_relu, w2)

loss = tf.reduce_sum((y - y_pred) ** 2.0)

grad_w1, grad_w2 = tf.gradients(loss, [w1, w2])

learning_rate = 1e-6
new_w1 = w1.assign(w1 - learning_rate * grad_w1)
new_w2 = w2.assign(w2 - learning_rate * grad_w2)


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    x_value = np.random.randn(N, D_in)
    y_value = np.random.randn(N, D_out)
    
    for i in range(500):
        
        loss_value, grad_w1_value, grad_w1_value = sess.run([loss, new_w1, new_w2],
                                    feed_dict={x: x_value, y: y_value})
        print("Iteration " + str(i) + "th: the loss is " + str(loss_value))

Iteration 0th: the loss is 2.6601e+07
Iteration 1th: the loss is 2.04535e+07
Iteration 2th: the loss is 1.81108e+07
Iteration 3th: the loss is 1.69335e+07
Iteration 4th: the loss is 1.56103e+07
Iteration 5th: the loss is 1.36054e+07
Iteration 6th: the loss is 1.10008e+07
Iteration 7th: the loss is 8.25942e+06
Iteration 8th: the loss is 5.86117e+06
Iteration 9th: the loss is 4.0136e+06
Iteration 10th: the loss is 2.71823e+06
Iteration 11th: the loss is 1.85464e+06
Iteration 12th: the loss is 1.29425e+06
Iteration 13th: the loss is 930967.0
Iteration 14th: the loss is 693112.0
Iteration 15th: the loss is 533728.0
Iteration 16th: the loss is 423590.0
Iteration 17th: the loss is 344818.0
Iteration 18th: the loss is 286715.0
Iteration 19th: the loss is 242350.0
Iteration 20th: the loss is 207456.0
Iteration 21th: the loss is 179340.0
Iteration 22th: the loss is 156249.0
Iteration 23th: the loss is 136985.0
Iteration 24th: the loss is 120713.0
Iteration 25th: the loss is 106843.0
Iteration 2

Iteration 328th: the loss is 0.00568513
Iteration 329th: the loss is 0.0054427
Iteration 330th: the loss is 0.0052148
Iteration 331th: the loss is 0.00499167
Iteration 332th: the loss is 0.00478122
Iteration 333th: the loss is 0.00457983
Iteration 334th: the loss is 0.00438425
Iteration 335th: the loss is 0.00420082
Iteration 336th: the loss is 0.00402088
Iteration 337th: the loss is 0.00385536
Iteration 338th: the loss is 0.00369365
Iteration 339th: the loss is 0.003541
Iteration 340th: the loss is 0.00339648
Iteration 341th: the loss is 0.00325588
Iteration 342th: the loss is 0.00312562
Iteration 343th: the loss is 0.00299736
Iteration 344th: the loss is 0.00287562
Iteration 345th: the loss is 0.00275953
Iteration 346th: the loss is 0.00264764
Iteration 347th: the loss is 0.00254055
Iteration 348th: the loss is 0.00244003
Iteration 349th: the loss is 0.00234414
Iteration 350th: the loss is 0.00225141
Iteration 351th: the loss is 0.00216482
Iteration 352th: the loss is 0.00207677
Iter

In [6]:
import torch
from torch.autograd import Variable

N, D_in, H, D_out = 64, 1000, 100, 10

x = Variable(torch.randn(N, D_in), requires_grad=False)
y = Variable(torch.randn(N, D_out), requires_grad=False)

w1 = Variable(torch.randn(D_in, H), requires_grad=True)
w2 = Variable(torch.randn(H, D_out), requires_grad=True)

learning_rate = 1e-6

for i in range(500):
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)
    
    loss = (y_pred - y).pow(2).sum()
    loss.backward()

    w1.data -= learning_rate * w1.grad.data
    w2.data -= learning_rate * w2.grad.data

    w1.grad.data.zero_()
    w2.grad.data.zero_()
    
    print("Iteration " + str(i) + "th: the loss is " + str(loss))

Iteration 0th: the loss is Variable containing:
 3.0351e+07
[torch.FloatTensor of size 1]

Iteration 1th: the loss is Variable containing:
 2.3290e+07
[torch.FloatTensor of size 1]

Iteration 2th: the loss is Variable containing:
 2.0736e+07
[torch.FloatTensor of size 1]

Iteration 3th: the loss is Variable containing:
 1.9158e+07
[torch.FloatTensor of size 1]

Iteration 4th: the loss is Variable containing:
 1.7107e+07
[torch.FloatTensor of size 1]

Iteration 5th: the loss is Variable containing:
 1.4143e+07
[torch.FloatTensor of size 1]

Iteration 6th: the loss is Variable containing:
 1.0756e+07
[torch.FloatTensor of size 1]

Iteration 7th: the loss is Variable containing:
 7.5778e+06
[torch.FloatTensor of size 1]

Iteration 8th: the loss is Variable containing:
 5.1011e+06
[torch.FloatTensor of size 1]

Iteration 9th: the loss is Variable containing:
1.00000e+06 *
  3.3720
[torch.FloatTensor of size 1]

Iteration 10th: the loss is Variable containing:
1.00000e+06 *
  2.2544
[torch.

Iteration 122th: the loss is Variable containing:
 53.1746
[torch.FloatTensor of size 1]

Iteration 123th: the loss is Variable containing:
 50.0298
[torch.FloatTensor of size 1]

Iteration 124th: the loss is Variable containing:
 47.0818
[torch.FloatTensor of size 1]

Iteration 125th: the loss is Variable containing:
 44.3189
[torch.FloatTensor of size 1]

Iteration 126th: the loss is Variable containing:
 41.7269
[torch.FloatTensor of size 1]

Iteration 127th: the loss is Variable containing:
 39.2963
[torch.FloatTensor of size 1]

Iteration 128th: the loss is Variable containing:
 37.0145
[torch.FloatTensor of size 1]

Iteration 129th: the loss is Variable containing:
 34.8742
[torch.FloatTensor of size 1]

Iteration 130th: the loss is Variable containing:
 32.8632
[torch.FloatTensor of size 1]

Iteration 131th: the loss is Variable containing:
 30.9750
[torch.FloatTensor of size 1]

Iteration 132th: the loss is Variable containing:
 29.2015
[torch.FloatTensor of size 1]

Iteration 

Iteration 251th: the loss is Variable containing:
1.00000e-02 *
  6.1374
[torch.FloatTensor of size 1]

Iteration 252th: the loss is Variable containing:
1.00000e-02 *
  5.8501
[torch.FloatTensor of size 1]

Iteration 253th: the loss is Variable containing:
1.00000e-02 *
  5.5734
[torch.FloatTensor of size 1]

Iteration 254th: the loss is Variable containing:
1.00000e-02 *
  5.3146
[torch.FloatTensor of size 1]

Iteration 255th: the loss is Variable containing:
1.00000e-02 *
  5.0639
[torch.FloatTensor of size 1]

Iteration 256th: the loss is Variable containing:
1.00000e-02 *
  4.8271
[torch.FloatTensor of size 1]

Iteration 257th: the loss is Variable containing:
1.00000e-02 *
  4.5994
[torch.FloatTensor of size 1]

Iteration 258th: the loss is Variable containing:
1.00000e-02 *
  4.3853
[torch.FloatTensor of size 1]

Iteration 259th: the loss is Variable containing:
1.00000e-02 *
  4.1792
[torch.FloatTensor of size 1]

Iteration 260th: the loss is Variable containing:
1.00000e-02 *


Iteration 394th: the loss is Variable containing:
1.00000e-04 *
  2.1739
[torch.FloatTensor of size 1]

Iteration 395th: the loss is Variable containing:
1.00000e-04 *
  2.1249
[torch.FloatTensor of size 1]

Iteration 396th: the loss is Variable containing:
1.00000e-04 *
  2.0693
[torch.FloatTensor of size 1]

Iteration 397th: the loss is Variable containing:
1.00000e-04 *
  2.0214
[torch.FloatTensor of size 1]

Iteration 398th: the loss is Variable containing:
1.00000e-04 *
  1.9747
[torch.FloatTensor of size 1]

Iteration 399th: the loss is Variable containing:
1.00000e-04 *
  1.9237
[torch.FloatTensor of size 1]

Iteration 400th: the loss is Variable containing:
1.00000e-04 *
  1.8857
[torch.FloatTensor of size 1]

Iteration 401th: the loss is Variable containing:
1.00000e-04 *
  1.8430
[torch.FloatTensor of size 1]

Iteration 402th: the loss is Variable containing:
1.00000e-04 *
  1.7994
[torch.FloatTensor of size 1]

Iteration 403th: the loss is Variable containing:
1.00000e-04 *
