# 01 Introduction into PyTorch 

### Date: Jan 8 2018
### Author: Farahana

Source : 
    1. https://github.com/jcjohnson/pytorch-examples
    2. https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials 

In [1]:
# Initialization
import torch as tc
import numpy as np

### Tensors
In numpy, it is np.ndarray, while in pytorch, it is tc.Tensor, To get random.rand, use:

In [2]:
x1 = np.random.rand(5,3)
print(x1)

[[ 0.10524064  0.55441554  0.63890575]
 [ 0.31140732  0.92061839  0.09061959]
 [ 0.52697401  0.84787463  0.4340737 ]
 [ 0.81852176  0.73008139  0.98265122]
 [ 0.84791687  0.51823401  0.25201279]]


In [3]:
x2 = tc.rand(5,3) #tc.rand() returns uniform distribution
print(x2)


 0.7258  0.5079  0.5913
 0.2510  0.9606  0.9861
 0.4332  0.7519  0.6741
 0.9888  0.6890  0.5213
 0.4498  0.7290  0.2245
[torch.FloatTensor of size 5x3]



However, the torch tensor has the GPU computation. Simply change the datatype

In [4]:
dtype = tc.cuda.FloatTensor
# dtype = tc.FloatTensor

In [5]:
x = tc.randn(5,3).type(dtype) #tc.randn() returns normal distribution
print(x)


 0.1063 -0.2758  0.9599
 1.1637 -1.0765  0.0800
 1.2271 -0.4760  0.1590
-1.9362 -0.0719 -0.0749
-0.8377 -1.5127 -1.7474
[torch.cuda.FloatTensor of size 5x3 (GPU 0)]



Or simply use condition to run on GPU;

In [6]:
x = tc.rand(5,3)
y = tc.rand(5,3)

if tc.cuda.is_available():
    x = x.cuda()
    y = y.cuda()
    total = x+y
    
print (total)    


 1.1888  0.4248  1.0202
 0.6308  1.1899  1.6353
 1.1508  0.7552  0.5693
 1.2280  0.9363  1.0905
 1.1289  1.4109  1.5023
[torch.cuda.FloatTensor of size 5x3 (GPU 0)]



There are many other tensor type on Torch. shown is FloatTensor and how to use it for GPU computation.

---------------------------------------------

### Variable
It is a thin wrapper around Torch Tensors. It records the operation applied into Tensors. 

In [7]:
from torch.autograd import Variable

Let us create input and output variables.

In [8]:
N, D_in, H, D_out = 24, 1000, 100, 4

In [9]:
x = Variable(tc.randn(N, D_in).type(dtype), requires_grad=False) # input
y = Variable(tc.randn(N, D_out).type(dtype), requires_grad=False) # output

Buliding simple loss function to get weights for y_pred

In [10]:
w1 = Variable(tc.randn(D_in, H).type(dtype), requires_grad=True) # weights_1
w2 = Variable(tc.randn(H, D_out).type(dtype), requires_grad=True) # weights_2

In [11]:
learning_rate = 1e-6
for t in range(200):
    y_pred = x.mm(w1).clamp(min=0).mm(w2)
    loss = (y_pred - y).pow(2).sum()
    print(t, loss.data[0])
    
    loss.backward()
    
    w1.data -= learning_rate * w1.grad.data
    w2.data -= learning_rate * w2.grad.data
    
    w1.grad.data.zero_()
    w2.grad.data.zero_()

0 4394089.5
1 1786397.25
2 1225609.125
3 863376.0625
4 619790.75
5 451727.875
6 333100.59375
7 248202.625
8 186722.3125
9 141566.671875
10 108027.453125
11 82892.1640625
12 63915.59765625
13 49500.7265625
14 38515.0703125
15 30122.875
16 23647.798828125
17 18619.017578125
18 14697.9609375
19 11630.669921875
20 9224.2080078125
21 7332.38037109375
22 5844.6279296875
23 4676.1005859375
24 3748.11083984375
25 3010.537841796875
26 2421.616943359375
27 1951.6900634765625
28 1575.3828125
29 1273.6552734375
30 1030.9921875
31 835.557861328125
32 677.9508666992188
33 550.6835327148438
34 447.78271484375
35 364.4840393066406
36 296.96429443359375
37 242.5496826171875
38 198.32374572753906
39 162.31854248046875
40 132.96420288085938
41 109.01695251464844
42 89.44876098632812
43 73.45408630371094
44 60.36873245239258
45 49.64921951293945
46 40.86111831665039
47 33.65228271484375
48 27.733482360839844
49 22.86996078491211
50 18.871492385864258
51 15.579492568969727
52 12.87059497833252
53 10.636991

In [12]:
print(y,y_pred)

Variable containing:
 0.5374  1.1459  0.5080  0.2277
 1.0326 -1.5230 -1.2499 -0.7892
-0.4449 -2.1612  1.0032  1.3698
 0.2738 -0.0921 -0.7533 -0.8807
 0.0855  0.1675  0.2585 -0.6481
 0.4286  0.0183  1.0202 -1.8127
-0.7092  1.6531 -2.0067 -1.0229
-0.1900  0.4437 -2.5814 -0.3400
-1.3135  0.8108  1.4490 -1.3611
-0.0259  0.8712  0.7864  0.1123
-0.2024 -0.6729 -1.9330 -0.4568
 0.3092  0.2651  0.1062  1.2880
 0.7365  2.1592 -0.3872  0.5021
-0.2911  0.5525 -1.0399 -1.0132
-1.4623  0.8358  0.7077  1.6616
 0.0307  0.7101 -0.6586 -1.9216
 3.5645 -0.6676 -0.0563 -1.7623
 1.9317 -0.5456  0.1401 -0.8864
-0.4066 -1.0957  2.1662  0.0614
 0.9462  0.2138  0.4221  0.8907
-0.9819 -1.8990 -0.6634 -0.4846
-0.4006 -1.3969 -0.6194 -1.2223
-1.0976 -1.9377 -1.3791 -0.3372
 1.4102  0.2370 -1.0349  0.2238
[torch.cuda.FloatTensor of size 24x4 (GPU 0)]
 Variable containing:
 0.5374  1.1461  0.5080  0.2279
 1.0326 -1.5230 -1.2499 -0.7892
-0.4448 -2.1612  1.0033  1.3698
 0.2739 -0.0920 -0.7533 -0.8807
 0.0855  0.1674

In [13]:
loss

Variable containing:
1.00000e-06 *
  1.3889
[torch.cuda.FloatTensor of size 1 (GPU 0)]

We will learn the Autograd module in next section