# PyTorch's tensor Library
The most of PyTorch operations are running on **tensors**. A tensors is an multidimensional array. Lets have a look on basic tensor operations. But first, lets import some important PyTorch libraries:
* **torch** - A Tensor library similar to NumPy, with strong GPU support
* **torch.autograd** -a "tape-based" automatic differentiation library
* **torch.nn** - a neural networks lirary deeply integrated with autograd
* **torch.optim** - an optimizer package to be used with torch.nn with stantdard optimization methods such as SGD, RMPSProp,LBFGS, Adam etc.

In [6]:
import torch
import torch.autograd as autograd
import torch.nn as nn
import torch.optim as optim

torch.manual_seed(123)

<torch._C.Generator at 0x7f09e23548b0>

# Creating Tensors
Tensors can be created form Python lists with the **torch.Tensor()** function.

In [7]:
# Create a torch.Tensor object from python list
v = [1,2,3]
print(type(v))
v_tensor = torch.Tensor(v)
print(type(v_tensor))

<class 'list'>
<class 'torch.Tensor'>


In [8]:
# Creating a torch.Tensor object of size 2x3 form 3x3 matrix
m1 = [1,2,3],[4,5,6]
m1_tensor = torch.Tensor(m1)
print(m1_tensor)
print(type(m1_tensor))

tensor([[1., 2., 3.],
        [4., 5., 6.]])
<class 'torch.Tensor'>


In [9]:
# Create a 3D torch.Tensor object of size 3x3x3.
m2 = [[[1,2,3],[2,3,4],[8,5,4]],
      [[1,3,2],[3,4,5],[9,0,1]],
      [[1,4,5],[5,4,6],[8,7,6]]]

m2_tensor = torch.Tensor(m2)
m2_tensor

tensor([[[1., 2., 3.],
         [2., 3., 4.],
         [8., 5., 4.]],

        [[1., 3., 2.],
         [3., 4., 5.],
         [9., 0., 1.]],

        [[1., 4., 5.],
         [5., 4., 6.],
         [8., 7., 6.]]])

In [11]:
# Similarly creating random data of 4D Tensor  with dimension with torh.randn

m3_tensor = torch.randn((4,3,3,3))
m3_tensor.shape
print(m3_tensor)

tensor([[[[-2.7202,  0.5421, -1.1541],
          [ 0.7763, -0.2582, -2.0407],
          [-0.8016, -0.8183, -0.0480]],

         [[ 0.5349,  1.1031,  1.3334],
          [-1.4053, -0.5922, -0.2548],
          [ 1.1517,  0.8138,  0.6532]],

         [[ 0.6557, -1.4056, -1.2743],
          [ 0.4513, -0.2280,  0.9224],
          [ 0.8566,  0.6465,  1.2782]]],


        [[[ 2.5501, -0.3018, -0.6703],
          [-0.6171, -0.8334,  0.5663],
          [ 1.0306, -0.3047,  1.6873]],

         [[ 0.6851,  2.0024, -0.5469],
          [ 1.6014, -0.3016, -0.7074],
          [-0.1465, -0.4943, -1.1766]],

         [[-2.0524,  0.1132,  1.4353],
          [-1.1454, -1.3316,  0.2230],
          [ 0.6463,  0.1538, -0.4452]]],


        [[[ 0.5503,  0.0658,  0.2225],
          [-0.1689, -0.5455,  0.2487],
          [ 0.1343,  0.7662,  2.2760]],

         [[-1.3255, -1.0590,  0.0801],
          [ 0.3531, -0.1207, -0.9797],
          [-2.1126, -0.2721, -0.3510]],

         [[-1.6483,  0.1536, -0.1807],
     

# Multidimensional tensors

Since we frequently use the tensoer of size n>3 dimensions,thefore  it is so important to understand. The best wat to think of a higher(n) dimensional object (and tensor in particular) is as a contianer which keeps a series of n-1 dimensional objects "inside" of it. We can "put out" these "inner" objects by indexing into higher dimesional tensor cotainer. Let's have a lock on some examples:
* For a vector v(dim(v) = 1), indexing into it ('pulling out of it) returns its "slice" -a scalar s (dim(s) = 0).
* For a matrix, indexing into it returns its "slice" -a (row or column) vector.
* 3D tensor can be  seen as cube or 3D rectangular consisting of horizontally "stacked" matrices. So if we index into a such tensor it will give us its which is a matrix!.
* We can't easily visualize 5D (or n-D) tensors, but the idea is actually the same if we index in to them, we will pull out an object of dimension n-1.

In [15]:
print(m1[0])
print(m2[0])
print(m3_tensor[0])

[1, 2, 3]
[[1, 2, 3], [2, 3, 4], [8, 5, 4]]
tensor([[[-2.7202,  0.5421, -1.1541],
         [ 0.7763, -0.2582, -2.0407],
         [-0.8016, -0.8183, -0.0480]],

        [[ 0.5349,  1.1031,  1.3334],
         [-1.4053, -0.5922, -0.2548],
         [ 1.1517,  0.8138,  0.6532]],

        [[ 0.6557, -1.4056, -1.2743],
         [ 0.4513, -0.2280,  0.9224],
         [ 0.8566,  0.6465,  1.2782]]])


# Operation with Tensors
 As we can operate on tensors in the ways we would expect. here belo are some operations

In [16]:
x = torch.Tensor([1,2,3])
y = torch.Tensor([4,5,6])
print(x)
print(y)

tensor([1., 2., 3.])
tensor([4., 5., 6.])


In [18]:
w = torch.matmul(x,y)
print(w)

tensor(32.)


**Concatenation**

In [19]:
# By defualt, it concatenates along the axis with 0 (rows). It's "stacking" the rows.

x_1 = torch.randn(2,5)
print(x_1)

tensor([[ 0.8552,  0.7492, -1.7119,  0.6025, -0.7018],
        [-1.3130,  0.1574,  2.0114,  0.1004,  0.8222]])


In [21]:
y_1 = torch.randn(3,5)
print(y_1)

tensor([[ 0.8985,  0.6210, -0.9679,  0.6740, -1.2828],
        [-0.5097,  0.1464, -0.4860, -0.7529,  1.6989],
        [ 0.4991, -2.1702,  0.5130, -1.9029,  0.8260]])


In [25]:
z_1 = torch.cat([x_1,y_1])
z_1.shape

torch.Size([5, 5])

In [30]:
# Second arg specifies which axis to concat along, Here we select 1 (column). It's attaching the columns
x_2 = torch.randn(2,3)
y_2 = torch.randn(2,5)
z_2 = torch.cat([x_2,y_2] ,1)
z_2.shape

# If Tensors are not compatible, torch will complain.

torch.Size([2, 8])

#  Reshaping Tensors
We can use the `.view()` method to reshape a tensor. Often we will need to reshape our data before pasing it to a neural network.

Let's assume we have 64000 RGB images with size of 28x28 pixels. We can define an array to shape (64000,3, 28,28) to hold them, where 3 is number of color channels

In [31]:
x = torch.randn(64000,3,28,28)
# Now we add a batch dimension of size 32 then infer second dimension by placing -1:
x_reshaped = x.view(32,-1, 3, 28,28)
print(x_reshaped.shape)

torch.Size([32, 2000, 3, 28, 28])


# Computation Graphs and Automatic Differentiation

A computation graph is a specification of parameters with which are invloved in the computation to give the ouput.

The fundamental class of PyTorch `autograd.Variable` keeps of jpw it was created.

In [32]:
# Variable wrap tensor objects
x = autograd.Variable(torch.Tensor([1,2,3]),requires_grad = True)
# You can access the data with the .data attribute

print(x.data)

tensor([1., 2., 3.])


In [34]:
y = autograd.Variable(torch.Tensor([4,5,6]),requires_grad = True)
z = x+y
print(z)

tensor([5., 7., 9.], grad_fn=<AddBackward0>)


In [35]:
operation = z.grad_fn
print(operation)

<AddBackward0 object at 0x7f0a1c081970>


The autograd.Variable knows which operation has created it. But does how that help **computes a gradient?**

In [36]:
# Lets sum up all the entries in x
s = z.sum()
print(s)

tensor(21., grad_fn=<SumBackward0>)


In [38]:
print(s.grad_fn)

<SumBackward0 object at 0x7f0a1da9e550>


# Gradient
So now, what is the derivative of this sum with respect to the first component of x? Remember, that  x is a tensor of 3 elements :$ x = (x_0,x_1,x_2)$

In math, we want a partial derivative of $s$ with respect to $x_0:\frac{dy}{dx_0}$

Well, $s$ knows that it was created as s sum of the tensor $z$ elements $(z_0,z_1,z_2)$. $z$ knows that it was the sum $x+y$. So

$$s = \frac{z_0}{x_0 +y_0} + \frac{z_1}{x_1 +y_1} + \frac{z_2}{x_2 +y_2}$$

And so $s$ contains enough information to determine that the derivative of $s$ with respect to $x_0$ is 1!.

First we nned to run **backpropagation** and calculate gradients with respect to every variable. Note: if you run `backward` multiples times, the gradients will increment. That is because PyTroch accumulates the gradients into the **.gard property** since for many models this is very convenient. Lets now have PyTorch compute the gradient, and see we were right with our guess of 1:

In [39]:
# Calling .backward() on any variable will run backprop, starting from it.
s.backward(retain_graph = True)

In [45]:
print(x)
print(x.grad)
print(y.grad)

tensor([1., 2., 3.], requires_grad=True)
tensor([1., 1., 1.])
tensor([1., 1., 1.])


In [48]:
s.backward(retain_graph = True)
print(x.grad)
print(y.grad)

tensor([4., 4., 4.])
tensor([4., 4., 4.])


# How NOT to break the computational graph

Let's create two tensor and add them up

In [49]:
x = torch.randn((2,2))
y = torch.randn((2,2))
z = x + y # These are Tensor types, and backprop would not be possible

print(z)

tensor([[ 2.0077, -0.0804],
        [-0.5274,  0.0947]])


Now we wrap the torch tensors in `autograd.Varible`. The `var_z` contains the information for backpropagations: 

In [52]:
var_x = autograd.Variable(x,requires_grad = True)
var_y = autograd.Variable(y, requires_grad  = True)
# var_z contains enough information to compute gradients, as we saw above

var_z = var_x + var_y
print(var_z.grad_fn)

<AddBackward0 object at 0x7f09d7bfd460>


But what happens if we extract the wrapped tensor object out of var_z and re-wrap the tensor in a new autograd.Variable?

In [55]:
var_z_data = var_z.data
new_var_z = autograd.Variable(var_z_data)
print(new_var_z.grad_fn)

None


The variable chain is not existing anymore,since we have extracted only data and the whole operatiosn was lost. If we try now to compute `backward` on `new_var_z`, it will throw an error:

In [57]:
#new_var_z.backward(retain_graph = True)

# CUDA
Check whether GPU acceleation with **CUDA** is available.

In [58]:
# Let us tun this cell only if CUDA is available
if torch.cuda.is_available():
    # Create a LongTensor and tranfers it
    # to GPU as torch.cuda.LongTensor
    a = torch.LongTensor(10).fill_(3).cuda()
    print(type(a))
    b = a.cpu()
    # transfers it to CPU, back to
    # being a torch.LongTensor