## Tensor & Variable

This is the second lesson of PyTorch. Through this course, you can learn how to use PyTorch like NumPy and learn about the basic elements of PyTorch Tensor and Variable and how they operate.

PyTorch's Tensor is a library with powerful GPU-accelerated tensor and dynamic build network. Many of PyTorch's operations are similar to NumPy, but because it runs on the GPU, so it is many times faster than NumPy.

In [5]:
import torch
import numpy as np

In [6]:
#Numpy nd array
numpy_tensor = np.random.randn(10, 20)

We can convert numpy array into tensor using 2 ways below:

In [7]:
pytorch_tensor1 = torch.Tensor(numpy_tensor)
pytorch_tensor2 = torch.from_numpy(numpy_tensor)

When using the above two methods for conversion, the data type of NumPy ndarray will be directly converted into the corresponding PyTorch Tensor data type.

At the same time, we can also convert pytorch tensor to numpy ndarray using the following method.

In [12]:
numpy_array = pytorch_tensor1.numpy()

#Tensor on GPU
numpy_array = pytorch_tensor1.cpu().numpy()

Note that Tensor on the GPU cannot be directly converted to NumPy ndarray. You need to use .cpu() to first transfer Tensor on the GPU to the CPU.

We can put tensor on GPU in two ways

In [14]:
# The first way is to define the cuda data type
dtype = torch.cuda.FloatTensor #Put tensor on default GPU
gpu_tensor = torch.randn(10, 20).type(dtype)

gpu_tensor = torch.randn(10, 20).cuda(0) #Put tensor on first GPU
gpu_tensor = torch.randn(10, 20).cuda(1) #Put tensor on second GPU


Now putting tensor back to the cpu

In [None]:
cpu_tensor = gpu_tensor.cpu()

### Properties of tensor

In [15]:
#Size of tensor
print(pytorch_tensor1.shape)
print(pytorch_tensor1.size())

torch.Size([10, 20])
torch.Size([10, 20])


In [16]:
#Type of tensor
print(pytorch_tensor1.type())

torch.FloatTensor


In [17]:
#Dimension of Tensor
print(pytorch_tensor1.dim())

2


In [18]:
#Total number of elements in tensor
print(pytorch_tensor1.numel())

200


## Practice

Consult the following document for the data type of tensor, create a float64, size 3 x 2, randomly initialized tensor, convert it to numpy ndarray, and output its data type.

In [19]:
x = torch.randn(3, 2)
x = x.type(torch.DoubleTensor)
x_array = x.numpy()
print(x_array.dtype)

float64


### Tensor Operation
The api in Tensor operation is very similar to NumPy. If you are familiar with the operations in NumPy, then tensor is basically the same.

In [20]:
x = torch.ones(2, 2)
print(x)

tensor([[1., 1.],
        [1., 1.]])


In [21]:
print(x.type())

torch.FloatTensor


In [23]:
x = x.long()
print(x)
print(x.type())

tensor([[1, 1],
        [1, 1]])
torch.LongTensor


In [24]:
x = torch.randn(4, 3)
print(x)

tensor([[ 0.4476,  0.1933, -0.3887],
        [-0.8183, -0.9913, -1.0813],
        [-0.0583,  0.3382, -0.0624],
        [-0.2718,  3.8732, -1.3096]])


In [30]:
#Maximum value along the line
max_value, max_idx = torch.max(x, dim=1)

In [26]:
max_value

tensor([ 0.4476, -0.8183,  0.3382,  3.8732])

In [27]:
max_idx

tensor([0, 0, 1, 1])

In [29]:
#Sum along the line x
sum_x = torch.sum(x, dim=1)
print(sum_x)

tensor([ 0.2522, -2.8908,  0.2175,  2.2919])


In [31]:
#Increase or decrease dimension

print(x.shape)
x = x.unsqueeze(0) #Increased in the first dimension
print(x.shape)

torch.Size([4, 3])
torch.Size([1, 4, 3])


In [32]:
x = x.unsqueeze(1) # Increase in the second dimension
print(x.shape)

torch.Size([1, 1, 4, 3])


In [33]:
x = x.squeeze(0) # Reduce the first dimension
print(x.shape)

torch.Size([1, 4, 3])


In [34]:
x = x.squeeze() # Remove all one dimension from tensor
print(x.shape)

torch.Size([4, 3])


In [35]:
x = torch.randn(3, 4, 5)
print(x.shape)

# Dimensional exchange using permute and transpose
x = x.permute(1, 0, 2) # permute can rearrange the dimensions of tensor
print(x.shape)

x = x.transpose(0, 2)  # transpose exchange two dimensions in tensor
print(x.shape)

torch.Size([3, 4, 5])
torch.Size([4, 3, 5])
torch.Size([5, 3, 4])


In [36]:
# using view to reshape tensor
x = torch.randn(3, 4, 5)
print(x.shape)

x = x.view(-1, 5) # -1 means any size, 5 means the second dimension becomes 5
print(x.shape)

x = x.view(3, 20) # reshape into the size of (3, 20)
print(x.shape)

torch.Size([3, 4, 5])
torch.Size([12, 5])
torch.Size([3, 20])


In [42]:
x = torch.randn(3, 4)
y = torch.randn(3, 4)
print(x)
print(y)
# Tensor sumb
z = torch.add(x, y)
print(z)

tensor([[ 0.6760,  0.9561,  1.9080,  0.7366],
        [ 1.0248,  0.8321, -2.2153,  0.8417],
        [ 1.4383,  1.7708, -1.0897,  2.0399]])
tensor([[-0.6914,  0.6661,  0.1675, -0.7811],
        [ 0.7039, -0.4436, -2.2425,  2.3513],
        [-0.3048, -0.0475, -0.4726, -0.3579]])
tensor([[-0.0154,  1.6222,  2.0754, -0.0445],
        [ 1.7287,  0.3885, -4.4577,  3.1930],
        [ 1.1335,  1.7233, -1.5624,  1.6820]])


In addition, most of the operations in pytorch support the inplace operation, that is, you can directly operate the tensor without having to open up additional memory space. The method is very simple. Generally, _ is added after the symbol of the operation.

In [43]:
x = torch.ones(3, 3)
print(x.shape)

# unsqueeze inplace
x.unsqueeze_(0)
print(x.shape)

# transpose inplace
x.transpose_(1, 0)
print(x.shape)

torch.Size([3, 3])
torch.Size([1, 3, 3])
torch.Size([3, 1, 3])


In [44]:
x = torch.ones(3, 3)
y = torch.ones(3, 3)
print(x)

# add inplace
x.add_(y)
print(x)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])


Create a float32, 4 x 4 all-one matrix, and modify the matrix in the middle of the matrix 2 x 2, all to 2

$$
\left[
\begin{matrix}
1 & 1 & 1 & 1 \\
1 & 2 & 2 & 1 \\
1 & 2 & 2 & 1 \\
1 & 1 & 1 & 1
\end{matrix}
\right] \\
[torch.FloatTensor\ of\ size\ 4x4]
$$

In [45]:
#Solution
x = torch.ones(4, 4).float()
x[1:3, 1:3] = 2
print(x)

tensor([[1., 1., 1., 1.],
        [1., 2., 2., 1.],
        [1., 2., 2., 1.],
        [1., 1., 1., 1.]])


## Autograd

The ``autograd`` package provides automatic differentiation for all operations
on Tensors. It is a define-by-run framework, which means that your backprop is
defined by how your code is run, and that every single iteration can be
different.

Tensor
--------
``torch.Tensor`` is the central class of the package. If you set its attribute
``.requires_grad`` as ``True``, it starts to track all operations on it. When
you finish your computation you can call ``.backward()`` and have all the
gradients computed automatically. The gradient for this tensor will be
accumulated into ``.grad`` attribute.

To stop a tensor from tracking history, you can call ``.detach()`` to detach
it from the computation history, and to prevent future computation from being
tracked.

To prevent tracking history (and using memory), you can also wrap the code block
in ``with torch.no_grad():``. This can be particularly helpful when evaluating a
model because the model may have trainable parameters with `requires_grad=True`,
but for which we don't need the gradients.

There’s one more class which is very important for autograd
implementation - a ``Function``.

``Tensor`` and ``Function`` are interconnected and build up an acyclic
graph, that encodes a complete history of computation. Each tensor has
a ``.grad_fn`` attribute that references a ``Function`` that has created
the ``Tensor`` (except for Tensors created by the user - their
``grad_fn is None``).

If you want to compute the derivatives, you can call ``.backward()`` on
a ``Tensor``. If ``Tensor`` is a scalar (i.e. it holds a one element
data), you don’t need to specify any arguments to ``backward()``,
however if it has more elements, you need to specify a ``gradient``
argument that is a tensor of matching shape.
"""

In [55]:
import torch

# Create a tensor and set requires_grad=True to track computation with it
x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


In [56]:
# Do an operation of tensor:
y = x + 2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward>)


In [57]:
# ``y`` was created as a result of an operation, so it has a ``grad_fn``.
print(y.grad_fn)

<AddBackward object at 0x000002592F7C98D0>


In [59]:
# Do more operations on y
z = y * y * 3
out = z.mean()
print(z, out)

tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=<MeanBackward1>)


In [60]:
# ``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``
# flag in-place. The input flag defaults to ``False`` if not given.
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

False
True
<SumBackward0 object at 0x000002592F854D68>


In [61]:
# Gradients
# ---------
# Let's backprop now
# Because ``out`` contains a single scalar, ``out.backward()`` is
# equivalent to ``out.backward(torch.tensor(1))``.

out.backward()


In [62]:
# print gradients d(out)/dx
print(x.grad)

tensor([[4.5000, 4.5000],
        [4.5000, 4.5000]])


#### You can do many crazy things with autograd!

In [63]:

x = torch.randn(3, requires_grad=True)

y = x * 2
while y.data.norm() < 1000:
    y = y * 2

print(y)

###############################################################
#
gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(gradients)

print(x.grad)


tensor([ 354.3214,  965.8456, -546.1478], grad_fn=<MulBackward>)
tensor([ 204.8000, 2048.0000,    0.2048])


You can also stop autograd from tracking history on Tensors
with ``.requires_grad``=True by wrapping the code block in
``with torch.no_grad():``

In [64]:
# You can also stop autograd from tracking history on Tensors
# with ``.requires_grad``=True by wrapping the code block in
# ``with torch.no_grad():``
print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
	print((x ** 2).requires_grad)


True
True
False


### Read more 
http://pytorch.org/docs/autograd