
# Tensors

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other specialized hardware to accelerate computing. 

In [1]:
import torch
import numpy as np

## Tensor Initialization

Tensors can be initialized in various ways. Take a look at the following examples:

**Directly from data**

Tensors can be created directly from data. The data type is automatically inferred.



In [2]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)

**From a NumPy array**

Tensors can be created from NumPy arrays (and vice versa - see `bridge-to-np-label`).



In [3]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

**From another tensor:**

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.



In [4]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print("Ones Tensor: \n {} \n".format(x_ones))

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print("Random Tensor: \n {} \n".format(x_rand))

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.8896, 0.7317],
        [0.8255, 0.2639]]) 



**With random or constant values:**

``shape`` is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.



In [5]:
shape = (2, 3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print("Random Tensor: \n {} \n".format(rand_tensor))
print("Ones Tensor: \n {} \n".format(ones_tensor))
print("Zeros Tensor: \n {}".format(zeros_tensor))

Random Tensor: 
 tensor([[0.0687, 0.5021, 0.6481],
        [0.1861, 0.7458, 0.4851]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])


## Tensor Attributes

Tensor attributes describe their shape, datatype, and the device on which they are stored.



In [6]:
tensor = torch.rand(3, 4)

print("Shape of tensor: {}".format(tensor.shape))
print("Datatype of tensor: {}".format(tensor.dtype))
print("Device tensor is stored on: {}".format(tensor.device))

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


--------------




## Tensor Operations



In [7]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
  tensor = tensor.to('cuda')
  print("Device tensor is stored on: {}".format(tensor.device))

**Standard numpy-like indexing and slicing:**



In [8]:
tensor = torch.ones(4, 4)
print(tensor, '\n\n')
tensor[:,1] = 0
print(tensor)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]) 


tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


#### Joining tensors



In [9]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])


**Multiplying tensors**



In [10]:
# This computes the element-wise product
print("tensor.mul(tensor) \n {} \n".format(tensor.mul(tensor)))
# Alternative syntax:
print("tensor * tensor \n {}".format(tensor * tensor))

tensor.mul(tensor) 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor * tensor 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


#### matrix multiplication between two tensors



In [11]:
print("tensor.matmul(tensor.T) \n {} \n".format(tensor.matmul(tensor.T)))
# Alternative syntax:
print("tensor @ tensor.T \n {}".format(tensor @ tensor.T))

tensor.matmul(tensor.T) 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

tensor @ tensor.T 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])


#### In-place operations
.



In [12]:
print(tensor, "\n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


--------------





## Bridge with NumPy
Tensors on the CPU and NumPy arrays can share their underlying memory
locations, and changing one will change	the other.



### Tensor to NumPy array



In [13]:
t = torch.ones(5)
print("t: {}".format(t))
n = t.numpy()
print("n: {}".format(n))

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in the tensor reflects in the NumPy array.



In [14]:
t.add_(1)
print("t: {}".format(t))
print("n: {}".format(n))

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


### NumPy array to Tensor



In [15]:
n = np.ones(5)
t = torch.from_numpy(n)

Changes in the NumPy array reflects in the tensor.



In [16]:
np.add(n, 1, out=n)
print("t: {}".format(t))
print("n: {}".format(n))

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]


## Differentiation in Autograd
Let's take a look at how ``autograd`` collects gradients. We create two tensors ``a`` and ``b`` with
``requires_grad=True``. This signals to ``autograd`` that every operation on them should be tracked.




In [2]:
import torch

a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)

We create another tensor ``Q`` from ``a`` and ``b``.

\begin{align}Q = 3a^3 - b^2\end{align}



In [3]:
Q = 3*a**3 - b**2

Let's assume ``a`` and ``b`` to be parameters of an NN, and ``Q``
to be the error. In NN training, we want gradients of the error
w.r.t. parameters, i.e.

\begin{align}\frac{\partial Q}{\partial a} = 9a^2\end{align}

\begin{align}\frac{\partial Q}{\partial b} = -2b\end{align}


When we call ``.backward()`` on ``Q``, autograd calculates these gradients
and stores them in the respective tensors' ``.grad`` attribute.

We need to explicitly pass a ``gradient`` argument in ``Q.backward()`` because it is a vector.
``gradient`` is a tensor of the same shape as ``Q``, and it represents the
gradient of Q w.r.t. itself, i.e.

\begin{align}\frac{dQ}{dQ} = 1\end{align}

Equivalently, we can also aggregate Q into a scalar and call backward implicitly, like ``Q.sum().backward()``.




In [4]:
external_grad = torch.tensor([1., 1.])
Q.backward(gradient=external_grad)

Gradients are now deposited in ``a.grad`` and ``b.grad``



In [5]:
# check if collected gradients are correct
print(9*a**2 == a.grad)
print(-2*b == b.grad)

tensor([True, True])
tensor([True, True])


### Computational Graph & Exclusion from the DAG

Conceptually, autograd keeps a record of data (tensors) & all executed operations (along with the resulting new tensors) in a directed acyclic graph (DAG).

In a forward pass, autograd does two things simultaneously:

- run the requested operation to compute a resulting tensor, and
- maintain the operation’s *gradient function* in the DAG.

The backward pass kicks off when ``.backward()`` is called on the DAG
root. ``autograd`` then:

- computes the gradients from each ``.grad_fn``,
- accumulates them in the respective tensor’s ``.grad`` attribute, and
- using the chain rule, propagates all the way to the leaf tensors.


``torch.autograd`` tracks operations on all tensors which have their ``requires_grad`` flag set to ``True``. For tensors that don’t require gradients, setting this attribute to ``False`` excludes it from the gradient computation DAG.

The output tensor of an operation will require gradients even if only a single input tensor has ``requires_grad=True``.


In [8]:
x = torch.rand((5, 5), requires_grad=True)
y = torch.rand((5, 5), requires_grad=True)
z = torch.rand((5, 5), requires_grad=True)

with torch.no_grad():
    a = x + y
print("Does `a` require gradients? : {}".format(a.requires_grad))
b = x + z
print("Does `b` require gradients?: {}\n".format(b.requires_grad))

x.requires_grad = False
y.requires_grad = False

a = x + y
print("Does `a` require gradients? : {}".format(a.requires_grad))
b = x + z
print("Does `b` require gradients?: {}\n".format(b.requires_grad))

x = torch.rand(5, 5)
y = torch.rand(5, 5)
z = torch.rand((5, 5), requires_grad=True)

a = x + y
print("Does `a` require gradients? : {}".format(a.requires_grad))
b = x + z
print("Does `b` require gradients?: {}\n".format(b.requires_grad))

Does `a` require gradients? : False
Does `b` require gradients?: True

Does `a` require gradients? : False
Does `b` require gradients?: True

Does `a` require gradients? : False
Does `b` require gradients?: True



**[Credits](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)**