# **Advantages of PyTorch's tensors over NumPy's ndarrays**



*   In the previous chapter, we saw that when calculating the optimal weight values, we vary each weight by a small amount and understand its impact on reducing the overall loss value. Note that the loss calculation based on the weight update of one weight does not impact the loss calculation of the weight update of other weights in the same iteration. Thus, this process can be optimized if each weight update is being made by a different core in parallel instead of updating weights sequentially. A GPU comes in handy in this scenario as it consists of thousands of cores when compared to a CPU (which, in general, could have <=64 cores).

*   A Torch tensor object is optimized to work with a GPU compared to NumPy. To understand this further, let's perform a small experiment, where we perform the operation of matrix multiplication using NumPy arrays in one scenario and tensor objects in another and compare the time taken to perform matrix multiplication in both scenarios:



# Generate two different torch objects:

In [1]:
import torch
x = torch.rand(1,6400)
y = torch.rand(6400,5000)

# Define the device to which we will store the tensor objects we created in step 1:

In [2]:
device = 'cuda' if torch.cuda.is_available() else 'GPU'

# Register the tensor objects that were created in step 1 with the device. Registering tensor objects means storing information in a device:

In [3]:
x,y = x.to(device),y.to(device)

# Perform matrix multiplication of the Torch objects and also, time it so that we can compare the speed in a scenario where matrix multiplication is performed on NumPy arrays:

In [4]:
%timeit z=(x@y)

The slowest run took 36.71 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 5: 514 µs per loop


# Perform matrix multiplication of the same tensors on gpu:

In [10]:
x,y = x.cpu(),y.cpu()
%timeit z=(x@y)


100 loops, best of 5: 8.87 ms per loop


# **Perfom the same matrix multiplication this time on Numpy arrays..**

In [11]:
import numpy as np
x = np.random.random((1,6400))
y = np.random.random((6400,5000))
%timeit z=np.matmul(x,y)

100 loops, best of 5: 19.2 ms per loop


The torch object is faster then Numpy arrays..