# What is PyTorch?

It’s a Python based scientific computing package targeted at two sets of audiences:

- A replacement for NumPy to use the power of GPUs
- a deep learning research platform that provides maximum flexibility and speed

---

## A replacement for NumPy to use the power of GPUs...

In [1]:
import torch as t

# Tensors
a = t.tensor([1,2,3])
# Can specify type during construction
a = t.tensor([1,2,3], dtype=t.half)

TypeError: 'module' object is not callable

In [11]:
# Can cast to different types once constructed
a

tensor([ 1.,  2.,  3.], dtype=torch.float16)

In [12]:
a.double()

tensor([ 1.,  2.,  3.], dtype=torch.float64)

In [14]:
a.float()

tensor([ 1.,  2.,  3.])

In [15]:
a.short()

tensor([ 1,  2,  3], dtype=torch.int16)

In [16]:
a.long()

tensor([ 1,  2,  3])

| Data type                | dtype	                       |CPU tensor	        |GPU tensor               |
|:------------------------:|:-----------------------------:|:------------------:|:-----------------------:|
| 32-bit floating point    | torch.float32 or torch.float  | torch.FloatTensor  | torch.cuda.FloatTensor  |
| 64-bit floating point    | torch.float64 or torch.double | torch.DoubleTensor | torch.cuda.DoubleTensor |
| 16-bit floating point	   | torch.float16 or torch.half   | torch.HalfTensor   | torch.cuda.HalfTensor   |
| 8-bit integer (unsigned) | torch.uint8                   | torch.ByteTensor   | torch.cuda.ByteTensor   |
| 8-bit integer (signed)   | torch.int8	                   | torch.CharTensor   | torch.cuda.CharTensor   |
| 16-bit integer (signed)  | torch.int16 or torch.short	   | torch.ShortTensor  | torch.cuda.ShortTensor  |
| 32-bit integer (signed)  | torch.int32 or torch.int	   | torch.IntTensor    | torch.cuda.IntTensor    |
| 64-bit integer (signed)  | torch.int64 or torch.long	   | torch.LongTensor   | torch.cuda.LongTensor   |


### Converting between Tensors and Numpy Arrays...
Tensor -> Numpy

In [17]:
import numpy as np

x = t.Tensor([1,2,3])
x

tensor([ 1.,  2.,  3.])

In [18]:
y = x.numpy()
type(y)

numpy.ndarray

However, they point to the same place in memory...

In [19]:
x

tensor([ 1.,  2.,  3.])

In [20]:
y

array([ 1.,  2.,  3.], dtype=float32)

In [21]:
x += 10
x

tensor([ 11.,  12.,  13.])

In [22]:
y

array([ 11.,  12.,  13.], dtype=float32)

Numpy -> Tensor

In [23]:
y = np.array([5,4,3])
type(y)

numpy.ndarray

In [24]:
x = t.from_numpy(y)
type(x)

torch.Tensor

In [25]:
y

array([5, 4, 3])

In [26]:
x

tensor([ 5,  4,  3])

In [27]:
y += 10
y

array([15, 14, 13])

In [28]:
x

tensor([ 15,  14,  13])

### Speed and Efficiency
Torch is faster than Numpy

In [29]:
import time
def timer(f, trials=5):
    avg_time = 0
    for i in range(trials):
        start = time.time()
        f()
        stop = time.time()
        avg_time += (stop - start)/trials
    return avg_time

In [30]:
np_arr = np.random.rand(10000,10000)
t_arr = t.rand((10000,10000))

In [31]:
print('type(t_arr) = ', type(t_arr))
print('t_arr.shape = ', t_arr.shape)

type(t_arr) =  <class 'torch.Tensor'>
t_arr.shape =  torch.Size([10000, 10000])


In [32]:
print('type(np_arr) = ', type(np_arr))
print('np_arr.shape = ', np_arr.shape)

type(np_arr) =  <class 'numpy.ndarray'>
np_arr.shape =  (10000, 10000)


In [33]:
timer(lambda: t_arr*t_arr)

0.049340581893920904

In [34]:
timer(lambda: np_arr*np_arr)

0.27298936843872074

What about optimizing with some class methods?

In [48]:
def exp1():
    y = t_arr * t_arr

In [49]:
y = t.empty(t_arr.shape)
def exp2():
    t.mul(t_arr, t_arr, out=y)

In [50]:
def exp3():
    t_arr.mul_(t_arr)

In [51]:
timer(exp1)

0.049684095382690426

In [53]:
timer(exp2)

0.044258165359497066

In [54]:
timer(exp3)

0.03188657760620117

### Tensors have 100+ built-in class methods 
including all your favorite numpy convenience methods and a a growing linear algebra library

## What about the GPU?
There is a class method call 'to()', that sends a tensor to a specific device. This is new in 0.4.0.

In [55]:
t_arr.shape

torch.Size([10000, 10000])

In [58]:
t_arr.device # Should currently be on the cpu

device(type='cpu')

In [62]:
device = t.device('cuda')
t_arr.to(device)

RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCGeneral.cpp:70

In [61]:
t.cuda.is_available()

False

## a deep learning research platform that provides maximum flexibility and speed

Central to all neural networks in PyTorch is the autograd package. Let’s first briefly visit this, and we will then go to training our first neural network.

The autograd package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

In [175]:
b = t.tensor([2])

In [176]:
b.requires_grad

False

In [177]:
y=b*b

In [178]:
y.backward()

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

In [179]:
b.requires_grad = True # or b.requires_grad_()

In [180]:
y=b*b

In [181]:
y.backward()

In [182]:
b.grad

tensor([ 4])

In [140]:
c = b.detach()

In [141]:
y=c*c

In [142]:
y.backward()

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

In [143]:
b.grad

tensor([ 8])