PyTorch `tensor` API is very similar to `numpy`. So if you are familiar with this math library, you should be at home with PyTorch's `tensor` functions. Here are a few areas where things might need to be reviewed. 

## Creating, Shaping, and referencing Arrays

First, let's review the API for basic array manipulation in PyTorch

Reference [1. PyTorch tensor types](https://pytorch.org/docs/stable/tensors.html)

In [1]:
# Importing the libray & checking the version
import torch
print(torch.__version__)

1.7.1


In [2]:
# torch vectors have a default type:
torch.get_default_dtype()

torch.float32

In [25]:
# Let's change this to `float64`
torch.set_default_dtype(torch.float64)
torch.get_default_dtype() 

# This won't work: 
# torch.set_defatult_dtype(torch.int) # won't work

There's also this option: `torch.set_default_tensor_type(torch.DoubleTensor)`

See documentation for details- https://pytorch.org/docs/stable/_modules/torch.html#set_default_tensor_type

### Creating Torch Tensors using Python List format

In [4]:
tensor_array = torch.Tensor([[1,2,3],
                             [4,5,6]])
tensor_array # notice the 'dots'

tensor([[1., 2., 3.],
        [4., 5., 6.]])

In [5]:
print(torch.numel(tensor_array)) # 
print(torch.is_tensor(tensor_array))

6
True


In [6]:
 # creating tensor using only specified shape
tensor_uninitialized = torch.Tensor(3, 4)
tensor_uninitialized

tensor([[ 0.0000e+00,  0.0000e+00, 4.6444e-310,  0.0000e+00],
        [4.6444e-310, 4.6444e-310, 4.6444e-310, 2.9644e-323],
        [6.9481e-310, 6.9481e-310,  8.5267e+21, 6.9481e-310]])

### Some helper functions

In [7]:
# Initialize weights of model parameters
tensor_initialized = torch.rand(3, 4)
tensor_initialized

tensor([[0.9743, 0.1624, 0.4889, 0.3946],
        [0.8985, 0.7147, 0.6557, 0.1286],
        [0.4564, 0.8626, 0.6585, 0.5904]])

In [8]:
tensor_int = torch.tensor([5, 3]).type(torch.IntTensor) # on CPU --> see reference [1] for CPU & GPU tensor types
# tensor_GPUfloat = torch.tensor([5, 3]).type(torch.cuda.FloatTensor) # will result in an error if CUDA library is not initialized

In [29]:
tensor_short = torch.ShortTensor([1, 2]) # directly instantiate a tensor of a type -- ensure you're only providing an int, not assuming implicit conversion
tensor_float = torch.tensor([1.0, 2.0]).type(torch.half)
print(tensor_short)
print(tensor_float)

tensor([1, 2], dtype=torch.int16)
tensor([1., 2.], dtype=torch.float16)


### Filling with values

In [30]:
tensor_fill = torch.full((2,2), fill_value=10, dtype=torch.float64) # try fill_value = 10., or use dtype=torch.float64
print(tensor_fill)

#What will this show?
# tensor_fill.dtype

tensor([[10., 10.],
        [10., 10.]])


In [32]:
tensor_of_ones = torch.ones([3,4], dtype=torch.int32)
print(tensor_of_ones)

tensor_of_zeros = torch.zeros_like(tensor_of_ones) # if you already have a shape and datatype of a tensor you want to replicate
print(tensor_of_zeros)

tensor_eye = torch.eye(5)
print(tensor_eye)

non_zero = torch.nonzero(tensor_eye)
print(non_zero) 

tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]], dtype=torch.int32)
tensor([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]], dtype=torch.int32)
tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])
tensor([[0, 0],
        [1, 1],
        [2, 2],
        [3, 3],
        [4, 4]])


## Note:

* `torch.tensor()` creates a copy of the unerlying data of the tensor
* `torch.Tensor()` creates a view of the same data

In [65]:
i = torch.tensor([[0, 1, 1, 0],
                  [2, 2, 0, 2]])

In [66]:
v = torch.tensor([3, 4, 5, 10], dtype=torch.float32)

### Sparse matrix

* Create a sparse tensor of shape (2,5) 
* In the example below, `sparse_tensor` has 2 rows and 3 columns values; remaining columns do not have values. 


In [70]:
sparse_tensor = torch.sparse_coo_tensor(i, v, [2, 5]) # in coordinate format

In [71]:
sparse_tensor.data # every pytorch tensor has a underlying .data field - notice the structure is very different

tensor(indices=tensor([[0, 1, 1, 0],
                       [2, 2, 0, 2]]),
       values=tensor([ 3.,  4.,  5., 10.]),
       size=(2, 5), nnz=4, dtype=torch.float32, layout=torch.sparse_coo)

In [72]:
print(sparse_tensor[1,0])
print(sparse_tensor[0,0])

# what will this be? 
#print(sparse_tensor[0,2])

tensor(5., dtype=torch.float32)
tensor(0., dtype=torch.float32)


# Tensor operations 
1. In-place operation: `initial_tensor.fill_(n)` vs `initial_tensor.fill(n)`
2. Out of place operation: `initial_tensor.add(n)` # added to every element but doesn't change `initial_tensor`

In [98]:
init_tensor = torch.rand(2,3)
print(init_tensor)

# in-place operation
init_tensor.fill_(10)
print(init_tensor)

init_tensor.add_(5)
print(init_tensor)

# out of place operation
new_tensor = init_tensor.add(5)
print(new_tensor)

new_tensor.sqrt_()
print(new_tensor)


tensor([[0.8466, 0.4982, 0.0671],
        [0.4712, 0.0029, 0.5059]])
tensor([[10., 10., 10.],
        [10., 10., 10.]])
tensor([[15., 15., 15.],
        [15., 15., 15.]])
tensor([[20., 20., 20.],
        [20., 20., 20.]])
tensor([[4.4721, 4.4721, 4.4721],
        [4.4721, 4.4721, 4.4721]])


In [99]:
x = torch.linspace(start=0, end=10, steps=11)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

### Chunking

In [100]:
tensor_chunk = torch.chunk(x, 3, 0)
tensor_chunk

(tensor([0., 1., 2., 3.]), tensor([4., 5., 6., 7.]), tensor([ 8.,  9., 10.]))

In [101]:
tensor1 = tensor_chunk[0] # tuple of tensor_chunk
tensor2 = tensor_chunk[1]
tensor3 = torch.tensor([3, 4, 5])
torch.cat((tensor1, tensor2, tensor3), 0) # concatenate along a given dimension

tensor([0., 1., 2., 3., 4., 5., 6., 7., 3., 4., 5.])

In [102]:
random_tensor = torch.Tensor([[10, 8, 10], [3, 2, 4], [2, 3, 5]])
random_tensor

tensor([[10.,  8., 10.],
        [ 3.,  2.,  4.],
        [ 2.,  3.,  5.]])

In [103]:
random_tensor.size()

torch.Size([3, 3])

In [107]:
resized_tensor = random_tensor.view(9) # 3x3 is viewed as a 1-D 9 element tensor - needs to be compatible with the original tensor shape
resized_tensor

tensor([10.,  8., 10.,  3.,  2.,  4.,  2.,  3.,  5.])

In [111]:
random_tensor[2, 2] = 100
print(resized_tensor) # original and resized tensors share the same underlying memory
print(random_tensor.shape)
print(resized_tensor.shape)

tensor([ 10.,   8.,  10.,   3.,   2.,   4.,   2.,   3., 100.])
torch.Size([3, 3])
torch.Size([9])


### Squeezing dimensions

In [115]:
tensor_unqueeze = torch.unsqueeze(random_tensor, 2)
tensor_unqueeze # 3 x 3 x 1 - with the extra dimension we 'unsqueezed'

tensor([[[ 10.],
         [  8.],
         [ 10.]],

        [[  3.],
         [  2.],
         [  4.]],

        [[  2.],
         [  3.],
         [100.]]])

In [116]:
tensor_transpose = torch.transpose(init_tensor, 0, 1) 
tensor_transpose

tensor([[15., 15.],
        [15., 15.],
        [15., 15.]])

### Sorting tensors

In [122]:
random_tensor

tensor([[ 10.,   8.,  10.],
        [  3.,   2.,   4.],
        [  2.,   3., 100.]])

In [123]:
sorted_tensor, sorted_indices = torch.sort(random_tensor)

In [124]:
sorted_tensor

tensor([[  8.,  10.,  10.],
        [  2.,   3.,   4.],
        [  2.,   3., 100.]])

In [125]:
sorted_indices # indices from the original tensor, but sorted

tensor([[1, 0, 2],
        [1, 0, 2],
        [0, 1, 2]])

### Floating operations

In [126]:
tensor_float = torch.FloatTensor([-6.2, 3.2, 2.3, -1.0])
tensor_float

tensor([-6.2000,  3.2000,  2.3000, -1.0000], dtype=torch.float32)

In [127]:
tensor_abs = torch.abs(tensor_float)
tensor_abs

tensor([6.2000, 3.2000, 2.3000, 1.0000], dtype=torch.float32)

### Matrix operations

0. addition (torch.add)
1. division (torch.div)
2. multiplication (torch.mul)
3. Matrix-Vector multiplication (torch.mv)
4. Matrix-matrix multiplication (torch.mm)

In [128]:
rand1 = torch.abs(torch.rand(2, 3))
rand2 = torch.abs(torch.rand(2, 3))

In [131]:
# addition
add1 = rand1 + rand2
add1

tensor([[1.1978, 0.6007, 0.6989],
        [1.6855, 0.5637, 1.0774]])

In [132]:
add2 = torch.add(rand1, rand2)
add2

tensor([[1.1978, 0.6007, 0.6989],
        [1.6855, 0.5637, 1.0774]])

1. Tensor Division

In [136]:
#division
tensor = torch.Tensor([[-1, -2, -3],
                       [ 1,  2,  3]])

In [138]:
tensor_div = torch.div(tensor, tensor + 0.3)
tensor_div

tensor([[1.4286, 1.1765, 1.1111],
        [0.7692, 0.8696, 0.9091]])

2. Vector multiplication

In [139]:
# multiplication
tensor_mul = torch.mul(tensor, tensor) 
tensor_mul

tensor([[1., 4., 9.],
        [1., 4., 9.]])

In [141]:
tensor_clamp = torch.clamp(tensor, min=-0.2, max=2)
tensor_clamp

tensor([[-0.2000, -0.2000, -0.2000],
        [ 1.0000,  2.0000,  2.0000]])

2. Dot product

In [142]:
t1 = torch.Tensor([1, 2])
t2 = torch.Tensor([10, 20])

In [143]:
dot_product = torch.dot(t1, t2)
dot_product

tensor(50.)

3. Multiply by multi-dimensional vectors

In [154]:
matrix = torch.tensor([[1, 2, 3],
                       [4, 5, 6]], dtype=torch.float64)
vector = torch.Tensor([0, 1, 2])

In [155]:
matrix.dtype

torch.float64

In [156]:
vector.dtype

torch.float64

In [157]:
matrix_vector = torch.mv(matrix, vector)
matrix_vector

tensor([ 8., 17.])

4. Matrix-Matrix multiplication

In [158]:
another_matrix = torch.Tensor([[10, 30],
                               [20, 0],
                               [0, 50]])

In [159]:
matrix_mul = torch.mm(matrix, another_matrix)
matrix_mul

tensor([[ 50., 180.],
        [140., 420.]])

5. Other operations: argmax

In [160]:
torch.argmax(matrix_mul, dim=1)

tensor([1, 1])

In [161]:
torch.argmin(matrix_mul, dim=1)

tensor([0, 0])

### Converting to Numpy and back

In [162]:
import numpy as np
import torch

In [163]:
tensor = torch.rand(4, 3)
tensor

tensor([[0.9392, 0.3672, 0.2916],
        [0.5239, 0.0381, 0.6503],
        [0.2986, 0.8337, 0.2599],
        [0.0639, 0.4277, 0.6613]])

In [165]:
# confirm it's a torch tensor
type(tensor) 

torch.Tensor

In [166]:
numpy_from_tensor = tensor.numpy()
numpy_from_tensor

array([[0.93918337, 0.36717894, 0.29164179],
       [0.52393005, 0.03809374, 0.65026695],
       [0.29860565, 0.83371833, 0.25987739],
       [0.06392023, 0.42767944, 0.66132197]])

In [170]:
print(tensor.dtype)
print(numpy_from_tensor.dtype)
print(type(numpy_from_tensor))

torch.float64
float64
<class 'numpy.ndarray'>


They share the same underlying memory

In [172]:
numpy_from_tensor[0, 0] = 100.0
numpy_from_tensor

array([[1.00000000e+02, 3.67178943e-01, 2.91641786e-01],
       [5.23930051e-01, 3.80937440e-02, 6.50266950e-01],
       [2.98605654e-01, 8.33718329e-01, 2.59877393e-01],
       [6.39202272e-02, 4.27679439e-01, 6.61321973e-01]])

In [173]:
tensor

tensor([[1.0000e+02, 3.6718e-01, 2.9164e-01],
        [5.2393e-01, 3.8094e-02, 6.5027e-01],
        [2.9861e-01, 8.3372e-01, 2.5988e-01],
        [6.3920e-02, 4.2768e-01, 6.6132e-01]])

In [174]:
numpy_arr = np.array([[1.0, 2.0, 3.0],
                      [10, 20., 30.],
                      [100, 200, 300.]])

numpy_arr

array([[  1.,   2.,   3.],
       [ 10.,  20.,  30.],
       [100., 200., 300.]])

In [177]:
tensor_from_numpy = torch.from_numpy(numpy_arr)
tensor_from_numpy

tensor([[  1.,   2.,   3.],
        [ 10.,  20.,  30.],
        [100., 200., 300.]])

In [178]:
type(tensor_from_numpy)

torch.Tensor

In [179]:
torch.is_tensor(tensor_from_numpy)

True

In [180]:
tensor_from_numpy[0] = 1
tensor_from_numpy

tensor([[  1.,   1.,   1.],
        [ 10.,  20.,  30.],
        [100., 200., 300.]])

In [181]:
numpy_arr

array([[  1.,   1.,   1.],
       [ 10.,  20.,  30.],
       [100., 200., 300.]])

Modifications in the numpy array carry to the torch tensor array, & back

In [182]:
np_array_one= np.array([4, 8])
np_array_one

array([4, 8])

In [183]:
tensor_from_array_one = torch.as_tensor(np_array_one)
tensor_from_array_one

tensor([4, 8])

In [184]:
np_array_one[1] = 5
np_array_one

array([4, 5])

In [185]:
tensor_from_array_one

tensor([4, 5])

To create a separate copy of the numpy array, use `torch.tensor` instead:

In [195]:
np_array_two = np.array([2, 2])
np_array_two

array([2, 2])

In [196]:
tensor_from_array_two = torch.tensor(np_array_two)
tensor_from_array_two

tensor([2, 2])

In [197]:
np_array_two[1] = 4
np_array_two

array([2, 4])

In [198]:
tensor_from_array_two

tensor([2, 2])

## PyTorch, CUDA, and GPUS

* PyTorch Tensors have been architected to make optimal use of GPUs for massively parallel computations. 
* Initially used for video & graphics processing, now widely used for big data & ML operations
* Training on massive parallel arch can be sped up by 10-15x
* NV created CUDA application programming interface for general purpose use of GPUs beyond graphics
* Any FW support for GPUs has to integrate with the CUDA platform - Compute Unified Device Architecture 

PyTorch Support for CUDA
* Dev can write CUDA compliant code
* Understood by CUDA aware FW
* Tensors need to be instantiated on the GPU for automatic speed up
* Use `torch.cuda` for CUDA operations
* Special Tensor types for CUDA,e x. `torch.cuda.FloatTensor` 
* Use `torch.cuda.device` context manager tracks selected GPU and creates tensors on it

All operations are on the same device 
* Cross GPU ops not allowed 
* Exceptions: Copy operations: `copy_()`, `to()`, `cuda()`
* GPU ops are asynchronous by default
* Enquied to particular device and invisible to teh user- FIFO
* Copy from one device to another (CPU->GPU, or GPU->GPU)
* use an env variable for synch, for debugging; `CUDA_LAUNCH_BLOCKING = 1` for error handling and stack tracing, usu not available in async operation
* Functions, such as `to()`, `copy()` allow non-blocking argument

Device agnostic code
* explicitly handles GPU or CPU cases
* common pattern is to use argparse to read user arguments
* code can be invoked with runtime flag to enable or disable runtime flag

Tensors make up our DIRECTED ACYCLIC Computation Graph
