___

# Machine Learning in Geosciences ] 
Department of Applied Geoinformatics and Carthography, Charles University

Lukas Brodsky lukas.brodsky@natur.cuni.cz


### PyTorch installation

`pip install torch`
`pip install torchvision`

### PyTorch tensor

Tensors are the building blocks for representing data in PyThorch. It is the fundamental data structure. The term `tensor` comes bundled with the notion of spaces. In this context of deep learning, tensors refer to the generalization of vectors and matrices to an arbitrary number of dimensions. 

The torch package contains not only the data structures for **multi-dimensional arrays** but also defines mathematical operations over these tensors. Additionally, it provides many utilities for efficient serializing of Tensors and arbitrary types, and other useful utilities.

### PyTorch tensor vs. NumPy array

What is the difference between numpy array and pytorch tensor?

1. The numpy arrays are the core functionality of the numpy package designed to support faster mathematical operations. Pytorch tensors are similar to numpy arrays, but can also be operated on CUDA-capable Nvidia GPU.
   
   
2. Numpy arrays are mainly used in typical machine learning algorithms. Pytorch tensors are mainly used in deep learning which requires heavy matrix computation.

3. Unlike numpy arrays, while creating pytorch tensor, it also accepts two other arguments called the device_type (whether the computation happens on CPU or GPU) and the requires_grad (which is used to compute the derivatives).

### PyTorch API

The PyTorch API establish a few directions on where to find things in the documentation (https://pytorch.org/docs/stable/index.html). 

**Topics covered**: 

    1. What is the difference between numpy array and pytorch tensor?

    2. How to create numpy arrays and pytorch tensors. Furthermore, we will see how to perform the same operation on both data types.
    
    

In [None]:
import torch 
import numpy as np

### Constructors

In [None]:
# NumPy
array1 = np.array([1,2,3,4])
# from list
list1 = [[1,2,3,4],[5,6,7,8]]
array2 = np.array(list1)
# casting data type
array3 = np.array([1,2,3,4], dtype=np.float64)
type(array1), array1.dtype

In [None]:
# PyTorch Tensor 
tensor1 = torch.tensor([1,2,3,4])
# from list
numpy_arr1 = np.array([11,21,322])
tensor2 = torch.tensor(numpy_arr1)
type(tensor1), tensor1.dtype

In [None]:
# from tensor to array 
numpy_arr = tensor1.numpy()
numpy_arr

In [None]:
# see what happen if you overwrite element in array
numpy_arr[1] = 999
tensor1, numpy_arr

In [None]:
# from array to tensor 
array1_tensor = torch.from_numpy(array1)
array1_tensor

### Random function

In [None]:
np.random.rand(2, 3) # numpy way

In [None]:
torch.rand(2, 3) # pytorch way

### Seed function

In [None]:
np.random.seed(42)
np.random.rand(2)

In [None]:
torch.manual_seed(42) # for both CPU and CUDA
torch.rand(2)

### Reshaping the array

In [None]:
# numpy approach 
np_arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
np_arr.shape

In [None]:
new_arr = np_arr.reshape(2, 2, -1) 
new_arr.shape

In [None]:
new_arr

In [None]:
# reshape tensor 
tensor1 = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8])
tensor1.reshape(2, 2, -1)

Note that there are two functions in pytorch to `reshape` the array. One is `permute` which basically permutes the dimensions without changing the data ordering and other is reshape which just changes the size to the desired size and so the ordering of elements get changed.

In [None]:
x = torch.randn(2, 5)
x

In [None]:
x.permute(0, 1)

In [None]:
x.permute(1, 0)

In [None]:
# reshape method has changes the order of elements
x.reshape(5, 2)

In [None]:
x.storage()

### Slicing the arrays

In [None]:
x = torch.rand(3, 4)
x[:, 2] = 23 # Replace every element of third column to 23
print(x)

In [None]:
y = np.random.rand(3, 4)
y[:, 2] = 223 # Replace every element of third column to 223
print(y)

In [None]:
x[0] = 999 # Replace every element of first row to 999
print(x)

In [None]:
y[0] = 9999 # Replace every element of first row to 9999
print(y)