# Intro to NumPy arrays and PyTorch tensors

We're going to meet the **NumPy** library for maths, numerics, ndarrays, linear algebra, and more. And we'll look at the **PyTorch** library to see what's so special about it.

**This is not an exhaustive look at the features of either library. For more on NumPy, check out [Intro_to_NumPy_for_ML.ipynb](Intro_to_NumPy_for_ML.ipynb).**

## NumPy: `ndarray` and `ufunc`

Everyone imports NumPy (pronounced "num-pie" or "num-pee", whatever you prefer) like this:

In [2]:
import numpy as np

Let's load a bit of data. NumPy provides a convenient way to load data from a URL. 

In [3]:
ds = np.DataSource('../data/')

f = ds.open('https://geocomp.s3.amazonaws.com/data/GR-NPHI-RHOB-DT.npy', mode='rb')

data = np.load(f)  # f could also just be a local filename.

dt = data[:, -1]

In [7]:
# Basics...
dt.ndim, dt.dtype, dt.shape, dt[30], dt[30:40]

(1,
 dtype('float64'),
 (71,),
 78.26930000272084,
 array([78.2693    , 78.8674    , 80.2621    , 78.1063    , 78.9166    ,
        78.1695    , 78.44      , 79.0334    , 76.98660001, 80.0743    ]))

In [6]:
# Very cool...
dt > 80, dt[dt > 80]

(array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True,  True,  True,  True, False, False,  True,  True, False,
        False,  True, False, False, False, False, False,  True, False,
        False, False, False, False, False,  True, False, False, False,
        False, False, False,  True,  True,  True, False,  True,  True,
         True, False, False, False, False, False, False, False, False,
        False, False, False, False, False, False, False,  True,  True,
        False, False,  True,  True,  True,  True,  True, False]),
 array([82.53010001, 81.7004    , 82.9246    , 81.796     , 81.1015    ,
        82.4635    , 84.49229999, 82.7202    , 82.8683    , 86.2192    ,
        86.0696    , 86.0411    , 82.1037    , 82.4433    , 82.9408    ,
        80.0252    , 80.1908    , 80.2621    , 80.0743    , 81.1474    ,
        83.72179999, 80.4112    , 80.2697    , 80.4792    , 80.3703    ,
        80.00720001, 82.693     , 80.1582    , 80.22649999, 81.2335    ,

In [9]:
# Awesome...
dt / 0.3048

# Compare to lists.

array([270.76804464, 268.04593176, 272.06233595, 268.35958005,
       266.08103675, 270.54954068, 277.20570864, 271.39173229,
       271.87762469, 282.87139107, 282.38057742, 282.2870735 ,
       269.36909449, 261.27690289, 260.24770344, 270.48326772,
       272.11548557, 261.44455381, 259.71423885, 262.54986876,
       258.45538058, 261.07611549, 259.6925853 , 260.23720472,
       257.34481627, 263.09317585, 261.68635171, 259.29330709,
       258.5101706 , 258.20406824, 256.789042  , 258.75131235,
       263.32709973, 256.2542651 , 258.91272965, 256.46161417,
       257.34908137, 259.29593176, 252.58070868, 262.710958  ,
       266.2316273 , 274.6778215 , 261.51312335, 263.81627297,
       263.35203412, 264.03937008, 258.72900262, 253.72408136,
       258.61942257, 258.10039372, 258.68471129, 260.80577427,
       250.01279529, 260.33628609, 257.0062336 , 253.75754592,
       257.90157481, 249.86942256, 260.41732283, 252.85564306,
       258.82677166, 263.68208663, 262.49081367, 260.28

These are NumPy arrays, which we'll meet properly in a minute. For now, just notice that they look a lot like lists... which might mean that even 'naive' functions that were written for scalar quantities (i.e. not for sequences) work on them.

In [12]:
def vp_from_dt(dt):
    return 1e6 / dt

vp_from_dt(dt)

array([3693.19799659, 3730.70388886, 3675.62822147, 3726.34358651,
       3758.25354638, 3696.18073454, 3607.42931629, 3684.71062684,
       3678.12541085, 3535.17545983, 3541.32004803, 3542.49306423,
       3712.37837027, 3827.35706421, 3842.49308177, 3697.08636112,
       3674.91029743, 3824.90277739, 3850.38573334, 3808.80022796,
       3869.13980179, 3830.30059314, 3850.70678409, 3842.64809897,
       3885.83696567, 3800.93477063, 3821.36857149, 3856.63637537,
       3868.3197557 , 3872.90567032, 3894.24716957, 3864.7146982 ,
       3797.55824988, 3902.37407209, 3862.30526925, 3899.21900492,
       3885.77256501, 3856.59733733, 3959.13054973, 3806.46474594,
       3756.12773794, 3640.62884489, 3823.89987612, 3790.51674394,
       3797.19869392, 3787.31398921, 3865.04794546, 3941.28927231,
       3866.68561106, 3874.46135048, 3865.709709  , 3834.2709351 ,
       3999.79528586, 3841.1856258 , 3890.95620754, 3940.76951038,
       3877.44821156, 4002.09033078, 3839.99032451, 3954.82571

Ostensibly scalar functions do work on them!

## Some tools for deep learning

Some NumPy features are very useful for machine learning in general, but especially in deep learning. Let's look at a few.

### Random numbers

You need to know how to make random numbers in NumPy. **Heads up** &mdash; most people don't use the recommended way, as it's still quite new (2017 I think).

In [14]:
rng = np.random.default_rng(seed=42)

rng.random(size=5), rng.integers(0, 10, size=20)

(array([0.77395605, 0.43887844, 0.85859792, 0.69736803, 0.09417735]),
 array([5, 9, 7, 7, 7, 7, 5, 1, 8, 4, 5, 3, 1, 9, 7, 6, 4, 8, 5, 4]))

### Reshaping

We'll often want to reshape arrays, e.g. turning rows into columns, flattening, unflattening, etc. Here are the main operations you need to know about:

In [135]:
arr = np.arange(25)

sq = arr.reshape(5, 5)
sq

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [136]:
sq.flatten()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [137]:
sq.T

array([[ 0,  5, 10, 15, 20],
       [ 1,  6, 11, 16, 21],
       [ 2,  7, 12, 17, 22],
       [ 3,  8, 13, 18, 23],
       [ 4,  9, 14, 19, 24]])

### Broadcasting

The broadcasting effect takes care of simple things like multiplying an `ndarray` by a scalar:

In [138]:
np.ones(5) * 100

array([100., 100., 100., 100., 100.])

But it's much more powerful than this. For example, imagine we have an image, and we'd like to multiply it by an array of scalars. Instead of having to loop over these scalars, we can compute all of them at once by adding some dimensions to the array containing the scalars:

In [139]:
x = np.ones((3, 3))
m = np.arange(1, 11).reshape(-1, 1, 1)

x * m

array([[[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]],

       [[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]],

       [[ 3.,  3.,  3.],
        [ 3.,  3.,  3.],
        [ 3.,  3.,  3.]],

       [[ 4.,  4.,  4.],
        [ 4.,  4.,  4.],
        [ 4.,  4.,  4.]],

       [[ 5.,  5.,  5.],
        [ 5.,  5.,  5.],
        [ 5.,  5.,  5.]],

       [[ 6.,  6.,  6.],
        [ 6.,  6.,  6.],
        [ 6.,  6.,  6.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 8.,  8.,  8.],
        [ 8.,  8.,  8.],
        [ 8.,  8.,  8.]],

       [[ 9.,  9.,  9.],
        [ 9.,  9.,  9.],
        [ 9.,  9.,  9.]],

       [[10., 10., 10.],
        [10., 10., 10.],
        [10., 10., 10.]]])

You'll see people achieve the same effect like this:

In [140]:
x = np.ones((3, 3))   # Imagine a tiny image.
m = np.arange(1, 11)[:, None, None]

x * m

array([[[ 1.,  1.,  1.],
        [ 1.,  1.,  1.],
        [ 1.,  1.,  1.]],

       [[ 2.,  2.,  2.],
        [ 2.,  2.,  2.],
        [ 2.,  2.,  2.]],

       [[ 3.,  3.,  3.],
        [ 3.,  3.,  3.],
        [ 3.,  3.,  3.]],

       [[ 4.,  4.,  4.],
        [ 4.,  4.,  4.],
        [ 4.,  4.,  4.]],

       [[ 5.,  5.,  5.],
        [ 5.,  5.,  5.],
        [ 5.,  5.,  5.]],

       [[ 6.,  6.,  6.],
        [ 6.,  6.,  6.],
        [ 6.,  6.,  6.]],

       [[ 7.,  7.,  7.],
        [ 7.,  7.,  7.],
        [ 7.,  7.,  7.]],

       [[ 8.,  8.,  8.],
        [ 8.,  8.,  8.],
        [ 8.,  8.,  8.]],

       [[ 9.,  9.,  9.],
        [ 9.,  9.,  9.],
        [ 9.,  9.,  9.]],

       [[10., 10., 10.],
        [10., 10., 10.],
        [10., 10., 10.]]])

### Matrix multiply

Deep learning often involves multiplying matrices and vectors. You can achieve this with the `matmul()` method on an ndarray, or (since Python 3.4) with the `@` operator.

Suppose we have 5 inputs to a neural network, and 3 units in the first hidden layer. In order to compute the activations on the first layer (using random weights here), we'll need the matrix multiplication of the weight vector and the inputs:

In [141]:
x = np.array([3, 6, 1, 4, 6])

W = rng.random((3, x.size))
b = np.zeros((3,))

W @ x + b

array([9.74779257, 7.7056171 , 9.24600484])

### More linear algebra

We can easily get the norm of a vector; by deafult it's the L2 norm, pass `ord` for other norms.

In [142]:
a = np.array([3.14, 5.21, 1.78])

np.linalg.norm(a)

6.33814641673731

We can get the inverse of a matrix:

In [143]:
m = rng.random((3, 3))

np.linalg.inv(m)

array([[-0.51815991,  0.337562  ,  0.89022118],
       [-6.46156246,  6.3257387 , -3.9862056 ],
       [ 6.95162645, -5.63179092,  3.17807217]])

If you need it, NumPy and SciPy can also handle 'real' dense matrices (with some extra operations) and sparse matrices, defined in a couple of different ways. You might need these if you deal with very large datasets.

---

For more on NumPy, check out [Intro_to_NumPy_for_ML.ipynb](Intro_to_NumPy_for_ML.ipynb).

---

## Torch tensors

Torch was developed in the open by a group of Lua programmers at IDIAP/EPFL in France, and later picked up by Facebook, Twitter and Google and others. It later turned into PyTorch and is used on huge machine learning projects everywhere.

PyTorch provides 4 important components:

- The tensor object.
- autograd: automatic differentiation.
- optim: efficient optimization. 
- nn: tools for computing large 'graphs' (how maths sees neural nets).

In essence, `torch.Tensor` is a `ndarray`, but it has a couple of important superpowers: it knows calculus, and it can compute on a GPU.

Let's look at a function:

$$ y = x^3 - 2x^2 + 2x - 7 $$

We see that:

$$ \frac{\mathrm{d}y}{\mathrm{d}x} = 3x^2 - 4x + 2 $$

So for $x$ = 3.5, $y$ = 18.375 and $\frac{\mathrm{d}y}{\mathrm{d}x}$ = 24.75. Let's see how to do it in PyTorch:

In [5]:
import torch

x = torch.tensor(3.5, requires_grad=True)
y = x**3 - 2*x**2 + 2*x - 7
y

tensor(18.3750, grad_fn=<SubBackward0>)

Apply the `backward()` method to compute gradients:

In [2]:
y.backward()

The gradient is stored on the tensor `x`, because that's the value to which it applies:

In [3]:
x.grad

tensor(24.7500)

The beauty of PyTorch is that it can compute arbitrarily complex function (e.g. nonlinear activations applied on top of the simple model here), and keep track of these gradients on millions of units across thousands of passes. All of this is done for us, on GPUs if we have them, so it's very fast and efficient.

<hr />

<div>
<img src="https://avatars1.githubusercontent.com/u/1692321?s=50"><p style="text-align:center">© Agile Geoscience 2022</p>
</div>