# neuralthreads
[medium](https://neuralthreads.medium.com/i-was-not-satisfied-by-any-deep-learning-tutorials-online-37c5e9f4bea1)

## Chapter 1 — Tensors

### 1.3 Basic and useful NumPy operations for the course

> First post (own_tutorial_1.ipynb).
> Previous post (own_tutorial_2.ipynb).

In this post, we will see some useful NumPy operations which we will use in future posts for this course. Important Operations like broadcasting and taking sum along a particular axis are explained here.

Starting with importing NumPy and setting the random seed at 42.

In [1]:
import numpy as np
np.random.seed(42)

**np.random.random**

gives an array of 
- random numbers between 0 and 1 
- of a given shape.

In [2]:
x = np.random.random(size = (3, 2))
print(x)
x.reshape(2,3)


[[0.37454012 0.95071431]
 [0.73199394 0.59865848]
 [0.15601864 0.15599452]]


array([[0.37454012, 0.95071431, 0.73199394],
       [0.59865848, 0.15601864, 0.15599452]])

**np.random.normal** 

gives an array of 
- normally distributed numbers having mean = ‘loc’ and 
- standard deviation = ‘scale’ 
- of given shape.

In [3]:
np.random.normal(loc = 0, scale = 1, size = (2, 3))

array([[ 1.57921282,  0.76743473, -0.46947439],
       [ 0.54256004, -0.46341769, -0.46572975]])

In [4]:
np.random.normal(loc = 0, scale = 5, size = (3, 3))

array([[ 1.20981136, -9.56640122, -8.62458916],
       [-2.81143765, -5.0641556 ,  1.57123666],
       [-4.54012038, -7.06151851,  7.32824384]])

**np.random.randint** 

gives an array of 
- random integers between low and high (exclusive) 
- of a given shape

In [5]:
np.random.randint(low = 0, high = 10, size = (3, 4))

array([[2, 6, 3, 8],
       [2, 4, 2, 6],
       [4, 8, 6, 1]])

In [6]:
np.random.randint(low = -9, high = 10, size = (4, 4))

array([[-6,  4,  8, -1],
       [-8,  5, -3,  2],
       [-2,  5, -7,  4],
       [ 7, -6,  8, -2]])

**np.ones** 

give an array of ones of a given shape

In [7]:
n= np.ones(shape = (2, 4))
print(n)
type(n[0][0])

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]


numpy.float64

**np.zeros** 

give an array of zeros of a given shape

In [8]:
np.zeros(shape = (2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

**np.eye** 

gives an identity matrix of a given shape, square matrix

In [9]:
np.eye(N = 3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [10]:
np.eye(N = 5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

**np.T** 

will take the transpose of a matrix or it will flip it along the diagonal

It reverses the shape tuple and also changes the order of the scalar elements

In [11]:
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print(x)
print(x.shape)
print(x.T)
x.T.shape

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
(4, 3)
[[ 1  4  7 10]
 [ 2  5  8 11]
 [ 3  6  9 12]]


(3, 4)

**np.concatenate**

will merge two or more arrays along the specified axis
The only requirement is all other axis should be same

If one array is of shape (3, 2, 4) and the other is of shape (3, 1, 4)
Then they can be merged along axis = 1 (2+1)

Or, if one array is of shape (1, 2, 3) and the other is of shape (2, 2, 3)
Then they can be merged along axis = 0 (1+2)

*np.concatenate* is different from *np.stack*

In [12]:
print('-----------X-------------')
x = np.random.randint(low = 0, high = 5,  size = (2, 3, 1), dtype = int)
print(x, x.shape)
print('-----------Y-------------')
y = np.random.randint(low = 6, high = 10, size = (2, 3, 3))
print(y, y.shape)
print('-----------Z-------------')
z = np.concatenate((x, y), axis = 2)
print(z, z.shape)

-----------X-------------
[[[3]
  [1]
  [1]]

 [[3]
  [4]
  [1]]] (2, 3, 1)
-----------Y-------------
[[[7 9 7]
  [7 7 9]
  [7 8 9]]

 [[8 9 7]
  [8 9 6]
  [7 9 6]]] (2, 3, 3)
-----------Z-------------
[[[3 7 9 7]
  [1 7 7 9]
  [1 7 8 9]]

 [[3 8 9 7]
  [4 8 9 6]
  [1 7 9 6]]] (2, 3, 4)


In [13]:
print('-----------X-------------')
x = np.random.randint(low = 0, high = 5,  size = (1, 2, 3), dtype = int)
print(x, x.shape, len(x.shape))
print('-----------Y-------------')
y = np.random.randint(low = 6, high = 10, size = (3, 2, 3))
print(y, y.shape, len(y.shape))
print('-----------Z-------------')
z = np.concatenate((x, y), axis = 0)
print(z, z.shape, len(z.shape))

-----------X-------------
[[[4 1 4]
  [1 0 3]]] (1, 2, 3) 3
-----------Y-------------
[[[9 9 6]
  [6 6 8]]

 [[6 6 6]
  [8 6 9]]

 [[6 9 9]
  [9 8 8]]] (3, 2, 3) 3
-----------Z-------------
[[[4 1 4]
  [1 0 3]]

 [[9 9 6]
  [6 6 8]]

 [[6 6 6]
  [8 6 9]]

 [[6 9 9]
  [9 8 8]]] (4, 2, 3) 3


In [14]:
print('-----------X-------------')
x = np.random.randint(low = 0, high = 5,  size = (2, 2, 3), dtype = int)
print(x, x.shape, len(x.shape))
print('-----------Y-------------')
y = np.random.randint(low = 6, high = 10, size = (2, 2, 3))
print(y, y.shape, len(y.shape))
print('-----------Z-------------')
z = np.concatenate((x, y), axis = 1)
print(z, z.shape, len(z.shape))

-----------X-------------
[[[2 0 2]
  [2 0 2]]

 [[4 1 1]
  [0 3 0]]] (2, 2, 3) 3
-----------Y-------------
[[[9 9 7]
  [6 9 8]]

 [[8 7 9]
  [6 8 9]]] (2, 2, 3) 3
-----------Z-------------
[[[2 0 2]
  [2 0 2]
  [9 9 7]
  [6 9 8]]

 [[4 1 1]
  [0 3 0]
  [8 7 9]
  [6 8 9]]] (2, 4, 3) 3


**np.sum** 

We will use it in chapter 5 when we will talk about Backpropagation.

> Note — Everything is the same for **np.mean**, **np.max**, and **np.min** what we are seeing for **np.sum**

np.sum will give us sum of all the scalars in the array

In [15]:
x = [i for i in range(1, 25)]
print(x)
print(np.sum(x)) 
x = np.array(x)
print(x, x.shape)
print(np.sum(x))
x = x.reshape((4, 3, 2))
print(x, x.shape)
print(np.sum(x), len(x.shape))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
300
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24] (24,)
300
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]] (4, 3, 2)
300 3


In [16]:
%%latex
\begin{align}
    \text{Sum is } \, \frac{(24 * 25)}{2} = 300 \,\,\textcolor{red}{ ¿ Why 25 ? }
\end{align}

<IPython.core.display.Latex object>

But sometimes we have to take the sum along a particular axis.

In this case, we have 3 axes (4, 3, 2). So, let us go through each axis one by one.

When we take the sum along **axis = 0** (4)

It will actually take the sum of N-1 D arrays element-wise and will return an array whose shape tuple will not have the first entry.

In [17]:
print(x)
print('-------------------------')
sum_0 = np.sum(x, axis = 0)     
print(sum_0, sum_0.shape, len(sum_0.shape))

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]]
-------------------------
[[40 44]
 [48 52]
 [56 60]] (3, 2) 2


![image.png](attachment:image.png)

All the matrices are added element-wise and the resulting shape is (3, 2)

We can see that the first entry is discarded from the shape (4, 3, 2)

When we take the sum along **axis = 1** (3)
It will actually take the sum of N-2 D arrays element-wise in all N-1 D arrays and will return an array whose shape tuple will not have the second entry.

In [18]:
print(x)
print('-------------------------')
sum_1 = np.sum(x, axis = 1)     
print(sum_1, sum_1.shape, len(sum_1.shape))

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]]
-------------------------
[[ 9 12]
 [27 30]
 [45 48]
 [63 66]] (4, 2) 2


![image.png](attachment:image.png)

Same for all other 2-D arrays and the resulting shape is (4, 2)

We can see that the second entry is discarded from the shape (4, __3,__ 2)

When we take the sum along **axis = 2** (2)

It will actually take the sum of N-3 D arrays element-wise in all N-2 D arrays and will return an array whose shape tuple will not have the third entry.

In this case, it will take the sum of the scalars in all of the vectors.

In [19]:
print(x)
print('-------------------------')
sum_2 = np.sum(x, axis = 2)     
print(sum_2, sum_2.shape)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]]
-------------------------
[[ 3  7 11]
 [15 19 23]
 [27 31 35]
 [39 43 47]] (4, 3)


In [20]:
%%latex
\begin{align}

        1 + 2 = 3 \\
        3 + 4 = 7 \\
        5 + 6 = 11 \\

\end{align}

<IPython.core.display.Latex object>

Same for all other 1-D arrays and the resulting shape is (4, 3)

We can see that the third entry is discarded from the shape (4, 3, 2)

> Note — We can also use axis = -1 in this case.

**Suppose you have a matrix**

Then taking the 

*sum* along **axis = 0** is equivalent to taking 
*sum* along the *columns* and taking the 

*sum* along **axis = 1** is equivalent to taking 
*sum* along the *rows*. 

We will use this very much in Backpropagation in chapter 5.

In [21]:
x = np.array([[1, 2, 3], [4, 5, 6]])
print(x, x.shape)
print('-----------sum along columns--------------')
sum_0 = np.sum(x, axis = 0)                 # sum along columns
print(sum_0, sum_0.shape)
print('------------sum along rows-------------')
sum_1 = np.sum(x, axis = 1)                 # sum along rows
print(sum_1, sum_1.shape)

[[1 2 3]
 [4 5 6]] (2, 3)
-----------sum along columns--------------
[5 7 9] (3,)
------------sum along rows-------------
[ 6 15] (2,)


## Broadcasting. 

>The most important topic in this post.

First, we have an array ‘x’

In [22]:
x = [i for i in range(1, 25)]
x = np.array(x)
print(x)
x = x.reshape(4, 3, 2)
print(x, x.shape)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]
[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]] (4, 3, 2)


When we perform any operation like 
- multiplication, 
- addition, 
- subtraction or 
- division 

between an **array** and a **scalar**, then *scalar* is broadcasted to every element in the *array*

In [23]:
x * 5

array([[[  5,  10],
        [ 15,  20],
        [ 25,  30]],

       [[ 35,  40],
        [ 45,  50],
        [ 55,  60]],

       [[ 65,  70],
        [ 75,  80],
        [ 85,  90]],

       [[ 95, 100],
        [105, 110],
        [115, 120]]])

5 is multiplied with every element in the array ‘x’

![image.png](attachment:image.png)

The same is true for addition, subtraction, and division.


In [24]:
x + 5

array([[[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]],

       [[24, 25],
        [26, 27],
        [28, 29]]])

In [25]:
x - 5

array([[[-4, -3],
        [-2, -1],
        [ 0,  1]],

       [[ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13]],

       [[14, 15],
        [16, 17],
        [18, 19]]])

In [26]:
x / 5

array([[[0.2, 0.4],
        [0.6, 0.8],
        [1. , 1.2]],

       [[1.4, 1.6],
        [1.8, 2. ],
        [2.2, 2.4]],

       [[2.6, 2.8],
        [3. , 3.2],
        [3.4, 3.6]],

       [[3.8, 4. ],
        [4.2, 4.4],
        [4.6, 4.8]]])

In [27]:
x ** 2

array([[[  1,   4],
        [  9,  16],
        [ 25,  36]],

       [[ 49,  64],
        [ 81, 100],
        [121, 144]],

       [[169, 196],
        [225, 256],
        [289, 324]],

       [[361, 400],
        [441, 484],
        [529, 576]]])

### Broadcasting a function

In python, if we define a function and pass an array as an argument, then the function is broadcasted to every element in the array.

![image.png](attachment:image.png)

> Note — Things will be different if the function uses the shape of the array which we will talk about in the Softmax activation function.

Now let us talk about broadcasting between an array and a smaller dimension array.

There are some requirements when we want to perform some operation between an array and a smaller dimension array. Let us understand with examples.

In [28]:
print(x, x.shape)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]

 [[13 14]
  [15 16]
  [17 18]]

 [[19 20]
  [21 22]
  [23 24]]] (4, 3, 2)


**First example**

In [29]:
y = np.array([1, 2])
print(y, y.shape)
print(x + y)
print((x + y).shape)

[1 2] (2,)
[[[ 2  4]
  [ 4  6]
  [ 6  8]]

 [[ 8 10]
  [10 12]
  [12 14]]

 [[14 16]
  [16 18]
  [18 20]]

 [[20 22]
  [22 24]
  [24 26]]]
(4, 3, 2)


We can see that here ‘y’ [1, 2] is added to every 1-S tensor in ‘x’

**Second example**

In [30]:
y = np.array([[1, 2], [3, 4], [5, 6]])
print(y, y.shape)
x + y

[[1 2]
 [3 4]
 [5 6]] (3, 2)


array([[[ 2,  4],
        [ 6,  8],
        [10, 12]],

       [[ 8, 10],
        [12, 14],
        [16, 18]],

       [[14, 16],
        [18, 20],
        [22, 24]],

       [[20, 22],
        [24, 26],
        [28, 30]]])

We can see that ‘y’ is added to every 2-D tensor in ‘x’

**Third example, minus**

In [31]:
y = np.array([[1], [3], [5]])
print(y, y.shape)
x - y

[[1]
 [3]
 [5]] (3, 1)


array([[[ 0,  1],
        [ 0,  1],
        [ 0,  1]],

       [[ 6,  7],
        [ 6,  7],
        [ 6,  7]],

       [[12, 13],
        [12, 13],
        [12, 13]],

       [[18, 19],
        [18, 19],
        [18, 19]]])

![image.png](attachment:image.png)

We can see that ‘y’ is subtracted from every 2-D column vector in ‘x’

**Fourth example, divide**

In [32]:
y = np.array([[[1], [3], [5]], [[7], [9], [11]], [[13], [15], [17]], [[19], [21], [23]]])
print(y, y.shape)
x / y

[[[ 1]
  [ 3]
  [ 5]]

 [[ 7]
  [ 9]
  [11]]

 [[13]
  [15]
  [17]]

 [[19]
  [21]
  [23]]] (4, 3, 1)


array([[[1.        , 2.        ],
        [1.        , 1.33333333],
        [1.        , 1.2       ]],

       [[1.        , 1.14285714],
        [1.        , 1.11111111],
        [1.        , 1.09090909]],

       [[1.        , 1.07692308],
        [1.        , 1.06666667],
        [1.        , 1.05882353]],

       [[1.        , 1.05263158],
        [1.        , 1.04761905],
        [1.        , 1.04347826]]])

We can see that ‘y’ divides every 3-D column vector in ‘x’

> Note — If two or more arrays have the same dimension, then the operation will be element-wise.

**DOT product**

It is very easy to take DOT products in NumPy.

In [33]:
print('-----------X--------------')
x = np.array([[1], [2], [3]])
print(x, x.shape)
print('-----------Y--------------')
y = np.array([[1, 2, 3]])
print(y, y.shape)
print('-----------Z--------------')
z = x .dot( y )
print(z, z.shape)

-----------X--------------
[[1]
 [2]
 [3]] (3, 1)
-----------Y--------------
[[1 2 3]] (1, 3)
-----------Z--------------
[[1 2 3]
 [2 4 6]
 [3 6 9]] (3, 3)


In [42]:
print('-----------X--------------')
x = np.array([[2, 1, 2], [2, 1, 3], [2, 1, 4]])
print(x, x.shape)
print('-----------Y--------------')
y = np.array([[3, 3], [3, 2], [2, 2]])
print(y, y.shape)
print('-----------Z--------------')
z = x .dot( y )
print(z, z.shape)

-----------X--------------
[[2 1 2]
 [2 1 3]
 [2 1 4]] (3, 3)
-----------Y--------------
[[3 3]
 [3 2]
 [2 2]] (3, 2)
-----------Z--------------
[[13 12]
 [15 14]
 [17 16]] (3, 2)


![image.png](attachment:image.png)

With this post, the first chapter is over. In the next post, we will start **Chapter 2** — Optimizers with Gradient Descent.