<a href="https://colab.research.google.com/github/sgcortes/2023_NAPLES/blob/main/02_Numpy_Refresher_Part_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1 style="font-size:30px;">Numpy Refresher (Part-2)</h1>


<img src='https://learnopencv.com/wp-content/uploads/2022/01/c4_01_NumPy_logo.png' width=200 align='left'><br/>

## Table of Contents

* [1 Understanding Array Dimensions and Reshaping NumPy Arrays](#1-Understanding-Array-Dimensions-and-Reshaping-NumPy-Arrays)
* [2 Combining Arrays / Matrices](#2-Combining-Arrays-/-Matrices)

In [None]:
import numpy as np

def array_info(array):
    print('Array:\n{}'.format(array))
    print('Data type:\t{}'.format(array.dtype))
    print('Array shape:\t{}'.format(array.shape))
    print('Array Dim:\t{}\n'.format(array.ndim))

## 1 Understanding Array Dimensions and Reshaping NumPy Arrays

Reshaping an array/matrix is often required in machine learning and computer vision and is a fundamental concept that should be fully understood. **Reshaping an array changes the shape of the data without altering the data itself. Reshaping arrays requires that the number of data elements remains constant**. In NumPy we will use the `reshape` method to reshape NumPy arrays.

### 1.1 Reshape

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

Either syntax below can be used:

``` python
result = np.reshape(a, newshape)

result = a.reshape(newshape)
```

Gives a new shape to a NumPy array `a` without changing its data.

Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html" target=_blank>np.reshape</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

### <font style="color:rgb(50,120,230)">Converting 1D Arrays to 2D Arrays</font>

We will start by creating a 1D array using `arange`. The shape of an array is always represented by a tuple. Notice that the size of the 1D array below is represented by the tuple: `(12, )`, which indicates it has a single dimension with 12 elements.

In [None]:
# Create a 1D array.
print('Create a 1D array:')
a = np.arange(1, 13, dtype=int)
array_info(a)
print("a[0]   = ", a[0])

Create a 1D array:
Array:
[ 1  2  3  4  5  6  7  8  9 10 11 12]
Data type:	int64
Array shape:	(12,)
Array Dim:	1

a[0]   =  1


Next, we will reshape the 1D array into a 2D array using `reshape(1, 12)` and notice that this does not change the number of elements in the array; it simply changes the shape by adding a new dimension to the array (which also changes how the elements within the array are accessed). In this case, we are creating a 2D array with a single row and 12 columns. The notion of rows and columns to describe an array is a convenient mental construct for 2D arrays, but NumPy uses `axis = 0, 1, 2, etc.`, to specify the dimensions of an array where `axis=0` represents the 1st dimension, `axis=1` the 2nd dimension, and so on. `axis=0` corresponds to the rows and `axis=1` corresponds to the columns. Beyond that, the 3rd and higher dimensions of a NumPy array are referred to by their axes.

In [None]:
# Reshape the 1D array to a (1x12) 2D array.
a = a.reshape(1, 12)
array_info(a)

# Print some values from the array.
print("a[0]   = ", a[0])                     
print("a[0,5] = ", a[0, 5])

Array:
[[ 1  2  3  4  5  6  7  8  9 10 11 12]]
Data type:	int64
Array shape:	(1, 12)
Array Dim:	2

a[0]   =  [ 1  2  3  4  5  6  7  8  9 10 11 12]
a[0,5] =  6


The same reshaping can also be accomplished with `reshape(1, -1)` as shown below, which may be convenient when the total number of elements is large.

In [None]:
# Rehape the 1D array to a (1x12) 2D array using reshape(1, -1).
a = a.reshape(1, -1)
array_info(a)

Array:
[[ 1  2  3  4  5  6  7  8  9 10 11 12]]
Data type:	int64
Array shape:	(1, 12)
Array Dim:	2



We can also reshape the 1D array into a (12x1) 2D array. 

In [None]:
# Rehape the 1D array to a (12x1) 2D array.
a = a.reshape(12, 1)
array_info(a)
print("a[0]   = ", a[0])
print("a[5,0] = ", a[5, 0])

Array:
[[ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]
 [11]
 [12]]
Data type:	int64
Array shape:	(12, 1)
Array Dim:	2

a[0]   =  [1]
a[5,0] =  6


Notice that we can reshape the array into any shape that are factors of 12. In the code below, we rehape the array to `(3x4)`. Other possible transformations include: `(4x3)`, `(2x6)` and `(6x2)`. Depending on the application and the data processing required, reshaping data can be very helpful and is often required for use with deep learning frameworks.

In [None]:
# Rehape the 1D array to a (3x4) 2D array.
a = a.reshape(3, 4)
array_info(a)
print("a[1] = ", a[1])

Array:
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Data type:	int64
Array shape:	(3, 4)
Array Dim:	2

a[1] =  [5 6 7 8]


### <font style="color:rgb(50,120,230)">Reshaping Multi-Demensional Arrays</font>

Reshaping higher dimensional data is no different than reshaping 2D arrays. The example below is provided to emphasize that the first axis coincides with the outer-most bracket and the last axis coincides with the inner-most bracket. 

In [None]:
# Create a (3x5x2) 3D array.
b = np.array([
    [[10, 11], [10, 12], [10, 13], [10, 14], [10, 15]],
    [[20, 21], [20, 22], [20, 23], [20, 24], [20, 25]],
    [[30, 31], [30, 32], [30, 33], [30, 34], [30, 35]],
])
array_info(b)
print("b[0,0,:] = ", b[0, 0, :])
print("b[2,0,:] = ", b[2, 0, :])

Array:
[[[10 11]
  [10 12]
  [10 13]
  [10 14]
  [10 15]]

 [[20 21]
  [20 22]
  [20 23]
  [20 24]
  [20 25]]

 [[30 31]
  [30 32]
  [30 33]
  [30 34]
  [30 35]]]
Data type:	int64
Array shape:	(3, 5, 2)
Array Dim:	3

b[0,0,:] =  [10 11]
b[2,0,:] =  [30 31]


In [None]:
b = b.reshape(3, 10)
array_info(b)

Array:
[[10 11 10 12 10 13 10 14 10 15]
 [20 21 20 22 20 23 20 24 20 25]
 [30 31 30 32 30 33 30 34 30 35]]
Data type:	int64
Array shape:	(3, 10)
Array Dim:	2



### <font style="color:rgb(50,120,230)">Adding an Axis / Dimension</font>

It is sometimes necessary to add an axis to a NumPy array to comply with the data shape required by systems that process the data. This is especially true with deep learning frameworks. Notice that adding a dimension to an existing NumPy array is always possible since it does not change the number of elements in the array (it only changes how the elements are accessed within the array). In the example below, we are reshaping the 2D `(3x3)` array into a 3D `(1x3x3)` array. We are effectively adding an axis to the very first position in the tuple. This new dimension is redundant, but reshaping data is often related to compliance with other data interfaces.

In [None]:
# Create a (3x3) 2D array.
b = np.arange(1, 10, dtype=int)
b = b.reshape(3, 3)
array_info(b)

Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)
Array Dim:	2



In [None]:
# Add a new axis to the 3x3 array to create a 3D array.
b = b.reshape(1, 3, 3)
array_info(b)

Array:
[[[1 2 3]
  [4 5 6]
  [7 8 9]]]
Data type:	int64
Array shape:	(1, 3, 3)
Array Dim:	3



### 1.2 Expand Dimensions

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In the previous section, we used `reshape` to add a new axis. We can use `np.expand_dims` or `np.newaxis` to do the same thing.

``` python
np.expand_dims(a, axis)
```

Expand the shape of an array. Insert a new axis that will appear at the axis position in the expanded array shape.

Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.expand_dims.html" target=_blank>np.expand_dims</a> <a href="https://numpy.org/doc/stable/reference/constants.html?highlight=newaxis#numpy.newaxis" target=_blank>np.newaxis</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In [None]:
# Create a 1D array.
print('Create a 1D array:')
c = np.arange(1, 10, dtype=int)
array_info(c)

# Reshape to a 3x3 2D array.
c = c.reshape(3,3)
array_info(c)

# Expand dimensions using: np.expand_dims
print('Using np.expand_dims:')
c_expand = np.expand_dims(c, axis=0)
array_info(c_expand)

# Expand dimensions using: np.newaxis
print('Using np.newaxis:')
c_newaxis = c[np.newaxis, :, : ]
array_info(c_newaxis)

# You can also use `None`, but np.newaxis is prefered since it is more explicit.
# print('Using None:')
# c_none = c[None, :, : ]
# array_info(c_none)

Create a 1D array:
Array:
[1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(9,)
Array Dim:	1

Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)
Array Dim:	2

Using np.expand_dims:
Array:
[[[1 2 3]
  [4 5 6]
  [7 8 9]]]
Data type:	int64
Array shape:	(1, 3, 3)
Array Dim:	3

Using np.newaxis:
Array:
[[[1 2 3]
  [4 5 6]
  [7 8 9]]]
Data type:	int64
Array shape:	(1, 3, 3)
Array Dim:	3



### 1.3 Squeeze

Sometimes we need to remove the redundant axis (single-dimensional entries). We can use `np.squeeze` to do this.

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

``` python
np.squeeze(a, axis=None)
```

Remove axes of length one from a.

Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.squeeze.html" target=_blank>np.squeeze</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In [None]:
e = np.arange(1, 10, dtype=int)
e = e.reshape(1, 3, 3) 
array_info(e)

Array:
[[[1 2 3]
  [4 5 6]
  [7 8 9]]]
Data type:	int64
Array shape:	(1, 3, 3)
Array Dim:	3



In [None]:
print('Squeeze along axis=0:')
e_squeezed = np.squeeze(e, axis=0)
array_info(e_squeezed)

Squeeze along axis=0:
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)
Array Dim:	2



In [None]:
import traceback

# Should get value error.
print('Squeeze along axis=1, should get ValueError')
try:
    e_squeezed_error = np.squeeze(e, axis=1)  
except Exception as e:
    traceback.print_exc()

Squeeze along axis=1, should get ValueError


Traceback (most recent call last):
  File "/var/folders/__/9cnms5ms0gz3wk9_cz4ctxqr0000gn/T/ipykernel_6590/685120187.py", line 6, in <module>
    e_squeezed_error = np.squeeze(e, axis=1)
  File "<__array_function__ internals>", line 5, in squeeze
  File "/Users/billk/opt/dl-tf-keras-env/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 1495, in squeeze
    return squeeze(axis=axis)
ValueError: cannot select an axis to squeeze out which has size not equal to one


### 1.4 Reshape Revisit

As we showed previously, we can use `reshape` with a -1 for one of the array axes without specifying a length. The examples below show how we can use this to create a specific number or rows or columns without regard to the other dimension as long as the number specified is a factor of the original size of the array.

In [None]:
f = np.arange(10)
array_info(f)

print('Reshape the array as two columns.')
f = f.reshape(-1, 2)
array_info(f)

Array:
[0 1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(10,)
Array Dim:	1

Reshape the array as two columns.
Array:
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
Data type:	int64
Array shape:	(5, 2)
Array Dim:	2



In [None]:
g = np.arange(10)
array_info(g)

print('Reshape the array as 2 rows.')
g = g.reshape(2, -1)
array_info(g)

Array:
[0 1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(10,)
Array Dim:	1

Reshape the array as 2 rows.
Array:
[[0 1 2 3 4]
 [5 6 7 8 9]]
Data type:	int64
Array shape:	(2, 5)
Array Dim:	2



### <font style="color:rgb(50,120,230)">Flattening Arrays</font>
When images are processed by a Convolutional Neural Network (CNN), the shape of the image data is often represented as a four-dimensional array (also referred to as a tensor) which includes the image height and width, the number of channels, and the number of images in the 'batch' of images to be processed. For the sake of simplicity, in this example, we will ignore the dimension associated with the image channel. In the code cell below, we construct a multi-dimensional array that notionally represents the data associated with three separate images being processed in a CNN. 

In [None]:
# Create three separate 4x4 arrays.
h1 = np.full((4, 4), 1, dtype='float32')
h2 = np.full((4, 4), 2, dtype='float32')
h3 = np.full((4, 4), 3, dtype='float32')

# Create a 3x4x4 array.
h = np.zeros((3, 4, 4))
h[0] = h1
h[1] = h2
h[2] = h3

array_info(h)

Array:
[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]

 [[3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]]]
Data type:	float64
Array shape:	(3, 4, 4)
Array Dim:	3



In [None]:
h = h.reshape(1, 3, 16) 
array_info(h)

Array:
[[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
  [2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]
  [3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]]]
Data type:	float64
Array shape:	(1, 3, 16)
Array Dim:	3



## 2 Combining Arrays / Matrices

Combining two or more arrays is a frequent operation in machine learning. Let's take a look at a few methods for doing this. 


### 2.1 Concatenate

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

``` python
np.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")
```

Join a sequence of arrays along an existing axis.

Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html" target=_blank>np.concatenate</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In [None]:
a1 = np.array([[1, 2, 3], [4, 5, 6]])
a2 = np.array([[7, 8, 9]])

array_info(a1)
array_info(a2)

print('Concatenate along axis zero:')
array = np.concatenate((a1, a2), axis=0)
array_info(array)

Array:
[[1 2 3]
 [4 5 6]]
Data type:	int64
Array shape:	(2, 3)
Array Dim:	2

Array:
[[7 8 9]]
Data type:	int64
Array shape:	(1, 3)
Array Dim:	2

Concatenate along axis zero:
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)
Array Dim:	2



### 2.2 Horizontal Stacking: `hstack`

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

``` python
np.hstack(tup)
```

Stack arrays in sequence horizontally (column-wise).

Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html#numpy.hstack" target=_blank>np.hstack</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In [None]:
a1 = np.array([[1, 1, 1], [2, 2, 2]])
a2 = np.array([[3, 3, 3, 3], [4, 4, 4, 4]])

array_info(a1)
array_info(a2)

a_hstacked = np.hstack((a1, a2))

print('Horizontal stack:')
array_info(a_hstacked)

Array:
[[1 1 1]
 [2 2 2]]
Data type:	int64
Array shape:	(2, 3)
Array Dim:	2

Array:
[[3 3 3 3]
 [4 4 4 4]]
Data type:	int64
Array shape:	(2, 4)
Array Dim:	2

Horizontal stack:
Array:
[[1 1 1 3 3 3 3]
 [2 2 2 4 4 4 4]]
Data type:	int64
Array shape:	(2, 7)
Array Dim:	2



In [None]:
b1 = np.array([[1],[2],[3]])
b2 = np.array([[4],[5],[6]])

array_info(b1)
array_info(b2)

b_hstacked = np.hstack((b1, b2))

print('Horizontal stack:')
array_info(b_hstacked)

Array:
[[1]
 [2]
 [3]]
Data type:	int64
Array shape:	(3, 1)
Array Dim:	2

Array:
[[4]
 [5]
 [6]]
Data type:	int64
Array shape:	(3, 1)
Array Dim:	2

Horizontal stack:
Array:
[[1 4]
 [2 5]
 [3 6]]
Data type:	int64
Array shape:	(3, 2)
Array Dim:	2



### 2.3 Vertical Stacking: `vstack`

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

``` python
np.vstack(tup)
```

Stack arrays in sequence vertically (row-wise).


Documentation: <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html#numpy.vstack" target=_blank>np.vstack</a>

<hr style="border:none; height: 4px; background-color:#D3D3D3" />

In [None]:
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])

array_info(a1)
array_info(a2)

a_vstacked = np.vstack((a1, a2))

print('Vertical stack:')
array_info(a_vstacked)

Array:
[1 2 3]
Data type:	int64
Array shape:	(3,)
Array Dim:	1

Array:
[4 5 6]
Data type:	int64
Array shape:	(3,)
Array Dim:	1

Vertical stack:
Array:
[[1 2 3]
 [4 5 6]]
Data type:	int64
Array shape:	(2, 3)
Array Dim:	2



In [None]:
# Create three separate 1x4x4 arrays.
h1 = np.full((1, 4, 4), 1, dtype='float32')
h2 = np.full((1, 4, 4), 2, dtype='float32')
h3 = np.full((1, 4, 4), 3, dtype='float32')

h = np.vstack((h1, h2, h3))
array_info(h)

Array:
[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]

 [[3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]
  [3. 3. 3. 3.]]]
Data type:	float32
Array shape:	(3, 4, 4)
Array Dim:	3

