<center>
  <a href="PP-08-FileHandling.ipynb" target="_self">File Handling</a> | <a href="./">Content Page</a> | <a href="PP-10-pandas.ipynb">The pandas Library</a>
</center>

# <center>THE NUMPY LIBRARY</center>
<center><b>Copyright &copy 2023 by DR DANNY POO</b><br> e:dannypoo@nus.edu.sg<br> w:drdannypoo.com</center><br>

Array is a data structure commonly used in Data Science. It is also essential for scientific computing but Python does not have an explicit implementation of array. For most Python programmers who need to use an array, they would have to implement it using a list structure which is clearly not an ideal solution.

In an attempt to provide for arrays, the standard Python library has a class for arrays – the “array.array” class in the “array” module. This class creates a 1-dimensional array object with limited functionalities for managing arrays. 

As Python gained popularity among developers, it became clear that more must be done to enhance the Python library to support scientific computing to include numerical calculations and arrays. This resulted in the birth of the Numpy library.

# 1. Getting the Numpy Library
To begin, Numpy library must be installed. If you have Python already installed, you can get Numpy by entering
```python
conda install numpy
```
or
```python
pip install numpy
```
in the Command Prompt.

To use Numpy and all its functionalities in it, you need to import the package. The Numpy package is usually imported as ``np``:

```python
import numpy as np
```

In [1]:
# To check the version of Numpy
import numpy as np
np.__version__ # double underscore “version” double underscore

'1.24.3'

# 2. ndarray
The central data structure of Numpy is the ndarray (i.e. N-dimensional array where N can be 0). An ``ndarray`` is a contiguous storage of multi-dimensional, homogeneous data elements of the same size. It can be indexed by a tuple of non-negative integers, by booleans, by another array, or by integers. Each element is referenced via the index. 

## 2.1 ndarray Attributes
Attributes reflect information or properties about a data structure. The ndarray has a number of attributes that describe the array. Any reference to the use of the word “array” will refer to the Numpy ``ndarray``. 

The Numpy ndarray has the following attributes:
![image-2.png](attachment:image-2.png)

In [2]:
# Create an ndarray
a = np.array([2, 4, 6, 8, 10])
type(a)

numpy.ndarray

### ndim
The attribute ``ndim`` shows the rank or number of dimensions of an ndarray.

In Numpy, dimensions are known as axes. Therefore, ndim shows the number of axes (or dimensions) of the array. The array “a” has one axis and the axis has 3 elements in it.

In [3]:
# Number of dimensions of an ndarray
a.ndim

1

In [4]:
# Array “b” is 2-dimensional and it has two axes. 
# The first axis has a length of 2, the second axis has a length of 3. 
b = np.array([[2, 4, 6],
              [8, 10, 12]])
b.ndim

2

### shape
The attribute ``shape`` shows the dimensions of the array. It is a tuple of integers indicating the size of the array in each dimension. 

For a 2-dimensional array (also known as a matrix) with “n” rows and “m” columns, shape will be (n, m). For array “b”, there are 2 rows and 3 columns.

For array “a” which is 1-dimensional, there is only one axis of length 5 (you may think of it as having a row and no column) as shown by the shape attribute.

In [5]:
# Shows dimensions of the array
a.shape

(5,)

In [6]:
# Array “c” is 3-dimensional and it has three axes. 
# The first axis has a length of 2, the second axis has a length of 4, and the third axis has a length of 3.
c = np.array([[[2, 4, 6],
               [8, 10, 12],
               [14, 16, 18],
               [20, 22, 24]],
              [[26, 28, 30],
               [32, 34, 36],
               [38, 40, 42],
               [44, 46, 48]]])
print(c.ndim)
print(c.shape)

3
(2, 4, 3)


In [7]:
# Array "d" has 0, 1, 2, 3 dimension
d = np.array(2)
print("d has 0 dimension")
print(d)
print(d.ndim)
print(d.shape)

d = np.array([2])
print("d has 1 dimension")
print(d)
print(d.ndim)
print(d.shape)

d = np.array([[2, 4, 6],
              [8, 10, 12]])
print("d has 2 dimensions")
print(d)
print(d.ndim)
print(d.shape)

d = np.array([[[2, 4, 6],
               [8, 10, 12],
               [14, 16, 18],
               [20, 22, 24]],
              [[26, 28, 30],
               [32, 34, 36],
               [38, 40, 42],
               [44, 46, 48]]])
print("d has 3 dimensions")
print(d)
print(d.ndim)
print(d.shape)

d has 0 dimension
2
0
()
d has 1 dimension
[2]
1
(1,)
d has 2 dimensions
[[ 2  4  6]
 [ 8 10 12]]
2
(2, 3)
d has 3 dimensions
[[[ 2  4  6]
  [ 8 10 12]
  [14 16 18]
  [20 22 24]]

 [[26 28 30]
  [32 34 36]
  [38 40 42]
  [44 46 48]]]
3
(2, 4, 3)


### size
The attribute ``size`` returns the total number of elements of the array. This is equivalent to the product of the elements of shape attribute. 

In [8]:
# What is the size of array "d"?
d.size

24

### dtype
The elements of an ndarray can take on any standard Python types as well as Numpy self-defined types such as numpy.int32, numpy.int16, numpy.float64. <br>
The element type can be specified via the ``dtype``attribute.

In [9]:
# What is the element type of array "d"?
d = np.array([[[2, 4, 6],
               [8, 10, 12],
               [14, 16, 18],
               [20, 22, 24]],
              [[26, 28, 30],
               [32, 34, 36],
               [38, 40, 42],
               [44, 46, 48]]])
d.dtype

dtype('int32')

In [10]:
# What is the element type of array "e"?
e = np.array([1.5, 3.8, 5.9])
e.dtype

dtype('float64')

In [11]:
# What is the element type of array "f"?
f = np.array(['tom', 'eve', 'jim']) # Unicode, 3 characters
f.dtype

dtype('<U3')

In [12]:
# What is the element type of array "f"?
f = np.array(['tom', 'eve', 'jimmy']) # Unicode, 5 characters
f.dtype

dtype('<U5')

# 3. ndarray Creation
Arrays are created from a regular Python list or tuple using the ``array()`` function. <br>
The type of the resulting array takes on the type of the elements in the sequences.

In [13]:
# Creates array from a list
g = np.array([10, 15, 20]) 
print(g, g.dtype)

[10 15 20] int32


In [14]:
# Creates array from a tuple
h = np.array((10, 15, 20)) 
print(h, h.dtype)

[10 15 20] int32


In [15]:
# Creates array of floating point numbers from a tuple
i = np.array((10.4, 15.0, 20.1)) 
print(i, i.dtype)

[10.4 15.  20.1] float64


<b>A common error in array creation is the omission of the round (for tuple) or square (for list) brackets, leading to multiple arguments being passed to the ``array()`` function. For example,
```python
d = np.array(10, 15, 20)      # incorrect
Out:
TypeError: array() takes from 1 to 2 positional arguments but 3 were given  
```
</b>

In [16]:
# The array() function transforms sequences of sequences into 2-dimensional arrays.
j = np.array([(10.4, 15, 20), (2, 4, 6)]) # 2 rows and 3 columns
print(j, j.dtype, type(j))
print(j.ndim, j.shape)

[[10.4 15.  20. ]
 [ 2.   4.   6. ]] float64 <class 'numpy.ndarray'>
2 (2, 3)


In [17]:
# array() function can transform sequences of sequences of sequences into 3-dimensional arrays, and so on.
k = np.array([([10, 15, 18], [20, 25, 30]), ([2, 4, 5], [6, 7, 8])])
print(k, k.dtype, type(k))
print(k.ndim, k.shape)

[[[10 15 18]
  [20 25 30]]

 [[ 2  4  5]
  [ 6  7  8]]] int32 <class 'numpy.ndarray'>
3 (2, 2, 3)


## 3.1 Functions for Array Creation with Default Values
<b>Numpy has several functions that enable the creation of arrays with default values. The function 
* ``zeros()`` creates an array full of zeros, 
* ``ones()`` create an array full of ones, and 
* ``empty()``creates an array whose initial content is random and depends on the state of the memory. 
</b>

### zeros() Function

In [18]:
# zeros(): default float64 elements
l = np.zeros([5, 7]) 
print(l, l.dtype, type(l))
print(l.ndim, l.shape)

[[0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0.]] float64 <class 'numpy.ndarray'>
2 (5, 7)


### ones() Function

In [19]:
# ones(): dtype specified
m = np.ones([3, 4], dtype="int64") 
print(m, m.dtype, type(m))
print(m.ndim, m.shape)

[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]] int64 <class 'numpy.ndarray'>
2 (3, 4)


### empty() Function

In [20]:
# empty(): uninitialized
n = np.empty((3, 3))  
print(n, n.dtype, type(n))
print(n.ndim, n.shape)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]] float64 <class 'numpy.ndarray'>
2 (3, 3)


## 3.2 arange() Function
Numpy provides `arange()` function similar to the Python built-in `range()` function. This function creates a sequence of numbers in the resulting array.

In [21]:
# arange(start, stop, step, dtype=None)
o = np.arange(5, 50, 7) 
print(o, o.dtype, type(o))
print(o.ndim, o.shape)

[ 5 12 19 26 33 40 47] int32 <class 'numpy.ndarray'>
1 (7,)


In [22]:
# Can also include float argument
p = np.arange(5, 10, 0.7)  # arange(start, stop, step)
print(p, p.dtype, type(p))
print(p.ndim, p.shape)

[5.  5.7 6.4 7.1 7.8 8.5 9.2 9.9] float64 <class 'numpy.ndarray'>
1 (8,)


The use of floating point argument as step could be problematic as we are not able to determine how many elements 
would be produced, due to the finite floating point precision. 
To address this issue, Numpy provides a ``linspace()`` function that allows for a fixed number of elements instead of the step.

### Using the arange() Function with reshape() Function
The `arange()` function can be combined with the `reshape()` function to transform a 1-dimensional array into a 2-dimensional array. The `reshape()` function divides a linear array into different parts in the manner specified by the shape argument.

In [23]:
# Using reshape() function
q = np.arange(1, 7).reshape(2,3) # the size of the resulting array is 6 elements
print(q, q.dtype, type(q))
print(q.ndim, q.shape)

[[1 2 3]
 [4 5 6]] int32 <class 'numpy.ndarray'>
2 (2, 3)


## 3.3 random() Function
To create a multi-dimensional array of random values, use the `random()` function of the numpy.random class

In [24]:
# Using random() function to create array of random values
r = np.random.random((3,4)) # pass in the size of the multi-dimensional array as an argument
print(r, r.dtype, type(r))
print(r.ndim, r.shape)

[[0.34337415 0.58976537 0.82404677 0.67339857]
 [0.72275153 0.44780963 0.86959103 0.5626754 ]
 [0.19194153 0.39908725 0.28109    0.61887257]] float64 <class 'numpy.ndarray'>
2 (3, 4)


# 4. Basic Operations on ndarray
Basic operations on arrays can be performed using:
<ol>
<li>Arithmetic operators</li>
<li>Matrix product</li>
<li>+=, -=, *=, //=, %= operators</li>
<li>Universal functions (ufunc)</li>
<li>Aggregate functions</li>
</ol>

## 4.1 Arithmetic Operators
An arithmetic operator can be applied directly onto an array. 

In [25]:
# Creates a 1-dimensional array
s = np.array([2, 4, 6, 8, 10]) 
print(s, s.dtype, type(s))
print(s.ndim, s.shape)

[ 2  4  6  8 10] int32 <class 'numpy.ndarray'>
1 (5,)


In [26]:
# Adds 10 (a scalar value) to the elements in the array
s = s + 10
print(s, s.dtype, type(s))
print(s.ndim, s.shape)

[12 14 16 18 20] int32 <class 'numpy.ndarray'>
1 (5,)


Other arithmetic operators such as subtraction, multiplication and division can likewise be performed on array elements.

In [27]:
# These arithmetic operators can also be applied between two arrays. 
# The operations are carried out element-wise 
# i.e. the operators are applied to the corresponding elements in the two arrays.
# It is imperative that the size of the arrays are the same. 
t = np.array([1, 3, 5, 7, 9])
s = s + t
print(s, s.dtype, type(s))
print(s.ndim, s.shape)

[13 17 21 25 29] int32 <class 'numpy.ndarray'>
1 (5,)


In [28]:
# These arithmetic operators can also be applied to multi-dimensional arrays. 
# As before, the arithmetic operators operate element-wise.
u = np.array([[2, 4, 6],                 # create a 2-dimensional array of 4 rows and 3 columns
              [8, 10, 12],
              [14, 16, 18],
              [20, 22, 24]])
v = np.linspace(1, 10, 12).reshape(4,3)  # create a 2-dimensional array of 4 rows and 3 columns
u = u * v
print(u, u.dtype, type(u))
print(u.ndim, u.shape)

[[  2.           7.27272727  15.81818182]
 [ 27.63636364  42.72727273  61.09090909]
 [ 82.72727273 107.63636364 135.81818182]
 [167.27272727 202.         240.        ]] float64 <class 'numpy.ndarray'>
2 (4, 3)


In [29]:
# Can also multiply the array “u” by the sine of array “v”.
u = np.array([[2, 4, 6],                 # create a 2-dimensional array of 4 rows and 3 columns
              [8, 10, 12],
              [14, 16, 18],
              [20, 22, 24]])
v = np.linspace(1, 10, 12).reshape(4,3)  # create a 2-dimensional array of 4 rows and 3 columns
u = u * np.sin(v)                        # multiply by the sine of elements of array “v”
print(u, u.dtype, type(u))
print(u.ndim, u.shape)

[[  1.68294197   3.8782238    2.90404715]
 [ -2.46295483  -9.04895704 -11.15055048]
 [ -5.11601593   6.87414395  17.1500735 ]
 [ 17.45826017   5.29268359 -13.05650666]] float64 <class 'numpy.ndarray'>
2 (4, 3)


In [30]:
# Can also multiply the array “u” by square root of the elements of array “v”.
u = np.array([[2, 4, 6],                 # create a 2-dimensional array of 4 rows and 3 columns
              [8, 10, 12],
              [14, 16, 18],
              [20, 22, 24]])
v = np.linspace(1, 10, 12).reshape(4,3)  # create a 2-dimensional array of 4 rows and 3 columns
u = u * np.sqrt(v)                       # multiply by the square root of elements of array “d”
print(u, u.dtype, type(u))
print(u.ndim, u.shape)

[[ 2.          5.3935989   9.74212969]
 [14.86912604 20.67057637 27.07565159]
 [34.03207044 41.49917852 49.44418341]
 [57.83990444 66.66333325 75.89466384]] float64 <class 'numpy.ndarray'>
2 (4, 3)


## 4.2 Matrix Product
Take two matrices and produce a third matrix such that each position in the third matrix is the sum of the products of each element of the corresponding row of the first matrix with the corresponding element of corresponding column of the second matrix. The resulting third matrix is the matrix product of the first and second matrices. Let us consider an example with two 3 by 3 elements matrices, A and B. 

There are a few points about matrix product worth mentioning here:

* Conventionally,  ``*`` is understood to be the symbol for matrix product when it is applied to two matrices. Numpy uses ``*`` for array multiplication and the ``dot()``function for matrix product.
* The number of elements in the corresponding row of the first matrix and the corresponding column of the second matrix must be the same. See the example that follows below.
* ``A.dot(B)`` and ``A@B``are alternative ways of writing ``np.dot(A, B)``.
* Since matrix product is not a commutative operation, the order of the operands is important. Thus, ``A.dot(B)`` is not equal to ``B.dot(A)``.

    
![image.png](attachment:image.png)

In [31]:
# Matrix product of A and B arrays
A = np.array([[2, 4, 6],
              [8, 10, 12],
              [14, 16, 18]])
B = np.array([[1, 3, 5],
              [7, 9, 11],
              [13, 15, 17]])
A = np.dot(A, B)  # alternatively, A.dot(B)
print(A, A.dtype, type(A))
print(A.ndim, A.shape)

[[108 132 156]
 [234 294 354]
 [360 456 552]] int32 <class 'numpy.ndarray'>
2 (3, 3)


**Size of row of matrix C must be same as size of column of matrix D**<br>
Consider two matrixes, C and D as follows: C and D have 2 rows and 3 columns. <br>
When we apply matrix product on C and D, a ValueError is reported. <br>
This is because the size of the row of C (has three elements) is not the same as the size of the column of D (which has two elements).
```python
C = np.array([[2, 4, 6],
              [8, 10, 12]])
D = np.array([[1, 3, 5],
              [7, 9, 11]])
C.dot(D)
Out:
ValueError: shapes (2,3) and (2,3) not aligned: 3 (dim 1) != 2 (dim 0)
```

In [32]:
# Redefine matrix D to be 3 rows and 2 columns, matrix product on C and D takes place.
C = np.array([[2, 4, 6],
              [8, 10, 12]])
D = np.array([[1, 3],
              [5, 7],
              [9, 11]])
print(C@D)             # same as C.dot(D)

[[ 76 100]
 [166 226]]


## 4.3 +=, -=, *=, //=, %= Operators
These operators (“+=”, “-=”, “*=”, “//=”, and “%=”) acts on the operand on the right side of the “=” assignment to the operand on the left side and assign the resulting value to the operand on the left side. 
All the above operations act in place to modify an existing array rather than create a new one.

In [33]:
E = np.array([2, 4, 6, 8])

E += 3                        # adds 3 to the array
print(E)

E -= 3                        # minus 3 from the array
print(E)

E *= 3                       # multiplies 3 to the array
print(E)

E //= 3                       # floor divides array by 3, returns the quotient values
print(E)

E %= 3                       # returns the modulus (or remainder) of the division by 3
print(E)

[ 5  7  9 11]
[2 4 6 8]
[ 6 12 18 24]
[2 4 6 8]
[2 1 0 2]


## 4.4	Universal Functions (ufunc)
A universal function acts on the elements of an input array to produce a corresponding result in a new output array. The size of the output array is the same as the input array. 

Commonly known as ufunc, some mathematical and trigonometric operations are implemented in Numpy as universal functions e.g. sqrt(), log(), sin(), cos(), exp(), add(), etc. These operations act on the input array element-wise.

In [34]:
F = np.array([1, 3, 5, 7])

print(np.sqrt(F))   # square root of array F e.g. √(3) = 1.73205081
print(np.log(F))    # logarithm (base e) of array F e.g.loge(3) = 1.09861229
print(np.sin(F))    # sine of array F e.g. sin(3) = 0.14112001
print(np.exp(F))    # exponential of array F e.g. e3 = 20.08553692

G = np.array([2.0, 4.4, 6.2, 8.8])

print(np.add(F, G)) # addition of array F and G e.g. 3 adds 4.4 = 7.4
print(np.cos(F))    # cosine of array F e.g. cos(3) = -0.9899925


[1.         1.73205081 2.23606798 2.64575131]
[0.         1.09861229 1.60943791 1.94591015]
[ 0.84147098  0.14112001 -0.95892427  0.6569866 ]
[   2.71828183   20.08553692  148.4131591  1096.63315843]
[ 3.   7.4 11.2 15.8]
[ 0.54030231 -0.9899925   0.28366219  0.75390225]


## 4.5	Aggregate Functions
An aggregate function performs an operations on all the elements in an array and produce a single result. A number of such functions are implemented in Numpy as methods of the ``ndarray`` class. They include: sum(), min(), max(), mean(), std(), cumsum(), etc.

In [35]:
H = np.array([34.0, 6.6, 68.9, 20.8, 50.5, 1.5])

print(H.sum())     # sum of the values
print(H.min())     # the minimum value in the array
print(H.max())     # the maximum value in the array
print(H.mean())    # the mean of all values in the array
print(H.std())     # the standard deviation of all values in the array

182.3
1.5
68.9
30.383333333333336
23.780343189748592


In [36]:
# The operations apply to the array as though it were a list of numbers, regardless of its shape.
I = np.array([[30.0, 20.5, 48.2], 
              [1.45, 90.2, 45.5],
              [9.25, 60.3, 33.3]])

print(I.sum())
print(I.min())
print(I.max())
print(I.mean())
print(I.std())

338.7
1.45
90.2
37.63333333333333
25.681846160707025


### Axis Parameter
When the axis parameter is specified, the operation is performed on the specified axis (i.e. row or column) of the array. 

In a 2-dimensional Numpy array, the axes are the directions along the rows (axis 0) and columns (axis 1) as shown.
![image.png](attachment:image.png)

Axis 0 is the axis that runs downward down the rows while axis 1 is the axis that runs horizontally across the columns.


In [37]:
# Using axis 0 and 1
J = np.arange(18).reshape(6,3)

print(J)
print(J.sum(axis=0))    # summing down the columns i.e. sums of each column
print(J.sum(axis=1))    # summing across the columns i.e. sums of each row
print(J.min(axis=0))    # minimum value of each column
print(J.min(axis=1))    # minimum value of each row
print(J.cumsum(axis=0)) # cumulative sum along each column

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]
 [15 16 17]]
[45 51 57]
[ 3 12 21 30 39 48]
[0 1 2]
[ 0  3  6  9 12 15]
[[ 0  1  2]
 [ 3  5  7]
 [ 9 12 15]
 [18 22 26]
 [30 35 40]
 [45 51 57]]


# 5. Indexing, Slicing and Iterating
Elements in an array are indexed. We can therefore select them, manipulate them, change their values or slice them via the index. We can also perform iterations within them.

## 5.1	Indexing an Array
Elements in an array are indexed using square brackets. They can then be referred to individually. An integer index scale is automatically assigned when an array is created. Indexing begins with the first value at position zero ending with n-1 where n is the number of elements in the array.

In [38]:
K = np.array([2, 4, 6, 8, 10, 12])

print(K[0])
print(K[5])

2
12


In [39]:
# Negative indexes are used to point to the element from the reverse direction, starting from -1
print(K[-1])
print(K[-2])
print(K[-6])

12
10
2


In [40]:
# Array of indexes can be used to select multiple elements all at once
print(K[[1, 2, 5]])

[ 4  6 12]


2-dimensional arrays are made up of rows and columns of elements. These rows and columns are defined by axes, where axis 0 represents the rows and axis 1 represents the columns. The index of each element is a pair of values (i.e. [r, c]) where “r” is the index for row and “c” is the index for column.

In [41]:
# 2-dimensional arrays are made up of rows and columns of elements. 
# These rows and columns are defined by axes, where axis 0 represents the rows and axis 1 represents the columns. 
# The index of each element is a pair of values (i.e. [r, c]) where “r” is the index for row and “c” is the index for column.
L = np.arange(1, 13).reshape(3, 4)

print(L)
print(L[0, 0])
print(L[1, 2])
print(L[2, 3])

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
1
7
12


## 5.2	Slicing an Array
You can extract a portion of a array to form new sub-arrays through slicing. Just as you are able to slice a Python list into arrays, Numpy allows you to do the same with ndarrays. However, there is a subtle difference between these two approaches. Slicing Python lists generate copies of arrays while in Numpy, the arrays are views of the same undelying array that was sliced.

### 1-Dimensional Arrays
You can slice an ndarray by using a sequence of numbers separated by colons “:” within square brackets.

In [42]:
M = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])   # create a 1-dimensional array

print(M)
print(M[2:8])  # take a slice of the array M
print(M)       # array M remains the same even after a slice has been performed

[ 1  2  3  4  5  6  7  8  9 10]
[3 4 5 6 7 8]
[ 1  2  3  4  5  6  7  8  9 10]


In [43]:
# The slice extracted from an array needs not have elements in a consecutive order. 
# You may skip elements by specifying a third value indicating the interval in the sequence of the elements. 
print(M[2:8:2]) # extract from 3rd to 8th element with an interval of 2 elements in the sequence
print(M)

[3 5 7]
[ 1  2  3  4  5  6  7  8  9 10]


In [44]:
# Numpy allows you to omit the numbers in the index
print(M[::])

[ 1  2  3  4  5  6  7  8  9 10]


In [45]:
# Omitting the first number, Numpy interprets it as 0 (i.e. the first element of the array). 
# Omitting the second number, Numpy interprets it as the maximum index of the array (i.e. the last element of the array). 
# Finally, omitting the last number, Numpy assumes it as 1 (i.e. all the elements are considered without any interval).
print(M[::3]) # slice from the beginning to the end of array with an interval of 3 elements
print(M[:3:]) # slice from the beginning to the third element with an interval of 1 element
print(M[3::]) # slice from the fourth element to the last element with an interval of 1 element

[ 1  4  7 10]
[1 2 3]
[ 4  5  6  7  8  9 10]


### 2-Dimensional Arrays
For 2-dimensional arrays, slicing still applies but you need to define the rows and columns.

In [46]:
N = np.arange(1, 29).reshape(4, 7)

print(N)
print(N[1, 2])       # indexing a specific element in the 2-dimensional array
print(N[0, :])       # extract first row with all columns

print(N[:, 0])       # extract all rows with first column
print(N[0:4, 0:7])   # extract the entire array
print(N[1:3, 2:6])   # slicing a portion of the array

print(N[[0,1,3], 1:5]) # slicing indexes of rows not contiguous, specify array of indexes for row
print(N[[0, 1, 3], 1:5:2]) # array of indexes for row with interval 2
print(N[[0, 1, 3], [1, 3, 5]]) # row 1 column 2 element, row 2 col 4 element and row 4 col 6 element

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]
 [15 16 17 18 19 20 21]
 [22 23 24 25 26 27 28]]
10
[1 2 3 4 5 6 7]
[ 1  8 15 22]
[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]
 [15 16 17 18 19 20 21]
 [22 23 24 25 26 27 28]]
[[10 11 12 13]
 [17 18 19 20]]
[[ 2  3  4  5]
 [ 9 10 11 12]
 [23 24 25 26]]
[[ 2  4]
 [ 9 11]
 [23 25]]
[ 2 11 27]


### Using Conditions and Boolean Operators
Besides using numerical values for indexing and slicing arrays, we could use conditions and boolean operators to extract a portion of the array.

In [47]:
N = np.arange(1, 29).reshape(4, 7)

print(N)
print(N < 18)     # which are the elements that have values less than 18?

print(N[N < 18])  # list the elements that are less than 18
print(N[N >= 18]) # list the elements that are greater than or equal to 18

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]
 [15 16 17 18 19 20 21]
 [22 23 24 25 26 27 28]]
[[ True  True  True  True  True  True  True]
 [ True  True  True  True  True  True  True]
 [ True  True  True False False False False]
 [False False False False False False False]]
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26 27 28]


## 5.3	Iterating an Array
The iteration of the elements in an ndarray is achieved by the ``for`` loop.

### 1-Dimensional Arrays

In [48]:
O = np.array([1, 3, 5, 7.5, 9.3])

print(O)
for i in O: print(i)

[1.  3.  5.  7.5 9.3]
1.0
3.0
5.0
7.5
9.3


### 2-Dimensional Arrays
For 2-dimensional arrays, we could use one ``for`` loop to print each row in the matrix. Let us consider the following example of a matrix of strings.

In [49]:
P = np.array([['tom', '1234567', 'engineer'],
              ['eve', '3012876', 'lawyer'],
              ['jim', '8325709', 'doctor']])

print(P)

[['tom' '1234567' 'engineer']
 ['eve' '3012876' 'lawyer']
 ['jim' '8325709' 'doctor']]


In [50]:
# We have here a matrix F of three rows of data.
# We will use a “for” loop to print each row in the matrix. 
for i in P: print(i)

['tom' '1234567' 'engineer']
['eve' '3012876' 'lawyer']
['jim' '8325709' 'doctor']


In [51]:
# To print the individual elements in the array, we could use two “for” loops to print each element in the matrix. 
for i in P:
    for j in i: print(j)

tom
1234567
engineer
eve
3012876
lawyer
jim
8325709
doctor


In [52]:
# Alternatively, we could use the “flat” attribute of the array to do the printing.  
for i in P.flat: print(i)

tom
1234567
engineer
eve
3012876
lawyer
jim
8325709
doctor


Numpy has a function that allows the application of an aggregate function to iterate through the elements of an array by rows or columns. This function, ``apply_along_axis()``, takes three arguments: the aggregate function, the axis on which to apply the function, and the array. If the axis equals to 0; the iteration evaluates the elements column by column. If the axis equals to 1, the the evaluation proceeds row by row. One example of an aggregate function is the Numpy ``sum()`` function.

In [53]:
Q = np.arange(1, 29).reshape(7, 4)

print(Q)
print(np.apply_along_axis(np.sum, axis=0, arr=Q))  # summation of elements in columns
print(np.apply_along_axis(np.sum, axis=1, arr=Q))  # summation of elements in rows

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]
 [25 26 27 28]]
[ 91  98 105 112]
[ 10  26  42  58  74  90 106]


In [54]:
# Likewise, we can apply the mean() function on the elements of the axis as follows
print(np.apply_along_axis(np.mean, axis=0, arr=Q)) # the mean of elements column by column
print(np.apply_along_axis(np.mean, axis=1, arr=Q)) # the mean of elements row by row

[13. 14. 15. 16.]
[ 2.5  6.5 10.5 14.5 18.5 22.5 26.5]


In [55]:
# You can create your own universal function (ufunc) and apply it iteratively on the elements of the array element-wise. 
# Thus, the result will be the same whether the iteration is done by column or by row. 
# We define a ufunc, double(), which returns twice the value passed in as argument.
def double (element):
    return element * 2

print(Q)
print(np.apply_along_axis(double, axis=0, arr=Q)) # double performed on elements in rows
print(np.apply_along_axis(double, axis=1, arr=Q)) # double performed on elements in columns

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]
 [25 26 27 28]]
[[ 2  4  6  8]
 [10 12 14 16]
 [18 20 22 24]
 [26 28 30 32]
 [34 36 38 40]
 [42 44 46 48]
 [50 52 54 56]]
[[ 2  4  6  8]
 [10 12 14 16]
 [18 20 22 24]
 [26 28 30 32]
 [34 36 38 40]
 [42 44 46 48]
 [50 52 54 56]]


# 6. Array Shape Manipulation
We could reshape a 1-dimensional array into a 2-dimensional array using the``reshape()`` function.

## 6.1 reshape() Function
We could reshape a 1-dimensional array into a 2-dimensional array using the reshape() function.

In [56]:
R = np.arange(1, 16)
print(R)

# Applying reshape on this array, we get a 2-dimensional array of shape (3, 5):
R1 = R.reshape(3, 5)
print(R1)

# The reshape() functions produces a new array. The original array “R” is still the same as before. 
# If you want to modify the shape of the original array (i.e. array “R”), 
# you have to assign a tuple containing the new dimensions to the array’s shape attribute.
print(R)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


## 6.2 ravel() Function
What if you want to convert a 2-dimensional array into a 1-dimensional array? You can use the ravel() function.

In [57]:
S = np.arange(1, 16) # create a 1-dimensional array
print(S)

# Applying reshape on this array, we get a 2-dimensional array of shape (3, 5)
S1 = S.reshape(3, 5)
print(S1)

S2 = S1.ravel() # back to 1-dimensional
print(S2)

# Alternatively, you could simply change the shape attribute of the array.
S1.shape = (5, 3)
print(S1)

S1.shape = (15)
print(S1)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]
 [13 14 15]]
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


## 6.3 transpose() Function
You can transpose a 2-dimensional array (or a matrix). <br>
That is, invert the columns with rows. Numpy provides the transpose() function to do just that.

In [58]:
T = np.arange(1, 16) # create a 1-dimensional array
T1 = T.reshape(3, 5) # reshape on this array, we get a 2-dimensional array of shape (3, 5)
print(T1)

T2 = T1.transpose()  # transpose T1 to get T2

print(T2)
print(T1.shape)
print(T2.shape)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]]
[[ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]
 [ 5 10 15]]
(3, 5)
(5, 3)


## 6.4 flip() Function
You can reverse, or flip, the contents of an array along an axis. <br>
The Numpy ``flip()`` function takes two arguments: the array to reverse and the axis.

In [59]:
numbers = np.arange(1, 16) # create a 1-dimensional array
flipped_numbers = np.flip(numbers)
print(numbers)
print(flipped_numbers)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]
[15 14 13 12 11 10  9  8  7  6  5  4  3  2  1]


In [60]:
numbers = np.array([2, 4, 6, 8, 10, 12]) # create a 1-dimensional array
flipped_numbers = np.flip(numbers)
print(numbers)
print(flipped_numbers)

[ 2  4  6  8 10 12]
[12 10  8  6  4  2]


In [61]:
# You can also flip a 2-dimensional array in all of the rows and all of the columns.
U = np.arange(1, 29).reshape(7, 4)
print(U)

flipped_U = np.flip(U)
print(flipped_U)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]
 [25 26 27 28]]
[[28 27 26 25]
 [24 23 22 21]
 [20 19 18 17]
 [16 15 14 13]
 [12 11 10  9]
 [ 8  7  6  5]
 [ 4  3  2  1]]


In [62]:
# By specifying axis = 0 as the second argument, only the rows are reversed.
U = np.arange(1, 29).reshape(7, 4)
print(U)

flipped_rows = np.flip(U, axis=0)
print(flipped_rows)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]
 [25 26 27 28]]
[[25 26 27 28]
 [21 22 23 24]
 [17 18 19 20]
 [13 14 15 16]
 [ 9 10 11 12]
 [ 5  6  7  8]
 [ 1  2  3  4]]


In [63]:
# To reverse only the columns, specify axis = 1. 
U = np.arange(1, 29).reshape(7, 4)
print(U)

flipped_columns = np.flip(U, axis=1)
print(flipped_columns)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]
 [25 26 27 28]]
[[ 4  3  2  1]
 [ 8  7  6  5]
 [12 11 10  9]
 [16 15 14 13]
 [20 19 18 17]
 [24 23 22 21]
 [28 27 26 25]]


# 7. Manipulating Arrays
Arrays can be joined with one another to make an enlarged array, or an array can be split into multiple smaller arrays. Functions for joining or splitting arrays are provided in Numpy. 

## 7.1 Joining Arrays
Arrays can be joined or, in Numpy terminology, stacked together along the axes. <br>
You can vertical stack (using the ``vstack()``  function) two arrays – this joins the rows of the second array as new rows to the first array; or you can horizontal stack (using the ``hstack()`` function) to join the columns of the second array as new columns of the first array.

In [64]:
# Given two arrays V1 and V2
V1 = np.array([[1, 3, 5],
              [7, 9, 11]])
V2 = np.array([[2, 4, 6],
              [8, 10, 12]])
print(V1)
print(V2)

[[ 1  3  5]
 [ 7  9 11]]
[[ 2  4  6]
 [ 8 10 12]]


### vstack() Function

In [65]:
# When the two matrixes are vertically stacked with one another, 
# the resulting array grows in a vertical direction.

V3 = np.vstack((V1, V2))
print(V3)

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]


### hstack() Function

In [66]:
V4 = np.hstack((V1, V2))
print(V4)

[[ 1  3  5  2  4  6]
 [ 7  9 11  8 10 12]]


### column_stack() Function
Besides these functions, there are ``column_stack()`` and ``row_stack()`` functions. <br>
These functions are used with 1-dimensional arrays to form 2-dimensional arrays. <br>
We define three 1-dimensional arrays “W1”, “W2”, and “W3”. <br>

In [67]:
# Using the ``column_stack()`` function, the resulting array is a stack of the three 
# 1-dimensional arrays as columns of a 2-dimensional array. 
# Note the dtype of the resulting array has changed to “float64” 
# even when two of the arrays’ dtypes are “int32”.
W1 = np.array([1, 3, 5])
W2 = np.array([2, 4, 6])
W3 = np.array([10.5, 13.8, 17.9])

W4 = np.column_stack((W1, W2, W3))
print(W4)

[[ 1.   2.  10.5]
 [ 3.   4.  13.8]
 [ 5.   6.  17.9]]


### row_stack() Function
We could use the ``row_stack()`` function to stack the three arrays row-wise to form the 2-dimensional array.

In [68]:
# use row_stack() to row stack W1, W2, W3
W1 = np.array([1, 3, 5])
W2 = np.array([2, 4, 6])
W3 = np.array([10.5, 13.8, 17.9])

W4 = np.row_stack((W1, W2, W3))
print(W4)

# use vstack() to row stack W1, W2, W3
W4 = np.vstack((W1, W2, W3))
print(W4)

[[ 1.   3.   5. ]
 [ 2.   4.   6. ]
 [10.5 13.8 17.9]]
[[ 1.   3.   5. ]
 [ 2.   4.   6. ]
 [10.5 13.8 17.9]]


## 7.2 Splitting Arrays
Arrays can be split into multiple smaller arrays. <br>
To split the original array horizontally, use  ``hsplit()`` function whereas to split it vertically, use ``vsplit()`` function.<br>
<b>Note that hsplit() and vsplit() functions do not change the original array that they operate on.</b>   

### hsplit() Function

In [69]:
X = np.array([[1, 3, 5], [7, 9, 11], [2, 4, 6], [8, 10, 12]])
print(X)

# Split this matrix into three 1-dimensional arrays horizontally using the hsplit() function
(X1, X2, X3) = np.hsplit(X, 3)

print(X1)
print(X2)
print(X3)
print()
print(X)    # X is not changed

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]
[[1]
 [7]
 [2]
 [8]]
[[ 3]
 [ 9]
 [ 4]
 [10]]
[[ 5]
 [11]
 [ 6]
 [12]]

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]


### vsplit() Function

In [70]:
X = np.array([[1, 3, 5], [7, 9, 11], [2, 4, 6], [8, 10, 12]])
print(X)
print()

# Split this matrix into four 1-dimensional arrays vertically using the vsplit() function
(X1, X2, X3, X4) = np.vsplit(X, 4) # got to have equal split
print(X1)
print(X2)
print(X3)
print(X4)
print()

# Split this matrix into two 2-dimensional arrays vertically using the vsplit() function
(X1, X2) = np.vsplit(X, 2) # got to have equal split
print(X1)
print(X2)

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]

[[1 3 5]]
[[ 7  9 11]]
[[2 4 6]]
[[ 8 10 12]]

[[ 1  3  5]
 [ 7  9 11]]
[[ 2  4  6]
 [ 8 10 12]]


### split() Function
There is another more complicated function for splitting arrays. This function, ``split()``, allows the splitting of an array into non-symmetrical sub-arrays.<br>
The ``split()`` function takes in three arguments: the array to be split, the indexes of the parts to be divided into, and the axis along which the split takes place. If the axis equals 0, the indexes will be rows whereas if it is 1, the indexes will be columns.

In [71]:
# We apply the split() function on matrix Y, indexes [1, 3] on the rows (axis=0). 
Y = np.array([[1, 3, 5], [7, 9, 11], [2, 4, 6], [8, 10, 12]])
print(Y)
print()

[Y1, Y2, Y3] = np.split(Y,[1, 3],axis=0)

# The split will result in three sub-arrays “Y1”, “Y2”, and “Y3” and they are formed based on the indexes [1, 3]. 
# The latter will result in three arrays apportioned in this manner: Y[:1], Y[1:3], and Y[3:].
print(Y1) # first array
print(Y2) # second and third array
print(Y3) # fourth array

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]

[[1 3 5]]
[[ 7  9 11]
 [ 2  4  6]]
[[ 8 10 12]]


In [72]:
# If we want to have four sub-arrays, then we need to have three indexes as argument. 
# The sub-arrays are thenapportioned into: C[:1], C[1:2], C[2:3] and C[3:].
Y = np.array([[1, 3, 5], [7, 9, 11], [2, 4, 6], [8, 10, 12]])
print(Y)
print()

[Y1, Y2, Y3, Y4] = np.split(Y,[1, 2, 3],axis=0)

# The split will result in three sub-arrays “Y1”, “Y2”, and “Y3” and they are formed based on the indexes [1, 3]. 
# The latter will result in three arrays apportioned in this manner: Y[:1], Y[1:2], Y[2:3] and Y[3:].
print(Y1) # first array
print(Y2) # second array
print(Y3) # third array
print(Y4) # fourth array

[[ 1  3  5]
 [ 7  9 11]
 [ 2  4  6]
 [ 8 10 12]]

[[1 3 5]]
[[ 7  9 11]]
[[2 4 6]]
[[ 8 10 12]]


## 7.3 Copying Arrays
Array assignment in Numpy merely creates a reference to the original array. Even when there are two references, there is only one array with elements. 

In [73]:
Z = np.array([2, 4, 6, 8, 10]) # creates an array
Z1 = Z                         # assign array a to b
print(Z1)                      # array b references a

# Make a change in the fourth element of array “Z”.
Z[3] = 77

print(Z)
print(Z1) # array “Z1” has changed even when there is no explicit operation on array “Z1”. 

[ 2  4  6  8 10]
[ 2  4  6 77 10]
[ 2  4  6 77 10]


In [74]:
# Take a slice of array “Z”, the object returned is a reference to the original array.
Z2 = Z[1:4]
print(Z)
print(Z2)

Z[2] = 11
print(Z2)
print(Z)

[ 2  4  6 77 10]
[ 4  6 77]
[ 4 11 77]
[ 2  4 11 77 10]


### copy() Function
If you want an array distinct from the original array, use the ``copy()`` function.

In [75]:
Z = np.arange(1, 20, 2)
print(Z)
print()

Z3 = Z.copy() # make a copy of array “Z”
Z3[5] = 2233  # change 6th element to value 2233

print(Z)
print(Z3)

[ 1  3  5  7  9 11 13 15 17 19]

[ 1  3  5  7  9 11 13 15 17 19]
[   1    3    5    7    9 2233   13   15   17   19]


<center>
  <a href="PP-08-FileHandling.ipynb" target="_self">File Handling</a> | <a href="./">Content Page</a> | <a href="PP-10-pandas.ipynb">The pandas Library</a>
</center>