# Chapter 1: NumPy

[**1.1 Numpy**](#1.1-Numpy)   
[**1.2 Numpy ndarray**](#1.2-Numpy-ndarray)   
[**1.3 Creating an array**](#3.3-Creating-an-array)  
[**1.3.1 ndarray constructor**](#1.3.1-ndarray-constructor)  
[**1.3.2 Using routines**](#1.3.2-Using-routines)  
[**1.4 Attributes of ndarray**](#1.4-Attributes-of-ndarray)  
[**1.5 Creating numpy array**](#1.5-Creating-numpy-array)  
[**1.6 Creating arrays using routines**](#1.6-Creating-arrays-using-routines)   
[**1.6.1 Using arange**](#1.6.1-Using-arange)    
[**1.6.2 Creating zeros arrays**](#1.6.2-Creating-zeros-arrays)  
[**1.6.3 Creating an empty array**](#1.6.3-Creatin-an-empty-array)  
[**1.6.4 Creating ones array**](#1.6.4-Creating-ones-array)  
[**1.6.5 Creating random numbers**](#1.6.5-Creating-random-numbers)  
[**1.6.6 Creating identity matrix**](#1.6.6-Creating-identity-matrix)  
[**1.6.7 Creating full array**](#1.6.7-Creating-full-array)  
[**1.7 Data Types for ndarrays**](#1.7-Data-Types-for-ndarrays)  
[**1.8 Indexing and Slicing**](#1.8-Indexing-and-Slicing)  
[**1.9 Advance Indexing and Slicing**](#1.9-Advance-Indexing-and-Slicing)  
[**1.10 Boolean Indexing**](#1.10-Boolean-Indexing)  
[**1.11 Fancy Indexing**](#1.11-Fancy-Indexing)  
[**1.12 Transposing Arrays and Swapping Axes**](#1.12-Transposing-Arrays-and-Swapping-Axes)  
[**1.13 Universal Functions**](#1.13-Universal-Functions)  
[**1.14 Methods for Boolean Arrays**](#1.14-Methods-for-Boolean-Arrays)  
[**1.15 Sorting**](#1.15-Sorting)  
[**1.16 Set Logic**](#1.16-Set-Logic)  
[**1.17 Masked Arrays**](#1.17-Masked-Arrays)   
[**1.18 Array manipulation routines**](#1.18-Array-manipulation-routines)

#### 1.1 Numpy

Numpy is acronomy for **Numeric Python** or **Numerical Python**. Numpy array is similar to Python list but it provides more benefits for example: being more compact, faster access in reading and writing items and, more conveninent and efficient. Numpy is one of the most important foundational packages for numerical computing in Python. It is a core library for scientific computation in Python. It provides high-performance multidimensional array object for efficient computation of arrays and matrices. Benefit of using Numpy includes:-  
* ndarray: An efficient multidimensional array providing fast array oriented arithmetic operations and flexible broadcasting capabilites.
* Mathematical functions are fast for operating on entire arrays of data. No need to write loops.
* Used for reading/writing array data to disk and working with memory-mapped files.
* It has linear algebra, random numuber generation, and Fourier transform capabilities.
* It has C API for connecting Numpy with libraries written in C, C++ etc.

Numpy is not used for :-
* It does not provide modeling or scientific functionality.

#### 1.2 NumPy ndarray

The key feature of NumPy is its N-dimensional array object, or ndarray. ndarray is a fast, flexible container for large datasets in Python. Array enables to perform mathematical operations on whole blocks of data using similiar syntax to the equivalent operations between scalar elements.

For example, Let's generate some random data and do some mathematical operations.

In [11]:
import numpy as np
sensor_data = np.random.randn(2,3)   # Generate random number
print("Sensor Data\n", sensor_data)
senor_data_5 = sensor_data * 5
print("Sensor Data multiplied by 5\n",senor_data_5)
print("Sensor Data added\n",sensor_data + sensor_data)

Sensor Data
 [[-0.53280803 -0.70222456  0.96294466]
 [ 0.59970264 -0.49662431  2.02395096]]
Sensor Data multiplied by 5
 [[-2.66404013 -3.5111228   4.8147233 ]
 [ 2.99851319 -2.48312153 10.11975478]]
Sensor Data added
 [[-1.06561605 -1.40444912  1.92588932]
 [ 1.19940528 -0.99324861  4.04790191]]


An ndarray is generic multidimensional container for homogenous data; i.e. all the items are of same types and size. Every items takes the same block size of memory, and all blocks are interpreted in exactly same way. `ndarrays` can share the same data. When the changes is made in one `ndarray`, it may be visible to others. So, ndarray can be a _view_ to other ndarray. Every array has:
* **shape**: a tuple indicating the size of each dimension.
* **dtype**: an object describing the _data type_ of the arrray.

The dimensions of ndarray is also known as `axes`. For example, `[0, 1 , 2]` has 1 axis and 3 elements.


Further Reading on N-dimensional array: [N-dimensional array (ndarray)](https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html)

In [12]:
sensor_data.shape

(2, 3)

In [13]:
sensor_data.dtype

dtype('float64')

#### 1.3 Creating an arrays
There are several ways to create an array:  
1. using low-level `ndarray` constructor. [More Info](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray).
2. using routines defined in [Array creation routines](https://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#routines-array-creation).

#### 1.3.1 ndarray constructor
* numpy.ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)
* shape: It defines the shape of array to be created. It is tuples of ints. 
* dtype: It defines the data type of the elements. It is an optional.
* buffer: It is used ot fill the array with data. It is an optional.
* offset: Offset of array data in buffer. It is an optional.
* strides: Strides of data in memory. It is an optional.
* order: Defines the row-major (C-style) or column-major (Fortran-style) order. i.e. {'C', 'F'}. It is an optional.

#### 1.3.2 Using routines

Table 1.1: Routines for one and zeros


| Methods | Description |
| ------- | ----------- |
| empty(shape[, dtype, order]) | Return a new array of given shape and type, without initializing entries. |
| empty_like(prototype[, dtype, order, subok, …]) | Return a new array with the same shape and type as a given array. | 
| eye(N[, M, k, dtype, order]) | Return a 2-D array with ones on the diagonal and zeros elsewhere. | 
| identity(n[, dtype]) | Return the identity array. | 
| ones(shape[, dtype, order]) |  Return a new array of given shape and type, filled with ones. | 
| ones_like(a[, dtype, order, subok, shape]) | Return an array of ones with the same shape and type as a given array. | 
| zeros(shape[, dtype, order]) | Return a new array of given shape and type, filled with zeros. | 
| zeros_like(a[, dtype, order, subok, shape]) | Return an array of zeros with the same shape and type as a given array. | 
| full(shape, fill_value[, dtype, order]) | Return a new array of given shape and type, filled with fill_value. | 
| full_like(a, fill_value[, dtype, order, …]) | Return a full array with the same shape and type as a given array. |

Table 1.2:Routines from existing data

| Methods | Description |
| ------- | ----------- |
| array(object[, dtype, copy, order, subok, ndmin]) | Create an array. |
| asarray(a[, dtype, order]) | Convert the input to an array. |
| asanyarray(a[, dtype, order]) | Convert the input to an ndarray, but pass ndarray subclasses through. |
| ascontiguousarray(a[, dtype]) | Return a contiguous array (ndim >= 1) in memory (C order). |
| asmatrix(data[, dtype]) | Interpret the input as a matrix. |
| copy(a[, order]) | Return an array copy of the given object. |
| frombuffer(buffer[, dtype, count, offset]) | Interpret a buffer as a 1-dimensional array. |
| fromfile(file[, dtype, count, sep, offset]) | Construct an array from data in a text or binary file. |
| fromfunction(function, shape, \*\*kwargs) | Construct an array by executing a function over each coordinate. |
| fromiter(iterable, dtype[, count]) | Create a new 1-dimensional array from an iterable object. |
| fromstring(string[, dtype, count, sep]) | A new 1-D array initialized from text data in a string. |
| loadtxt(fname[, dtype, comments, delimiter, …]) | Load data from a text file. |

#### 1.4 Attributes of ndarray
* **ndarray.ndim**: defines the number of axes (dimensions) of the array.
* **ndarray.shape**: defines the number of dimensions and items in an array. It is a tuple of N non-negative integers that specify the sizes of each dimension. A matrix with _n_ rows and _m_ columns has `shape` of `(n,m)`. The length of the shape tuple is the number of axes, i.e. `ndim`
* **ndarray.size**: defines the total number of elements of the array. It is equal to the product of elements of shape.
* **ndarray.dtype**: defines the type of elements in the array.
* **ndarray.itemsize**: defines the size (in bytes) of each elements of the array. e.g. `float64` has itemsize 8 (i.e. 64/8), complex32 has itemsize 4 (i.e. 32/8). It is similar to `ndarray.dtype.itemsize`.
* **ndarray.data**: defines the buffer containing the actual elements of the array. It is not used since we'll access element using indexing/slicing. Indexing is used to access and modify the contents of an ndarray. Also, using methods and attributes of ndarray, the contents can accessed and modified.
* **ndarray.flags**: defines memory layout of the array.
* **ndarray.strides**: defines the tuple of bytes to step in each dimension when traversing an array. It describes the number of bytes that should be skipped in memory to go to the next element. If the strides are (5,2), then we need to proceed two byte to get to the next column and 5 bytes to locate the next row.
* **ndarray.nbytes**: defines the total bytes consumed by the elements of the array.
* **ndarray.base**: defines the base object if memory is from some other object.
* **ndarray.T**: transpose the array.
* **ndarray.real**: the real part of the array.
* **ndarray.imag**: the imaginary part of the array.
* **ndarray.flat**: One dimensional iterator over the array.
* **ndarray.ctypes**: defines an objects to simplify the interaxtion of the array with the ctypes module.

#### 1.5 Creating numpy array
**`numpy`** module is used to create n-dimensional array.

In [2]:
import numpy as np

To create numpy array we need to use `np.array()` function. It take list as it's parameter and other optional parameter can be specified for data types, buffer, offset, stride and order. Data type is specified where we need more control on data stored in memory and on disk.

In [9]:
# Create one dimensional array of size 3, with 4-byte integer elements
one_dim_array = np.array([0,1,2], np.int32)
print("one_dim_array:", one_dim_array)
print("Type:", type(one_dim_array))
print("Dtype:", one_dim_array.dtype)
print("Dimension:",one_dim_array.ndim)
print("Size:",one_dim_array.size)
print("Shape:",one_dim_array.shape)
print("ItemSize:",one_dim_array.itemsize)
one_dim_array

one_dim_array: [0 1 2]
Type: <class 'numpy.ndarray'>
Dtype: int32
Dimension: 1
Size: 3
Shape: (3,)
ItemSize: 4


array([0, 1, 2], dtype=int32)

**Note**: The above one dimension array has one axis with 3 elements.

In [10]:
# Create two dimensional array of size 2x3, with 4-byte integer elements
two_dim_array = np.array([[0,1,2], [3,4,5]], np.int32)
print("two_dim_array:", two_dim_array)
print("Type:", type(two_dim_array))
print("Dtype:", two_dim_array.dtype)
print("Dimension:",two_dim_array.ndim)
print("Size:",two_dim_array.size)
print("Shape:",two_dim_array.shape)
print("ItemSize:",two_dim_array.itemsize)
two_dim_array

two_dim_array: [[0 1 2]
 [3 4 5]]
Type: <class 'numpy.ndarray'>
Dtype: int32
Dimension: 2
Size: 6
Shape: (2, 3)
ItemSize: 4


array([[0, 1, 2],
       [3, 4, 5]], dtype=int32)

**Note**: The above two dimension array has two axis with six elements.

In [37]:
# Create three dimensional array of size 2x3, with 4-byte integer elements
three_dim_array = np.array([[[0,1,2], [3,4,5], [6,7,8]]], np.int32)
print("three_dim_array:", three_dim_array)
print("Type:", type(three_dim_array))
print("Dtype:", three_dim_array.dtype)
print("Dimension:",three_dim_array.ndim)
print("Size:",three_dim_array.size)
print("Shape:",three_dim_array.shape)
print("ItemSize:",three_dim_array.itemsize)
three_dim_array

three_dim_array: [[[0 1 2]
  [3 4 5]
  [6 7 8]]]
Type: <class 'numpy.ndarray'>
Dtype: int32
Dimension: 3
Size: 9
Shape: (1, 3, 3)
ItemSize: 4


array([[[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]]], dtype=int32)

**Note**: The above three dimension array has three axis with nine elements.

In [40]:
# Create three dimensional array of size 2x3, with 4-byte integer elements
three_dim_two_depth_array = np.array([[[0,1,2], [3,4,5], [6,7,8]], [[10,11,12], [13,14,15], [16,17,18]]], np.int32)
print("three_dim_two_depth_array:", three_dim_two_depth_array)
print("Type:", type(three_dim_two_depth_array))
print("Dtype:", three_dim_two_depth_array.dtype)
print("Dimension:",three_dim_two_depth_array.ndim)
print("Size:",three_dim_two_depth_array.size)
print("Shape:",three_dim_two_depth_array.shape)
print("ItemSize:",three_dim_two_depth_array.itemsize)
three_dim_two_depth_array

three_dim_two_depth_array: [[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[10 11 12]
  [13 14 15]
  [16 17 18]]]
Type: <class 'numpy.ndarray'>
Dtype: int32
Dimension: 3
Size: 18
Shape: (2, 3, 3)
ItemSize: 4


array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]]], dtype=int32)

**Note:** The above array has 2 depth, 3 rows and 4 column defined in shape (2,3,3).

#### 1.6 Creating arrays using routines

#### 1.6.1 Using arange
`arange` is an array-valued version of the built-in Python range function.

In [85]:
new_data = np.arange(25)
new_data

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

#### 1.6.2 Creating zeros arrays

In [69]:
a = np.zeros((2, 5, 5))
a

array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]])

In [68]:
a.shape

(5, 5, 5)

#### 1.6.3  Creating an empty array
Empty array is used to create placeholder of array structure, where the data is filled later. We can also initialize array with zeros, ones, constant or random values.

In [170]:
# Create an empty array
np.empty((2,4))

array([[4.4e-323, 0.0e+000, 0.0e+000, 0.0e+000],
       [0.0e+000, 0.0e+000, 0.0e+000, 0.0e+000]])

In [171]:
# Create an array of zeros
np.zeros((2,3,4), dtype = np.int16)

array([[[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]], dtype=int16)

#### 1.6.4  Creating ones array

In [172]:
# Create an array of ones
np.ones((3,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

#### 1.6.5  Creating random numbers

In [174]:
# Create an array with random values
np.random.random((3,3))

array([[0.57722195, 0.10955283, 0.91949129],
       [0.20238217, 0.50675145, 0.47540398],
       [0.64427806, 0.41343669, 0.17205333]])

#### 1.6.6  Creating identity matrix

In [83]:
# Create an identity matrix
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [84]:
# Create an identity matrix
# Identity matix is a square matrix where all the elements in the principal diagonal are ones
# and other elements are zeros. Multiplying matrix with an identity matrix won't change the values. 
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

#### 1.6.7 Creating full array

In [175]:
# Create a full array
np.full((2,4),5)

array([[5, 5, 5, 5],
       [5, 5, 5, 5]])

In [179]:
# Create an array of evenly-spaced values
np.arange(2,10,2)

array([2, 4, 6, 8])

In [184]:
# Create an array of evenly-spaced values
np.linspace(0,2,9)

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

#### 1.7 Data Types for ndarrays
The _data type_ or **dtype** is a special object that contains the information or metadata the ndarray needs to interpret a chunk of memory as a particular type of data. The numerical dtypes are represented with type name i.e. (int, float), followed by a number indicating the number of bits per element. A standard double-precision floating-point value takes 8 bytes or 64 bits. This type is known as **float64**.

Table 1.3: NumPy Data Types

| Type | Type Code | Description |
| ---- | ---- | ---- |
| int 8, uint8 | i1, u1 | Signed and unsigned 8-bit(1 byte) integer types |
| int 16, unint16 | i2, u2 | Signed and unsigned 16-bit integer types |
| int 36, unint32 | i4, u4 | Signed and unsigned 32-bit integer types |
| int 64, unint64 | i8, u8 | Signed and unsigned 62-bit integer types |
| float16 | f2 | Half-precision floating point |
| float32 | f4 or f | Standard single-precision floating point; Compatible with C float |
| float64 | f8 or d | Standard double-precision floating point; Compatible with C double and Python float object |
| float 128 | f16 or g | Extended-precision floating point |
| complex64, complex128, complex256 | c8, c16, c32 | Complex numbers represented by two 32, 54, or 128 floats resp. |
| bool | ? | Boolean type storing True and False values |
| object | O | Python object type; a value can be any Python object |
| string_ | S | Fixed-length ASCII string type ( 1 byte per character); for e.g. to create a string dytpe with length 10, use 'S10' |
| unicode_ | U | Fixed-length Unicode type (number of bytes platform specific); same specification semantics as string_ (e.g., 'U10') |

In [87]:
array_one = np.array([5,6,7], dtype = np.float64)
array_one.dtype

dtype('float64')

In [88]:
array_two = np.array([5,6,7], dtype = np.int32)
array_two.dtype

dtype('int32')

We can explicitly `cast` an array from one dtype to another using ndarray's **`astype`** method. Calling `astype` will always creates a new array. i.e copy of the data, even if the new dtype is the same as the old dtype.

In [89]:
arr_int = np.array([1,2,3,4])
print(arr_int.dtype)
arr_float = arr_int.astype(np.float64)
print(arr_float.dtype)

int64
float64


In [96]:
arr_string = np.array(['1', '2', '3.5', '100.56'])
arr_string.astype(np.float64)

array([  1.  ,   2.  ,   3.5 , 100.56])

**ValueError** will be raised if it fails to cast. For e.g. casting string than cannot be convert to numberic.

In [99]:
arr_err = np.array(['1', 'B', '3'])
arr_err.astype(float) 

ValueError: could not convert string to float: 'B'

We can also use array's dtype attribute to cast the type.

In [101]:
arr_int.astype(arr_float.dtype) # the converted array will be float, it is using arr_float's data type.

array([1., 2., 3., 4.])

#### 1.8 Indexing and Slicing
We can subset the data or individual elements using indexing and slicing.

**One dimensional array**: One dimensional array is simple and similar to python list.

In [103]:
sequence_9 = np.arange(10)
sequence_9

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [104]:
sequence_9[7]  # retrieve the 7th element from the array.

5

In [105]:
sequence_9[2:8]  # retrieve the 2 to 7th element from the array.

array([2, 3, 4, 5, 6, 7])

In [106]:
sequence_9[2:8] = 100  # Assign a scalar value to a slice
sequence_9

array([  0,   1, 100, 100, 100, 100, 100, 100,   8,   9])

**Important Note**: In the above example, the array slice is assigned with scalar value. The distinction with Python's list is that array slice are _views_ on the original array. It means that the data is not copied, any modifications to the view will be reflected in the source array. Let's see the example below:

In [109]:
sequence_35 = sequence_9[3:9]
sequence_35

array([100, 100, 100, 100, 100,   8])

When the value of **sequence_35** is changed then the mutations will be reflected into the original array.

In [110]:
sequence_35[2] = 789

In [111]:
sequence_35

array([100, 100, 789, 100, 100,   8])

In [112]:
sequence_9

array([  0,   1, 100, 100, 100, 789, 100, 100,   8,   9])

The "bare" slice **[:]** will assign to all values in an array.

In [113]:
sequence_35[:] = 0
sequence_9

array([  0,   1, 100,   0,   0,   0,   0,   0,   0,   9])

**Note**: To copy a slice of an ndarray instead of view, we need to explicitly copy the array by using **array_name.copy()**

In [11]:
one_dim_array

array([0, 1, 2], dtype=int32)

In [12]:
one_dim_array[2]

2

**Two dimensional array**: In two-dimensional array the elements at each index are not scalars but one-dimensional array. The individual elements can be accessed recursively. We can pass a comma-seperated list of indices to select individual elements. We can think axis 0 as the _rows_ of the array and axis 1 as the _columns_.

                                Axis 1
                      

                             |0|    |1|    |2|
                    
                     |0|     |0,0|  |0,1|  |0,2|                
    Axis 0
                     |1|     |1,0|  |1,1|  |1,2|
                    
                     |2|     |2,0|  |2,1|  |2,2|
                     
Figure: 1.1 Indexing elements in a Numpy array

In [116]:
marks = np.array([[56,77,99],[59,84,92],[50,68,97]])
marks

array([[56, 77, 99],
       [59, 84, 92],
       [50, 68, 97]])

In [117]:
marks[1]  # retrieve all elements from row 1.

array([59, 84, 92])

In [118]:
marks[0][2] # get element with index 0 and 2. i.e. 0 row and 2 column vlaues.

99

In [119]:
marks[0,2] # get element with index 0 and 2. i.e. 0 row and 2 column vlaues.

99

In [13]:
two_dim_array

array([[0, 1, 2],
       [3, 4, 5]], dtype=int32)

In [14]:
two_dim_array[0,1]

1

In [15]:
two_dim_array[1,2]

5

In [16]:
two_dim_array[0:2]

array([[0, 1, 2],
       [3, 4, 5]], dtype=int32)

In [17]:
two_dim_array[1:2]

array([[3, 4, 5]], dtype=int32)

In [18]:
two_dim_array[0,:]

array([0, 1, 2], dtype=int32)

In [19]:
two_dim_array[1,:]

array([3, 4, 5], dtype=int32)

In [20]:
two_dim_array[0,0:2]

array([0, 1], dtype=int32)

In [21]:
two_dim_array[1,1:3]

array([4, 5], dtype=int32)

In [22]:
two_dim_array[:,1:2]

array([[1],
       [4]], dtype=int32)

In [23]:
two_dim_array[:,1:3]

array([[1, 2],
       [4, 5]], dtype=int32)

**Three dimensional array**:

In [134]:
chair = np.array([[[10,25,45],[15,28,48],[19,21,35]], [[11,23,65],[12,25,45],[16,20,42]]])
chair

array([[[10, 25, 45],
        [15, 28, 48],
        [19, 21, 35]],

       [[11, 23, 65],
        [12, 25, 45],
        [16, 20, 42]]])

In [135]:
chair[0]

array([[10, 25, 45],
       [15, 28, 48],
       [19, 21, 35]])

In [136]:
chair.shape  # The dimension of chair array is 2x3x3. i.e 2 depth 3 rows 3 columns.

(2, 3, 3)

We can assign both scalar values and arrays to higher dimension array.

In [137]:
chair_update = chair[0].copy()
chair[0] = 22
chair

array([[[22, 22, 22],
        [22, 22, 22],
        [22, 22, 22]],

       [[11, 23, 65],
        [12, 25, 45],
        [16, 20, 42]]])

In [138]:
chair_update

array([[10, 25, 45],
       [15, 28, 48],
       [19, 21, 35]])

In [139]:
chair[0] = chair_update
chair

array([[[10, 25, 45],
        [15, 28, 48],
        [19, 21, 35]],

       [[11, 23, 65],
        [12, 25, 45],
        [16, 20, 42]]])

In [140]:
chair[0,1] # retrieve 0 element's 1st row. 

array([15, 28, 48])

In [143]:
chair[0][1]

28

In [144]:
chair[0][1][2] # retrieve 0 element's 1st row 2nd column value

48

#### 1.9 Advance Indexing with slicing
ndarrays can be sliced similiar to Python lists.

In [145]:
temp_num = np.arange(30)
temp_num

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

In [146]:
temp_num[5:10]   # get elements starting index 5 to 9. (10-1)

array([5, 6, 7, 8, 9])

In [147]:
temp_num_2d = np.array([[1,2,3],[7,8,9],[10,5,2]])
temp_num_2d

array([[ 1,  2,  3],
       [ 7,  8,  9],
       [10,  5,  2]])

In the example below, in 2d array it sliced along axis 0, the first axis. The slice selects a range of elements along an axis. We can read the expression `temp_num_2d[:2]` as "select the first two rows from the array".

In [148]:
temp_num_2d[:2] 

array([[1, 2, 3],
       [7, 8, 9]])

In [149]:
temp_num_2d[:2,1:] # select all elements upto (2-1) i.e. 0 & 1 row and all columns except 1st column.

array([[2, 3],
       [8, 9]])

**Note**: When we use slicing shown above, we'll always get the array views of the same number of dimensions. When we mix integer indexes and slice then we'll get lower dimensional slices.

In [150]:
temp_num_2d[1,:2]

array([7, 8])

In [153]:
temp_num_2d[:2,2]

array([3, 9])

Colon indicate to retrieve an entire axis.

In [155]:
temp_num_2d[:,:1]

array([[ 1],
       [ 7],
       [10]])

When we assign to a slice expression it will assign to the whole selection.

In [157]:
temp_num_2d[0,:2]

array([1, 2])

In [158]:
temp_num_2d[0,:2] = 55
temp_num_2d

array([[55, 55,  3],
       [ 7,  8,  9],
       [10,  5,  2]])

#### 1.10 Boolean Indexing
Boolean indexing will always create a copy of the data when selecting from an array.

In this example, we'll use two array, one with string type and other with numeric type. Let's suppose each name in the first array corresponds to a row in the second array. Our requirement is to select all the row that correspond the name 'Jacky'. We'll use boolean indexing to filter the records that only consists the name 'Jacky'.

In [172]:
users = np.array(['Harry', 'Bobby', 'Jacky','Pat', 'Jacky','John'])
users

array(['Harry', 'Bobby', 'Jacky', 'Pat', 'Jacky', 'John'], dtype='<U5')

In [173]:
amount = np.random.randint(55, high=100, size=(6,5))
amount

array([[68, 86, 57, 59, 81],
       [69, 65, 85, 93, 58],
       [59, 80, 79, 85, 81],
       [75, 57, 78, 84, 92],
       [94, 67, 55, 60, 85],
       [80, 94, 56, 87, 99]])

In [174]:
users == 'Jacky'   # comparision operation

array([False, False,  True, False,  True, False])

We can pass the boolean array when indexing the array.

In [175]:
amount[users == 'Jacky']

array([[59, 80, 79, 85, 81],
       [94, 67, 55, 60, 85]])

We can also pass boolean array as well as index the column too.

In [177]:
amount[users == 'Jacky',3]

array([85, 60])

Now, we'll select all except 'Jacky'

In [178]:
users != 'Jacky'

array([ True,  True, False,  True, False,  True])

In [180]:
amount[~(users == 'Jacky')]

array([[68, 86, 57, 59, 81],
       [69, 65, 85, 93, 58],
       [75, 57, 78, 84, 92],
       [80, 94, 56, 87, 99]])

We can uses `~` (not) operator if we want to invert the condition.

In [181]:
only_bob = users == 'Jacky'

In [182]:
amount[~only_bob]

array([[68, 86, 57, 59, 81],
       [69, 65, 85, 93, 58],
       [75, 57, 78, 84, 92],
       [80, 94, 56, 87, 99]])

We can combine multiple boolean conditions and use boolean arithmentic operators like `&` (and) and `|` (or) to filter the data elements.

In [183]:
harry_pat = (users == 'Harry') | (users == 'Pat')
harry_pat

array([ True, False, False,  True, False, False])

In [184]:
amount[harry_pat]

array([[68, 86, 57, 59, 81],
       [75, 57, 78, 84, 92]])

**Note**: Python keywords `and` and `or` won't works with boolean array, always use `&` and `|`.

Boolean array can also be used to set the values.

In [185]:
amount[amount < 85] = 0
amount

array([[ 0, 86,  0,  0,  0],
       [ 0,  0, 85, 93,  0],
       [ 0,  0,  0, 85,  0],
       [ 0,  0,  0,  0, 92],
       [94,  0,  0,  0, 85],
       [ 0, 94,  0, 87, 99]])

In [187]:
amount[users != 'Jacky'] = 100
amount

array([[100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [  0,   0,   0,  85,   0],
       [100, 100, 100, 100, 100],
       [ 94,   0,   0,   0,  85],
       [100, 100, 100, 100, 100]])

#### 1.11 Fancy Indexing
Numpy uses fancy indexing to index by using integer arrays.

In [191]:
temp_array = np.random.randint(5,size = (10,5))
temp_array

array([[4, 1, 1, 2, 2],
       [4, 2, 4, 3, 1],
       [1, 1, 0, 3, 2],
       [2, 3, 1, 0, 1],
       [4, 0, 2, 1, 0],
       [0, 4, 1, 3, 0],
       [1, 4, 3, 2, 4],
       [4, 4, 2, 1, 0],
       [3, 2, 4, 3, 1],
       [2, 1, 1, 3, 4]])

In [192]:
temp_array[[0,9,1,8,0]]   # select row in the order of 0,9,1,8,0.

array([[4, 1, 1, 2, 2],
       [2, 1, 1, 3, 4],
       [4, 2, 4, 3, 1],
       [3, 2, 4, 3, 1],
       [4, 1, 1, 2, 2]])

#### 1.12 Transposing Arrays and Swapping Axes
Transposing is used to reshape the data which will return a view on the underlying data without copying anything. Array has `transpose` method and `T` attribute to transpose the array.

In [196]:
temp_array = np.arange(20).reshape(10,2)
temp_array

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15],
       [16, 17],
       [18, 19]])

In [197]:
temp_array.T

array([[ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
       [ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19]])

In [200]:
np.transpose(temp_array)

array([[ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
       [ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19]])

`swapaxes` is a ndarray method that takes a pair of axis numbers and switches the indicated axes to rearrange the data elements. It returns a view on the data without making a copy.

In [210]:
temp_array.swapaxes(0,1)  # swap element between axes 0 and axes 1.

array([[ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18],
       [ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19]])

#### 1.13 Universal Functions
A universal functions or **`ufunc`** is a function that performs element-wise operations on data in ndarrays. It can be called as fast vectorized wrappers for simple functions that take one or more scalar values and product one or more scalar results. **unfunc** are simple element-wise transformations such as sqrt, exp etc. They are called `unary` unfuncs. Other function that takes two arrays such as add, maximum etc are known as `binary` unfuncs that returns a single array as the output result.

In [211]:
seq_num = np.arange(5)
np.sqrt(seq_num)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ])

In [212]:
np.exp(seq_num)

array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692, 54.59815003])

In [214]:
score_1 = np.random.randn(10)
score_2 = np.random.randn(10)

In [215]:
score_1

array([ 0.62824776,  1.93676184,  0.62611487,  0.15513952, -0.75042867,
        0.75877395, -0.7013509 , -0.10048266, -1.60660864, -0.02818735])

In [216]:
score_2

array([-0.39579281,  0.99820668, -1.99796783, -0.15547071,  1.46451355,
       -0.45705179,  1.57908758,  0.47782595, -0.25416641, -0.93195779])

In [217]:
np.maximum(score_1, score_2)   # calculate the element-wise maximums between elements of score_1 and score_2.

array([ 0.62824776,  1.93676184,  0.62611487,  0.15513952,  1.46451355,
        0.75877395,  1.57908758,  0.47782595, -0.25416641, -0.02818735])

Table 1.4: Unary unviersal functions

| Function | Description |
| -------- | -------- |
| abs, fabs | Compute the absolute value element-wise for integer, floating point, or complex values |
| sqrt | Compute the square root of each element, similar to array ** 0.5 |
| square | Compute the square of each element, similar to array ** 2 |
| exp | Compute the exponent $e^{x}$ of each elements |
| log, log10, log2, log1p | Natural logarithm (base e), log base 10, log base 2 and log(1+x) resp. |
| sign | Compute the sign of each elements: 1 (positive), 0 (zero), or -1 (negative) |
| ceil | Compute the ceiling of each element (i.e. the smallest integer greater than or equal to that number) |
| floor | Compute the floor of each element (i.e. the largest integer less than or equal to each element) |
| rint | Round elements to the nearest integer, preserving the dytpe |
| modf | Return fractional and integral parts of array as a separate array |
| isnan | Return boolean array indicating whether each value is Nan (Not a Number) |
| isfinite, isinf | Return boolean array indicating whether each element is finite (non-inf, non-NaN) or infinite resp. |
| cos, cosh, sin, sinh, tan, tanh | Regular and hyperbolic trigonometric functions |
| arccos, arccosh, arcsin, arcsinh, arctan, arctanh | Inverse trigonometric functions |
| logical_not | Computer truth value of not x element-wise (similar to ~array) |

Table 1.5: Binary universal functions  

| Function | Description |
| -------- | ------------ |
| add | Add corresponding elements in arrays |
| subtract | Subtract elements in second array from frist array |
| multiply | Multiply array elements |
| divide, floor_divide | Divide or floor divide (truncting the remainder) |
| power | Raise elements in first array to powers indicated in second array |
| maximum, fmax | Element-wise maximum; fmax ignores NaN |
| minimun, fmin | Element-wise minimun; fmin ignores NaN |
| mod | Element-wise modulus(remainer of division |
| copysign | Copy sign of values in second argument to values in first argument |
| greater, greater_equal, less, less_equal, equal, not_equal | Perform element-wise comparision that returns boolean array ( similar to infix operator >, >=, <, <=, ==, !=) |
| logical_and, logical_or, logical_xor | Compute element-wise truth value of logical operation ( similar to infix operator &, \| ^) |

We can use aggregations functions like sum, mean, std etc either calling the array instance method or using the top-level NumPy function. mean and sum take an optional `axis` argument that computes the statistic over the given axis, resulting in an array with one fewer dimension.

Table 1.6: Array Statistical Methods 

| Method | Description |
| ------ | ----------- |
| sum | Sum of all the elements in the array or along an axis; zero-length arrays has sum 0 |
| mean | Arithmetic mean; zero-length arrays have NaN mean |
| std, var | Standard deviation and variance resp. with optional degrees of freedom adjustment (default denominator n) |
| min, max | Minimum and maximum |
| argmin, argmax | Indices of minimun and maximun elements resp. |
| cumsum | Cumulative sum of elements starting from 0 |
| cumprod | Cumulative product of elements starting from 1 |

In [218]:
temp_array

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15],
       [16, 17],
       [18, 19]])

In [221]:
temp_array.mean()

9.5

In [222]:
np.mean(temp_array)

9.5

In [219]:
temp_array.mean(axis=1)

array([ 0.5,  2.5,  4.5,  6.5,  8.5, 10.5, 12.5, 14.5, 16.5, 18.5])

In [223]:
temp_array.sum()

190

In [224]:
temp_array.sum(axis=0)

array([ 90, 100])

In [225]:
temp_array.mean(1) # compute mean across the columns

array([ 0.5,  2.5,  4.5,  6.5,  8.5, 10.5, 12.5, 14.5, 16.5, 18.5])

In [226]:
temp_array.sum(0) # compute sum down the rows

array([ 90, 100])

In [227]:
temp_array.cumsum()  # it will not aggregate but produce an array of the intermediate results.

array([  0,   1,   3,   6,  10,  15,  21,  28,  36,  45,  55,  66,  78,
        91, 105, 120, 136, 153, 171, 190])

In [230]:
temp_array[2:5].cumprod()

array([    4,    20,   120,   840,  6720, 60480])

#### 1.14 Methods for Boolean Arrays
Boolean values are coerced to 1 (True) and 0 (False). `sum` is often used as a means of counting True values in  a boolean array. `any` and `all` are also used in boolean arrays. `any` checks if one or more values in an array is True. `all` checks if every value is True. `any` and `all` can be also be used with non-boolean arrays.

In [233]:
rand_num = np.random.randn(10)
rand_num

array([ 0.18050338, -0.26763806, -0.61108349,  1.22295678,  0.75629351,
        0.9760859 , -1.21064339, -1.90744511,  1.23391776, -0.73749851])

In [234]:
rand_num >0

array([ True, False, False,  True,  True,  True, False, False,  True,
       False])

In [235]:
(rand_num>0).sum()  # calculate the total positive values

5

In [236]:
valid = np.array([True, False, False, False, True, True])
valid

array([ True, False, False, False,  True,  True])

In [237]:
valid.any()

True

In [238]:
valid.all()

False

#### 1.15 Sorting
NumPy arrays can be sorted using `sort` method. `np.sort` return a sorted copy of an array instead of modifying the array in-place.

In [240]:
temp_rand = np.random.randn(10)
temp_rand

array([ 0.00140888, -0.36303666,  0.70186779, -1.28337956,  0.42617945,
       -0.00511888,  1.21809059, -0.8546204 , -0.0542071 , -0.20528876])

In [241]:
temp_rand.sort()
temp_rand

array([-1.28337956, -0.8546204 , -0.36303666, -0.20528876, -0.0542071 ,
       -0.00511888,  0.00140888,  0.42617945,  0.70186779,  1.21809059])

In [243]:
temp_rand_2d = np.random.randn(6,5)
temp_rand_2d

array([[ 0.59895027,  1.99883311, -0.12275313, -0.55602504, -0.70353952],
       [ 0.8502775 , -0.25563223,  0.14980086,  0.72144428, -0.45453843],
       [ 0.44653561, -0.23201519,  1.13257405,  0.3705467 ,  1.49068238],
       [-0.67187477, -1.06482131,  0.44855654, -0.33229752,  1.18286727],
       [ 0.10668042, -0.46667648,  0.67360826,  0.92929984, -1.22288563],
       [ 0.34314334, -1.13590688, -1.43647026,  1.52478647,  1.04329251]])

Multidimensional array can be sorted by passing the axis number to sort method.

In [244]:
temp_rand_2d.sort(1)
temp_rand_2d

array([[-0.70353952, -0.55602504, -0.12275313,  0.59895027,  1.99883311],
       [-0.45453843, -0.25563223,  0.14980086,  0.72144428,  0.8502775 ],
       [-0.23201519,  0.3705467 ,  0.44653561,  1.13257405,  1.49068238],
       [-1.06482131, -0.67187477, -0.33229752,  0.44855654,  1.18286727],
       [-1.22288563, -0.46667648,  0.10668042,  0.67360826,  0.92929984],
       [-1.43647026, -1.13590688,  0.34314334,  1.04329251,  1.52478647]])

#### 1.16 Set Logic
Numpy has basic set operation for one-dimensional ndarrays. `np.unique` is used more frequently that returns the sorted unique values in an array.

In [246]:
users

array(['Harry', 'Bobby', 'Jacky', 'Pat', 'Jacky', 'John'], dtype='<U5')

In [247]:
np.unique(users)

array(['Bobby', 'Harry', 'Jacky', 'John', 'Pat'], dtype='<U5')

In [248]:
nums = np.array([50,1,45,2,3,50,1,45,42])
nums

array([50,  1, 45,  2,  3, 50,  1, 45, 42])

In [249]:
np.unique(nums)

array([ 1,  2,  3, 42, 45, 50])

Table 1.7: Array set operations

| Method | Description |
| ------ | ----------- |
| unique(x) | Compute the sorted, unique elements in x |
| intersect1d(x,y) | Compute the sorted, common elements in x and y |
| union1d(x,y) | Compute the sorted union of elements |
| in1d(x,y) | Compute a boolean array indicating whether each element of x is contained in y |
| setdiff1d(x,y) | Set difference, elements in x that are not in y |
| setxor1d(x,y| Set symmetric difference; elements that are in either of the arrays, but not both |

#### 1.17 Masked Arrays
Masked array is the combination of a `ndarray` and a `mask`. A mask is either _`nomask`_ that indicates no value of the associated array is invalid, or an array of booleans that determines for each element of the associated array is either valid or or not valid. If an element of the mask is `True` then the corresponding element of the associated is said to be masked (invalid). If an element of the mask is `False` then the corresponding element of the associated array is said to be unmasked (valid). Masked array is the combination of:-
* ndarray of any shape or datatype
* boolean mask with the same shape
* a **fill_value** i.e. a value used to replace the invalid entries for returning a standard ndarray.

Masked Array are important for:-
* When we want to preserve the values we masked for later processing, without copying the array.
* When we have to handle many arrays, each with their own mask.
* When we have different flgas for missing or invalid values. We want to preserve these flage without replacing  in the original dataset, but exclude from computations.
* When we  can't avoid or eliminate missing values but don't want to deal with NaN values.

`numpy.ma` module comes with specific implementation of `ufuncs` so we can apply fast vectorized functions and operations on masked data. The output will be a masked array.

In [6]:
house_price = np.array([100, 230, 450, 10]) # price are on thousand, the last value seems to be outlier.

import numpy.ma as ma
msk_house_price = ma.masked_array(house_price, mask=[0,0,0,-1])
msk_house_price

masked_array(data=[100, 230, 450, --],
             mask=[False, False, False,  True],
       fill_value=999999)

#### 1.18 Array manipulation routines
* np.meshgrid
* np.sqrt
* np.where
* np.reshape


[More on numpy routines](https://numpy.org/doc/stable/reference/routines.html)  
[Some tutorials](https://medium.com/@mo000007/numpy-for-beginners-19d64e164df6)