#### author: `marimuthu (@kmario23)`

## **A short introduction to NumPy**

NumPy (*Numerical Python*) is a Python library for scientific computing. It offers `ndarray` data structure for storing and `ufuncs` for efficiently processing the (homogeneous) data. Some of the important functionalities include: `basic slicing`, `advanced or fancy indexing`, `broadcasting`, etc.

In [10]:
# imports

import numpy as np
import matplotlib.pyplot as plt

In [None]:
# check version
np.__version__

In [11]:
# get a list of all supported data types
np.sctypes

{'int': [numpy.int8, numpy.int16, numpy.int32, numpy.int64],
 'uint': [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint64],
 'float': [numpy.float16, numpy.float32, numpy.float64, numpy.float128],
 'complex': [numpy.complex64, numpy.complex128, numpy.complex256],
 'others': [bool, object, bytes, str, numpy.void]}

In [None]:
# read about signature and docstring 
np.ndarray?

### **N-dimensional Arrays**

In [12]:
# 1D array
arr = np.array([1, 2, 3, 23])

# get its datatype
arr.dtype

dtype('int64')

In [13]:
# 2D array

# a row vector
row_vec = arr[np.newaxis, :]  # arr[None, :]
row_vec.shape

(1, 4)

In [14]:
# a column vector
col_vec = arr[:, np.newaxis]  # arr[:, None]
col_vec.shape

# read more about newaxis here: https://stackoverflow.com/questions/29241056/how-does-numpy-newaxis-work-and-when-to-use-it

(4, 1)

In [23]:
# a random array where the values come from a standard Normal distribution

gaussian = np.random.randn(2 * 3 * 4)
gaussian.shape

(24,)

In [30]:
# reshape the array to desired shape.
# only the number of dimensions can be altered 
# the number of elements CANNOT be changed during a reshape operation

gaussian = gaussian.reshape(2, 3, 4)
gaussian.shape

(2, 3, 4)

In [None]:
# an array full of zero values
# one can also specify datatype

zero_arr = np.zeros((3, 4), dtype=np.uint8)
zero_arr

In [None]:
# an array full of ones
# one can also specify datatype

ones_arr = np.ones((3, 4), dtype=np.float32)
ones_arr

If no datatype is specified during array construction using `np.array()`, NumPy assigns a default `dtype`. This is dependent on the OS (32 or 64 bit) and the elements of the array. 

- On a 32-bit system, `np.int32` would be assigned if all the values of the array are integers. If at least one value is float, then `np.float32` would be assigned (i.e., integers are up-cast to floating point). 
- Analogously, on a 64-bit machine, `np.int64` would be assigned if all the values of the array are integers. If at least one value is float, then `np.float64` would be assigned.

In [None]:
# a diagonal array

diag = np.diag([1, 2, 3, 4.0])
diag.dtype

In [None]:
# a 4x4 identity (matrix) array

iden = np.identity(4, dtype=np.float128)  # np.eye(4, dtype=np.float128)
iden

## **NumPy Array Attributes**

- *Attributes of arrays*: Determining the size, shape, memory consumption, and data types of arrays
- *Indexing of arrays*: Getting and setting the value of individual array elements
- *Slicing of arrays*: Getting and setting smaller subarrays within a larger array
- *Reshaping of arrays*: Changing the shape of a given array
- *Joining and splitting of arrays*: Combining multiple arrays into one, and splitting one array into many

Each array has attributes such as: 
 - `ndim` (the number of dimensions)
 - ``shape`` (the size of each dimension)
 - ``size`` (the total number of elements in the array)
 - ``nbytes`` (lists the total memory consumed by the array (in bytes))

In [28]:
# get number of dimensions of the array

gaussian.ndim

3

In [31]:
# get the shape of the array
gaussian.shape

(2, 3, 4)

In [38]:
# get the total number of elements in the array
gaussian.size

24

In [37]:
# total elements
print("total number of items: ", gaussian.size)

# get memory consumed by each item in the array
gaussian.itemsize
print("memory consumed by each item: ", gaussian.itemsize)

# get memory consumed by the array
gaussian.nbytes
print("total memory consumed: ", gaussian.nbytes)

total number of items:  24
memory consumed by each item:  8
total memory consumed:  192


## **Array Indexing**

 - For 1D arrays, indexing works same as Python list

In [49]:
# 1D array of random integers
# get 10 integers from 0 to 23

num_samples = 10
integers = np.random.randint(23, size=num_samples)
integers

array([ 1, 10, 17, 19,  3,  2, 15, 11, 11,  6])

In [52]:
# get 3rd element (remember: NumPy unlike MATLAB is 0 based indexing)
integers[2]

17

In [56]:
# updating the array
# truncation will happen if there's a datatype mismatch
integers[2] = 99.21
integers

array([ 1, 10, 99, 19,  3,  2, 15, 11, 11,  6])

In [58]:
# slice a portion of the array
# similar to Python iterator slicing
# x[start:stop:step]

# get last 5 elements
integers[-5:]

# if `stop` is omitted then it'll be sliced till the end of the array
# by default, step is 1

array([ 2, 15, 11, 11,  6])

In [63]:
# get alternative elements (every other element) from the array
# equivalently step = 2

integers[::2]

array([ 1, 99,  3, 15, 11])

In [64]:
# reversing the array
integers[::-1]

array([ 6, 11, 11, 15,  2,  3, 19, 99, 10,  1])

In [67]:
# forward traversal of array
integers[3::]

array([19,  3,  2, 15, 11, 11,  6])

In [69]:
# reverse travesal of array (starting from 4th element)
integers[3::-1]

array([19, 99, 10,  1])

## **nD arrays (a.k.a tensors)**

In [73]:
# a 2D array
twenty = (np.arange(4 * 5)).reshape(4, 5)
twenty

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [74]:
# slice first 2 rows and 3 columns
twenty[:2, :3]

array([[0, 1, 2],
       [5, 6, 7]])

In [78]:
# slice and get only the corner elements
# three "jumps" along dimension 0
# four "jumps" along dimension 1
twenty[::3, ::4]

array([[ 0,  4],
       [15, 19]])

In [85]:
# reversing the order of elements along columns (i.e. along dimension 0)
twenty[::-1, ...]

array([[15, 16, 17, 18, 19],
       [10, 11, 12, 13, 14],
       [ 5,  6,  7,  8,  9],
       [ 0,  1,  2,  3,  4]])

In [86]:
# reversing the order of elements along rows (i.e. along dimension 1)
twenty[..., ::-1]

array([[ 4,  3,  2,  1,  0],
       [ 9,  8,  7,  6,  5],
       [14, 13, 12, 11, 10],
       [19, 18, 17, 16, 15]])

In [87]:
# reversing the rows and columns (i.e. along both dimensions)
twenty[::-1, ::-1]

array([[19, 18, 17, 16, 15],
       [14, 13, 12, 11, 10],
       [ 9,  8,  7,  6,  5],
       [ 4,  3,  2,  1,  0]])

In [91]:
# or more intuitively
np.flip(twenty, axis=(0, 1))

# or equivalently
np.flipud(np.fliplr(twenty))
np.fliplr(np.flipud(twenty))

array([[19, 18, 17, 16, 15],
       [14, 13, 12, 11, 10],
       [ 9,  8,  7,  6,  5],
       [ 4,  3,  2,  1,  0]])

## **view** vs **copy**

## **Super useful functions**

In [101]:
# toy data
arr = np.arange(5 * 7).reshape(5, 7)
arr

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])

In [102]:
# randomly shuffle the array along axis 0
# NOTE: this is an in-place operation
np.random.shuffle(arr)
arr

array([[28, 29, 30, 31, 32, 33, 34],
       [14, 15, 16, 17, 18, 19, 20],
       [ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [21, 22, 23, 24, 25, 26, 27]])

In [None]:
# argmax of an array
arr = np.arange(4, 2 * 11).reshape(2, 9)
