<center><img src="img/numpy.png" alt="drawing" width="150"/></center>

# NumPy


NumPy, standing from Numerical Python, is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions from the domain of linear algebra, fourier transform, and matrices to operate on these arrays. 

The ancestor of NumPy, Numeric, was originally created by Jim Hugunin with contributions from several other developers. In 2005, Travis Oliphant created NumPy by incorporating features of the competing Numarray into Numeric, with extensive modifications. NumPy is open-source software and has many contributors.

So, why use NumPy? In Python we have lists that serve the purpose of arrays, but they are slow to process. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPy is called ndarray and it provides a lot of supporting functions that make working with ndarray very easy. NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. This behavior is called locality of reference in computer science. Also they are optimized to work with latest CPU architectures.

NumPy as a Python library is written partially in Python, but most of the parts that require fast computation are written in C or C++. The source code for NumPy is located at [this github repository](https://github.com/numpy/numpy).

In [4]:
import numpy as np

## NumPy Arrays

NumPy is used to work with arrays. The array object in NumPy is called ndarray. We can create a NumPy ndarray object by using the `array()` function. To create an ndarray, we can pass a list, tuple or any array-like object into the `array()` method, and it will be converted into an ndarray.

A dimension in arrays is one level of array depth in nested arrays (arrays that have arrays as their elements).

* **0-D Arrays**: 0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
* **1-D Arrays**: An array that has 0-D arrays as its elements is called vector or 1-D array. These are the most common and basic arrays.
* **2-D Arrays**: An array that has 1-D arrays as its elements is called matrix, or 2nd order tensor, or 2-D array.
* **3-D Arrays**: An array that has 2-D arrays as its elements is called 3rd order tensor, or 3-D array.

<center><img src="img/numpy01.png" alt="drawing" width="650"/></center>



In [14]:
arr_0d = np.array(42)

arr_1d = np.array([1, 2, 3,4,5])

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

arr_3d = np.array([[[1, 2, 3],
                    [4, 5, 6]],
 
                   [[1, 2, 3],
                    [4, 5, 6]]])

arr_0d, arr_1d, arr_2d, arr_3d

(array(42),
 array([1, 2, 3, 4, 5]),
 array([[1, 2, 3],
        [4, 5, 6]]),
 array([[[1, 2, 3],
         [4, 5, 6]],
 
        [[1, 2, 3],
         [4, 5, 6]]]))

In [15]:
type(arr_0d), type(arr_1d), type(arr_2d), type(arr_3d)

(numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray)

NumPy arrays provides the `ndim` attribute that returns an integer that tells us how many dimensions the array have.

In [16]:
arr_0d.ndim, arr_1d.ndim, arr_2d.ndim, arr_3d.ndim

(0, 1, 2, 3)

Of course, an array can have any number of dimensions. When the array is created, you can define the number of dimensions by using the ndmin argument.

In [17]:
arr_5d = np.array([1, 2, 3, 4], ndmin=5)
arr_5d

array([[[[[1, 2, 3, 4]]]]])

In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

Arrays carry their own attributes. Among the most important ones are:

* **Shape**: The length (number of elements) of each of the dimensions of an array.
* **Type**: The data type of the elements of the array.
* **Rank**: The number of array dimensions.
* **Size**: The total number of items in the array.

In [30]:
print("Shape:", array_1d.shape)
print("Type:", array_1d.dtype)
print("Rank:", array_1d.ndim)
print("Size:", array_1d.size)

Shape: (4,)
Type: int64
Rank: 1
Size: 4


Arrays are indexed just like Python lists.

In [31]:
array_2d[1][1:3]                                                                                       

array([6., 7.])

Numpy also accepts changing the elements of an array directly through indexing.

In [32]:
array_2d[1][2] = 5  

Additionaly one can also use conditional indexing.

In [37]:
array_2d[array_2d>2]                                                                                 

array([3., 4., 5., 6., 5., 8.])

Or to obtain the indices of the conditiona selection:

In [39]:
np.where(array_2d>2)   

(array([0, 0, 1, 1, 1, 1]), array([2, 3, 0, 1, 2, 3]))

There are some other ways to create arrays.

In [17]:
# an array of given dimensions full with 0's and 1's!
zeros = np.zeros((2,3))   
ones = np.ones((3,2))
zeros, ones

(array([[0., 0., 0.],
        [0., 0., 0.]]),
 array([[1., 1.],
        [1., 1.],
        [1., 1.]]))

In [21]:
# general form
sevens = np.full(shape = (2, 3), fill_value = 7)
tens = np.full(shape = (3, 2), fill_value = 10)
sevens, tens

(array([[7, 7, 7],
        [7, 7, 7]]),
 array([[10, 10],
        [10, 10],
        [10, 10]]))

Random arrays are arrays of some abitrary size which contain random numbers.

In [46]:
# set specific seed for reproducibility
np.random.seed(42)

In [47]:
# generate 20 random integers < 100
X = np.random.randint(1, 100, size=20)
Y = np.random.randint(1, 100, size=20)  

# basic statistics
np.mean(X), np.var(X), np.std(X), np.median(X), np.percentile(X, q=50), np.ptp(X)

(49.2, 996.8599999999999, 31.573089807619397, 52.5, 52.5, 91)

Once can compute covariance and correlation between two arrays.

In [48]:
# covariance matrix
np.cov(X,Y)

array([[1049.32631579,  133.11578947],
       [ 133.11578947,  510.06315789]])

In [49]:
# correlation matrix
np.corrcoef(X, Y)

array([[1.       , 0.1819543],
       [0.1819543, 1.       ]])

Basic mathematical operations can be performed withe usual mathematical symbols, or in a more nympy way like following.

In [51]:
a = np.array([[3, 6]])
b = np.array([[2, 2]])

# element wise addition
addition = np.add(a,b)

# element wise multiplication
multiplication = np.multiply(a,b)

# matrix multiplication (reshaping needed)
matrix_multiplication = np.matmul(np.reshape(a, [1, 2]), np.reshape(b, [2, 1]))

# element wise division
division = np.divide(a, b)

# element wise rise to power
power = np.power(a,b)

addition, multiplication, matrix_multiplication, division, power

(array([[5, 8]]),
 array([[ 6, 12]]),
 array([[18]]),
 array([[1.5, 3. ]]),
 array([[ 9, 36]]))

Some more basic operation functions.

In [53]:
# square
square = np.square(array_1d)

# square root
sqrt = np.sqrt(array_1d)

# absolute values
absolute_value = np.abs(array_1d)

# find the minimum
reduce_min = np.min(array_1d)

# find the minimum element position
argmin = np.argmin(array_1d)

# find the maximum
reduce_max = np.max(array_1d)

# find the maximum element position
argmax = np.argmax(array_1d)

square, sqrt, absolute_value, reduce_min, argmin, reduce_max, argmax

(array([ 1,  4,  9, 16]),
 array([1.        , 1.41421356, 1.73205081, 2.        ]),
 array([1, 2, 3, 4]),
 1,
 0,
 4,
 3)