# NumPy
NumPy is short form for Numeric Python.

Almost all of the libraries/packages in PyData ecosystem rely on NumPy  
Part of bigger eco-system  
    - SciPy
    - Matplotlib
    - Pandas
    - SymPy



## NumPy Array
* Alternative to regular `Python list` 
* Similar to Python list, but __can hold only <u>*one type of data*<u>__
    * Multiple data types are <u> coerced into one single type<u>


* Can do calculations over entire arrays
* Easy & Fast compared to `Python list`
* Comes with its own methods
	* Behaves differently than `Python list`


* Can do indexing & slicing similar to `Python list`
* Can also use array of Boolean to do slicing

### Official documentation
Documentation for NumPy https://docs.scipy.org/doc/numpy/reference/index.html

#### n-d Array 
Numpy can produce n-dimensional arrays


Before working with numpy package we need to import the package

In [1]:
import numpy as np

'arange' method used to generate range of values
* It accepts Start, Stop, Step & Data Type as arguments
* This will not produce the last number (i.e., Stop number)


Tip: Use `Shift + Tab` to see the arguments / parameters for a method.  
Use `Tab` to see / complete the methods

In [3]:
np.arange(0, 30, 5)

array([ 0,  5, 10, 15, 20, 25])

### Examining an array
Also can be used for exploratory data analysis (EDA)

In [5]:
my_array = np.arange(start = 0, stop = 25, step = 2)
my_array

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24])

In [7]:
my_array.ndim             # gives the dimension of the array

1

In [8]:
my_array.shape

(13,)

In [9]:
my_array.size

13

In [10]:
my_array.dtype

dtype('int32')

In [11]:
my_array.itemsize

4

### Creating an array
'array' function is used to create an array  
It can create n-dimensional array  


We can create an array from `Python list` 

In [14]:
my_list = [1, 3, 5, 7]
my_array1 = np.array(object=my_list)
my_array1

array([1, 3, 5, 7])

Creating 2 dimensional array from a list of list  
Number of dimensions of an array can be known from the # of square brackets


In [15]:
my_mat = [ [1,2,3], [4,5,6], [7,8,9]]
my_array2 = np.array(my_mat)
my_array2

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## Functions in Numpy

### <u>Zero Array<u>
'zeros' function is used to get an array of zeros  
We can also use tuples with (rows , columns) to get a n - dimensional array of 'zeros'

In [16]:
np.zeros(4)

array([ 0.,  0.,  0.,  0.])

In [17]:
np.zeros((5,4))

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

### <u>One Array<u>
'ones' function is used to get an array of ones  
We can also use tuples with (rows , columns) to get a n - dimensional array of 'ones'  
We can multiply or do mathematical operations on the function to get unique arrays

In [18]:
np.ones(4)

array([ 1.,  1.,  1.,  1.])

In [19]:
np.ones((3,2))

array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

In [21]:
np.ones((4,2)) * 2

array([[ 2.,  2.],
       [ 2.,  2.],
       [ 2.,  2.],
       [ 2.,  2.]])

In [22]:
np.ones((4,2)) + 3

array([[ 4.,  4.],
       [ 4.,  4.],
       [ 4.,  4.],
       [ 4.,  4.]])

### <u>Linspace <u>

Used to produce evenly spaced numbers between two values  
Specify the number of values required using `num=` argument  
To include end #, use `endpoint = True`  


Note: Note the difference between `Linspace` and  `arange`

In [25]:
np.linspace(20, 40, num=8, endpoint=True)

array([ 20.        ,  22.85714286,  25.71428571,  28.57142857,
        31.42857143,  34.28571429,  37.14285714,  40.        ])

### <u> Creating Identity Matrix<u>

`eye` function is used to produce an identity matrix  
An identity matrix is,   
* a square matrix in which all the elements of the principal diagonal are ones  
and all other elements are zeros

In [26]:
np.eye(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

## Generating Random Numbers

There are lot of methods to generate random numbers  
Some of them are
- binomial
- exponential
- logistic
- multinomial
- normal

In [7]:
np.random.rand(5)

array([ 0.21779491,  0.62146378,  0.24107481,  0.35543734,  0.26249475])

In [9]:
np.random.rand(4,3)

array([[ 0.51738473,  0.96767235,  0.09897713],
       [ 0.19548317,  0.41572895,  0.76111216],
       [ 0.97628879,  0.3942287 ,  0.82326943],
       [ 0.15588998,  0.15160123,  0.6143077 ]])

To get random **integer** numbers between two boundaries  
Note: Lower end is inclusive, Higher end is exclusive

In [10]:
np.random.randint(5, 20, 5)

array([18, 19,  8,  6,  8])

To get random numbers from normal distribution

In [11]:
np.random.randn(2)

array([ 1.37005701, -0.85470637])

In [12]:
np.random.randn(3,2)

array([[ 0.20176307, -1.10323248],
       [ 2.50744897, -0.78911998],
       [-1.58796816, -0.40941872]])

### Reshaping array
To reshape an array `reshape` function is used

In [37]:
my_array = np.arange(20,40)
my_array

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
       37, 38, 39])

In [38]:
my_array.reshape(4,5)

array([[20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39]])

### Other useful methods

`max` & `min` functions

In [39]:
my_array.max()

39

In [40]:
my_array.min()

20

To get the index of the maximum & minimum values

In [41]:
my_array.argmax()

19

In [42]:
my_array.argmin()

0

### Getting shape of an array

In [50]:
my_array.shape

(20,)

In [51]:
my_arr = my_array.reshape(4,5)
my_arr.shape

(4, 5)

### To get the data type of an array

In [46]:
my_array.dtype

dtype('int32')

## <u>Indexing and Selection<u>

In [48]:
my_array[8]

28

In [52]:
my_array[4:8]

array([24, 25, 26, 27])

In [54]:
my_array[:4]

array([20, 21, 22, 23])

In [55]:
my_array[4:]

array([24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39])

Indexing an 2d array can be done using two methods  
* Single Bracket method [ ]  
* Double bracket method [ ] [ ]  

Note: Remember indexing in Python starts with 0


In [59]:
my_arr

array([[20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39]])

In [60]:
my_arr[0][3]

23

In [62]:
my_arr[3][4]

39

In [65]:
my_arr[1,3]

28

In [66]:
my_arr[0:2 , 2:4]

array([[22, 23],
       [27, 28]])

Note: The lower boundary is _**excluded**_ in both rows & columns selection

Tip: Use Single bracket notation for selection for convenience

### Comparison operators with Array
When we use comparison operators with arrays, it will result in an array of boolean values  
We can use this boolean array to do conditional selection


In [69]:
bool_array = my_array > 30
bool_array

array([False, False, False, False, False, False, False, False, False,
       False, False,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [72]:
my_array2 = my_array[bool_array]
my_array2

array([31, 32, 33, 34, 35, 36, 37, 38, 39])

## Numpy Operations

All the operations in NumPy done **_element by element_** basis

### Array with Array

In [74]:
my_array + my_array

array([40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72,
       74, 76, 78])

In [77]:
my_array / my_array

array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.])

### Array with Scalars 

In [79]:
my_array * 3

array([ 60,  63,  66,  69,  72,  75,  78,  81,  84,  87,  90,  93,  96,
        99, 102, 105, 108, 111, 114, 117])

In [80]:
my_array + 5

array([25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
       42, 43, 44])

In [81]:
my_array - 10

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29])

In [82]:
my_array ** 2

array([ 400,  441,  484,  529,  576,  625,  676,  729,  784,  841,  900,
        961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521])

### Universal Array Functions

In [83]:
np.sqrt(my_array)

array([ 4.47213595,  4.58257569,  4.69041576,  4.79583152,  4.89897949,
        5.        ,  5.09901951,  5.19615242,  5.29150262,  5.38516481,
        5.47722558,  5.56776436,  5.65685425,  5.74456265,  5.83095189,
        5.91607978,  6.        ,  6.08276253,  6.164414  ,  6.244998  ])

In [84]:
np.exp(my_array)

array([  4.85165195e+08,   1.31881573e+09,   3.58491285e+09,
         9.74480345e+09,   2.64891221e+10,   7.20048993e+10,
         1.95729609e+11,   5.32048241e+11,   1.44625706e+12,
         3.93133430e+12,   1.06864746e+13,   2.90488497e+13,
         7.89629602e+13,   2.14643580e+14,   5.83461743e+14,
         1.58601345e+15,   4.31123155e+15,   1.17191424e+16,
         3.18559318e+16,   8.65934004e+16])

In [85]:
np.max(my_array)

39