   
# Video Series: Introduction to Machine Learning with Scikit-Learn


## [4] - Prerequisite: Introduction to NumPy




<br/><br/>

___NumPy (Numeric Python) is the first step towards your journey on Machine Learning with Python Programming Language. NumPy is used for creating N-D arrays. Here is how we can use the same___

<br/><br/>

In [52]:
import numpy as np

In [53]:
# NumPy array

n_arr = np.array([1,2,3,4,5])

<br/><br/><br/>

In [54]:
# Python array

p_arr = [1,2,3,4,5]

<br/><br/>

In [55]:
print("Python Array => ",p_arr)
print("NumPy Array => ",n_arr)

Python Array =>  [1, 2, 3, 4, 5]
NumPy Array =>  [1 2 3 4 5]


<br/><br/><br/>

### How it's different from Python List?


This is the first question that comes to everyone's mind and here is why _NumPy_ is the obvious choice

- (1) NumPy N-D arrays takes less memory for storing data
- (2) NumPy makes it easy to perform mathematical operations

<br/>

#### Memory Usage

In [56]:
len(p_arr)

5

In [57]:
import sys
len(p_arr) * sys.getsizeof(1)

140

<br/><br/>

_Let's check the size of NumPy array_

In [58]:
len(n_arr)

5

In [59]:
n_arr.size

5

In [60]:
n_arr.itemsize

8

In [61]:
n_arr.size * n_arr.itemsize

40

<br/><br/>

_Futher optimization is also possible with NumPy_

In [62]:
n_arr = np.array([1,2,3,4,5], dtype=np.int8)

In [63]:
n_arr.size * n_arr.itemsize

5

<br/><br/><br/><br/>

### There is NO magic behind reduced size

Unlike Python lists, NumPy arrays are __Homogeneous__ i.e all elements of the array are of same type

In [64]:
p_arr = [1,  2.0, "Hello", "World", 56000]

In [65]:
p_arr

[1, 2.0, 'Hello', 'World', 56000]

<br/>

In [66]:
n_arr = np.array([1,2.0, "Hello", "World", 56000])

<br/><br/><br/><br/><br/>

In [67]:
n_arr

array(['1', '2.0', 'Hello', 'World', '56000'], dtype='<U32')

In [68]:
n_arr.dtype

dtype('<U32')

In [69]:
p_arr[0] * 10

10

In [70]:
n_arr[0] * 10

'1111111111'

<br/><br/><br/>

### Exercise

Create a NumPy array with a combination of integer and floating point values and check the resultant data type of the NumPy array

In [72]:
# %load chap4-1-0.py
n_arr = np.array([ 1, 2.0, 10.56, 45, 60])
n_arr, n_arr.dtype

(array([ 1.  ,  2.  , 10.56, 45.  , 60.  ]), dtype('float64'))

<br/><br/><br/>

#### Mathematical Operations

In [73]:
p_arr = [1,2,3,4,5]
n_arr = np.array([1,2,3,4,5])

In [74]:
p_arr * 2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [75]:
n_arr * 2

array([ 2,  4,  6,  8, 10])

<br/><br/><br/><br/>

In [76]:
p_arr = [[1,2],
       [3, 4]]

n_arr = np.array([[1,2],
                 [3,4]])

In [77]:
p_arr * 2

[[1, 2], [3, 4], [1, 2], [3, 4]]

In [78]:
n_arr * 2

array([[2, 4],
       [6, 8]])

<br/><br/><br/>

### How Data is represented / used in Machine Learning

<br/>

In programming, we're used to see the arrays as a row numbers

<br/><br/>

In [83]:
n_arr = np.array([1,2,3,4,5,6,7])

In [84]:
n_arr

array([1, 2, 3, 4, 5, 6, 7])

<br/><br/><br/>

But in Machine learning, we generally represent the numbers as columns

In [85]:
n_arr = n_arr.reshape(-1,1)
n_arr

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7]])

In [86]:
n_arr.shape

(7, 1)

In [87]:
n_arr.ndim

2

___Vector  (Row Vectors and Column Vectors)___

<br/><br/><br/>

In [88]:
n_arr = np.array([[1,2],
                 [3,4]])

In [89]:
n_arr

array([[1, 2],
       [3, 4]])

<br/>

In [90]:
n_arr.shape, n_arr.ndim

((2, 2), 2)

___Matrix___

<br/><br/><br/>

In [95]:
n_arr = np.arange(0,16,1)
n_arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [96]:
n_arr = n_arr.reshape(4,2,2)

In [97]:
n_arr

array([[[ 0,  1],
        [ 2,  3]],

       [[ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15]]])

In [98]:
n_arr.shape

(4, 2, 2)

In [99]:
n_arr.ndim

3

<br/>

___Tensor___

<br/><br/><br/>

### Generating NumPy arrays without predefined static data

Till now we've created an array with predefined data, let's see how we can use different numpy functions to create N-D arrays

__Using numpy.empty(...)__

In [100]:
n_arr = np.empty(5)
n_arr

array([4.9e-324, 9.9e-324, 1.5e-323, 2.0e-323, 2.5e-323])

In [101]:
n_arr = np.empty((5,2))
n_arr

array([[ 1.5,  3. ],
       [ 4.5,  6. ],
       [ 7.5,  9. ],
       [10.5, 12. ],
       [13.5, 15. ]])

<br/><br/><br/>

#### Exercise

- 1) Convert this 2-D array into a 1-D array
- 2) Change the dimension to 2,5

In [103]:
# %load chap4-1-1.py
n_arr.reshape(10)
n_arr.reshape(2,5)
n_arr.reshape(10,1)

array([[ 1.5],
       [ 3. ],
       [ 4.5],
       [ 6. ],
       [ 7.5],
       [ 9. ],
       [10.5],
       [12. ],
       [13.5],
       [15. ]])

<br/><br/><br/><br/>

### Other utility functions
__using numpy.zeros(...)__

In [104]:
n_arr = np.zeros(5)
n_arr

array([0., 0., 0., 0., 0.])

In [105]:
arr = np.zeros((5,2))
arr

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

In [106]:
arr = np.zeros((5,2)).reshape(10,1)
arr

array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]])

<br/><br/>

__Using np.ones(...)__

In [107]:
arr = np.ones(5)
arr

array([1., 1., 1., 1., 1.])

In [108]:
arr = np.ones((5,2))
arr

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

In [109]:
arr = np.ones((5,2)).reshape(1,10)
arr

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])


<br/>
<br/>

In [110]:
arr.fill(23)

In [111]:
arr

array([[23., 23., 23., 23., 23., 23., 23., 23., 23., 23.]])