# NumPy

In [1]:
#regular array or list
height = [1.73, 1.68, 1.30, 1.89]
weight = [65.5, 60, 54, 80.2]

In [2]:
height, type(height)

([1.73, 1.68, 1.3, 1.89], list)

Here we want to calculate the BMI for each index of the list but below code is not going to work

`weight / height ** 2`

We can do this with a simple loop but `NumPy array` provides an easier way..

In [3]:
import numpy as np

In [4]:
#first let's convert the lists from regular to numpy array/
np_height = np.array(height)
np_weight = np.array(weight)
np_height, type(np_height)

(array([1.73, 1.68, 1.3 , 1.89]), numpy.ndarray)

Now we calculate the BMI of each index using `numpy array`

In [5]:
bmi = np_weight / np_height ** 2
bmi

array([21.88512814, 21.2585034 , 31.95266272, 22.45177907])

Regular list and numpy array differs a lot<br>
First off, numpy array contains `only one type`<br>
so if we put different types, for the following case it takes all of data as strings

In [6]:
np.array([1.0, "is", True])

array(['1.0', 'is', 'True'], dtype='<U32')

Second, numpy array behaves differently

In [7]:
python_list = [1, 2, 3]
numpy_array = np.array([1, 2, 3])

For example, if we do python_list + python_list, the list `elements are pasted together`<br>
If we do this on numpy array on the other hand, we get `element wise sum` of the array

In [8]:
python_list + python_list, numpy_array + numpy_array

([1, 2, 3, 1, 2, 3], array([2, 4, 6]))

**NumPy Subsetting using booleans**<br><br>
Say i want to get all bmi valuse that are over 22..<br>
First step is using the greater than sign

In [9]:
bmi > 22

array([False, False,  True,  True])

Next i can use this `boolean array` inside square brackets to do the subsetting..

In [10]:
bmi[bmi > 22]

array([31.95266272, 22.45177907])

**2D NumPy array**

In [11]:
np_2d = np.array([[1.73, 1.68, 1.30, 1.89],
                     [65.5, 60, 54, 80.2]])
np_2d

array([[ 1.73,  1.68,  1.3 ,  1.89],
       [65.5 , 60.  , 54.  , 80.2 ]])

`Shape attribute` of 2d numpy array provides the number of rows and columns

In [12]:
np_2d.shape

(2, 4)

In [13]:
#subsetting on a 2d numpy array
np_2d[0][2], np_2d[0,2]

(1.3, 1.3)

Suppose i want the height and weight of the second and third family member and i just want the weight

In [14]:
np_2d[:, 1:3], np_2d[1, :]

(array([[ 1.68,  1.3 ],
        [60.  , 54.  ]]),
 array([65.5, 60. , 54. , 80.2]))

We can do operation like below in numpy 2d array<br>
Like the one where we calculated the bmi with numpy array

In [15]:
test_1 = np.array([[1, 2],
                  [3, 4],
                  [5, 6],
                  [7, 8]])
test_2 = np.array([3 , 4])
test_1 + test_1, test_1 * test_2

(array([[ 2,  4],
        [ 6,  8],
        [10, 12],
        [14, 16]]),
 array([[ 3,  8],
        [ 9, 16],
        [15, 24],
        [21, 32]]))

**Useful numpy functions**<br><br>

But before we learn useful functions lets generate two arrays of 5000 `random numbers` using numpy function

In [16]:
test_height = np.round(np.random.normal(1.75, 0.20, 5000), 2)
test_weight = np.round(np.random.normal(60.32, 15, 5000), 2)
test_height, test_weight

(array([1.77, 1.88, 1.57, ..., 1.8 , 1.85, 2.09]),
 array([77.1 , 47.41, 57.23, ..., 80.21, 80.15, 34.52]))

In [17]:
#using column_stack function to paste them together as two colums
np_city = np.column_stack((test_height, test_weight))
np_city

array([[ 1.77, 77.1 ],
       [ 1.88, 47.41],
       [ 1.57, 57.23],
       ...,
       [ 1.8 , 80.21],
       [ 1.85, 80.15],
       [ 2.09, 34.52]])

In [18]:
#average height of the np_city people
np.mean(np_city[:, 0])

1.7538479999999999

In [19]:
#median weight of the np_city people
np.median(np_city[:, 1])

60.46

In [20]:
#correaltion of the height and weight of the np_city people
np.corrcoef(np_city[:, 0], np_city[:, 1])

array([[ 1.       , -0.0228224],
       [-0.0228224,  1.       ]])

In [21]:
#standard deviation of the heights
np.std(np_city[:, 0])

0.19865405330876085