# PYTHON NUMPY

In our last class meeting, we learned that NumPy is a python library generally used for working with arrays. It includes a wide range of mathematical functions, such as linear algebra, Fourier transforms, and random number generation


**Why do we need to learn NumPy?**
- NumPy’s are used for machine learning, data science, image and signal processing, scientific computing, and quantum computing.


**Advantages of using NumPy Arrays:**
- NumPy arrays consume less memory to store data, thus, it has a faster execution compared to Python Lists.

- NumPy arrays are optimized for complex mathematical and statistical operations. Up to 50x faster than iterating over Python lists using loops.

- It can also be used with various libraries like Pandas, Scipy, sckit-learn, etc.


To learn more about Numpy’s, let’s start first with its basic implementations.

### IMPORTING NUMPY

Reminder: Make sure that you have installed numpy before calling the library

In [None]:
import numpy as np
# the numpy library is denoted as np so that it will readable
# and easier to call the functions inside the library

### NUMPY ARRAY CREATION

There are various ways on how to create an array:

#### Using a Python List:

In [None]:
# complete the code and output the array
np.array([2, 4, 6, 8])

#### Using np.zeros()

The number passed as an argument to np.zeros() determines the size of the resulting array filled with zeros.

In [None]:
# complete the code and output the array
np.zeros(4)

#### Using np.arange()

Returns an array with values within a specified interval.

In [None]:
# complete the code and output the array

#create an array with values from 0 to 4
np.arange(5) 

#create an array with values from 1 to 8 with a step of 2
np.arange(1, 9, 2) 

#### Using np.random.rand()

Used to create an array of random numbers

In [None]:
# complete the code and output the array
np.random.rand(5)

#### Using np.empty()

Used to create an empty array.

In [None]:
# complete the code and output the array
np.empty(4)

# The elements inside the array are not completely empty but rather it contains non-existing numbers

### NUMPY DATA TYPES

To check the data type of an array call the **```dtype```** attribute, returns the data type of the elements inside the array.

In [None]:
# create an array of integers
np.array([-3, -1, 0, 1]) 

# print the data type of the array

For signed integers, by default, **int64** is the data type you can change its data type by indicating what type to use in creating an array.

In [None]:
np.array([1,3,7], dtype = ‘int32’)

# print the array and its data type

Type conversion can also be performed in NumPy.

In [None]:
np.array([1, 3, 5, 7, 9])
# use this syntax to perform type conversion arr.astype('float')

# print the converted array and its new data type

#### Other NumPy Attributes

- **```ndim```** – returns number of dimension of the array
- **```size```** – returns numbers of elements in the array
- **```shape```** – returns the size of the array in each dimension
- **```itemsize```** – returns the size (in bytes) of each elements in the array
- **```data```** – returns the buffer (reference or pointers) containing actual elements of the array in memory 


In [None]:
np.arange(9)

# print the dimension of the array
# print the array shape
# print the array size

### NUMPY ARRAY INDEXING

Similar to lists, array elements can be accessed using their index.

In [None]:
np.array([1, 3, 5, 7, 9, 11])
# access the first element
# access the third element
# access the fifth element

You can also access array elements using negative indexing.

In [None]:
np.array([1, 3, 5, 7, 9, 11])
# access the last element
# access the second-to-the-last element
# access the fourth-to-the-last element

### NUMPY ARRAY SLICING

Similar to lists, you can also get a subset of array elements using the slicing operator. 

**```arr[start:stop:step]```**
- start is the index where python starts getting the subset by default the index is 0.
- stop is the end, however, the last element is stop – 1. By default, the last element is the stop index.
- step, is an optional argument, it dictates how many steps it will take before collecting the element in array. By default step is 1.

In [None]:
np.array([1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27])
# get the subset of [5, 7, 9, 11]
# get the subset of [9, 15, 21, 27]
# get the subset of [13, 17]

You can also get the subset of the array using Negative Indexing.

In [None]:
np.array([1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27])
# get the subset of [23, 25, 27]
# get the subset of [9, 13, 17, 21]
# get the subset of [27, 17, 7]

### NUMPY ARRAY RESHAPING

A 1D array can be reshaped into an N-D array by calling **```np.reshape(array, newshape, order = ‘C’)```**.

In [None]:
np.array([2, 4, 6, 8, 9, 10, 11, 12])

# reshaped the array into a 2D array with 2 rows and 4 columns

### NUMPY COMPARISON/LOGICAL OPERATORS

In performing this task, NumPy will perform element wise operation.

In [None]:
np.array([1, 2, 3])
np.array([3, 2, 1])

# perform the lesser than comparison using the less() function
# perform the greater than comparison using the greater() function
# perform the not equal comparison using the not_equal() function

For logical operators, NumPy can store array elements of Boolean type.

In [None]:
np.array([True, False, True])
np.array([False, False, True])

# perform the AND operator using logical_and() function

### NUMPY MATH FUNCTIONS

NumPy Math Functions are divided into three categories:
- Arithmetic Functions
- Trigonometric Functions
- Rounding Functions

Let’s perform Arithmetic Functions on a 1D array:

In [None]:
np.array([1, 3, 5, 7, 9, 11])
np.array([2, 4, 6, 8, 10, 12])

# get the sum of two arrays using the + operator and add() function
#get the product of two arrays using the * operator and product() function

Trigonometric Functions on a 1D array:

In [None]:
# array of angles in radians
np.array([0, 1, 2])
# compute the cosine of the angles
# compute the inverse tangent of the angles

In [None]:
# angle in radians
angle = 1.57079633 
# convert the angle to degrees
# convert the angle back to radians

Rounding Functions

In [None]:
np.array([1.23456, 2.34567, 3.45678, 4.56789])
# call the floor() function and observe its output
# call the ceil() function and observe its output

### NUMPY STATISTICAL FUNCTIONS

In [None]:
np.array([76, 78, 81, 66, 85])

# compute the mean of the marks
# compute the median of marks
# find the maximum and minimum marks

Now, that you are more familiar with using the different NumPy operations on a 1D array. Let’s try to make it more challenging by applying the different NumPy operations on an N-D dimensional array.

### N-D ARRAY FROM LIST OF LISTS

In [None]:
np.array([[1, 2, 3, 4, 5], 
          [6, 7, 8, 9, 10]])
#print the array including its dimension and shape

In [None]:
np.array([[[1, 2, 3, 4], 
           [5, 6, 7, 8], 
           [9, 10, 11, 12]], 
          
          [[13, 14, 15, 16], 
           [17, 18, 19, 20], 
           [21, 22, 23, 24]]])
#print the array including its dimension and shape

### Using np.zeros()

In [None]:
# create a 2D array with 2 rows and 3 columns filled with zeros

In [None]:
# create a 3D array with dimensions 2x3x4 filled with zeros

### Using np.full()

This function is used when you want to create an N-D array with a specific value.

In [None]:
# create a 2D array with elements initialized to 5
np.full((2, 2), 5) 

# print the array

### Using np.random.rand()

In [None]:
# create a 2D array of 3 rows and 3 columns of random numbers

In [None]:
# create a 3D array of shape (3, 3, 3) of random numbers

### Using np.empty()

In [None]:
# create an empty 2D array with 2 rows and 2 columns

In [None]:
# create an empty 3D array of shape (2, 2, 2)

### NUMPY ARRAY INDEXING ON N-D ARRAYS

In 2D NumPy Array, to access a specific element, you need to remember to use the square bracket notation and its syntax **```arr[row, column]```**.

In [None]:
np.array([[1, 3, 5, 7], 
          [2, 4, 6, 8], 
          [9, 11, 13, 15]])

# access the element at the third row and second column
# access the element at the second row and third column
# print the last row
# print the last column

For 3D arrays, to access an element **```arr[slice, row, column]```**.

In [None]:
np.array([[[1, 2, 3, 4], 
           [5, 6, 7, 8], 
           [9, 10, 11, 12]], 
          
          [[13, 14, 15, 16], 
           [17, 18, 19, 20], 
           [21, 22, 23, 24]]])

# print this element arr[1, 2, 1]

### NUMPY 2-D ARRAY SLICING

**```arr[row_start:row_stop:row_step, col_start:col_stop:col_step]```**

To get the subset of rows, let’s have a breakdown of the different indexes for the row. **```row_start```** is the starting index, indicating which row to start from to get the subset; by default its index is 0. **```row_stop```**, signifies the last row to get the subset (**```row_stop```**-1), and by default the last row of the entire array is the **```row_stop```** index. Lastly, **```row_step```** is an optional index that counts how many steps to take and its default value is 1.

To get the subset of columns, the idea of how slicing works in rows is also similar to columns.

In [None]:
np.array([[1, 3, 5, 7], 
          [2, 4, 6, 8], 
          [9, 11, 13, 15]])

# slice the array to get the first two rows and columns
# slice the array to get the last two rows and columns
# output the second row of the array [2, 4, 6, 8]
# output the fourth column of the array [7, 8, 15]

### NUMPY ARRAY RESHAPING

Just as NumPy allows 1-D arrays to transform into an N-D array, you can also flatten an N-D array to a 1-D array.

In [None]:
# flatten a 2D array to 1D
np.array([[1, 3], 
          [6, 8], 
          [11, 13]]) 

# np.reshape(arr, -1)

In [None]:
# flatten a 3D array to 1D
np.array([[[1, 2], 
           [6, 8], 
           [9,12]], 
          
          [[13, 14], 
           [17, 20], 
           [21, 24]]])

# np.reshape(arr, -1)

### NUMPY ARRAY TRANSPOSE

One of the operations in matrices is transpose and NumPy offers that function by calling **```np.transpose()```**.

In [None]:
np.array([[1, 3, 5], 
          [2, 4, 8]])

# transpose the 2D array

### NUMPY MATH FUNCTIONS

In working with math operations on 2D arrays, one of the arguments to be passed is the array and its axis. Take note that axis is the valid syntax for passing the second argument in a function. For example:

In [None]:
np.array([[2, 4, 6], 
          [8, 10, 12], 
          [14, 16, 18]])

# computes the median along the horizontal axis
np.median(arr, axis = 0) 

**```np.median(arr, axis = 0)```**

In a 2D array, **```axis = 0```** refers to the rows, therefore, it will compute the median for each column separately. This results in an array where each element represents the median of values across the rows for the corresponding column. 

In [None]:
np.array([[2, 4, 6], 
          [8, 10, 12], 
          [14, 16, 18]])

# computes the median along the vertical axis
np.median(arr, axis = 1)

# NUMPY EXERCISE WITH CSV FILE

In this section, you will do some basic NumPy operations in a CSV file.

### IMPORT THE NUMPY LIBRARY

In [None]:
# write your code

### READ THE CSV FILE

Try importing the CSV file **```air-quality-data.csv```**. This file contains 13 columns. Here are the details of the column:
- Datetime
- PM2.5
- PM10
- NO2
- NH3
- SO2
- CO
- O3
- NOx
- NO
- Benzene
- Toluene
- Xylene

All the columns except for Datetime, are of data type float. This dataset is used to check the quality of the air in a certain place using different factors. Let’s try working with the dataset and perform various NumPy operations.

To read the file we will be using the **```np.genfromtxt()```** function.

In [None]:
# complete the code
path = 'write down the file path'
np.genfromtxt(path, delimiter=',', skip_header=True, usecols=range(1, 13))

Explanation: **```path```** variable contains the file directory of the csv file. We also called the **```skip_header```** variable and set it to **```True```** since these data are not needed for the NumPy operations. Additonally, the **```usecols```** parameter excludes the Datatime column from the array.

### OUTPUT THE LOADED DATA

In [None]:
# print the loaded data

In [None]:
# print data dimension

In [None]:
# print data shape

In [None]:
# print array data type

### PERFORMING STATISTICAL OPERATIONS IN DATA

In [None]:
# compute the mean and median for each data column
# print the results

In [None]:
# find the maximum and minimum values for each data column
# print results

In [None]:
# compute the standard deviation in the seventh column of the array
# print results

In [None]:
# only compute the standard deviation of the first 50 elements in the seventh column of the array
# print results

### DATA SLICING AND FILTERING

In [None]:
# output data values greather than 28 in the third column of the array
# the printed output should be a numerical value

In [None]:
# output the first five row and column elements

In [None]:
# output the first 20 row elements and all data columns with step 4

In [None]:
# get the last row
# reshape the array to a 2D array with a shape of (4, 3)

In [None]:
# get the tenth-to-the-last row
# reshape the array to a 2D array with a shape of (2, 6)
# then transpose the array

### DATA PLOTTING (OPTIONAL)

Try plotting the data using the **matplotlib** library.

#### Import the library

Make sure that you have installed the library before importing it. Similar to numpy, we denoted the library as **```plt```** for readability and easier to call the functions. 

In [None]:
import matplotlib.pyplot as plt 

#### Plot the Data

Here’s a simple implementation on how to plot a numpy data using matplotlib.

In [None]:
# plot the first column of the array
pm25_column = data[::, 1]

plt.plot(pm25_column)
plt.xlabel('Index')
plt.ylabel('Values')
plt.title('PM2.5 Variation')
plt.show()

Try plotting different columns or subsets of data using the **```matplolib.pyplot```** library.

In [None]:
# plot the eight column of the array

In [None]:
# only plot the first 50 elements in the eight column of the array

In [None]:
# plot the second and third column elements of the array
# make sure that the two elements do not have the same color

In [None]:
# only plot the first 10 elements in the second and third column of the array
# make sure that the two elements do not have the same color