<a href="https://colab.research.google.com/github/marianqian/Intro-to-ML-and-DL-Using-fast.ai/blob/master/notebooks/Lesson_2_Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to the AIM Academy! This is the second lesson, focused on 
introducing how to use the Python library, NumPy. NumPy supports the use of arrays with multiple dimensions and other math functions. 

NOTE: Educational use and distribution is permitted, but credit and attribution to AIM Academy is required. 

#Learning Objectives

*   Call methods with an object
*   Understand NumPy array characteristics
* Understand how to look through documentation



This Jupyter Notebook, the interface you are using now, will follow along this [quickstart tutorial](https://docs.scipy.org/doc/numpy/user/quickstart.html) to Numpy. 

#Introduction to ndarray
First, let's import the NumPy library itself. Using the `import` function, we can import NumPy and rename it as `np` so we do not have to call the full name every time we want to use the library. 


In [0]:
import numpy as np

NumPy's array class is known as a `ndarray`. The number of axes a ndarray has is also known as the number of dimensions the array has. Usually, arrays only have two axes, or two dimensions, known as columns and rows. 
Let's create a ndarray.





The array below, assigned the name `example_array`, contains four different doubles, or numbers with decimal points. Each time we added a dot, ., and an attribute, we are calling a specific feature or characteristic of the array. The line `example_array.ndim`, means we are looking at what is stored in the feature `ndim` for the variable `example_array`. You should be familiar with how the print statements work below; if you have any questions, refer to Lesson 1 - Python notebook. Each of the array's features give us different information about the array. The output below explains what each feature means.



In [0]:
example_array = np.array([[2.2, 3.0], [5.5, 6.8]])
print("Number of axes or dimensions: ", example_array.ndim)
print("Dimensions: ", example_array.shape)
print("Total number of elements in array: ", example_array.size)
print("Element type inside the array: ", example_array.dtype)
print("Type of object: ", type(example_array))

Number of axes or dimensions:  2
Dimensions:  (2, 2)
Total number of elements in array:  4
Element type inside the array:  float64
Type of object:  <class 'numpy.ndarray'>


#Creating ndarrays
In the above example, `example_array` was created through explicitly stating what elements were inside the array. 
```
example_array = np.array([[2.2, 3.0], [5.5, 6.8]])
another_array = np.array([4.0, 5.3])
```
Here, `another_array` is also created in the same way as `example_array`. The only difference between the two is that `another_array` has one axes, or only has one dimension. 

Notice how the arguement that is passed into `np.array() `is a list, `[]`, and not just the elements itself. 

Rather than specifying what is in the array, NumPy also allows us to create arrays that are automatically filled with numbers, but with a set size that we give as an argument. The code block below shows the different ways we can create ndarrays without actually needing to know what the array contains beforehand. These methods automatically fill-in the arrays. 

For these methods, a tuple consisting of the dimensions for the array must be passed. The date type of the element can also be specified from the `dtype` parameter, but it is not required. NumPy automatically assumes that `dtype=float64`. 

Specifically for `np.arange()`, the function returns a list of numbers that is a sequence. The three arguments is the begining number, the ending number, and the size of the step for each interval. The function works similar to the for-statement in Python but returns an array instead. 

In [0]:
zero = np.zeros((4, 3))
print("\nUsing np.zeros: Returns an array filled with zeros. \n", zero)
ones = np.ones((4, 5, 2), dtype=np.int8)
print("\nUsing np.ones: Returns an array filled with ones.  \n", ones)
random = np.empty((6))
print("\nUsing np.empty: Returns an array with random numbers specified by data type. \n", random)
sequence = np.arange(2, 10, 2)
print("\nUsing np.arange: Returns an array of sequence of numbers. (One dimensional array) \n", sequence)


Using np.zeros: Returns an array filled with zeros. 
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Using np.ones: Returns an array filled with ones.  
 [[[1 1]
  [1 1]
  [1 1]
  [1 1]
  [1 1]]

 [[1 1]
  [1 1]
  [1 1]
  [1 1]
  [1 1]]

 [[1 1]
  [1 1]
  [1 1]
  [1 1]
  [1 1]]

 [[1 1]
  [1 1]
  [1 1]
  [1 1]
  [1 1]]]

Using np.empty: Returns an array with random numbers specified by data type. 
 [1.82137340e-316 2.75895426e-316 2.71938316e-316 2.71934996e-316
 2.71943059e-316 6.33422981e+173]

Using np.arange: Returns an array of sequence of numbers. (One dimensional array) 
 [2 4 6 8]


Note that when the arrays are printed out, it is printed out similar to how lists are printed out. 
NumPy automatically does not include the center part of an array if it is too big to print out the whole thing. 

#Operating with ndarrays
Basic arithmetic operations (addition, subtraction, multiplication, and division) are done element-wise for each array, and the resulting array has each element with the operation already applied to it. For example, if `a` as a ndarray, `a + 2` would result in a **new ndarray** that has the same exact properties as a, except its elements will be bigger by 2.



In [0]:
print("ndarray a: \n", a)
a = np.array([1, 2, 3, 4])
new_a = a + 2
print("ndarray new_a: \n", new_a)
print("ndarray a: remains the same as before \n", a)

ndarray a: 
 [1 2 3 4]
ndarray new_a: 
 [3 4 5 6]
ndarray a: remains the same as before 
 [1 2 3 4]


In the example below, we will keep using our `example_array` from the previous code block. If two ndarrays have the same dimensions, we can add, subtract, multiply, or subtract their values. 

In [0]:
print("ndarray example_array: \n", example_array)
array = np.array([[4.3, 7.9], [2.1, 9.0]])
print("ndarray array: \n", array)
print("Subtracting example_array - array: \n", example_array - array)
print("Adding example_array + array: \n", example_array + array)
print("Multiplying example_array * array: \n", example_array * array)
print("Dividing example_array / array: \n", example_array / array)

ndarray example_array: 
 [[2.2 3. ]
 [5.5 6.8]]
ndarray array: 
 [[4.3 7.9]
 [2.1 9. ]]
Subtracting example_array - array: 
 [[-2.1 -4.9]
 [ 3.4 -2.2]]
Adding example_array + array: 
 [[ 6.5 10.9]
 [ 7.6 15.8]]
Multiplying example_array * array: 
 [[ 9.46 23.7 ]
 [11.55 61.2 ]]
Dividing example_array / array: 
 [[0.51162791 0.37974684]
 [2.61904762 0.75555556]]


Note that everytime we are using an operation on an array, we are creating a whole new array. The initial array we applied the operator on remains the same, unless we add an equal sign (=) after the operator. With the equal sign included, the array itself is modified.

THe @ sign denotes the dot product between two arrays. Watch [this Khan Academy video](https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/dot-cross-products/v/vector-dot-product-and-vector-length) for more information about dot products. 



In [0]:
print("Dot product example_array @ array: \n", example_array @ array)
print("Dot product example_array @ array: \n", example_array.dot(array))

Dot product example_array @ array: 
 [[ 15.76  44.38]
 [ 37.93 104.65]]
Dot product example_array @ array: 
 [[ 15.76  44.38]
 [ 37.93 104.65]]


The NumPy ndarray class also allows us to find the sum and different information about the array. Methods such as sum(), min(), and max() can give us the total sum, minimum, and maximum of the whole array, and passing a specific axis (axis=0, only looking at each column) as a parameter will give information specific to that axis.

In [0]:
print("Sum of elements in example_array: ", example_array.sum())
print("Minimum in example_array: ", example_array.min())
print("Maximum in example_array: ", example_array.max())

print("Sum of elements in example_array for each column ", example_array.sum(axis=0))
print("Minimum in example_array for each column: ", example_array.min(axis=0))
print("Maximum in example_array for each column: ", example_array.max(axis=0))

Sum of elements in example_array:  17.5
Minimum in example_array:  2.2
Maximum in example_array:  6.8
Sum of elements in example_array for each column  [7.7 9.8]
Minimum in example_array for each column:  [2.2 3. ]
Maximum in example_array for each column:  [5.5 6.8]


#Iterating through an ndarray
We can look at sections of arrays by slicing, using colons (`:`) with brackets (`[]`). `example_array[1]` gives the whole second row of the array because arrays are indexed beginning at zero, and `example_array[0:2]` gives the first and second rows, but not including the third row. `example_array` itself does not have a third row, but in other arrays, you can access a specific set of rows by using slicing. 

In [0]:
print("example_array: \n", example_array)
print("example_array[1]: \n", example_array[1])
print("example_array[0:2]: \n", example_array[0:2])

example_array: 
 [[2.2 3. ]
 [5.5 6.8]]
example_array[1]: 
 [5.5 6.8]
example_array[0:1]: 
 [[2.2 3. ]
 [5.5 6.8]]


For multidimensional arrays, one index for each axis can access a specific element inside the array. In `example_array[1,0]`, we are accessing the first element (0) in the second row (1). The indexes can be replaced with slices, where `example_array[0:1, 1]` includes 0:1 instead of just 0, but also gives us the first row. 

In [0]:
print("example_array[1,0]: \n", example_array[1,0])
print("example_array[1]: \n", example_array[0:1, 1])

example_array[1,0]: 
 5.5
example_array[1]: 
 [3.]


We can use the for-statement to loop through elements inside the array. By calling `example_array.flat`, we are able to look at each element in the array instead of each row. 

In [0]:
print("Printing each row in example_array:")
for row in example_array:
  print(row)
print("\n")
print("Printing each element in example_array:")
for element in example_array.flat:
  print(element)

Printing each row in example_array:
[2.2 3. ]
[5.5 6.8]


Printing each element in example_array:
2.2
3.0
5.5
6.8


#Changing array shape
Here, we will continue to use the `example_array` from before. The `ravel()` method returns a one-dimensional array, with the length being the number of elements in the array. `reshape()` changes the shape of the array to the given dimensions from the parameters, and leaving -1 as a dimension will allow the array to automatically calculate that dimension so that the number of elements stays the same. All of these functions return a separate, new array; however, if we use `resize()`, the dimensions of the array is modified. 


In [0]:
print("example_array: \n", example_array)
print("example_array.ravel() \n", example_array.ravel())
print("example_array.reshape(4, 1) \n", example_array.reshape(4, 1))
print("example_array.reshape(4, -1) \n", example_array.reshape(4, -1))
example_array.resize(1, 4)
print("example_array \n", example_array)
example_array.resize(2, 2)
print("example_array \n", example_array)


example_array: 
 [[2.2 3.  5.5 6.8]]
example_array.ravel() 
 [2.2 3.  5.5 6.8]
example_array.reshape(4, 1) 
 [[2.2]
 [3. ]
 [5.5]
 [6.8]]
example_array.reshape(4, -1) 
 [[2.2]
 [3. ]
 [5.5]
 [6.8]]
example_array 
 [[2.2 3.  5.5 6.8]]
example_array 
 [[2.2 3. ]
 [5.5 6.8]]


`another_name = example_array` gives another name to example_array. Changing `another_name` will change the elements in `example_array`. 

`view = example_array.view()` calls view to be a shallow copy, where the elements are the same and changing one element in one array will change the same element in the other; however, we can reshape the array's dimensions, and the other array will not change. `view` is view of the data owned by `example_array`. 

`deep_copy = example_array.copy()` creates a whole new copy, a deep copy, of `example_array`. Changing anything in `example_array` will not change `deep_copy`, as they are completely separate objects. 



In [0]:
another_name = example_array
view = example_array.view()
deep_copy = example_array.copy()

#Broadcasting
Broadcasting is a technique where arrays which do not have the same dimensions can be multiplied with one another. Broadcasting allows us to not have to waste time to state specific arrays with the same dimensions. Go to the following NumPy links to learn more about broadcasting: 
* [Basic Broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
* [Array broadcasting in NumPy](https://docs.scipy.org/doc/numpy/user/theory.broadcasting.html#array-broadcasting-in-numpy
)

During broadcasting, we multiply a vector, or an array, by a scalar, or one number. Two arrays do not need the same dimentions **if one of the array's dimensions is 1**, or else they need the same number of dimensions. The trailing dimensions of both arrays must match. The array with less dimensions are stretched over, or copied over, to make a larger array, which is then multiplied to the originally bigger array. 

![Broadcasting picture](https://docs.scipy.org/doc/numpy/_images/theory.broadcast_2.gif)

(Array Broadcasting in Numpy, https://docs.scipy.org/doc/numpy/user/theory.broadcasting.html#array-broadcasting-in-numpy)

In [0]:
print("example_array: \n", example_array)
print("example_array.shape: ", example_array.shape)
example_array.resize(4, 1)
print("example_array: \n", example_array)
print("example_array.shape: ", example_array.shape)
a = np.ones(5)
print("a: \n", a)
print("a.shape: ", a.shape)
print(example_array * a)

example_array: 
 [[2.2]
 [3. ]
 [5.5]
 [6.8]]
example_array.shape:  (4, 1)
example_array: 
 [[2.2]
 [3. ]
 [5.5]
 [6.8]]
example_array.shape:  (4, 1)
a: 
 [1. 1. 1. 1. 1.]
a.shape:  (5,)
[[2.2 2.2 2.2 2.2 2.2]
 [3.  3.  3.  3.  3. ]
 [5.5 5.5 5.5 5.5 5.5]
 [6.8 6.8 6.8 6.8 6.8]]
