### The University of Melbourne, School of Computing and Information Systems
# COMP90049 Introduction Machine Learning, 2020 Semester 2

## Week 2 - Introduction

Welcome to Jupyter Notebook—an interactive environment that mixes code, visualisations and text.

Jupyter Notebook supports many programming languages (called "kernels" in the Jupyter lingo). In this course, we'll mainly be using Python 3 due to its popularity in the machine learning/data science communities. Information about the kernel is diplayed in the top right of the UI.

## Cells

Notebooks are made up cells: *markdown cells* and *code cells*. 
This cell is an example of a markdown cell. 
Markdown cells can contain text, tables, images, equations, etc. 
(see the Markdown guide under the _Help_ menu for more info). 

You can edit a markdown cell by double-clicking on it. To convert cells to markdown, highlight the cell and hit `<M>`. To convert back to a code cell, hit `<Y>`.

To evaluate the cell press the <button class='btn btn-default btn-xs'><i class="icon-step-forward fa fa-step-forward"></i></button> button in the toolbar, or hit `<CTRL>+<ENTER>`. 
Try it below! 

--- **Edit me** ---

Next are some code cells. 
You can evaluate them individually, using the <button class='btn btn-default btn-xs'><i class="icon-step-forward fa fa-step-forward"></i></button> button or by hitting `<CTRL>+<ENTER>`. 
Often, you'll want to run all cells in the notebook, or below a certain point. The functions for doing this are in the _Cell_ menu.

In [1]:
message = "Hello world!"

In [2]:
print(message)

Hello world!


Please ensure that the scipy, numpy, matplotlib, and sklearn packages are installed (although we won’t be
using the latter two today).

In [4]:
import scipy
import numpy as np 
import matplotlib as mpl
import sklearn

(You might wish to examine the installation instructions at http://scipy.org/install.html
if you are considering using your local machine.)

## NumPy Basics

The main numpy object is a so-called “homogeneous multidimensional array” — note that this is a little less flexible than using a list or tuple, but it allows mathematical operations to be performed much faster. (And we’ll be doing a fair bit of number-crunching this semester, so this is an important property.) The following is an introduction to NumPy functions and properties.

### Creating Arrays

In [5]:
a = np.array([0, 1, 2, 3, 4])
b = np.array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]], dtype = float)
c = np.array([[1, 6], [2, 7], [3, 8], [4, 9], [5, 10]], dtype = int)

In [6]:
np.arange(0, 10)     # array of evenly spaced values

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
np.zeros((2, 3))     # array of zeros with the given shape

array([[0., 0., 0.],
       [0., 0., 0.]])

In [8]:
np.ones(2)           # array of ones with the given shape

array([1., 1.])

In [9]:
np.empty((2, 3))     # empty array (arbitrary values)

array([[0., 0., 0.],
       [0., 0., 0.]])

In [10]:
np.full((2, 3), 3)   # fill new array with given shape

array([[3, 3, 3],
       [3, 3, 3]])

### Inspecting an array

In [11]:
b.size               # number of elements in the array

10

In [12]:
b.ndim               # number of dimensions

2

In [13]:
b.shape              # lengths of each dimension

(2, 5)

In [14]:
b.dtype              # data type of array elements

dtype('float64')

### Numpy Basic operations


Numpy supports vector (and matrix) operations,like addition, subtraction, and scalar multiplication.

You need to be very, very careful about manipulating arrays of different sizes. numpy typically won’t throw exceptions. Instead, it will do "something": that something might be very intelligent, like automatically increasing the dimensionality of the smaller array to match the larger array — but if you aren’t expecting it, the errors can be very difficult to find.

In [15]:
a1 = np.array([0,1,2,3,4])
a2 = np.array([1,3,-2,0,4])

In [16]:
print(a1 + a2)                # element-wise addition (or np.add)

[1 4 0 3 8]


In [17]:
print(a1 - a2)                # element-wise subtraction (or np.subtract)

[-1 -2  4  3  0]


In [18]:
print(a1 * a2)                # element-wise multiplication (or np.multiply)

[ 0  3 -4  0 16]


In [19]:
print(a1 / a2)                # element-wise division (or np.divide)

[ 0.          0.33333333 -1.                 inf  1.        ]


  """Entry point for launching an IPython kernel.


#### Question 1
How can we add (element-wise) arrays a and b? 

In [20]:
b.sum()              # sum elements

45.0

#### Question 2
What do you think would be the result of comparision `b < 2`? How about `a1 = a2`?

In [21]:
b < 2                   # element-wise comparison

array([[ True,  True, False, False, False],
       [False, False, False, False, False]])

In [22]:
a1 == a2                # element-wise comparison

array([False, False, False, False,  True])

#### Question 3
How can we check whether arrays have the same shape and elements?

In [23]:
np.array_equal(a, b)  # check whether arrays have the same shape and elements

False

There are many more operations:

In [24]:
b.min()              # minimum element

0.0

In [25]:
b.max()              # maximum element

9.0

In [26]:
b.mean()             # mean of elements

4.5

### Using Numpy arrays

Numpy arrays can be indexed, sliced, and iterated over, similarly to lists. 

#### Exercise 1
Write a function to calculate the **Euclidean distance** between $\vec{a}$ and $\vec{b}$, starting with the following code.
\begin{align}
    E_d(\vec{a},\vec{b})= \sqrt{\sum_{i=1}^n (a_i-b_i)^2}
\end{align}

In [32]:
def my_euclidean_dist(a,b):
    assert len(a)==len(b), "Arrays are of different sizes!"
    s = 0
    for i in range(len(a)): 
        s += np.square(a[i]-b[i])
    return np.sqrt(s)

#### Exercise 2
Use this function to calculate the eculadian Distance between `a1` and `a2`.

In [33]:
print(my_euclidean_dist(a1,a2))

5.477225575051661


### Numpy and Matrices

Matrices can be made in numpy by wrapping a list of lists. For example the matrices M and N can be modeled in Numpy by using the following code.

\begin{align}
    \mathbf{M} = \begin{pmatrix} 
        1 & 2 & 3 \\ 4 & 2 & 1 \\ 6 & 2 & 0 
    \end{pmatrix} 
    \quad \text{and} \quad 
    \mathbf{N} = \begin{pmatrix} 
        0 & 3 & 1 \\ 1 & 1 & 4 \\ 2 & 0 & 3 
    \end{pmatrix}
\end{align}

In [34]:
M = np.array([[1,2,3],[4,2,1],[6,2,0]])
N = np.array([[0,3,1],[1,1,4],[2,0,3]])

You can use Numpy to perform all kind of **Linear Algebra** operations on these matrices. Such as:

In [35]:
np.transpose(M)                    # reverse the Matrix M

array([[1, 4, 6],
       [2, 2, 2],
       [3, 1, 0]])

In [36]:
np.dot(M,N)                        # Calculate the dot product of M and N

array([[ 8,  5, 18],
       [ 4, 14, 15],
       [ 2, 20, 14]])

In [37]:
np.linalg.inv(M)                   # matrix inverse

array([[ 1. , -3. ,  2. ],
       [-3. ,  9. , -5.5],
       [ 2. , -5. ,  3. ]])

In [38]:
np.eye(3)                          # identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### Exercise 3
Write a short script to compare:
1. M * N and np.dot(M, N)
2. N * M and np.dot(N, M)
3. M * M and M**2 and np.dot(M, M)

In [39]:
print("M*N",M*N)
print("M.N",np.dot(M, N))
print("N*M",N*M)
print("N.M",np.dot(N, M))
print("M*M",M*M)
print("M**2",M**2)
print("M.M",np.dot(M, M))

M*N [[ 0  6  3]
 [ 4  2  4]
 [12  0  0]]
M.N [[ 8  5 18]
 [ 4 14 15]
 [ 2 20 14]]
N*M [[ 0  6  3]
 [ 4  2  4]
 [12  0  0]]
N.M [[18  8  3]
 [29 12  4]
 [20 10  6]]
M*M [[ 1  4  9]
 [16  4  1]
 [36  4  0]]
M**2 [[ 1  4  9]
 [16  4  1]
 [36  4  0]]
M.M [[27 12  5]
 [18 14 14]
 [14 16 20]]


## Getting Help
Confused about a particular function / method? Putting a question mark `<?>` after the object in question will return the docstring.

In [40]:
np.random.normal?

## Interrupting/restarting the kernel

Code is run in the kernel process. You can interrupt the kernel by pressing the stop button <button class='btn btn-default btn-xs'><i class='icon-stop fa fa-stop'></i></button> in the toolbar. Try it out below.

In [42]:
import time
time.sleep(10)

Occassionally you may want to restart the kernel (e.g. to clear the namespace). You can do this by pressing the <button class='btn btn-default btn-xs'><i class='icon-epeat fa fa-repeat'></i></button> button in the toolbar. You can find more options under the _Kernel_ menu.