# Lecture 1: Introduction and Tools

This lecture introduces the course and its syllabus as well as providing a brief overview of the various topics we'll cover and the utility of machine learning.

We additionally will cover information about the various tools we will be using in this course. We will discuss the niche of each tool, how we will use it, and how to get it set up on your local machine.
***
Basic course information, schedules, assignments, and resources are located [here](https://canvas.uw.edu/courses/1192473)

## 1. Machine Learning: Human Problems Versus Computer Problems

### 1.1. Machine Problems

Most problems can be split into two categories: things a computer would be good at and things a person would be good at. Traditionally, these two categories tend to have little overlap.

Problems well suited to computers tend to be those which are:
* deterministic
* repetitive
* computationally intensive

Typically, these problems would take a very long time for a person to do, but can be coded up quickly and simply. For example, consider computing the factorial for a moderately sized number. For a person, mentally calculating $18!$ would take quite a while, but a computer can do it near instantly.

In [12]:
import time
# Python program to find the factorial of a number provided by the user.

# change the value for a different result
num = 18

# uncomment to take input from the user
#num = int(input("Enter a number: "))
now = time.clock()
factorial = 1

# check if the number is negative, positive or zero
if num < 0:
   print("Sorry, factorial does not exist for negative numbers")
elif num == 0:
   print("The factorial of 0 is 1")
else:
   for i in range(1,num + 1):
       factorial = factorial*i
   print("The factorial of",num,"is",factorial)
print("Compute Time: %f" % (time.clock() - now))

The factorial of 18 is 6402373705728000
Compute Time: 0.000569


Half a millisecond is pretty good for such a big computation!
***
This type of problem appears very frequently and typically can be implemented in a fairly obvious way. It is important to recognize this kind of problem because they are NOT suitable for machine learning.

### 1.2. Human Problems

In contrast to machine problems, human problems are tasks that we are actually quite good at! A few basic examples include understanding a spoken word, identifying a dog in a herd of sheep, or guessing which new movie your friend might like.

![ImageDet](https://kaggle2.blob.core.windows.net/competitions/kaggle/3333/media/INODex2.png)


![SpeechDet](http://gluon.mxnet.io/_images/wake-word.png)

Although these types of tasks seem simple, when we sit down and think about how to code up something that works well, it becomes very difficult. Consider a picture of cat. The a program, all we have is matrix of pixels, figuring out what those pixels represent would be quite difficult using conventional coding methods.

These typse of problems are extremely well suited to machine learning, and will be the focus of much of this course.

### 1.3. Machine Learning Examples
Some examples of machine learning tasks that we will cover in this course include

<center>Regression</center>
![regression](https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAfpAAAAJDI1YzM1MGQ4LTc4NDQtNDJiMS1iYmYwLTIyNzUwNmI4Njc0MA.gif)

<center>Classification</center>
![deathcap](http://gluon.mxnet.io/_images/death_cap.jpg)
<center>Death cap - do not eat!</center>

<center>Detection</center>
![detection](http://en.people.cn/NMediaFile/2017/0804/FOREIGN201708041620000597246818595.jpg)

<center>Speech Recognition</center>
<center>----D----e----e-----p------- L----ea------r------ni-----ng---</center>
![recog](http://gluon.mxnet.io/_images/speech.jpg)

<center>Captioning</center>
![detection](https://idealog.co.nz/media/VERSIONS/images/google_950x700--upscale.png)

Along with many other tasks. You might notice that some of the examples above have very impactful applications. It would be good to try to think up some of your own machine learning applications since they might be applicable to your term project!

## 2. Tools and Setup

### 2.1. MXNet

Although it would be relatively simple to code up a small neural network, the very large networks used in modern machine learning quickly become untenable to hand code. This has led to the development of many machine learning framework. Some of the most popular frameworks you've likely heard about include Tensorflow, Caffe, and Torch. Each framework accels in certain areas, has weaknesses in others, and is supported by varying groups. In this course, we will be using MXNet.

MXNet is a relatively new framework (developed in large part at UW!) that has been adopted by Amazon as its framework of choice. With it's 1.0 release, it has two key benefits that make it ideal for us: extremely clean syntax and the ability to deploy to any platform. For comparison, Tensorflow is great for deploying but has some of the worst syntax (and hence difficulty to learn) of any framework.

#### 2.1.1. Examples

Let's take a look at a few snippets of MXNet to get a feel for how it works and can be used. If you're ambitious, there's an excellent crash course (parts of which will mirror the content of this course) available [here](http://gluon.mxnet.io/index.html)

In [12]:
# import MXNet and its array handling module
import mxnet as mx
from mxnet import nd
# use a fixed seed so results are always the same
mx.random.seed(1)
# create an empty matrix with three rows and four columns
x = nd.empty((3, 4))
print(x)


[[  6.97250574e+15   4.55604170e-41  -1.18404297e+03   3.09196506e-41]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00]]
<NDArray 3x4 @cpu(0)>


Note that x is initialized to random values (because we used the empty function). If we want zero intialization, we can instead say...

In [13]:
x = nd.zeros((3, 4))
print(x)


[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
<NDArray 3x4 @cpu(0)>


NDArray objects have many useful properties and methods that can be inspected

In [14]:
dir(x)

['T',
 '__add__',
 '__array_priority__',
 '__bool__',
 '__class__',
 '__del__',
 '__delattr__',
 '__dir__',
 '__div__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__idiv__',
 '__imod__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__isub__',
 '__itruediv__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__module__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__pow__',
 '__radd__',
 '__rdiv__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__rpow__',
 '__rsub__',
 '__rtruediv__',
 '__setattr__',
 '__setitem__',
 '__setstate__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '_at',
 '_fresh_grad',
 '_get_index_nd',
 '_get_nd_advanced_indexing',
 '_get_nd_basic_indexing',
 '_prepare_value_nd',
 '_set_nd_advanced_indexing',
 '_set_nd_basic_indexing',
 '_slice',
 '_sync_copyfrom',
 '_to_shared

Some that will come in handy quite often is looking at the shape of the matrix and the number of elements it contains

In [15]:
print(x.shape)
print(x.size)

(3, 4)
12


If you're familiar with Numpy (a very common python library that makes numerical operations faster) you're in luck! Nearly all Numpy operations have an equivalent NDarray function.

In [16]:
# set x to be a matrix of normally distributed random numbers
x = nd.random_normal(0, 1, (3,4))
# add x to x
x + x


[[ 0.22575472 -2.61288834 -0.2142715  -5.2619853 ]
 [-0.11471695  0.62696832 -1.15302181 -2.22119904]
 [ 1.15921438 -0.45799193  2.08968568  1.62487364]]
<NDArray 3x4 @cpu(0)>

In [18]:
# compute x squared
x**2


[[  1.27412984e-02   1.70679641e+00   1.14780692e-02   6.92212248e+00]
 [  3.28999502e-03   9.82723162e-02   3.32364827e-01   1.23343134e+00]
 [  3.35944504e-01   5.24391532e-02   1.09169650e+00   6.60053611e-01]]
<NDArray 3x4 @cpu(0)>

MXNet of course supports linear algebra operations, since nearly all machine learning is linear algebra at its core

In [19]:
# compute the matrix multiplication of x with its transpose
nd.dot(x, x.T)


[[ 8.65313816  2.56772017 -1.8848604 ]
 [ 2.56772017  1.66735852 -1.60968721]
 [-1.8848604  -1.60968721  2.14013386]]
<NDArray 3x3 @cpu(0)>

Conveniently, all MXNet matrices and tensors can be converted to numpy for use with other python modules (matplotlib for example).

In [21]:
y = x.asnumpy()
y

array([[ 0.11287736, -1.30644417, -0.10713575, -2.63099265],
       [-0.05735848,  0.31348416, -0.57651091, -1.11059952],
       [ 0.57960719, -0.22899596,  1.04484284,  0.81243682]], dtype=float32)

If you're computer has a GPU, one of the main features of MXNet is that it makes using it super easy to speed up computation.

In [22]:
# transfer x on to the GPU
x_gpu = x.copyto(mx.gpu(0))
x_gpu


[[ 0.11287736 -1.30644417 -0.10713575 -2.63099265]
 [-0.05735848  0.31348416 -0.57651091 -1.11059952]
 [ 0.57960719 -0.22899596  1.04484284  0.81243682]]
<NDArray 3x4 @gpu(0)>

In [24]:
# we can now perform a gpu addition in the same way as we did on the cpu
x_gpu + x_gpu


[[ 0.22575472 -2.61288834 -0.2142715  -5.2619853 ]
 [-0.11471695  0.62696832 -1.15302181 -2.22119904]
 [ 1.15921438 -0.45799193  2.08968568  1.62487364]]
<NDArray 3x4 @gpu(0)>

You might notice this actually took longer on the CPU. This is because the amount of work was quite small, so the added overhead of copying the data to the GPU actuallly took longer than the computation. When we start working with neural networks, having a GPU working will save you a lot of time.

Although MXNet's math functionality is quite nice, the real attraction is the ability to simply define a neural network. Below we define a 3 layer neural network. Although we're not going to train or use it in this lecture, it's nice to see how simple it is to make a neural net!

In [37]:
from mxnet import gluon
model_ctx = mx.cpu()
class MLP(gluon.Block):
    def __init__(self, **kwargs):
        super(MLP, self).__init__(**kwargs)
        with self.name_scope():
            self.dense0 = gluon.nn.Dense(64, activation="relu")
            self.dense1 = gluon.nn.Dense(64, activation="relu")
            self.dense2 = gluon.nn.Dense(10)

    def forward(self, x):
        x = self.dense0(x)
        print("Hidden Representation 1: %s" % x)
        x = self.dense1(x)
        print("Hidden Representation 2: %s" % x)
        x = self.dense2(x)
        print("Network output: %s" % x)
        return x

net = MLP()
net.collect_params().initialize(mx.init.Normal(sigma=.01), ctx=model_ctx)

In [38]:
net

MLP(
  (dense0): Dense(None -> 64, Activation(relu))
  (dense1): Dense(None -> 64, Activation(relu))
  (dense2): Dense(None -> 10, linear)
)

### 2.2. Docker

![Docker](https://www.docker.com/sites/default/files/Package%20software.png)

Docker is a tool that allows containers with a specific environment to be independantly built and launched. This is really useful for complicated environments that need to run on a lot of systems! For us, this allows us to guarantee every student has the same environment ie same python version / juptyer environment / MXNet version regardless of what OS is being used.

As you might imagine, in a large class having identical environments will cut back on a lot of bugs that otherwise would pop up!

Docker containers can be thought of as miniature virtual machines that are highly portable.

#### 2.2.1. Independent Environments

The first assignment will show how to set up and run a docker container. However, there are a few things to keep in mind.

When a container is spawned, it is completely seperate from the rest of the computer. Everything you do and touch will disappear when the container exits. This can be both good and bad. It's good because if you accidentally mess something up, you can very easily just close the container and spawn a new one. It's bad because if you're not careful, some of your work might disappear!

When you spawn a container, you can mount a directory using the -v option. This makes it so that changing files in that directory are shared between the container and the OS. Make sure that any work you want to save is in a mounted directory!

### 2.3. Jupyter

Jupyter Notebooks create an interactive coding environment. Snippets of code can be run in a cell, allowing easy debugging. Not only does this make writing code a much more pleasant experience, it also allows others to follow along with what you've done. Moreover, you can add good looking documentation and images easily.

This lecture is a decent example of a Notebook! It contains some code segments and quite a bit of documentation, feel free to download it from the course website and poke around to see how things work.

In this course, the vast majority of content will be presented in Jupyter Notebooks. All lectures will be notebook style as will assignments. The first homework shows how to set up and start using your very own notebooks, I highly recommend you play around and get comfortable as you'll be using these quite a bit.