## What is Machine Learning (and why should you care?)

---
![comic](https://eugene-kaspersky-wpengine.netdna-ssl.com/files/2016/09/machine-learning-robots-dilbert.gif)

---



**What EXACTLY is Machine Learning?**

> *ML*: Gives computers ability to learn without being told explicitly what to do. Often they improve with experience. 

**What about deep-learning/AI.. Is that the same thing?**

> *Deep Learning*: A software technique that imitates the workings of the human brain to process data and detect patterns for use in making decisions. A sub-category of machine learning.

It turns out there are many other sub-categories of ML besides deep-learning, and they all have their own strengths/weaknesses, and areas of application. We'll try to cover as many as we can!

Ultimately, ML is in the business of creating **MODELS**. These Models are custom built for each specific task, often by feeding them with lots of **TRAINING** data. Once they have been built, they then offer **PREDICTIONS**, which can take a number of forms. Input is turned into Output, just like any other computer program.

---

### What's with all the hype (and why now?)

ML has been around for decades, so why is it gaining so much traction all of a sudden? It seems every company is doing something with AI. It comes down to a confluence of a few things:

**Developer Friendliness**

ML, for the longest time, used to be the purvue of PHDs, universities, and perhaps a few larger companies who could afford to make big R&D investments. The algorithms are complex and the related math is more sophisticated than what most developers are accustomed to.

Within the past few years open source libraries have been developed that abstract away a lot of the complexity of the underyling algorithms, and instead let developers focus on tuning the best possible model for their particular problem. These libraries offer tools to help developers verify and diagnose the performance of their models, and give us comprehendable data about how well or poorly the models will perform against unseen data.

For example, look at the wikipedia page for a very popular ML algorithm like [K-Nearest-Neighbors](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm). You no longer *need* to understand the intricacies of this and other similar algorithms, and can instead focus on finding the right one for the job.

**Performance**

With the explosion of cloud services like AWS, any developer has access to nearly unlimited computing power on demand. Some new and novel parallelization libraries like Google's [Tensor Flow](https://www.tensorflow.org/) which is used for training Neural Network Models, can be used to train a model in a massively parallel fashion on specialized hardware such as a GPU (Graphics Processing Unit, yay video games!).  For instance, look at the 'Accelerated Computing: P2' option available at [AWS](https://aws.amazon.com/ec2/instance-types/). Anyone can provision one of these, with 192 GB of video memory, and 16 GPUs each with 2496 cores (~40,000 cores). They can access this machine for roughly $7/hr, and prices continue to drop as performance improves. Compare that to running on your local machine which has maybe 8 or 16 cores.

This type of hardware allows training of models of a size and at a speed that simply wasn't possible until very recently.


**Very Big Data**

Companies like Google, Amazon, Facebook, etc, have so much data that they can train models that would be impossible for those on the outside to immitate. **Training Data** is a key ingredient, along with the Performance and Developer Friendliness described above to build interesting models, and they have more of it than just about anyone. It has allowed them to make remarkable advances in problems like image recognition, where now their models [can outperform humans](https://www.extremetech.com/extreme/233746-ai-beats-doctors-at-visual-diagnosis-observes-many-times-more-lung-cancer-signals) in classifying the objects in images. 

Thankfully for everyone else, there is a trickle down effect, and techniques are available to import the results of their training (a piece of a neural network for example), and then customize it to your specific, smaller dataset. There is also an explosion of publicly available data, and private companies (perhaps like yours!) are all collecting more data than ever that they can use to train models and automate more and more decision making.

---

### Why use Machine Learning?

While the applications of ML seem limitless, there are a few specific types of problems for which it has been especially useful:

- complex rules, unrealistic to program manually. eg. spam/fraud detection.
- problems without known algorithmic solutions. speech/image recognition, language translation etc.
- coping with changing data. rules can automatically adjust on the fly without reprogramming.
- helping humans learn. ML can find patterns in data humans never would.
- predictions: eg. shopping predictions, chatbots, weather/stock predictions.

Some specific examples:

- [Kaggle](https://www.kaggle.com/competitions?sortBy=deadline&group=all&page=1)



---

### It's that easy!

Let's show how we can train a well performing model and make predictions using MNIST, it is a well known set of hand-drawn digits (70,000), that has been labeled. Our goal is to build a model that can look at a hand drawn digit and recognize it.

*(In order to run the code below, hit ctrl-enter while each cell is highlited. We'll go over running code in Jupyter in more detail in the next chapter.)*

In [None]:
# import libraries
from sklearn.datasets import fetch_mldata
import matplotlib.pyplot as plt
import matplotlib as matplotlib
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)

In [None]:
# fetch the data
# X is the data, y are the labels
mnist = fetch_mldata('MNIST original', data_home='./tmp')
X, y = mnist["data"], mnist["target"]

# 'train' data is used to build the model. 'test' data is used to validate that it's working.
# we won't show the model the test data, so it can't cheat!
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

# we shuffle the training data so that it is not organized in any particular way which might affect training.
shuffle_index = np.random.permutation(60000)
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]

In [None]:
# at this point, we have 60,000 training instances, and 10,000 we will test against.

# let's pick a test digit at random
index = np.random.randint(10000)
index

In [None]:
# let's see what that digit looks like. It's a 28x28 array of pixel intensities, 0 to 255.
image = np.reshape(X_test[index], (28,28))
pd.DataFrame(image)

In [None]:
# let's see what this number actually looks like
plt.imshow(image, cmap = matplotlib.cm.binary,interpolation="nearest")
plt.grid("on")
plt.show()

In [None]:
# and let's see what it was supposed to be
y_test[index]

Great! Now we know what this digit looks like and what it's supposed to be. 
Can we use ML to build a model that can accurately recognize these digits?

In [None]:
#  we'll use K-Nearest Neighbors as it works quite well. 
# 'weights=distance' and 'n_neighbors=4' are called 'hyperparameters'. Hyperparameters let us 
# tune an algorithm to work better with our specific data. How we choose these values is explained later.
best_knn_clf = KNeighborsClassifier(weights='distance', n_neighbors=4, n_jobs=-1)

best_knn_clf.fit(X_train, y_train)

In [None]:
best_knn_clf.predict_proba(X_test[index])

Did your model predict your image's value correctly? It probably did, even though it never saw that particular drawing before. It can shown that this particular model successfully recognizes about 97% of new images. With a little more time and tuning, we can get closer to 100%.

Let's head back to the [Table Of Contents](../table_of_contents.ipynb) where we can take a closer look at some of the tools we'll be making use of.

Oh, and maybe run the next cell:

---
![comic](https://raw.githubusercontent.com/qingkaikong/blog/master/2017_12_machine_learning_funny_pictures/figures/figure_8.png)
---