### What is Machine Learning?

While at IBM, Arthur Samuel developed a program that learned how to play checkers (1959). He called it:

    “The field of study that gives computers the ability to learn without being explicitly programmed.”

What does this mean?

As programmers, we often approach problems in a methodical, logic-based way. We try to determine what our desired outputs should be, and then create the proper rules that will transform our inputs into those outputs.

Machine learning flips the script. We want the program itself to learn the rules that describe our data the best, by finding patterns in what we know and applying those patterns to what we don’t know.

These algorithms are able to learn. Their performance gets better and better with each iteration, as it uncovers more hidden trends in the data.


https://www.evernote.com/shard/s468/sh/7e3584e7-b69d-4b0f-97fd-954ca46f7b7c/ZAH3NwZ9uDGjmgl4NBUQD12CJaJ0dFnAXPgqXGoAnaBbYPAvbv8ZUJN7xA

### Supervised Learning
Machine learning can be branched out into the following categories:

    Supervised Learning
    Unsupervised Learning

Supervised Learning is where the data is labeled and the program learns to predict the output from the input data. For instance, a supervised learning algorithm for credit card fraud detection would take as input a set of recorded transactions. For each transaction, the program would predict if it is fraudulent or not.

Supervised learning problems can be further grouped into regression and classification problems.

Regression:

In regression problems, we are trying to predict a continuous-valued output. Examples are:

    What is the housing price in Neo York?
    What is the value of cryptocurrencies?

Classification:

In classification problems, we are trying to predict a discrete number of values. Examples are:

    Is this a picture of a human or a picture of an AI?
    Is this email spam?

For a quick preview, we will show you an example of supervised learning.

1. NYBD (Neo York Bot Department) wants to analyze how Neo Yorkers are talking to one another so that they can determine who is being negative. They have built a Naive Bayes classifier that predicts whether an intercepted text is good or bad, based on the frequency that a word is used in a good training example or a bad one. Run the code to see if the model classifies the sentence "This hot dog was awful!" as a negative sentiment.

In [None]:
from texts import text_counter, text_training
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

intercepted_text = "This hot dog was awful!"

text_counts = text_counter.transform([intercepted_text])

text_classifier = MultinomialNB()

text_labels = [0] * 1000 + [1] * 1000

text_classifier.fit(text_training, text_labels)

final_pos = text_classifier.predict_proba(text_counts)[0][1]

final_neg = text_classifier.predict_proba(text_counts)[0][0]

if final_pos > final_neg:
  print("The text is positive.")
else:
  print("The text is negative.")


### Unsupervised Learning
Unsupervised Learning is a type of machine learning where the program learns the inherent structure of the data based on unlabeled examples.

Clustering is a common unsupervised machine learning approach that finds patterns and structures in unlabeled data by grouping them into clusters.

Some examples:

    Social networks clustering topics in their news feed
    Consumer sites clustering users for recommendations
    Search engines to group similar objects in one cluster

For a quick preview, we will show you an example of unsupervised learning.


In [8]:
import matplotlib.pyplot as plt
import numpy as np 

from os.path import join, dirname, abspath
from mpl_toolkits.mplot3d import Axes3D

from sklearn.cluster import KMeans
from sklearn import datasets

iris = datasets.load_iris()

x = iris.data
y = iris.target

fignum = 1

# Plot the ground truth

fig = plt.figure(fignum, figsize=(4, 3))

ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)

for name, label in [('Robots', 0),
                    ('Cyborgs', 1),
                    ('Humans', 2)]:
    ax.text3D(x[y == label, 3].mean(),
              x[y == label, 0].mean(),
              x[y == label, 2].mean() + 2, name,
              horizontalalignment='center',
              bbox=dict(alpha=.2, edgecolor='w', facecolor='w'))

# Reorder the labels to have colors matching the cluster results

y = np.choose(y, [1, 2, 0]).astype(float)
ax.scatter(x[:, 3], x[:, 0], x[:, 2], c=y, edgecolor='k')

ax.xaxis.set_ticklabels([])
ax.yaxis.set_ticklabels([])
ax.zaxis.set_ticklabels([])

ax.set_xlabel('Time to Heal')
ax.set_ylabel('Reading Speed')
ax.set_zlabel('EQ')

ax.set_title('')
# ax.dist = 12

plt.show()

<Figure size 400x300 with 0 Axes>

### The Machine Learning Process

https://www.evernote.com/shard/s468/sh/e73a6a8c-7970-40d2-a21c-97184d3ce379/vSziZFNcplVBWPH9onWdSvJAhPQ6qR3MJc3pAga0nFzstw9LScI98kaTXw

### Scikit-Learn Cheatsheet

https://www.evernote.com/shard/s468/sh/f0bc2785-b94b-4147-998f-3731645a81f6/sApMNBNdjwwzqNiRrfqlueaocNPeXeR8ZhQOPDNJZN6f0fR6Rzn2tNcBMg