### The Machine Learning Landscape

<b>Definition</b><br>
Machine Learning is the field of study that gives computers the ability to learn without being explicity programmed.
<br>
<br>
<i> training set </i> - examples that the system uses to learn
<br>
<i> training instance </i> - each training example
<br>
<i> accuracy </i> - performance measure of model
<br><br>
Example)<br>
Task (T) is set to flag spam for new emails, the experience (E) is the training data, and the performance measure (P) could be the ratio of correctly classified emails.

### Why Use Machine Learning?

<ul>
<li> Good for systems that require a lot of hand-tuning or long lists of rules. Code is simplified and performs better. </li>
<li> Complex problems where there is no good solution using traditional approach. </i>
<li> Fluctuating environments - ML can adapt </li>
<li> Gaining insights about complex problems & large amounts of data. </i>
</ul>

### Types of Machine Learning Systems:

### Supervised/Unsupervised Learning

Can be classified based on amount/type of supervision used during training. Four major categories.

#### Supervised Learning

<i> supervised learning </i> - training data used to feed algorithm contains the desired solutions.

<i>Classification</i> is an example.<br>
Another is to predict a <i>target</i> numeric value, given a set of <i>features</i> called <i>predictors</i>. This type of task is called <i>regression</i>.
<br>
Some regression algorithms can be used for classificiation as well.

Note:<br>
    In ML, an <i>attribute</i> is a data type (ex: "Mileage"), while a <i> feature </i> has several meanings depending on context, but usually means and attribute plus its value (ex: Milage = 15,000). Many people use the word interchangeably.

Some of the most important supervised learning algorithms (covered in the book):
<br>
<ul>
<li> k-Nearest Neighbors </li>
<li> Linear Regression </li>
<li> Logistic Regression </li>
<li> Support Vector Machines (SVMs)</li>
<li> Decision Trees and Random Forests </li>
<li> Neural Networks$^{2}$ </li>
</ul>

#### Unsupervised Learning

<i>Unsupervised learning </i> - the training data is unlabeled and the system learns on its own.

<ul>
<li>Clustering</li>
         - k-Means
<br>     - Hierarchical Cluster Analysis (HCA)
<br>     - Expectation Maximization
<br>    
<li>Visualization and dimensionality reduction</li>
         - Principal Component Analysis (PCA)
<br>     - Kernel PCA
<br>     - Locally-Linear Embedding (LLE)
<br>     - t-distributed Stochastic Neighbor Embedding (t-SNE)
<br>    
<li>Association rule learning</li>
         - Apirori
<br>     - Eclat
</ul>

<i> feature extraction </i> -  starts from an initial set of measured data and builds derived values

<i> anomaly detection </i> - finding outliers in a dataset

<i> association rule learning </i> - dig into large amounts of data and discover interesting relations between attributes

#### Semisupervised Learning

<i> Semisupervised learning </i> - using partially labeled training data

<i> Deep Belief Networks </i> (DBNs) are based on unsupervised components called <i>restricted Boltzmann machines</i> (RBMs) stacked on top of one another. RBMs are trained sequentially in an unsupervised manner, and then the whole system is fine-tuned using supervised learning techniques.

#### Reinforcement Learning

<i>Agent</i> - the learning system which observes the environment, selects and performs actions, and get <i>rewards</i> in return (or <i>penalties</i>).

<i> Policy </i> - the best strategy that is learned by the agent. This defines what action the agent should choose when it is in a given situation.

### Batch and Online Learning

criterion to classify ML is whether or not the system can learn incrementally from a stream of incoming data.

#### Batch learning

<i> batch learning </i> - the system is incapable of learning incrementally. This takes awhile to do so it is typically done offline.
<br>
<i> offline learning </i> - first the system is trained, and then it is launched into production without learning more.

#### Online learning

<i> online learning</i> - train the system incrementally by feeding it data instances sequentially, either individually or by small groups called <i>mini-batches</i>.

<i>out-of-core learning</i> - using online learning to train systems on huge datasets that cannot fit in one machine's main memory.
<br><br>
Note: This is usually done offline, so it may also be called <i>incremental learning</i>.

<i> Learning rate </i> - how fast they should adapt to changing data.

Using a high learning rate could result in a few issues, such as faulting data coming in or if someone is spamming a system. So, the performance must be monitored to react to abnormal data and unusual behavior (using an anomally detection algorithm).

### Instance-Based Versus Model-Based Learning

ML systems can be categorized based on how they <i>generalize</i>. There are two main approaches:

#### Instance-based learning

<i> measure of similarity</i> 
<br>
<i> instance-based learning </i> - the system learns the examples by heart, then generalizes the new cases using a similarity measure.

#### Model-based learning

<i> model-based learning </i> - build a model of these examples, then use that model to make <i>predictions</i>.

<i> model selection </i> - creating a model to see if it fits a set of data
<br>
<i> linear model </i>
<br>
<i> utility function</i> (or <i> fitness function</i>) - measures how good your model is
<br>
<i> cost function </i> - measure how bad your model is
<br>
<i> training </i> - feed in training data so that the model finds the parameters that make the linear model fit best to your data.

In [None]:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn