# Lecture 12: Machine Learning

**Predictive:** Apply machine learning techniques to data you have currently to generate a model that will be able to make a prediction on future data.

**Problem:** Detecting whether credit card charges are fradulent.

**Data science question:** Can we use the time of the charge, the location of the charge, and the price of the charge to predict whether that charge is fradulent or not?

**Type of analysis:** Predictive analysis

Predictive analysis uses data you have now to make predictions in the future.

Machine learning approaches are used for predictive analysis!

## The Main Types of Machine Learning

* **Classical ML:**: Simple data, clear features
* **Reinforcement Learning:** No data, but we have an environment to interact with
* **Ensembles:** When quality is a real problem
* **Neural Networks and Deep Learning:** Complicated data, unclear features, belief in a miracle

## Machine Learning Generalizations

### Basic Steps to Prediction

* Data partitioning
* Feature selection
* Model selection
* Model assessment

#### Data Partitioning

* The data used to build your predictive model -> Training Data
* Data from the original dataset that was held out and not used in training the model; helpful in fine-tuning prediction accuracy -> Test Data
* New and independent data set used to assess if prediction model is generalizable -> Validation Data

**CLICKER QUESTION:**

What portion of the data are typically used for generating the model?

A) The entire dataset

B) The training data

C) The testing data

D) The validation data

#### Feature Selection

* **Feature selection** determines which variables are most predictive and includes them in the model
* Variables that can be used for accurate prediction exploit the relationship between the variables but do not mean the one causes the other

## Two Modes of Machine Learning

* **Supervised Learning:** You tell the computer what features to use to classify the observations
* **Unsupervised Learning:** The computer determines how to classify based on properties within the data

## Approaches to Machine Learning

**Supervised Learning**

* Classification with categorical variables
* Regression with continuous variables

**Unsupervised Learning**

* Clustering (categorical) and dimensionality reduction (continuous)
* Can automatically structure in data

### Model Selection

*Big datasets with simple models?*

#### Regression vs. Classification

**Regression**

*"Draw a line through these dots. Yup, that's the machine learning."*

Uses:

* Stock price forecasts
* Demand and sales volume analysis
* Medical diagnosis
* Any number-time correlations

Popular algorithms: Linear regression, Polynomial regression

**Classification**

*"Splits objects based on one of the attributes known beforehand. Separate socks by color, documents based on language, music by genre."*

Uses:

* Spam filtering
* Language Detection
* A search of similar documents
* Sentiment analysis
* Recognition of handwritten characters and numbers
* Fraud detection

Popular algorithms: Naive Bayes, Decision Tree, Logistic Regression, K-nearest Neighbors, Support Vector Machine

#### Regression: Predicting Continuous Variables

* We will use the linear relationship between variables to generate a **predictive model**
* The training data will be used to build the predictive model
* Use linear regression to model the relationship
* For prediction, the individual values in the training data are not important; we only need the model

#### Classification: Predicting Categorical Variables

* Calculate outcomes with **decision trees**!

## Unsupervised Learning

**Clustering**

*"Divides objects based on unknown features. Machine chooses the best way."*

Uses:

* For market segmentation (types of customers, loyalty)
* To merge close points on a map
* For image compressions
* To analyze and label new data
* To detect abnormal behavior

Popular algorithms: K-means clustering, Mean-Shift, DBSCAN

**Dimensionality Reduction (Generalization)**

*"Assembles specific features into more high-level ones."*

Uses:

* Recommender systems
* Beautiful visualizations
* Topic modeling and similar document search
* Fake image analysis
* Risk management

Popular algorithms: Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA, pLSA, GLSA), t-Distributed Stochastic Neighbor Embedding (t-SNE)

### Illustrating the K-Means Method: Putting Kebab Kiosks in the Optimal Way

1. Put kebab kiosks in random places in the city
2. Watch how buyers choose the nearest one
3. Move kiosks closer to the centers of their popularity
4. Watch and move again
5. Repeat a million times
6. Voilà! You are now the kebab master!

**CLICKER QUESTION:**

You want to predict someone's emotion based on an image. How would you approach this with machine learning?

A) Supervised, Regression

B) Supervised, Classification

C) Unsupervised, Dimensionality Reduction

D) Unsupervised, Clustering

E) Unsupervised, Neural Network

**Neural Networks**

*"We have a thousand-layer network, dozens of video cards, but still no idea where to use it. Let's generate cat pics!"*

Uses:

* Replacement of all algorithms discussed above
* Object classification on photos and videos
* Speech recognition and synthesis
* Image processing, style transfer
* Machine translation

Popular algorithms: Perceptron, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Autoencoders

## Neurons vs. Mathematical Neurons

**Biological Neuron**

* Recieves signal on synapse
* When trigger sends signal on action

**Mathematical Neuron**

* Mathematical abstraction, inspired by biological neuron
* Either on or off based on sum of input

### Weights

* Tell the neuron to respond more to one input and less to another
* Adjusted when training, unless an adaptive network is involved

### Multilayer Perceptron (MLP)

* Signals arrive at **input** nodes
* These signals are processed through **hidden layers**
* **Outputs** result from the algorithms applied

### Convolutional Neural Network (CNN)

* CNNs avoid manual labeling
* The neural network itself learns how to build important features from simple lines
* Perceptron finds important features specific to a cat
* *"CNNs are all the rage right now. They are used to search for objects on photos and in videos, face recognition, style transfer, generating and enhancing images, creating effects like slow-mo and improving image quality. Nowadays CNNs are used in all the cases that involve pictures and videos."*

### Model Assessment

* Root Mean Squared Error (RMSE)
    * $RMSE = \sqrt{\frac{Σ(Predicted - Actual) ^ 2}{N}}$
    * A few outliers can lead to a big increase in RMSE, even if all the other predictions are pretty good
    * $accuracy = \frac{\# \ of \ samples \ predicted \ correctly}{\# \ of \ samples \ predicted}*100$

### Sensitivity and Specificity

* A 2x2 table is a type of confusion matrix
    * Accuracy: What % were predicted correctly?
* $Sensitivity = \frac{TP}{TP + FN}$
    * Of those positives, what % were predicted to be positive?
* $1 - Specificity = \frac{TN}{TN + FP}$
    * Of those negatives, what % were predicted to be negative?

**CLICKER QUESTION:**

You're given a dataset with a number of features and have been asked to predict each individual's age. What prediction approach would you use?

A) Regression (supervised)

B) Classification (supervised)

C) Clustering (unsupervised)

D) Dimensionality reduction (unsupervised)

**CLICKER QUESTION:**

After predicting each person's age, how would you assess your model?

A) RMSE

B) Accuracy

C) Sensitivity

D) Specificity

E) AUC

**CLICKER QUESTION:**

What would be the error value you'd want from your model?

A) 0.2

B) 1.3

C) 2.5

D) 10.0

E) 20.0