# Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow

## Chapter 1

---

### How would you define Machine Learning?

#### PG 2

Machine Learning is the science (and art) of programming computers so they can
learn from data.

---

### Can you name four types of problems where it shines?

#### PG 5

- Problems for which existing solutions require a lot of fine-tuning or long
lists of rules.
- Complex problems for which using a traditional approach yields no good
solution.
- Fluctuating environments
- Getting insights about complex problems and large amounts of data.

---

### What is a labeled training set?

#### PG 8

- Supervised learning uses labeled training sets.
- Each example consists of an input (features) and and output (label).

---

### What are the two most common supervised tasks?

#### PG 8

- Classification: Output is a distinct category.
- Regression: Output is a continuous value.

---

### Can you name four common unsupervised tasks?

#### PG 10

- Clustering
- Anomaly detection and novelty detection
- Visualization and dimensionality reduction
- Association rule learning

---

### What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?

#### PG 14

Reinforcement Learning
- Agent
    - observes the environment
    - select and performs actions (using the policy)
    - gets rewards or penalties
- Policy
    - a learned strategy
    
1. Observe
1. Select action using policy
1. Action
1. Get reward or penalty
1. Update policy (learning step)
1. Iterate until an optimal policy is found

---

### What type of algorithm would you use to segment your customers into multiple groups?

#### PG 10

- Unsupervised: Clustering algorithm
- Supervised: Classification algorithm

---

### Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?

#### PG 2

Examples of spam and ham are normally used to train a supervised learning
algorithm for spam detection.

---

### What is an online learning system?

#### PG 15

The system is trained incrementally by feeding it data instances sequentially, 
either individually or in small groups called mini-batches.

Think of this as **incremental learning**. 

---

### What is out-of-core learning?

#### PG 16

All the data cannot fit on a machine's main memory.
The algorithm loads part of the data, runs a training step on that data,
and repeats the process until it run on all the data.

---

### What type of learning algorithm relies on a similarity measure to make predictions?

#### PG 17

Instance-Based Learning

All the training examples are memorized and when a new instance is presented
the a similarity measure to the nearest n training examples is used to make a
decision.

---

### What is the difference between a model parameter and a learning algorithm's hyperparameter?

#### PG 29

A *hyperparameter* is a parameter of a learning algorithm (not of the model).

---

### What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?

#### PG 18

Model-based learning algorithms attempt to determine the model parameters
that minimize a Cost function, and thus are able to most accurately generalize
to new instances. 

Loss Function: measure of an individual training instance.
Cost Function: measure of all the training instances.

After a model is trained a prediction on a new sample is made by feeding the
new sample into the model as input and running inference to calculate an
output.

---

### Can you name four of the main challenges in Machine Learning?

- Insufficient Quantity of Training Data (PG 23)
- Nonrepresentative Training Data (PG 25)
- Poor-Quality Data (PG 26)
- Overfitting the Training Data (PG 27)
- Underfitting the Training Data (PG 29)

---

### If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?

#### PG 28

The model is overfitting the data.

- Simplify the model
    - select a model with fewer parameters
    - reduce the number of attributes in the training data
    - constrain the model
- Gather more training data
- Reduce the noise in the training data

---

### What is a test set, and why would you want to use it?

#### PG 30
A test set is used to evaluate how well a model generalizes to new data.
It is key that the data in the test set was never used to train the model.

---

### What is the purpose of a validation set?

#### PG 31

A validation set is used to rank different models or hyperparameter
configurations.

---

### What is the train-dev set, when do you need it, and how do you use it?

#### PG 32

The train-dev set is used when there is a risk of mismatch between the training 
data and the data used in the validation and test datasets.

The train-dev set is a holdout set from the training set.

1. Train the model on the train set.
1. Run inference on the train-dev set and the validation set.
1. If the model performs well on the training set but not on the train-dev 
set, then the model is likely overfitting.
    1. Address the overfitting.
1. If it performs well on both the training set and the train-dev set, but
not on the validation set, then there is probably a significant data mismatch 
between the training data and the validation/test data.
    1. Improve the training data to make it look more like the 
    validation/test data.

---

### What can go wrong if you tune hyperparameters using the test set?

#### PG 31

The model is being weakly trained to the test set and will not generalize to
new data as well as the metrics against the test set predict. 