# TYPES OF ML SYSTEMS
### 1. Supervised learning
Give a model both features and labels. The model learns the relationship between the features and labels.
Example:
- Linear Regression
- Classification
- Support Vector Machines
- Decision Trees and Random Forests
- Neural Networks
- k-Nearest Neighbors

### 2. Unsupervised learning
Give a model only features. The model learns the relationship between the features.
Example:
- Clustering
    -- k-Means
    -- Hierarchical Cluster Analysis (HCA)
    -- DBSCAN
- Anomaly detection and novelty detection
    -- One-class SVM
    -- Isolation Forest
- Visualization and dimensionality reduction
    -- Principal Component Analysis (PCA)
    -- Kernel PCA
    -- Locally-Linear Embedding (LLE)
    -- t-distributed Stochastic Neighbor Embedding (t-SNE)
- Association rule learning
    -- Apriori
    -- Eclat
        
### 3. Semisupervised learning
Give a model both features and labels, but only a few labels. The model learns the relationship between the features and labels.
Example:
- Deep belief networks (DBNs)
    
### 4. Reinforcement learning
Give a model only rewards. The model learns the relationship between the rewards and actions.



# BATCH AND ONLINE LEARNING
### Batch learning
- Train the model with all the data at once.
- Takes a long time to train.
- Need to train the model again when new data comes in.

### Online learning (incremental learning)
- Train the model with data instances sequentially, either individually or by small groups called mini-batches.
- Fast to train.
- Able to train the model on the fly.
- Able to handle huge datasets.
- Able to train the model on non-stationary data.
- Able to train the model on limited computing resources.
- Able to train the model on systems that receive data as a continuous flow.
- Need to be careful about bad data.
- Need to monitor the model's performance.

# INSTANCE-BASED VERSUS MODEL-BASED LEARNING
### Instance-based learning
- The model learns the training data by heart.
- The model generalizes to new cases by using a similarity measure to compare them to the learned cases.

### Model-based learning
- The model learns the training data and builds a model of these data.
- The model generalizes to new cases by using the model.

# MAIN CHALLENGES OF ML
### Insufficient quantity of training data
- The more data, the better.
- The more data, the more likely to find a pattern.
- The more data, the more likely to find outliers.
- The more data, the more time to train the model.
- The more data, the more computing resources to train the model.
- The more data, the more time to evaluate the model.

### Nonrepresentative training data
- The training data needs to be representative of the new cases for each stratum.

### Poor-quality data
- The training data needs to be clean.
- The training data needs to be consistent.
- The training data needs to be relevant.
- The training data needs to be unbiased.
- The training data needs to be labeled correctly.

### Irrelevant features
- The model needs to be trained with the right features.
- The model needs to be trained with the right number of features.
- The model needs to be trained with the right type of features.
- The model needs to be trained with the right combination of features.

### Overfitting the training data
- Model is too complex relative to the amount and noisiness of the training data.
- Model is trained with too many features.

### Underfitting the training data
- Model is too simple relative to the amount and noisiness of the training data.
- Model is trained with too few features.



# SCIKIT-LEARN DESIGN
### Estimators
Any object that can estimate some parameters based on a dataset is called an estimator. (for example "imputer" is an estimator) The estimation is performed by the fit() method and it takes only a dataset as a parameter (or two for supervised learning algorithms; the second dataset contains the labels). Any other parameter needed to guide the estimation process is considered a hyperparameter (such as imputer's strategy), and it must be set as an instance variable (generally via a constructor parameter).
  
### Transformers
Some estimators (such as imputer) can also transform a dataset; these are called transformers. The transformation is performed by the transform() method with the dataset to transform as a parameter. It returns the transformed dataset. This transformation generally relies on the learned parameters, as is the case for an imputer. All transformers also have a convenience method called fit_transform() that is equivalent to calling fit() and then transform() (but sometimes fit_transform() is optimized and runs much faster).

### Predictors
Some estimators are capable of making predictions given a dataset; they are called predictors. For example, the LinearRegression model in the previous chapter was a predictor: it predicted life satisfaction given a country's GDP per capita. A predictor has a predict() method that takes a dataset of new instances and returns a dataset of corresponding predictions. It also has a score() method that measures the quality of the predictions given a test set (and the corresponding labels in the case of supervised learning algorithms).

### Inspection
All the estimator's hyperparameters are accessible directly via public instance variables (e.g., imputer.strategy), and all the estimator's learned parameters are also accessible via public instance variables with an underscore suffix (e.g., imputer.statistics_).

### Nonproliferation of classes
Datasets are represented as NumPy arrays or SciPy sparse matrices, instead of homemade classes. Hyperparameters are just regular Python strings or numbers.

### Composition
Existing building blocks are reused as much as possible. For example, it is easy to create a Pipeline estimator from an arbitrary sequence of transformers followed by a final estimator, as we will see.

### Sensible defaults
Scikit-Learn provides reasonable default values for most parameters, making it easy to create a baseline working system quickly.

