## Looking at a Big Picture
### Performance Measure

* RMSE Vs MAE
    * RMSE$(\mathbf{X},h)=\sqrt{\frac{1}{m}\sum_{i=1}^{m}(h(\mathbf{x}^{(i)})-y({(i)})^2}$, $l_2$ norm
    * MAE$(\mathbf{X},h)=\frac{1}{m}\sum_{i=1}^{m}\lvert h(\mathbf{x}^{(i)})-y{(i)}\rvert$, $l_1$ norm
    * The higher the norm index, the more it focuses on large values and neglects small ones. This is why the RMSE is more sensitive to outliers than MAE. But when outliers are exponentially rare, the RMSE performs very well and is generally preferred.
    
    
Read this! [机器学习模型评估](https://zhuanlan.zhihu.com/p/30721429机器学习模型评估)

## Get the data
* Useful pandas functions
    * `df.info()` `value_counts()`
    * `df.hist()`
### Create a TestSet
* `from sklearn.model_selection import train_test_split`
* `from sklearn.model_selection import StratifiedShuffleSplit`

## Discover and Visualize the Data to Gain Insights

## Prepare the Data for ML Algorithms
### Data Cleaning
* Missing value
```
from sklearn.preprocessing import Imputer
imputer = Imputer(strategy="median")
```
`stragegy` could also be `"mean"` or `"most_frequent"`

### Text and Categorical Attributes
* **sklearn.preprocessing.LabelEncoder** convert text labels to numbers
* **sklearn.preprocessing.OneHotEncoder** convert integer categorical values into one-hot vectors, stored as SciPy sparse matrix by default.
* **sklearn.preprocessing.LabelBinarizer** apply both transformations in one shot. Setting *sparse_output=True* get a sparse matrix

### Custom Transformers
The more cominations youu can automatically try out, the more likely you will find a great feature combination easily.

```
from sklearn.base import BaseEstimator, TransformerMixin

class CombinedAttributesAdder(BaseEstimator, TransformerMixin):
    def __init__(self,...)
        ...
    def fit(self,...)
        return self
    def transform(self, X, y=None):
        ...
```

### Feature scaling
* Standardization Vs Min-Max Scaling: Standardization is less affected by outliers.

### Transformation pipelines
* `from sklearn.pipline import Pipeline
  from sklearn.pipline import ReatureUnion`
* Use `'selector'` to transform partial features, and `FeatureUnion()` to combine numerical and categorical feature pipelines

## Select and Train a Model
### Better Evaluation Using Cross-Validation
* `from sklearn.model_selection import cross_val_score
   scores = cross_val_score(model, X, y, scoring="neg_mean_squared_error",cv=10)`
* Save all the models/predictions/scores
```
from sklearn.externals import joblib
joblib.dumpy(my_model, "my_model.pkl")
# and later...
my_model_loaded = joblib.load("my_model.pkl")
```

## Fine-Tune Your Model
### Grid Search
* `from sklearn.model_selectionimport GridSearchCV`
* Some of the data preparation steps can be treated as hyperparameters using the `CombinedAttributesAdder` transformer.

### Randomized Search

### Ensemble methods

* Top feature selection could also be integrated into data preparation pipeline using `TopFeatureSelector(feature_importances, k)` in a Pipeline.

## Launch, Monitor, and Maintain Your System

Bring human expertise. Check data quality regularly, as could be reflected in the model performance much later.

## Resources
* Detailed implementation [notebook](https://github.com/ageron/handson-ml/blob/master/02_end_to_end_machine_learning_project.ipynb) by [Aurélien Geron](https://github.com/ageron)