The blueprint is: **problem definition**, **evaluation**, **feature engineering**, and **fighting overfitting**.

1. **Defining the problem and assembling a dataset**
2. **Choosing a measure of success**.   
    Some useful notes:
    * For balanced-classification problems, where every class is equally likely, accuracy and area under the receiver operating characteristic curve (ROC AUC) are common metrics. For class-imbalanced problems, you can use precision and recall. For ranking problems or multilabel classification, you can use mean average precision. 
3. **Deciding on an evaluation protocol**  
    Three common evaluation protocols:
      * Maintaining a hold-out validation set—The way to go when you have plenty of data
      * Doing K-fold cross-validation—The right choice when you have too few samples for hold-out validation to be reliable
      * Doing iterated K-fold validation—For performing highly accurate model evalua- tion when little data is available
4. **Preparing your data**  
    Once you know what you’re training on, what you’re optimizing for, and how to evalu- ate your approach, you’re almost ready to begin training models. But first, you should format your data in a way that can be fed into a machine-learning model—here, we’ll assume a deep neural network:
    * As you saw previously, your data should be formatted as tensors.
    * The values taken by these tensors should usually be scaled to small values: for example, in the [-1, 1] range or [0, 1] range. 
    * If different features take values in different ranges (heterogeneous data), then the data should be normalized.
    * You may want to do some feature engineering, especially for small-data problems.
 
5. **Developing a model that does better than a baseline**  
    You need to make three key choices to build your first working model:
    * Last-layer activation—This establishes useful constraints on the network’s out- put. For instance, the IMDB classification example used sigmoid in the last layer; the regression example didn’t use any last-layer activation; and so on.
    * Loss function—This should match the type of problem you’re trying to solve. For instance, the IMDB example used binary_crossentropy, the regression exam- ple used mse, and so on.
    * Optimization configuration—What optimizer will you use? What will its learning rate be? In most cases, it’s safe to go with rmsprop and its default learning rate.
    
          PROBLEM TYPE                             LAST-LAYER ACTIBATION      LOSS FUNCTION
          
          Binary classification                        sigmoid                binary_crossentropy
          Multiclass, single-label classification      softmax                categorical_crossentropy
          Multiclass, multilabel classification        sigmoid                Binary_crossentropy
          Regression to arbitrary values               None                   mse
          Regression to values between 0 and 1         sigmoid                mse or binary_crossentropy
                       
6. **Scaling up: developing a model that overfits**  
   The ideal model is one that stands right at the border between underfitting and overfitting; between undercapacity and overcapacity. To fig- ure out where this border lies, first you must cross it.  
   To figure out how big a model you’ll need, you must develop a model that overfits. This is fairly easy:
   1. Add layers.
   2. Make the layers bigger.
   3. Train for more epochs.  
   When you see that the model’s performance on the validation data begins to degrade, you’ve achieved overfitting.
   
7. **Regularizing your model and tuning your hyperparameters**
   This step will take the most time: you’ll repeatedly modify your model, train it, evalu- ate on your validation data (not the test data, at this point), modify it again, and repeat, until the model is as good as it can get. These are some things you should try:
   * Add dropout.
   * Try different architectures: add or remove layers. 
   * Add L1 and/or L2 regularization.
   * Try different hyperparameters (such as the number of units per layer or the learning rate of the optimizer) to find the optimal configuration.
   * Optionally, iterate on feature engineering: add new features, or remove fea- tures that don’t seem to be informative.
                         
                        

## Summary
* Define the problem at hand and the data on which you’ll train. Collect this data, or annotate it with labels if need be.
* Choose how you’ll measure success on your problem. Which metrics will you monitor on your validation data?
* Determine your evaluation protocol: hold-out validation? K-fold valida- tion? Which portion of the data should you use for validation?
* Develop a first model that does better than a basic baseline: a model with statistical power.
* Develop a model that overfits.
* Regularize your model and tune its hyperparameters, based on perfor- mance on the validation data. A lot of machine-learning research tends to focus only on this step—but keep the big picture in mind.