# <span style="color:purple">GIS and Machine Learning for Object Detection in Satellite Imagery</span>

<img src="img/robot.jpg"></img>

## <span style="color:blue">Step 5: Set up a configuration file containing CNN hyperparameters and a label file containing your object classes</span>

Neural Networks are great for learning a variety of patterns. How effective the neural network is at identifying any given pattern is often determined by the architecture of the Neural Network. In this step, we will create a configuration file that stores the values, or hyperparameters, that determine the architecture and behavior of the neural network.

To understand hyperparameters, let's briefly (and broadly) revisit how neural networks operate...

<img src="img/simple_neural_network.jpg"></img>

Broadly speaking, the purpose of the architecture of the neural network is to help break down large problems into smaller problems that can be solved by smaller segments of the network. For instance, let's take a look at this simplified neural network that determines if an input image has a dog:

<img src="img/dogneuralnetwork.png"></img>

With respect to simple neural networks, there are two immediate terms that we will cover:

##### FeedForward and Backpropagation

## FeedForward

As data flows through the network, each neuron has a vote in the activation of its connected neurons, until the signals make it to the output layer. The feeding of data from the input layer through the hidden layers to the output layer is an action termed "FeedForward".

The connections between each neuron have a corresponding weight to start with.

<img src="img/nn01.png"></img>

## Backpropagation

The "magic" of Machine Learning is actually in the learning process that occurs on each FeedForward iteration during training epochs. 

During training, we will pass our training data through the network's layers and take a guess. This guess will almost certainly be completely off the mark. 

We know the correct answer and we know that the neural network made a wrong guess, so now we need to determine which neural connections and weights contributed the most to the error in our guess.

Our goal with backpropagation is to update each of the weights in the network so that they cause the actual output to be closer the target output, thereby minimizing the error for each output neuron and the network as a whole.

We want to adjust the weights so that the error in the output is as low as possible. 

One of the ways this can occur is through gradient descent, which we'll mention briefly. 

## Gradient Descent

<img src="img/gradientdescent.JPG"></img>

Backpropagation gradually adjusts the weights in the connections over iterations to very gradually adjust each weight in each connection to minimize the error across the entire network. 

<img src="img/nnweights.png"></img>

## Epoch

FeedForward + Backpropagation = Epoch


Each iteration of FeedForward and Backpropagation during training is referred to as an Epoch. 

http://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.75188&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false

# A few examples of hyperparameters

#### Learning rate
Learning rate controls how much to update the weight in the optimization algorithm. 

<img src="img/gradient.jpg"></img>

#### Number of epochs
Number of epochs is the the number of times the entire training set pass through the neural network. We should generally start with a high amount of epochs to determine at which point the neural network is done minimizing the error across the weights in the network. 

#### Batch size
Determines the number of training samples in one FeedForward and Backpropagation pass. The higher the batch size, the more memory space in your GPU you will need. 

#### Activation function
The function used to determine if a neuron will "fire" or not. 

<img src="img/relu.png" style="width:50%"></img>

#### Number of hidden layers 
The architecture of the neural network. Generally, more layers result in a more accurate neural network, but the trade off is that it is computationally expensive to train the network. 

#### There are many others...

#### So now, let's store the default hyperparameter values in a configuration file which will be used as an input when we start training the network. 

For help with setting up your own configuration file, refer to the TensorFlow documentation. 

Alternatively, you may use the configuration file we used and is available in this repo. If you use the config file attached, be sure to alter the PATH_TO_BE_CONFIGURED variables to reference your workspace. 

You may also want to change your batch size, depending on your GPU's VRAM. The default of 24 should work for fairly modern systems with powerful graphics cards, but if you experience a memory error, you may want to test with a lower batch size.

For reference, the following was my config file at the completion of this step:

In [None]:
# SSD with Mobilenet v1, configured for the cafo dataset.

model {
  ssd {
    num_classes: 1    # Number of classes in your dataset.
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 10
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
  from_detection_checkpoint: true
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "data/object-detection.pbtxt"
}

eval_config: {
  num_examples: 40
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/test.record"
  }
  label_map_path: "training/object-detection.pbtxt"
  shuffle: false
  num_readers: 1
}

The config file is used by TensorFlow's "model_builder.py" module in the "_build_ssd_model" method

https://github.com/tensorflow/models/blob/a4944a57ad2811e1f6a7a87589a9fc8a776e8d3c/object_detection/builders/model_builder.py#L108

The output of this step should be a .config file stored in your workplace.

https://github.com/Qberto/ML_ObjectDetection_CAFO/blob/master/training/ssd_mobilenet_v1_cafo.config

# Label File

The label file is a much simpler process... in fact here's the entire contents of the one we used:

```
item { id: 1 name: 'cafo' }
```

This one can be found at training/object-detection.pbtxt.

https://github.com/Qberto/ML_ObjectDetection_CAFO/blob/master/training/object-detection.pbtxt