Filled in 'Dropout' section in 'Layers' (#27)

Brilliant man. looks great
bfortuner · Jul 6, 2018 · 954f64c · 954f64c
1 parent 016f83b
commit 954f64c
Show file tree

Hide file tree

Showing 3 changed files with 44 additions and 1 deletion.
diff --git a/docs/images/dropout.png b/docs/images/dropout.png
diff --git a/docs/images/dropout_net.png b/docs/images/dropout_net.png
diff --git a/docs/layers.rst b/docs/layers.rst
@@ -41,7 +41,49 @@ Be the first to `contribute! <https://github.com/bfortuner/ml-cheatsheet>`__
 Dropout
 -------
 
-Be the first to `contribute! <https://github.com/bfortuner/ml-cheatsheet>`__
+A dropout layer takes the output of the previous layer's activations and randomly sets a certain fraction (dropout rate) of the activatons to 0, cancelling or 'dropping' them out.
+
+It is a common regularization technique used to prevent overfitting in Neural Networks.
+
+.. image:: images/dropout_net.png
+      :align: center
+
+The dropout rate is the tunable hyperparameter that is adjusted to measure performance with different values. It is typically set between 0.2 and 0.5 (but may be arbitrarily set).
+
+Dropout is only used during training; At test time, no activations are dropped, but scaled down by a factor of dropout rate. This is to account for more units being active during test time than training time.
+
+For example:
+
+ - A layer in a neural net outputs a tensor (matrix) A of shape (batch_size, num_features).
+ - The dropout rate of the layer is set to 0.5 (50%).
+ - A random 50% of the values in A will be set to 0.
+ - These will then be multiplied with the weight matrix to form the inputs to the next layer.
+
+The premise behind dropout is to introduce noise into a layer in order to disrupt any interdependent learning or coincidental patterns that may occur between units in the layer, that aren't significant.
+
+.. rubric:: Code
+
+.. code-block:: python
+
+      # layer_output is a 2D numpy matrix of activations
+
+      layer_output *= np.random.randint(0, high=2, size=layer_output.shape) # dropping out values
+
+      # scaling up by dropout rate during TRAINING time, so no scaling needs to be done at test time
+      layer_output /= 0.5 
+      # OR
+      layer_output *= 0.5 # Scaling down during TEST time.
+
+.. [2]
+
+This results in the following operation.
+
+.. image:: images/dropout.png
+      :align: center
+
+All reference, images and code examples, unless mentioned otherwise, are from section 4.4.3 of `Deep Learning for Python <https://www.manning.com/books/deep-learning-with-python>`_ by François Chollet. 
+
+.. [2]
 
 
 Linear
@@ -73,3 +115,4 @@ Be the first to `contribute! <https://github.com/bfortuner/ml-cheatsheet>`__
 .. rubric:: References
 
 .. [1] http://www.deeplearningbook.org/contents/convnets.html
+.. [2] “4.4.3, Fundamentals of Machine Learning: Adding Dropout.” `Deep Learning for Python <https://www.manning.com/books/deep-learning-with-python>`_, by Chollet, François. Manning Publications Co., 2018, pp. 109–110.