Merge pull request #136 from aigamedev/cost

Documentation Updates
aigamedev · Nov 22, 2015 · 8bf26e4 · 8bf26e4
2 parents 666a68e + 0b823c7
commit 8bf26e4
Show file tree

Hide file tree

Showing 5 changed files with 69 additions and 6 deletions.
diff --git a/README.rst b/README.rst
@@ -21,8 +21,8 @@ Thanks to the underlying ``Lasagne`` implementation, the code supports the follo
 * **Activation Functions —** ``Sigmoid``, ``Tanh``, ``Rectifier``, ``Softmax``, ``Linear``.
 * **Layer Types —** ``Convolution`` (greyscale and color, 2D), ``Dense`` (standard, 1D).
 * **Learning Rules —** ``sgd``, ``momentum``, ``nesterov``, ``adadelta``, ``adagrad``, ``rmsprop``, ``adam``.
-* **Regularization —** ``L1``, ``L2`` and ``dropout``.
-* **Dataset Formats —** ``numpy.ndarray``, ``scipy.sparse``, ``iterators`` (via callback).
+* **Regularization —** ``L1``, ``L2``, ``dropout``, and soon batch normalization.
+* **Dataset Formats —** ``numpy.ndarray``, ``scipy.sparse``, and iterators (via ``callback``).
 
 If a feature you need is missing, consider opening a `GitHub Issue <https://github.com/aigamedev/scikit-neuralnetwork/issues>`_ with a detailed explanation about the use case and we'll see what we can do.
 

diff --git a/docs/guide_advanced.rst b/docs/guide_advanced.rst
@@ -6,8 +6,8 @@ The examples in this section help you get more out of ``scikit-neuralnetwork``,
 
 .. _example-pipeline:
 
-Pipeline
---------
+sklearn Pipeline
+----------------
 
 Typically, neural networks perform better when their inputs have been normalized or standardized.  Using a scikit-learn's `pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_ support is an obvious choice to do this.
 
@@ -105,3 +105,42 @@ In the cases when you have large numbers of hyper-parameters that you want to tr
     rs.fit(a_in, a_out)
 
 This works for both :class:`sknn.mlp.Classifier` and :class:`sknn.mlp.Regressor`.
+
+
+Training Callbacks
+------------------
+
+You have full access to — and some control over — the internal mechanism of the training algorithm via callback functions.  There are six callbacks available:
+
+    * ``on_train_start`` — Called when the main training function is entered.
+    * ``on_epoch_start`` — Called the first thing when a new iteration starts.
+    * ``on_batch_start`` — Called before an individual batch is processed.
+    * ``on_batch_finish`` — Called after that individual batch is processed.
+    * ``on_epoch_finish`` — Called the first last when the iteration is done.
+    * ``on_train_finish`` — Called just before the training function exits.
+
+You can register for callbacks with a single function, for example:
+
+.. code:: python
+
+    def my_callback(event, **variables):
+        print(event)        # The name of the event, as shown in the list above.
+        print(variables)    # Full dictionary of local variables from training loop.
+
+    nn = Regressor(layers=[Layer("Linear")],
+                   callback=my_callback)
+
+This function will get called for each event, which may be thousands of times depending on your dataset size. An easier way to proceed would be to use specialized callbacks.  For example, you can use callbacks on each epoch to mutate or jitter the data for training, or inject new data lazily as it is loaded.
+
+.. code:: python
+
+    def prepare_data(X, y, **other):
+        # X and y are variables in the training code. Modify them
+        # here to use new data for the next epoch.
+        X[:] = X_new
+        y[:] = y_new
+
+    nn = Regressor(layers=[Layer("Linear")],
+                   callback={'on_epoch_start': prepare_data})
+
+This callback will only get triggered at the start of each epoch, before any of the data in the set has been processed.  You can also prepare the data separately in a thread and inject it into the training loop at the last minute.
diff --git a/docs/guide_beginners.rst b/docs/guide_beginners.rst
@@ -81,3 +81,19 @@ Working with images as inputs in 2D (as greyscale) or 3D (as RGB) images stored
     nn.fit(X_train, y_train)
 
 The neural network here is trained with eight kernels of shared weights in a ``3x3`` matrix, each outputting to its own channel.  The rest of the code remains the same, but see the :class:`sknn.mlp.Layer` documentation for supported convolution layer types and parameters.
+
+
+Per-Sample Weighting
+--------------------
+
+When training a classifier with data that has unbalanced labels, it's useful to adjust the weight of the different training samples to prevent bias.  This is achieved via a feature called masking.  You can specify the weights of each training sample when calling the ``fit()`` function.
+
+.. code:: python
+
+    w_train = numpy.array((X_train.shape[0],))
+    w_train[y_train == 0] = 1.2
+    w_train[y_train == 1] = 0.8
+
+    nn.fit(X_train, y_train, w_train)
+
+In this case, there are two classes ``0`` given weight ``1.2``, and ``1`` with weighting ``0.8``.  This feature also works for regressors as well.
diff --git a/docs/guide_intermediate.rst b/docs/guide_intermediate.rst
@@ -1,5 +1,5 @@
-Misc. Additions
-===============
+Extra Features
+==============
 
 Verbose Mode
 ------------

diff --git a/sknn/mlp.py b/sknn/mlp.py
@@ -274,6 +274,10 @@ def fit(self, X, y, w=None):
         y : array-like, shape (n_samples, n_outputs)
             Target values are real numbers used as regression targets.
 
+        w : array-like (optional), shape (n_samples) 
+            Floating point weights for each of the training samples, used as mask to
+            modify the cost function during optimization. 
+
         Returns
         -------
         self : object
@@ -335,6 +339,10 @@ def fit(self, X, y, w=None):
             Target values as integer symbols, for either single- or multi-output
             classification problems.
 
+        w : array-like (optional), shape (n_samples) 
+            Floating point weights for each of the training samples, used as mask to
+            modify the cost function during optimization.
+
         Returns
         -------
         self : object