From 582e4b57c7daa667afc4945b187054c92a98b0c8 Mon Sep 17 00:00:00 2001
From: "Alex J. Champandard" <alexjc@aigamedev.com>
Date: Sun, 22 Nov 2015 17:31:09 +0100
Subject: [PATCH 1/2] Documentation for the per-sample weighting or dataset
 masking feature.

---
 docs/guide_beginners.rst    | 16 ++++++++++++++++
 docs/guide_intermediate.rst |  4 ++--
 sknn/mlp.py                 |  8 ++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/docs/guide_beginners.rst b/docs/guide_beginners.rst
index f605632..6136930 100644
--- a/docs/guide_beginners.rst
+++ b/docs/guide_beginners.rst
@@ -81,3 +81,19 @@ Working with images as inputs in 2D (as greyscale) or 3D (as RGB) images stored
     nn.fit(X_train, y_train)
 
 The neural network here is trained with eight kernels of shared weights in a ``3x3`` matrix, each outputting to its own channel.  The rest of the code remains the same, but see the :class:`sknn.mlp.Layer` documentation for supported convolution layer types and parameters.
+
+
+Per-Sample Weighting
+--------------------
+
+When training a classifier with data that has unbalanced labels, it's useful to adjust the weight of the different training samples to prevent bias.  This is achieved via a feature called masking.  You can specify the weights of each training sample when calling the ``fit()`` function.
+
+.. code:: python
+
+    w_train = numpy.array((X_train.shape[0],))
+    w_train[y_train == 0] = 1.2
+    w_train[y_train == 1] = 0.8
+
+    nn.fit(X_train, y_train, w_train)
+
+In this case, there are two classes ``0`` given weight ``1.2``, and ``1`` with weighting ``0.8``.  This feature also works for regressors as well.
\ No newline at end of file
diff --git a/docs/guide_intermediate.rst b/docs/guide_intermediate.rst
index 75968d7..0625dad 100644
--- a/docs/guide_intermediate.rst
+++ b/docs/guide_intermediate.rst
@@ -1,5 +1,5 @@
-Misc. Additions
-===============
+Extra Features
+==============
 
 Verbose Mode
 ------------
diff --git a/sknn/mlp.py b/sknn/mlp.py
index f632a41..6adb099 100644
--- a/sknn/mlp.py
+++ b/sknn/mlp.py
@@ -274,6 +274,10 @@ def fit(self, X, y, w=None):
         y : array-like, shape (n_samples, n_outputs)
             Target values are real numbers used as regression targets.
 
+        w : array-like (optional), shape (n_samples) 
+            Floating point weights for each of the training samples, used as mask to
+            modify the cost function during optimization. 
+
         Returns
         -------
         self : object
@@ -335,6 +339,10 @@ def fit(self, X, y, w=None):
             Target values as integer symbols, for either single- or multi-output
             classification problems.
 
+        w : array-like (optional), shape (n_samples) 
+            Floating point weights for each of the training samples, used as mask to
+            modify the cost function during optimization.
+
         Returns
         -------
         self : object

From 0b823c7aafe961fe4dc9ea7c5a761ffdacf9863b Mon Sep 17 00:00:00 2001
From: "Alex J. Champandard" <alexjc@aigamedev.com>
Date: Sun, 22 Nov 2015 19:07:10 +0100
Subject: [PATCH 2/2] Updates to the documentation now also include callbacks.

---
 README.rst              |  4 ++--
 docs/guide_advanced.rst | 43 +++++++++++++++++++++++++++++++++++++++--
 2 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/README.rst b/README.rst
index 23c5dc7..680cb50 100644
--- a/README.rst
+++ b/README.rst
@@ -21,8 +21,8 @@ Thanks to the underlying ``Lasagne`` implementation, the code supports the follo
 * **Activation Functions —** ``Sigmoid``, ``Tanh``, ``Rectifier``, ``Softmax``, ``Linear``.
 * **Layer Types —** ``Convolution`` (greyscale and color, 2D), ``Dense`` (standard, 1D).
 * **Learning Rules —** ``sgd``, ``momentum``, ``nesterov``, ``adadelta``, ``adagrad``, ``rmsprop``, ``adam``.
-* **Regularization —** ``L1``, ``L2`` and ``dropout``.
-* **Dataset Formats —** ``numpy.ndarray``, ``scipy.sparse``, ``iterators`` (via callback).
+* **Regularization —** ``L1``, ``L2``, ``dropout``, and soon batch normalization.
+* **Dataset Formats —** ``numpy.ndarray``, ``scipy.sparse``, and iterators (via ``callback``).
 
 If a feature you need is missing, consider opening a `GitHub Issue <https://github.com/aigamedev/scikit-neuralnetwork/issues>`_ with a detailed explanation about the use case and we'll see what we can do.
 
diff --git a/docs/guide_advanced.rst b/docs/guide_advanced.rst
index 4610e2d..6c0501e 100644
--- a/docs/guide_advanced.rst
+++ b/docs/guide_advanced.rst
@@ -6,8 +6,8 @@ The examples in this section help you get more out of ``scikit-neuralnetwork``,
 
 .. _example-pipeline:
 
-Pipeline
---------
+sklearn Pipeline
+----------------
 
 Typically, neural networks perform better when their inputs have been normalized or standardized.  Using a scikit-learn's `pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_ support is an obvious choice to do this.
 
@@ -105,3 +105,42 @@ In the cases when you have large numbers of hyper-parameters that you want to tr
     rs.fit(a_in, a_out)
 
 This works for both :class:`sknn.mlp.Classifier` and :class:`sknn.mlp.Regressor`.
+
+
+Training Callbacks
+------------------
+
+You have full access to — and some control over — the internal mechanism of the training algorithm via callback functions.  There are six callbacks available:
+        
+    * ``on_train_start`` — Called when the main training function is entered.
+    * ``on_epoch_start`` — Called the first thing when a new iteration starts.
+    * ``on_batch_start`` — Called before an individual batch is processed.
+    * ``on_batch_finish`` — Called after that individual batch is processed.
+    * ``on_epoch_finish`` — Called the first last when the iteration is done.
+    * ``on_train_finish`` — Called just before the training function exits.
+        
+You can register for callbacks with a single function, for example:
+
+.. code:: python
+
+    def my_callback(event, **variables):
+        print(event)        # The name of the event, as shown in the list above.
+        print(variables)    # Full dictionary of local variables from training loop.
+
+    nn = Regressor(layers=[Layer("Linear")],
+                   callback=my_callback)
+
+This function will get called for each event, which may be thousands of times depending on your dataset size. An easier way to proceed would be to use specialized callbacks.  For example, you can use callbacks on each epoch to mutate or jitter the data for training, or inject new data lazily as it is loaded.
+
+.. code:: python
+
+    def prepare_data(X, y, **other):
+        # X and y are variables in the training code. Modify them
+        # here to use new data for the next epoch.
+        X[:] = X_new
+        y[:] = y_new
+
+    nn = Regressor(layers=[Layer("Linear")],
+                   callback={'on_epoch_start': prepare_data})
+
+This callback will only get triggered at the start of each epoch, before any of the data in the set has been processed.  You can also prepare the data separately in a thread and inject it into the training loop at the last minute.
\ No newline at end of file