readthedocs: work on background and dev section

JuliaML · Feb 7, 2017 · 2a4a0c2 · 2a4a0c2
1 parent a4d46a0
commit 2a4a0c2
Show file tree

Hide file tree

Showing 3 changed files with 185 additions and 127 deletions.
diff --git a/docs/developer/design.rst b/docs/developer/design.rst
@@ -1,146 +1,119 @@
 Developer Documentation
 =========================
 
+.. warning::
+
+   This section is still under development and thus in an
+   unfinished state.
+
+In this part of the documentation we will focus on some of the
+internal design aspects of this library. Consequently, the target
+audience of this section and its sub-sections is primarily people
+interested in contributing to this package. As such, the
+information provided here should be of little to no relevance for
+users interested in simply using this package, and can safely be
+skipped.
+
 Abstract Superclasses
 --------------------------
 
+We have shown in previous sections, that many families of loss
+functions are implemented as immutable types with free
+parameters. One such family is :class:`L1EpsilonInsLoss`, which
+represents all the :math:`\epsilon`-insensitive loss-functions
+for each possible value of :math:`\epsilon`.
+
+Aside from these special families, there a handful of more
+generic families that together contain most of the implemented
+loss functions. These are implemented as abstract types in the
+type tree. From an end-user's perspective, these are useful for
+dispatching on the particular kind of prediction problem that
+they are intended for (regression vs classification). Form an
+implementation perspective, these abstract types allow to
+implement shared functionality and fall-back methods.
+
 Most of the implemented losses fall under the category of
-supervised losses. In other words they represent functions with
-two parameters (the true targets and the predicted outcomes) to
-compute their value.
+supervised losses. As such, we rarely mention other types of
+losses anywhere in this documentation.
 
 .. class:: SupervisedLoss
 
-   Abstract subtype of ``Loss``.
-   A loss is considered **supervised**, if all the information needed
-   to compute ``value(loss, features, targets, outputs)`` are contained
-   in ``targets`` and ``outputs``, and thus allows for the
-   simplification ``value(loss, targets, outputs)``.
+   Abstract subtype of :class:`Loss`.
+
+   As mentioned in the background section, a supervised loss
+   represents a function of two parameters, namely the true
+   targets and the predicted outcomes. A loss is considered
+   **supervised**, if all the information needed to compute
+   ``value(loss, features, target, output)`` are contained in
+   ``target`` and ``output``, and thus allows for the
+   simplification ``value(loss, target, output)``.
 
 .. class:: DistanceLoss
 
-   Abstract subtype of :class:`SupervisedLoss`.
-   A supervised loss that can be simplified to
-   ``L(targets, outputs) = L(targets - outputs)`` is considered
-   distance-based.
+   Abstract subtype of :class:`SupervisedLoss`. A supervised loss
+   that can be simplified to ``value(loss, target, output)`` =
+   ``value(loss, output - target)`` is considered distance-based.
 
 .. class:: MarginLoss
 
-   Abstract subtype of :class:`SupervisedLoss`.
-   A supervised loss, where the targets are in {-1, 1}, and which
-   can be simplified to ``L(targets, outputs) = L(targets * outputs)``
-   is considered margin-based.
+   Abstract subtype of :class:`SupervisedLoss`. A supervised
+   loss, where the targets are in {-1, 1}, and which can be
+   simplified to ``value(loss, target, output)`` = ``value(loss,
+   target * output)`` is considered margin-based.
 
 Shared Interface
--------------------
-
-.. function:: value(loss, agreement)
-
-   Computes the value of the loss function for each
-   observation in ``agreement`` individually and returns the result
-   as an array of the same size as the parameter.
-
-   :param loss: An instance of the loss we are interested in.
-   :type loss: :class:`MarginLoss`
-   :param agreement: The result of multiplying the true targets with
-                     the predicted outputs.
-   :type agreement: ``AbstractArray``
-   :return: The value of the loss function for the elements in
-            ``agreement``.
-   :rtype: ``AbstractArray``
-
-.. function:: deriv(loss, agreement)
+----------------------
 
-   Computes the derivative of the loss function for each
-   observation in ``agreement`` individually and returns the result
-   as an array of the same size as the parameter.
-
-   :param loss: An instance of the loss we are interested in.
-   :type loss: :class:`MarginLoss`
-   :param agreement: The result of multiplying the true targets with
-                     the predicted outputs.
-   :type agreement: ``AbstractArray``
-   :return: The derivatives of the loss function for the elements in
-            ``agreement``.
-   :rtype: ``AbstractArray``
-
-.. function:: value_deriv(loss, agreement)
+We can further divide the supervised losses into two useful
+sub-categories: :class:`DistanceLoss` for regression and
+:class:`MarginLoss` for classification.
 
-   Returns the results of :func:`value` and :func:`deriv` as a tuple.
-   In some cases this function can yield better performance, because
-   the losses can make use of shared variable when computing
-   the values.
+Losses for Regression
+~~~~~~~~~~~~~~~~~~~~~~
 
-Shared Interface
--------------------
+Supervised losses that can be expressed as a univariate function
+of ``output - target`` are referred to as distance-based losses.
+Distance-based losses are typically utilized for regression
+problems. That said, there are also other losses that are useful
+for regression problems that don't fall into this category, such
+as the :class:`PeriodicLoss`.
 
 .. function:: value(loss, difference)
 
-   Computes the value of the loss function for each
-   observation in ``difference`` individually and returns the result
-   as an array of the same size as the parameter.
+   Computes the value of the `loss` function for each observation
+   in `difference` individually and returns the result as an
+   array of the same size as the parameter.
 
    :param loss: An instance of the loss we are interested in.
    :type loss: :class:`DistanceLoss`
    :param difference: The result of subtracting the true targets from
                       the predicted outputs.
    :type difference: ``AbstractArray``
    :return: The value of the loss function for the elements in
-            ``difference``.
+            `difference`.
    :rtype: ``AbstractArray``
 
 .. function:: deriv(loss, difference)
 
    Computes the derivative of the loss function for each
-   observation in ``difference`` individually and returns the result
-   as an array of the same size as the parameter.
+   observation in `difference` individually and returns the
+   result as an array of the same size as the parameter.
 
    :param loss: An instance of the loss we are interested in.
    :type loss: :class:`DistanceLoss`
    :param difference: The result of subtracting the true targets from
                       the predicted outputs.
    :type difference: ``AbstractArray``
    :return: The derivatives of the loss function for the elements in
-            ``difference``.
+            `difference`.
    :rtype: ``AbstractArray``
 
 .. function:: value_deriv(loss, difference)
 
-   Returns the results of :func:`value` and :func:`deriv` as a tuple.
-   In some cases this function can yield better performance, because
-   the losses can make use of shared variable when computing
-   the values.
-
-Regression vs Classification
------------------------------
-
-We can further divide the supervised losses into two useful
-sub-categories: :class:`DistanceLoss` for regression and
-:class:`MarginLoss` for classification.
-
-Losses for Regression
-~~~~~~~~~~~~~~~~~~~~~~
-
-Supervised losses that can be expressed as a univariate function
-of ``output - target`` are referred to as distance-based losses.
-
-.. code-block:: julia
-
-    value(L2DistLoss(), difference)
-
-Distance-based losses are typically utilized for regression problems.
-That said, there are also other losses that are useful for
-regression problems that don't fall into this category, such as
-the :class:`PeriodicLoss`.
-
-.. note::
-
-    In the literature that this package is partially based on,
-    the convention for the distance-based losses is ``target - output``
-    (see [STEINWART2008]_ p. 38).
-    We chose to diverge from this definition because it would force
-    a difference between the results for the unary and the binary
-    version of the derivative.
+   Returns the results of :func:`value` and :func:`deriv` as a
+   tuple. In some cases this function can yield better
+   performance, because the losses can make use of shared
+   variable when computing the values.
 
 Losses for Classification
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -149,24 +122,42 @@ Margin-based losses are supervised losses where the values of the
 targets are restricted to be in :math:`\{1,-1\}`, and which can
 be expressed as a univariate function ``output * target``.
 
-.. code-block:: julia
+.. function:: value(loss, agreement)
+
+   Computes the value of the loss function for each observation
+   in ``agreement`` individually and returns the result as an
+   array of the same size as the parameter.
 
-    value(L1HingeLoss(), agreement)
+   :param loss: An instance of the loss we are interested in.
+   :type loss: :class:`MarginLoss`
+   :param agreement: The result of multiplying the true targets with
+                     the predicted outputs.
+   :type agreement: ``AbstractArray``
+   :return: The value of the loss function for the elements in
+            ``agreement``.
+   :rtype: ``AbstractArray``
 
-.. note::
+.. function:: deriv(loss, agreement)
 
-    Throughout the codebase we refer to the result of
-    ``output * target`` as ``agreement``.
-    The discussion that lead to this convention can be found
-    `issue #9 <https://github.com/JuliaML/LossFunctions.jl/issues/9#issuecomment-190321549>`_
+   Computes the derivative of the loss function for each
+   observation in ``agreement`` individually and returns the
+   result as an array of the same size as the parameter.
 
-Margin-based losses are usually used for binary classification.
-In contrast to other formalism, they do not natively provide
-probabilities as output.
+   :param loss: An instance of the loss we are interested in.
+   :type loss: :class:`MarginLoss`
+   :param agreement: The result of multiplying the true targets with
+                     the predicted outputs.
+   :type agreement: ``AbstractArray``
+   :return: The derivatives of the loss function for the elements in
+            ``agreement``.
+   :rtype: ``AbstractArray``
 
+.. function:: value_deriv(loss, agreement)
 
-Deviations from Literature
-----------------------------
+   Returns the results of :func:`value` and :func:`deriv` as a
+   tuple. In some cases this function can yield better
+   performance, because the losses can make use of shared
+   variable when computing the values.
 
 Writing Tests
 ----------------