Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
107 lines (61 sloc) 3.8 KB

Risk Models

The prediction model together with a (compatible) loss function constitutes a risk model, which can be expressed as loss(f(x; \theta), y).

In this package, we use a type SupervisedRiskModel to capture this:

abstract RiskModel

immutable SupervisedRiskModel{PM<:PredictionModel,L<:Loss} <: RiskModel

We also provide a function to construct a risk model:

.. function:: riskmodel(pm, loss)

    Construct a risk model, given the predictio model ``pm`` and a loss function ``loss``.

    Here, ``pm`` and ``loss`` need to be *compatible*, which means that the output of the prediction and the first argument of the loss function should have the same number of dimensions.

    Actually, the definition of ``riskmodel`` explicitly enforces this consistency:

    .. code-block:: julia

        riskmodel{N,M}(pm::PredictionModel{N,M}, loss::Loss{M}) =
            SupervisedRiskModel{typeof(pm), typeof(loss)}(pm,loss)


We may provide other risk model in addition to supervised risk model in future. Currently, the supervised risk models, which evaluate the risk by comparing the predictions and the desired responses, are what we focus on.

Common Methods

When a set of inputs and the corresponding outputs are given, the risk model can be considered as a function of the parameter \theta.

The package provides methods for computing the total risk and the derivative of the total risk w.r.t. the parameter.

.. function:: value(rmodel, theta, x, y)

    Compute the total risk *w.r.t.* the risk model `rmodel`, given

    - the prediction parameter ``theta``;
    - the inputs ``x``; and
    - the desired responses ``y``.

    Here, ``x`` and ``y`` can be a single sample or matrices comprised of a set of samples.


    .. code-block:: julia

        # constructs a risk model, with a linear prediction
        # and a squared loss.
        #   risk := (theta'x - y)^2 / 2
        rmodel = riskmodel(LinearPred(5), SqrLoss())

        theta = randn(5)  # parameter
        x = randn(5)      # a single input
        y = randn()       # a single output

        value(rmodel, theta, x, y)  # evaluate risk on a single sample (x, y)

        X = randn(5, 8)   # a matrix of 8 inputs
        Y = randn(8)      # corresponding outputs

        value(rmodel, theta, X, Y)  # evaluate the total risk on (X, Y)

.. function:: value_and_addgrad!(rmodel, beta, g, alpha, theta, x, y)

    Compute the total risk on ``x`` and ``y``, and its gradient *w.r.t.* the parameter ``theta``, and add it to ``g`` in the following manner:

    .. math::

        g \leftarrow \beta g + \alpha \nabla_\theta \mathrm{Risk}(x, y; \theta)

    Here, ``x`` and ``y`` can be a single sample or a set of multiple samples. The function returns both the evaluated value and ``g`` as a 2-tuple.

    .. note::

        When ``beta`` is zero, the computed gradient (or its scaled version) will be written to ``g`` without using the original data in ``g`` (in this case, ``g`` need not be initialized).

.. function:: value_and_grad(rmodel, theta, x, y)

    Compute and return the gradient of the total risk on ``x`` and ``y``, *w.r.t.* the parameter ``g``.

    This is just a thin wrapper of ``value_and_addgrad!``.

Note that the addgrad! method is provided for risk model with certain combinations of prediction models and loss functions. Below is a list of combinations that we currently support:

  • LinearPred + UnivariateLoss
  • AffinePred + UnivariateLoss
  • MvLinearPred + MultivariateLoss
  • MvAffinePred + MultivariateLoss

If you have a new prediction model that is not defined by the package, you can write your own addgrad! method, based on the description above.

You can’t perform that action at this time.