Skip to content

Commit

Permalink
Improve usage overview docs (#76)
Browse files Browse the repository at this point in the history
* Add some references

* Suppress warning related to notebook index file

* Improve equation number location

* Improve inverse problem documentation (relevant to #64)

* Add todo sections

* Add description of x-update ADMM solvers

* Added description of step size methods in optimizer docs

* Formatting fix

* Edits for consistency

* CSS improvement

* Typo fixes

* Remove _k on the regularizer

* Update overview.rst

Implement @Michael-T-McCann's suggestion.

Co-authored-by: Michael-T-McCann <michael.thompson.mccann@gmail.com>
Co-authored-by: crstngc <cristina.cgarcia@gmail.com>
  • Loading branch information
3 people committed Nov 5, 2021
1 parent ff97e6d commit 384a2bf
Show file tree
Hide file tree
Showing 6 changed files with 315 additions and 26 deletions.
11 changes: 11 additions & 0 deletions docs/source/_static/scico.css
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,17 @@ dl.citation > dd {
}


/* Fix equation number location */

.math {
text-align: left;
}

.eqno {
float: right;
}


/* Style for autosummary API docs */

dl.field-list.simple {
Expand Down
3 changes: 3 additions & 0 deletions docs/source/classes.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _classes:

Important Classes
=================

Expand All @@ -6,3 +8,4 @@ Important Classes

operator
functional
optimizer
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ SCICO Documentation
:hidden:

api
examples/index.ipynb


Indices
Expand Down
234 changes: 234 additions & 0 deletions docs/source/optimizer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
.. _optimizer:

Optimization Algorithms
=======================


.. raw:: html

<style type='text/css'>
div.document ul blockquote {
margin-bottom: 8px !important;
}
div.document li > p {
margin-bottom: 4px !important;
}
div.document ul > li {
list-style: square outside !important;
margin-left: 1em !important;
margin-bottom: 1em !important;
}
section {
padding-bottom: 1em;
}
ul {
margin-bottom: 1em;
}
</style>



ADMM
----

The Alternating Direction Method of Multipliers (ADMM) :cite:`glowinski-1975-approximation` :cite:`gabay-1976-dual`
is an algorithm for minimizing problems of the form

.. math::
:label: eq:admm_prob
\argmin_{\mb{x}, \mb{z}} \; f(\mb{x}) + g(\mb{z}) \; \text{such that}
\; \acute{A} \mb{x} + \acute{B} \mb{z} = \mb{c} \;,
where :math:`f` and :math:`g` are convex (but not necessarily smooth)
functions, :math:`\acute{A}` and :math:`\acute{B}` are linear operators,
and :math:`\mb{c}` is a constant vector. (For a thorough introduction and
overview, see :cite:`boyd-2010-distributed`.)

The SCICO ADMM solver, :class:`.ADMM`, solves problems of the form

.. math::
\argmin_{\mb{x}} \; f(\mb{x}) + \sum_{i=1}^N g_i(C_i \mb{x}) \;,
where :math:`f` and the :math:`g_i` are instances of :class:`.Functional`,
and the :math:`C_i` are :class:`.LinearOperator`, by defining

.. math::
g(\mb{z}) = \sum_{i=1}^N g_i(\mb{z}_i) \qquad \mb{z}_i = C_i \mb{x}
in :eq:`eq:admm_prob`, corresponding to defining

.. math::
\acute{A} = \left( \begin{array}{c} C_0 \\ C_1 \\ C_2 \\
\vdots \end{array} \right) \quad
\acute{B} = \left( \begin{array}{cccc}
-I & 0 & 0 & \ldots \\
0 & -I & 0 & \ldots \\
0 & 0 & -I & \ldots \\
\vdots & \vdots & \vdots & \ddots
\end{array} \right) \quad
\mb{z} = \left( \begin{array}{c} \mb{z}_0 \\ \mb{z}_1 \\ \mb{z}_2 \\
\vdots \end{array} \right) \quad
\mb{c} = \left( \begin{array}{c} 0 \\ 0 \\ 0 \\
\vdots \end{array} \right) \;.
In :class:`.ADMM`, :math:`f` is a :class:`.Functional`, typically a :class:`.Loss`, corresponding to the forward model of an imaging problem,
and the :math:`g_i` are :class:`.Functional`, typically corresponding to a
regularization term or constraint. Each of the :math:`g_i` must have a
proximal operator defined. It is also possible to set ``f = None``, which corresponds to defining :math:`f = 0`, i.e. the zero function.


Subproblem Solvers
^^^^^^^^^^^^^^^^^^

The most computational expensive component of the ADMM iterations is typically
the :math:`\mb{x}`-update,

.. math::
:label: eq:admm_x_step
\argmin_{\mb{x}} \; f(\mb{x}) + \sum_i \frac{\rho_i}{2}
\norm{\mb{z}^{(k)}_i - \mb{u}^{(k)}_i - C_i \mb{x}}_2^2 \;.
The available solvers for this problem are:

* :class:`.admm.GenericSubproblemSolver`

This is the default subproblem solver as it is applicable in all cases. It
it is only suitable for relatively small-scale problems as it makes use of
:func:`.solver.minimize`, which wraps :func:`scipy.optimize.minimize`.


* :class:`.admm.LinearSubproblemSolver`

This subproblem solver can be used when :math:`f` takes the form
:math:`\norm{\mb{A} \mb{x} - \mb{y}}^2_W`. It makes use of the conjugate
gradient method, and is significantly more efficient than
:class:`.admm.GenericSubproblemSolver` when it can be used.

* :class:`.admm.CircularConvolveSolver`

This subproblem solver can be used when :math:`f` takes the form
:math:`\norm{\mb{A} \mb{x} - \mb{y}}^2_W` and :math:`\mb{A}` and all
the :math:`C_i` s are circulant (i.e., diagonalizable in a Fourier basis).


For more details of these solvers and how to specify them, see the API
reference page for :mod:`scico.admm`.



PGM
---

The Proximal Gradient Method (PGM) :cite:`daubechies-2004-iterative`
:cite:`beck-2010-gradient` and Accelerated Proximal Gradient Method (AcceleratedPGM) :cite:`beck-2009-fast` are algorithms for minimizing
problems of the form

.. math::
\argmin_{\mb{x}} f(\mb{x}) + g(\mb{x})
where :math:`g` is convex and :math:`f` is smooth and convex. The
corresponding SCICO solvers are :class:`PGM` and :class:`AcceleratedPGM`
respectively. In most cases :class:`AcceleratedPGM` is expected to provide
faster convergence. In both of these classes, :math:`f` and :math:`g` are
both of type :class:`.Functional`, where :math:`f` must be differentiable,
and :math:`g` must have a proximal operator defined.

While ADMM provides significantly more flexibility than PGM, and often
converges faster, the latter is preferred when solving the ADMM
:math:`\mb{x}`-step is very computationally expensive, such as in the case of
:math:`f(\mb{x}) = \norm{\mb{A} \mb{x} - \mb{y}}^2_W` where :math:`A` is
large and does not have any special structure that would allow an efficient
solution of :eq:`eq:admm_x_step`.



Step Size Options
^^^^^^^^^^^^^^^^^

The step size (usually referred to in terms of its reciprocal, :math:`L`) for the gradient descent in :class:`PGM` can be adapted via
Barzilai-Borwein methods (also called spectral methods) and iterative
line search methods.

The available step size policy classes are:

* :class:`BBStepSize`

This implements the step size adaptation based on the Barzilai-Borwein
method :cite:`barzilai-1988-stepsize`. The step size :math:`\alpha` is
estimated as

.. math::
\mb{\Delta x} = \mb{x}_k - \mb{x}_{k-1} \; \\
\mb{\Delta g} = \nabla f(\mb{x}_k) - \nabla f (\mb{x}_{k-1}) \; \\
\alpha = \frac{\mb{\Delta x}^T \mb{\Delta g}}{\mb{\Delta g}^T
\mb{\Delta g}} \;\;.
Since the PGM solver uses the reciprocal of the step size, the value
:math:`L = 1 / \alpha` is returned.


* :class:`AdaptiveBBStepSize`

This implements the adaptive Barzilai-Borwein method as introduced in
:cite:`zhou-2006-adaptive`. The adaptive step size rule computes

.. math::
\mb{\Delta x} = \mb{x}_k - \mb{x}_{k-1} \; \\
\mb{\Delta g} = \nabla f(\mb{x}_k) - \nabla f (\mb{x}_{k-1}) \; \\
\alpha^{\mathrm{BB1}} = \frac{\mb{\Delta x}^T \mb{\Delta x}}
{\mb{\Delta x}^T \mb{\Delta g}} \; \\
\alpha^{\mathrm{BB2}} = \frac{\mb{\Delta x}^T \mb{\Delta g}}
{\mb{\Delta g}^T \mb{\Delta g}} \;\;.
The determination of the new step size is made via the rule

.. math::
\alpha = \left\{ \begin{matrix} \alpha^{\mathrm{BB2}} \;, &
\mathrm{~if~} \alpha^{\mathrm{BB2}} / \alpha^{\mathrm{BB1}}
< \kappa \; \\
\alpha^{\mathrm{BB1}} \;, & \mathrm{~otherwise} \end{matrix}
\right . \;\;,
with :math:`\kappa \in (0, 1)`.

Since the PGM solver uses the reciprocal of the step size, the value
:math:`L = 1 / \alpha` is returned.


* :class:`LineSearchStepSize`

This implements the line search strategy described in :cite:`beck-2009-fast`.
This strategy estimates :math:`L` such that
:math:`f(\mb{x}) \leq \hat{f}_{L}(\mb{x})` is satisfied with
:math:`\hat{f}_{L}` a quadratic approximation to :math:`f` defined as

.. math::
\hat{f}_{L}(\mb{x}, \mb{y}) = f(\mb{y}) + \nabla f(\mb{y})^H
(\mb{x} - \mb{y}) + \frac{L}{2} \left\| \mb{x} - \mb{y}
\right\|_2^2 \;\;,
with :math:`\mb{x}` the potential new update and :math:`\mb{y}` the
current solution or current extrapolation (if using :class:`AcceleratedPGM`).


* :class:`RobustLineSearchStepSize`

This implements the robust line search strategy described in
:cite:`florea-2017-robust`. This strategy estimates :math:`L` such that
:math:`f(\mb{x}) \leq \hat{f}_{L}(\mb{x})` is satisfied with
:math:`\hat{f}_{L}` a quadratic approximation to :math:`f` defined as

.. math::
\hat{f}_{L}(\mb{x}, \mb{y}) = f(\mb{y}) + \nabla f(\mb{y})^H
(\mb{x} - \mb{y}) + \frac{L}{2} \left\| \mb{x} - \mb{y} \right\|_2^2 \;\;,
with :math:`\mb{x}` the potential new update and :math:`\mb{y}` the
auxiliary extrapolation state. Note that this should only be used
with :class:`AcceleratedPGM`.


For more details of these step size managers and how to specify them, see
the API reference page for :mod:`scico.pgm`.
38 changes: 20 additions & 18 deletions docs/source/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,23 +61,23 @@ please cite :cite:`pfister-2021-scico`
in the source distribution).


Anatomy of an Inverse Problem
-----------------------------
Inverse Problems
----------------

SCICO is designed to solve problems of the form
SCICO is designed to solve inverse problems such as

.. math::
\argmin_{\mb{x}} \; f(\mb{x}) + \sum_{k=1}^K g_k(C_k (\mb{x})),
\argmin_{\mb{x}} \; f(\mb{x}) + g(C \mb{x}) \;,
with :math:`\mb{x} \in \mathbb{R}^{n}` representing the reconstruction,
e.g., a 1D, 2D, 3D, or 3D+time image,
:math:`C_k: \mathbb{R}^{n} \to \mathbb{R}^{m_k}`
being regularization operators,
where :math:`\mb{x} \in \mathbb{R}^{n}` represents the reconstruction,
e.g., a 1D signal, 2D image, 3D volume, or 3D+time volume sequence,
:math:`C: \mathbb{R}^{n} \to \mathbb{R}^{m}`
is a regularization operator,
and :math:`f: \mathbb{R}^{n} \to \mathbb{R}`
and :math:`g_k: \mathbb{R}^{m_k} \to \mathbb{R}`
being functionals associated with the data fidelity
and regularization, respectively.
and :math:`g: \mathbb{R}^{m} \to \mathbb{R}`
are functionals associated with the data fidelity
and regularization term, respectively.

In a typical computational imaging problem,
we have a `forward model` that models the data acquisition process.
Expand All @@ -89,7 +89,7 @@ in some appropriate sense.
This discrepency is measured in our data fidelity, or `loss`, function

.. Math::
f(\mb{x}) = L(A(\mb{x}), \mb{y}) \,,
f(\mb{x}) = L(A(\mb{x}), \mb{y}) \;,
where :math:`L` is a `functional` that maps a vector (or array)
into a scalar.
Expand All @@ -104,15 +104,17 @@ and the functional :math:`L` in a single object.
A library of functionals and losses are implemented
in :mod:`.functional` and :mod:`.loss`, respectively.

The functionals :math:`g_r(C_r (\cdot))` are regularization functionals.
The :math:`C_r` are operators, usually linear operators.
Together,
these terms encourage solutions that are "simple" in some sense.
The functionals :math:`g(\cdot)` or :math:`g(C (\cdot))`
are regularization functionals, and the :math:`C` are operators,
usually linear operators.
Together, these terms encourage solutions that are "simple" in some sense.
A popular choice is to let :math:`g = \norm{ \cdot }_1`
and :math:`C` be a :class:`.FiniteDifferece` linear operator;
this combination promotes piecewise smooth solutions
and :math:`C` be a :class:`.FiniteDifferece` linear operator,
which promotes piecewise smooth solutions
to the inverse problem.

For more detail in these classes, see :ref:`classes`.


Usage Examples
--------------
Expand Down

0 comments on commit 384a2bf

Please sign in to comment.