Generic diagonal scaling api (#198)

* add rho dependent box cone projection * rename vars in direct solver * starting work * fix * broken * fixed * checking in broken state * fix memory error * restore test settings * cleanup * better test logging * bump back to 0.5 * bump to 3.1.0 * tweak * switch to other resids * use u - u_t * temp * revert version update for now, make enforce cone boundaries take func pointer * revert to previous scaling logic * use enforce cone boundaries everywhere * revert test changes * cleanup * propagate changes to indirect * minor cleanup * update comments * use cone info to set R_y * clang-format -style="{BasedOnStyle: llvm, IndentWidth: 2, AllowShortFunctionsOnASingleLine: None, KeepEmptyLinesAtTheStartOfBlocks: false}" -i *.{c,h} * add comment * working on gpu code * working gpu * use TAU_FACTOR again * comment * comment * minor minunit output tweak * comment * comment * cleanup dot_r * minor rename var * minor rename var * move update_dual_vars to after update_scale * Minor comment. * this commit (almost) reproduces previous behaviour * Revert "this commit (almost) reproduces previous behaviour" This reverts commit 0014084. * fmt * add comment * aãdd lin_sys_solver field to info, update docs * docs tweaks * fix lin sys docs, add const to diag_r * add comment
cvxgrp · Jan 11, 2022 · 5a40b44 · 5a40b44
1 parent 3f5d251
commit 5a40b44
Show file tree

Hide file tree

Showing 33 changed files with 526 additions and 457 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,4 @@
+*.gcno
 *.csv
 *.iml
 docs/*

diff --git a/docs/src/algorithm/scale.rst b/docs/src/algorithm/scale.rst
@@ -17,8 +17,8 @@ algorithm then becomes:
   w^{k+1} &= w^k + u^{k+1} - \tilde u^{k+1} \\
   \end{align}
 
-which yields :math:`w^k \rightarrow u^\star + R^{-1} \mathcal{Q}(u^\star)` where :math:`0
-\in \mathcal{Q}(u^\star) + N_{\mathcal{C}_+}(u^\star)`.
+which yields :math:`w^k \rightarrow u^\star + R^{-1} \mathcal{Q}(u^\star)` where
+:math:`0 \in \mathcal{Q}(u^\star) + N_{\mathcal{C}_+}(u^\star)`.
 
 This changes the first two steps of the procedure. The :ref:`linear projection
 <linear_solver>` and the cone projection (explained next).
@@ -77,6 +77,40 @@ The quantity :math:`\rho_x` is determined by the :ref:`setting <settings>` value
 described in :ref:`Dynamic scale updating <updating_scale>`. Finally,  :math:`d`
 is determined by :code:`TAU_FACTOR` in the code defined in :code:`glbopts.h`.
 
+Root-plus function
+------------------
+
+Finally, the :code:`root_plus` function is modified to be the solution
+of the following quadratic equation:
+
+.. math::
+  \tau^2 (d + r^\top R_{-1} r) + \tau (r^\top R_{-1} \mu^k - 2 r^\top R_{-1} p^k - d \eta^k) + p^k R_{-1} (p^k - \mu^k) = 0,
+
+where :math:`R_{-1}` corresponds to the first :math:`n+m` entries of :math:`R`.
+Other than when computing :math:`\kappa` (which does not affect the algorithm)
+this is the *only* place where :math:`d` appears, so we have a lot of
+flexibility in how to choose it and it can even change from iteration to
+iteration. It is an open question on how best to select this parameter.
+
+Moreau decomposition
+--------------------
+The projection onto a cone under a diagonal scaling also satisfies a
+Moreau-style decomposition identity, as follows:
+
+.. math::
+   x + R^{-1} \Pi_\mathcal{C^*}^{R^{-1}} ( - R x ) = \Pi_\mathcal{C}^R ( x )
+
+where :math:`\Pi_\mathcal{C}^R` denotes the projection onto convex cone
+:math:`\mathcal{C}` under the :math:`R`-norm, which is defined as
+
+.. math::
+  \|x\|_R = \sqrt{x^\top R x}.
+
+This identity is useful when deriving cone projection routines, though most
+cones are either invariant to this or we enforce that :math:`R` is constant
+across them. Note that the two components of the decomposition are
+:math:`R`-orthogonal.
+
 Dual vector
 -----------
 
@@ -91,39 +125,14 @@ and we have
 .. math::
   \begin{align}
   v^{k+1} &= R( u^{k+1} + w^k - 2 \tilde u^{k+1} ) \\
-          &= R( \Pi_{\mathcal{C}_+} (2 \tilde u^{k+1} - w^k) + w^k - 2 \tilde u^{k+1}) \\
-          &= R( \Pi_{\mathcal{C}^*_+} (-2 \tilde u^{k+1} + w^k)) \\
-          &= R \tilde v^{k+1} \\
+          &= R( \Pi^R_{\mathcal{C}_+} (2 \tilde u^{k+1} - w^k) + w^k - 2 \tilde u^{k+1}) \\
+          &= R( R^{-1} \Pi^{R^{-1}}_{\mathcal{C}^*_+} (R(w^k -2 \tilde u^{k+1}))) \\
+          &= \Pi^{R^{-1}}_{\mathcal{C}^*_+} (R(w^k -2 \tilde u^{k+1})) \\
           &\in \mathcal{C}^*_+
   \end{align}
 
-by Moreau and the fact that :math:`R` is chosen to be constant
-within each sub-cone :math:`\mathcal{K}`. Finally note that :math:`v^k \perp
-u^k`, since
-
-.. math::
-  \begin{align}
-  (u^k)^\top v^k &= (u^k)^\top R \tilde v^{k+1}  \\
-   &= \sum_{\mathcal{K} \in \mathcal{C}_+} r_\mathcal{K} (u^k_\mathcal{K})^\top \tilde v^k_\mathcal{K} \\
-   &= 0
-  \end{align}
-
-since :math:`\tilde v^k \perp u^k` and :math:`R` is chosen to be constant
-within each sub-cone :math:`\mathcal{K}`.
-
-Root-plus function
-------------------
-
-Finally, the :code:`root_plus` function is modified to be the solution
-of the following quadratic equation:
-
-.. math::
-  \tau^2 (d + r^\top R r) + \tau (r^\top R \mu^k - 2 r^\top R p^k - d \eta^k) + p^k R (p^k - \mu^k) = 0.
-
-Other than when computing :math:`\kappa` (which does not affect the algorithm)
-this is the *only* place where :math:`d` appears, so we have a lot of
-flexibility in how to choose it and it can even change from iteration to
-iteration. It is an open question on how best to select this parameter.
+by Moreau, and finally note that :math:`v^k \perp
+u^k` from the fact that the Moreau decomposition is :math:`R`-orthogonal.
 
 .. _updating_scale:
 

diff --git a/docs/src/api/c.rst b/docs/src/api/c.rst
@@ -9,7 +9,8 @@ C / C++
 Main solver API
 ---------------
 
-The main C API is imported from the header :code:`scs.h`.
+The C API is imported from the header :code:`scs.h`, available `here
+<https://github.com/cvxgrp/scs/blob/master/include/scs.h>`_.
 
 .. doxygenfunction:: scs
 

diff --git a/docs/src/api/info.rst b/docs/src/api/info.rst
@@ -19,6 +19,9 @@ the following fields.
    * - :code:`status`
      - :code:`char *`
      - Status string (e.g., 'solved')
+   * - :code:`lin_sys_solver`
+     - :code:`char *`
+     - Linear system solver used
    * - :code:`status_val`
      - :code:`scs_int`
      - Status integer :ref:`exit flag <exit_flags>`

diff --git a/docs/src/examples/c.rst b/docs/src/examples/c.rst
@@ -11,15 +11,16 @@ C code to solve this is below.
    :language: c
 
 After following the CMake :ref:`install instructions <c_install>`, we can
-compile the code using:
+compile the code (assuming the library was installed in
+:code:`/usr/local/`) using:
 
 .. code::
 
 	  gcc -I/usr/local/include/scs -L/usr/local/lib/ -lscsdir qp.c -o qp.out
 
 .. ./qp.out > qp.c.out
 
-Then running the code yields output
+Then running the binary yields output:
 
 .. literalinclude:: qp.c.out
    :language: none

diff --git a/docs/src/examples/matlab.rst b/docs/src/examples/matlab.rst
@@ -11,7 +11,7 @@ Python code to solve this is below.
    :language: matlab
 
 After following the matlab :ref:`install instructions <matlab_install>`, we can
-run the code yielding output
+run the code yielding output:
 
 .. matlab -r "run ~/git/scs/docs/src/examples/qp.m;exit"
 

diff --git a/docs/src/examples/python.rst b/docs/src/examples/python.rst
@@ -11,7 +11,7 @@ Python code to solve this is below.
    :language: python
 
 After following the python :ref:`install instructions <python_install>`, we can
-run the code yielding output
+run the code yielding output:
 
 .. python qp.py > qp.py.out
 

diff --git a/docs/src/examples/qp.c b/docs/src/examples/qp.c
@@ -14,9 +14,9 @@ int main(int argc, char **argv) {
   int A_p[3] = {0, 2, 4};
   double b[3] = {-1., 0.3, -0.5};
   double c[2] = {-1., -1.};
-  /* data shape */
-  int n = 2;
-  int m = 3;
+  /* data shapes */
+  int n = 2; /* number of variables */
+  int m = 3; /* number of constraints */
 
   /* Allocate SCS structs */
   ScsCone *k = (ScsCone *)calloc(1, sizeof(ScsCone));
@@ -25,7 +25,7 @@ int main(int argc, char **argv) {
   ScsSolution *sol = (ScsSolution *)calloc(1, sizeof(ScsSolution));
   ScsInfo *info = (ScsInfo *)calloc(1, sizeof(ScsInfo));
 
-  /* Fill in structs */
+  /* Fill in data struct */
   d->m = m;
   d->n = n;
   d->b = b;
@@ -36,7 +36,7 @@ int main(int argc, char **argv) {
   /* Cone */
   k->l = m;
 
-  /* Utility to set some default settings */
+  /* Utility to set default settings */
   scs_set_default_settings(stgs);
 
   /* Modify tolerances */
@@ -49,8 +49,9 @@ int main(int argc, char **argv) {
   /* Verify that SCS solved the problem */
   printf("SCS solved successfully: %i\n", exitflag == SCS_SOLVED);
 
-  /* Print iterations taken */
-  printf("SCS took %i iters\n", info->iter);
+  /* Print some info about the solve */
+  printf("SCS took %i iters, using the %s linear solver.\n", info->iter,
+         info->lin_sys_solver);
 
   /* Print solution x */
   printf("Optimal solution vector x*:\n");

diff --git a/docs/src/examples/qp.c.out b/docs/src/examples/qp.c.out
@@ -13,21 +13,21 @@ lin-sys:  sparse-direct
 ------------------------------------------------------------------
  iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
 ------------------------------------------------------------------
-     0| 1.78e+00  1.00e+00  1.42e+00 -1.04e+00  1.00e-01  2.14e-05 
-   175| 3.34e-11  4.17e-10  4.07e-10  1.24e+00  2.08e+01  1.25e-04 
+     0| 1.78e+00  1.00e+00  1.42e+00 -1.04e+00  1.00e-01  2.09e-05 
+   175| 4.30e-11  5.94e-10  5.70e-10  1.24e+00  2.08e+01  1.27e-04 
 ------------------------------------------------------------------
 status:  solved
-timings: total: 2.01e-04s = setup: 7.52e-05s + solve: 1.26e-04s
-	 lin-sys: 2.56e-05s, cones: 1.43e-05s, accel: 2.20e-05s
+timings: total: 2.11e-04s = setup: 8.29e-05s + solve: 1.28e-04s
+	 lin-sys: 2.44e-05s, cones: 1.23e-05s, accel: 2.31e-05s
 ------------------------------------------------------------------
 objective = 1.235000
 ------------------------------------------------------------------
 SCS solved successfully: 1
-SCS took 175 iters
+SCS took 175 iters, using the sparse-direct linear solver.
 Optimal solution vector x*:
 x[0] = 0.300000
 x[1] = -0.700000
 Optimal dual vector y*:
 y[0] = 2.700000
 y[1] = 2.100000
-y[2] = 0.000000
+y[2] = -0.000000
diff --git a/docs/src/examples/qp.prob b/docs/src/examples/qp.prob
@@ -3,7 +3,7 @@ In this example we shall solve the following small quadratic program:
 
 .. math::
     \begin{array}{ll}
-    \mbox{minimize} & \frac{1}{2} x^T \begin{bmatrix}3 & -1\\ -1 & 2 \end{bmatrix}
+    \mbox{minimize} & (1/2) x^T \begin{bmatrix}3 & -1\\ -1 & 2 \end{bmatrix}
     x + \begin{bmatrix}-1 \\ -1\end{bmatrix}^T x \\
     \mbox{subject to} & \begin{bmatrix} -1 & 1\\ 1 & 0\\ 0 & 1\end{bmatrix} x \leq \begin{bmatrix}-1 \\ 0.3 \\ -0.5\end{bmatrix}
     \end{array}
@@ -12,11 +12,11 @@ over variable :math:`x \in \mathbf{R}^2`. This problem corresponds to data:
 
 .. math::
     \begin{array}{cccc}
-    P = \begin{bmatrix}3 & -1\\ -1 & 2 \end{bmatrix}, & 
-    A = \begin{bmatrix}-1 & 1\\ 1 & 0\\ 0 & 1\end{bmatrix}, &  
+    P = \begin{bmatrix}3 & -1\\ -1 & 2 \end{bmatrix}, &
+    A = \begin{bmatrix}-1 & 1\\ 1 & 0\\ 0 & 1\end{bmatrix}, &
     b = \begin{bmatrix}-1 \\ 0.3 \\ -0.5\end{bmatrix}, &
-    c = \begin{bmatrix}-1 \\ -1\end{bmatrix} 
+    c = \begin{bmatrix}-1 \\ -1\end{bmatrix}.
     \end{array}
 
-The cone :math:`\mathcal{K}` is simply the positive orthant :code:`l` of
+And the cone :math:`\mathcal{K}` is the positive orthant :code:`l` of
 dimension 3.
diff --git a/docs/src/index.rst b/docs/src/index.rst
@@ -41,13 +41,13 @@ over variables
    :header-rows: 0
 
    * - :math:`x \in \mathbf{R}^n`
-     - primal variable 
-
-   * - :math:`s \in \mathbf{R}^m`
-     - slack variable 
+     - primal variable
 
    * - :math:`y \in \mathbf{R}^m`
-     - dual variable 
+     - dual variable
+
+   * - :math:`s \in \mathbf{R}^m`
+     - slack variable
 
 with data
 
@@ -56,13 +56,13 @@ with data
    :header-rows: 0
 
    * - :math:`A \in \mathbf{R}^{m \times n}`
-     - sparse data matrix
+     - sparse data matrix, see :ref:`matrices`
    * - :math:`P \in \mathbf{S}_+^{n}`
-     - sparse **symmetric positive semidefinite** matrix
+     - sparse, symmetric positive semidefinite matrix
    * - :math:`c \in \mathbf{R}^n`
-     - dense primal cost vector 
+     - dense primal cost vector
    * - :math:`b \in \mathbf{R}^m`
-     - dense dual cost vector 
+     - dense dual cost vector
    * - :math:`\mathcal{K} \subseteq \mathbf{R}^m`
      - nonempty, closed, convex cone, see :ref:`cones`
    * - :math:`\mathcal{K}^* \subseteq \mathbf{R}^m`

diff --git a/docs/src/install/c.rst b/docs/src/install/c.rst
@@ -54,9 +54,9 @@ project
   make
 
 The CMake build-system exports two CMake targets called :code:`scs::scsdir` and
-:code:`scs::scsindir` as well as a header file `scs.h` that defines the API. The
-libraries can be imported using the find_package CMake command and used by
-calling target_link_libraries as in the following example:
+:code:`scs::scsindir` as well as a header file :code:`scs.h` that defines the
+API. The libraries can be imported using the find_package CMake command and used
+by calling target_link_libraries as in the following example:
 
 .. code:: bash
 
@@ -73,7 +73,7 @@ calling target_link_libraries as in the following example:
 
 Makefile
 ^^^^^^^^
-Alternatively you can use the Makefile and manage the libraries yourself 
+Alternatively you can use the Makefile and manage the libraries yourself
 
 .. code:: bash
 
@@ -92,7 +92,7 @@ To compile and run the tests execute
 If make completes successfully, it will produce two static library files,
 :code:`libscsdir.a`, :code:`libscsindir.a`, and two dynamic library files
 :code:`libscsdir.ext`, :code:`libscsindir.ext` (where :code:`.ext` extension is
-platform dependent) in the :code:`out` folder.  
+platform dependent) in the :code:`out` folder.
 
 If you have a GPU and have CUDA installed, you can also execute make gpu to
 compile SCS to run on the GPU which will create additional libraries and demo
@@ -108,5 +108,5 @@ binaries in the out folder corresponding to the GPU version.  Note that the GPU
 To use the libraries in your own source code, compile your code with the linker
 option :code:`-L(PATH_TO_SCS_LIBS)` and :code:`-lscsdir` or :code:`-lscsindir`
 (as needed). The API and required data structures are defined in the file
-:code:`include/scs.h`. The four main API functions are:
+:code:`include/scs.h`.
 
diff --git a/docs/src/linear_solver/index.rst b/docs/src/linear_solver/index.rst
@@ -7,8 +7,8 @@ At each iteration :math:`k` SCS solves the following set of linear equations:
 
 .. math::
   \begin{bmatrix}
-  \rho_x I + P  &  A^\top \\
-  A &  -\mathrm{diag}(\rho_y)   \\
+  R_x + P  &  A^\top \\
+  A &  -R_y   \\
   \end{bmatrix}
   \begin{bmatrix}
   x \\
@@ -20,14 +20,14 @@ At each iteration :math:`k` SCS solves the following set of linear equations:
   z^k_y
   \end{bmatrix}
 
-for a particular right-hand side :math:`z^k \in \mathbf{R}^{n+m}`
-(the presence of the diagonal scaling :math:`\rho` terms is explained in
-:ref:`scaling`). Note that the matrix does not change from iteration to
-iteration, which is a major advantage of this approach. Moreover, this system
-can be solved approximately so long as the errors satisfy a summability
-condition, which permits the use of approximate solvers that can scale to
-very large problems.
-
+for a particular right-hand side :math:`z^k \in \mathbf{R}^{n+m}`. The presence
+of the diagonal scaling :math:`R \in \mathbf{R}^{(n+m) \times (n+m)}` matrix is
+explained in :ref:`scaling`. The :math:`R_y` term is negated to make the matrix
+quasi-definite; we can recover the solution to the original problem from this
+modified one. Note that the matrix does not change from iteration to iteration,
+which is a major advantage of this approach. Moreover, this system can be solved
+approximately so long as the errors satisfy a summability condition, which
+permits the use of approximate solvers that can scale to very large problems.
 
 
 Available linear solvers
@@ -62,8 +62,8 @@ The indirect method solves the above linear system approximately with a
 .. math::
 
   \begin{align}
-  (\rho_x I + P + A^\top \mathrm{diag}(\rho_y)^{-1} A) r_x & = q_x - A^\top \mathrm{diag}(\rho_y)^{-1} q_y \\
-                            r_y & = \mathrm{diag}(\rho_y)^{-1}(A z_x + q_y).
+  (R_x + P + A^\top R_y^{-1} A) x & = z^k_x + A^\top R_y^{-1} z^k_y \\
+                            y & = R_y^{-1}(A x - z^k_y).
   \end{align}
 
 then solves the positive definite system using using `conjugate gradients