We want to solve a 2-d problem with Gaussian increments over time and Laplacian increments over trials, with loss function (negative log-likelihood):
\begin{equation}
L(x,z;\Delta N) = \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} \, (x_k + z_r) + \log(1+e^{x_k + z_r}) \right] + \frac{1}{2\sigma} \sum\limits_{k=2}^{K} (x_k - x_{k-1})^2 + \lambda \sum\limits_{r=2}^{R} |z_r - z_{r-1}|
\end{equation}

The MAP estimation problem for this model solves
\begin{equation}
\min_{x,z} \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} \, (x_k + z_r) + \log(1+e^{x_k + z_r}) \right] + \frac{1}{2\sigma} \sum\limits_{k=2}^{K} (x_k - x_{k-1})^2 + \lambda \sum\limits_{r=2}^{R} |z_r - z_{r-1}|
\end{equation}
where $x = (x_1,..., x_K)'$, $z = (z_1,..., z_R)'$

We propose to solve this optimization problem using an iterative algorithm: starting with an initial guess $x^{(0)}, z^{(0)}$, we find the sollution of the above equation by iteratively solving for 

\begin{equation}
x^{(l+1)} = \arg \min _x Q(x; x^{(l)}, z^{(l)})
\end{equation}

\begin{align}
Q(x; x^{(l)}, z^{(l)}) &= \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} \, (x_k + z_r^{(l)}) + \log(1+e^{x_k^{(l)} + z_r^{(l)}}) + (1 + e^{-(x^{(l)}_k + z^{(l)}_r)})^{-1} (x_k - x_k^{(l)}) + \frac{1}{8} (x_k - x_k^{(l)})^2 \right] \\
&+ \frac{1}{2\sigma} \sum\limits_{k=2}^{K} (x_k - x_{k-1})^2 + \lambda \sum\limits_{r=2}^{R} |z^{(l)}_r - z^{(l)}_{r-1}|\\
&= \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} \, x_k + \frac{x_k - x_k^{(l)}}{1 + e^{-(x_k^{(l)} + z_r^{(l)})}} + \frac{(x_k - x_k^{(l)})^2}{8} \right] + \frac{1}{2\sigma} \sum\limits_{k=2}^{K} (x_k - x_{k-1})^2 + \mathrm{const}.
\end{align}

and
\begin{equation}
z^{(l+1)} = \arg \min _x Q(z; x^{(l)}, z^{(l)})
\end{equation}


\begin{equation}
Q(z; x^{(l)}, z^{(l)}) = \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} z_r + \frac{z_r - z^{(l)}_r}{1 + e^{-(x^{(l)}_k + z^{(l)}_r)}} + \frac{(z_r - z^{(l)}_r)^2}{8} \right] + \lambda \sum\limits_{r=2}^{R} \frac{(z_r - z_{r-1})^2}{|z^{(l)}_r - z^{(l)}_{r-1}|} + \mathrm{const}
\end{equation}




Heuristically, the idea of this iterative procedure is that 
$|z_r - z_{r-1}| = \frac{(z_r - z_{r-1})^2}{|z^{(l)}_r - z^{(l)}_{r-1}|}$ and the taylor expansion of $\log(1+e^{x_k + z_r})$ at $x_k^{(l)}$ and $z_r^{(l)}$ to the second order. Since we want to control the sign of residue of the taylor expansion to be negative, we choose a conservative quadratic approximation based on
\begin{equation}
\frac{d^2}{dx^2}(\log(1 + e^x)) = \frac{1}{e^x + 2 + e^{-x}} \leq 4.
\end{equation}

Each iteration of this algorithm solves for the MAP estimate in a one-dimensional LGSSM alternatively.

We now show this sequence of iteratively reweighted Kalman smoothers converges to the solution $x^*$ and $z^*$ of $L(x,z;\Delta N)$.

According to [Lange, 1993], we could rewrite $L(x,z;\Delta N)$ and $Q(z; x^{(l)}, z^{(l)})$ as
\begin{equation}
L(x,z;\Delta N) = \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} \, (x_k + z_r) + \log(1+e^{x_k + z_r}) \right] + \frac{1}{2\sigma} \sum\limits_{k=2}^{K} (x_k - x_{k-1})^2 + \frac{1}{2} k(\delta^2(z_r))
\end{equation}

and

\begin{equation}
Q(z; x^{(l)}, z^{(l)}) = \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} z_r + \frac{z_r - z^{(l)}_r}{1 + e^{-(x^{(l)}_k + z^{(l)}_r)}} + \frac{(z_r - z^{(l)}_r)^2}{8} \right] + \frac{1}{2} k'(\delta^2(z^{(l)}_r)) z_r' \Sigma_z^{-1} z_r + \mathrm{const}
\end{equation}

Since $Q(z; x^{(l)}, z^{(l)})$ is quadratic, we could attain its minimum $z^{(l+1)}$ at finite step. We want to show that (1)$(z^{(l)})_{l=1}^{\infty}$ is bounded, (2)there exists $\bar z = \lim_{l \rightarrow \infty} z^{(l)} $ and (3) $\bar z$ is also the minimum of $L(x^{(l)},z;\Delta N)$:

(1) From [Lange, 1993], $\frac{1}{2} k'(\delta^2(z^{(l)}_r)) z_r' \Sigma_z^{-1} z_r - \frac{1}{2} k(\delta^2(z_r))$ attains its minimum at $z^{(l)}$. Therefore $-\frac{1}{2} k'(\delta^2(z^{(l)}_r)) z_r' \Sigma_z^{-1} z_r + \frac{1}{2} k(\delta^2(z_r))$ attains its maximum at $z^{(l)}$. This implies:

\begin{align}
L(x^{(l)},z;\Delta N) &= Q(z; x^{(l)}, z^{(l)})+L(x^{(l)},z;\Delta N)-Q(z; x^{(l)}, z^{(l)})\\
&> Q(z; x^{(l)}, z^{(l)})+L(x^{(l+1)},z;\Delta N)-Q(z; x^{(l)}, z^{(l+1)})\\
&\ge Q(z; x^{(l+1)}, z^{(l)})+L(x^{(l+1)},z;\Delta N)-Q(z; x^{(l)}, z^{(l+1)})\\
&= L(x^{(l+1)},z;\Delta N)
\end{align}

Using the similar argument in "Robust Estimation of State-Space Models by Iterative $l_2$ Approximations", we could show L(x^{(l)},z;\Delta N) is coercive and therefore there exists a convergent subsequence of $(z^{(l)})_{l=1}^{\infty}$.

The optimality condition for $z^{(l)}$ of  $Q(z; x^{(l)}, z^{(l)})$ is 

\begin{equation}
\nabla_z Q(z; x^{(l)}, z^{(l)}) |_{z^{(l)}} = 0
\end{equation}

This is equivalent to 
\begin{equation}
\nabla_z \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} + \frac{e^{z_r}}{(1+e^{x^{(l)}_k + z_r})} + \frac{1}{4}(z_r -z_r^{(l)})\right]  + k'(\delta^2(z_r))\Sigma_z^{-1} z_r |_{z^{(l+1)}} = 0
\end{equation}

Taking limits and invoking continuity
\begin{equation}
\nabla_z \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} + \frac{e^{z_r}}{(1+e^{x^{(l)}_k + z_r})} \right]  + k'(\delta^2(z_r))\Sigma_z^{-1} z_r |_{\bar z} = 0
\end{equation}

On the other hand, every minimum point of $z^*$ of$L(x^{(l)},z;\Delta N)$ satisfies the first order necessary conditions

\begin{equation}
\nabla_z \sum\limits_{k=1}^{K} \sum\limits_{r=1}^{R} \left[ -\Delta N_{k,r} + \frac{e^{z_r}}{(1+e^{x^{(l)}_k + z_r})} \right]  + k'(\delta^2(z_r))\Sigma_z^{-1} z_r |_{z^*} = 0
\end{equation}

Using the similar argument in "Robust Estimation of State-Space Models by Iterative $l_2$ Approximations", we could show $L(x^{(l)},z;\Delta N)$ is concave. Therefore there exists a unique minimizer $z^*$ of $L(x^{(l)},z;\Delta N)$.

****subgradient might be used in the above proof

We could prove the same result of $Q(x; x^{(l)}, z^{(l)})$ using the similar argument.

In order to prove convergence of block coordinate descent based on Theorem 5.1 from [Tseng, 2001], we verify Assumptions B1-B3 and Assumption C1. 