# Lecture 13

We compare a number of different methods to perform equilibrium computation in a TU-logit model.

Take a toy example where $x$=age of man, and $y$=age of woman. There are
25 categories of age in the studied population, between 16 and 40, and the
surplus is%
\begin{align*}
\Phi\left(  x,y\right)  =-\left\vert x-y\right\vert
\end{align*}
and the heterogeneities are iid Gumbel.

$n_{x}$ and $m_{y}$ are taken from Choo and Siow's ACS data

In [10]:
library(nloptr)
library(nleqslv)
library(microbenchmark)
library(Matrix)
library(gurobi)

thepath = getwd()
load(paste0(thepath,"/ChooSiowData/nSinglesv4.RData"), verbose = FALSE)
load(paste0(thepath,"/ChooSiowData/nMarrv4.RData"), verbose = FALSE)
load(paste0(thepath,"/ChooSiowData/nAvailv4.RData"), verbose = FALSE)

nbCateg = 25 # keep only the 16-40 yo population
nSingles = nSingles70n
marr = marr70nN
nAvail = avail70n

muhatx0 = nSingles[1:nbCateg,1]
muhat0y = nSingles[1:nbCateg,2]
muhatxy = marr[[1]][1:nbCateg,1:nbCateg]
then = c(nAvail[[1]][1:nbCateg,1])
them = c(nAvail[[1]][1:nbCateg,2])
nbIndiv = sum(then)+sum(them)
then = then / nbIndiv
them = them / nbIndiv

nbX = length(then)
nbY = length(them)

Xs = (1:nbCateg)+15
Ys = (1:nbCateg)+15

thephi =   - abs (matrix(Xs,nbX,nbY) - matrix(Ys,nbX,nbY,byrow=T)) /20

In this course, we have discussed a number of techniques to compute equilibrium:
* Gradient descent

* Newton descent

* Jacobi iteration (IPFP)

* Linear programming (after discretization of heterogeneity)

We will now design them, implement them and benchmark them.



## Equilibrium as an optimization problem

Recall dual version
\begin{equation}
\min_{\left(  U_{xy}\right)  }\left\{  G\left(  U\right)  +H\left(
\Phi-U\right)  \right\}  \label{EdgeOptim}%
\end{equation}
where
\begin{align*}
G\left(  U\right)   &  =\sum_{x\in\mathcal{X}}n_{x}\log\left(  1+\sum
_{y\in\mathcal{Y}}\exp\left(  U_{xy}\right)  \right) \\
H\left(  V\right)   &  =\sum_{y\in\mathcal{Y}}m_{y}\log\left(  1+\sum
_{x\in\mathcal{X}}\exp\left(  V_{xy}\right)  \right)
\end{align*}

This is a convex optimization problem in dimension $\left\vert
\mathcal{X}\right\vert \times\left\vert \mathcal{Y}\right\vert $.

## Gradient descent: generalities

To solve the optimization problem%
\begin{align*}0
\min W\left(  U\right)
\end{align*}
where $U$ is convex, gradient descent consists in%
\begin{align*}
U_{t+1}=U_{t}-\epsilon_{t}\nabla W\left(  U_{t}\right)
\end{align*}
for $\epsilon_{t}>0$ small enough.

Intuition: with enough smoothness
\begin{align*}
W\left(  U_{t+1}\right)   &  =W\left(  U_{t}\right)  +\left\langle \nabla
W\left(  U_{t}\right)  ,U_{t+1}-U_{t}\right\rangle +O\left(  \left\Vert
U_{t+1}-U_{t}\right\Vert ^{2}\right) \\
&  =W\left(  U_{t}\right)  -\epsilon_{t}\left\Vert \nabla W\left(
U_{t}\right)  \right\Vert ^{2}+O\left(  \epsilon_{t}^{2}\right)  .
\end{align*}


If $W\left(  U\right)  =G\left(  U\right)  +H\left(  \Phi-U\right)  $,
then
\begin{align*}
\nabla W\left(  U\right)  =\nabla G\left(  U\right)  -\nabla H\left(
\Phi-U\right)
\end{align*}
which interprets as the \emph{market imbalance} (supply minus demand for $xy$ matches).

Here, with logit heterogeneities,
\begin{align*}
\frac{\partial W\left(  U\right)  }{\partial U_{xy}}=\frac{n_{x}\exp\left(
U_{xy}\right)  }{1+\sum_{y\in\mathcal{Y}}\exp\left(  U_{xy}\right)  }%
-\frac{m_{y}\exp\left(  \Phi_{xy}-U_{xy}\right)  }{1+\sum_{x\in\mathcal{X}%
}\exp\left(  \Phi_{xy}-U_{xy}\right)  }.
\end{align*}

In [11]:
edgeGradient = function(Phi,n,m, xtol_rel = 1e-8 ,ftol_rel=1e-15)
{
  nbX = length(n)
  nbY = length(m)
  eval_f <- function(theU)
  {
    theU = matrix(theU,nbX,nbY)
    theV = Phi-theU
    denomG = 1 + apply(exp(theU),1,sum)
    denomH = 1 + apply(exp(theV),2,sum)
    valG = sum(n * log(denomG))
    valH = sum(m * log(denomH))
    gradG = exp( theU ) * (n / denomG)
    gradH =  t( t(exp(theV)) * (m / denomH) ) 
    #
    ret = list(objective = valG + valH,
               gradient = c(gradG - gradH ))
    #
    return(ret)
  }
  U_init = Phi / 2
  
  resopt = nloptr(x0 = U_init, eval_f = eval_f,
                  opt = list("algorithm" = "NLOPT_LD_LBFGS",
                             "xtol_rel"= xtol_rel,
                             "ftol_rel"= ftol_rel))
  Usol = matrix(resopt$solution,nbX,nbY)
  mu = exp( Usol ) * (n / (1 + apply(exp(Usol),1,sum)))
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  val = sum(mu * Phi) - 2*sum(mu*log(mu / sqrt(matrix(n,ncol=1) %*% matrix(m,nrow=1)))) - sum(mux0*log(mux0 / n)) - sum(mu0y*log(mu0y / m))
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = resopt$iterations))
}

## Newton Descent

Newton descent consists in doing
\begin{align*}
U_{t+1}=U_{t}-\epsilon_{t}\left(  D^{2}W\left(  U_{t}\right)  \right)
^{-1}\nabla W\left(  U_{t}\right)
\end{align*}
for $\epsilon_{t}>0$ small enough.

Intuition: when $\epsilon_{t}\rightarrow0$, $\left(  U_{t}\right)  $
tends to the solution of ODE%
\begin{align*}
\frac{dU_{t}}{dt}=-\epsilon_{t}\left(  D^{2}W\left(  U_{t}\right)  \right)
^{-1}\nabla W\left(  U_{t}\right)
\end{align*}
that is%
\begin{align*}
\frac{d}{dt}\nabla W\left(  U_{t}\right)  =-\epsilon_{t}\nabla W\left(
U_{t}\right)
\end{align*}
which has solution
\begin{align*}
\nabla W\left(  U_{t}\right)  =\nabla W\left(  U_{0}\right)  \exp\left(
-\int_{0}^{t}\epsilon_{s}ds\right)  ,
\end{align*}
hence $\left\Vert \nabla W\left(  U_{t}\right)  \right\Vert \rightarrow0$ as
$t\rightarrow+\infty$.

When $W\left(  U\right) =G\left(  U\right)  +H\left(  \Phi-U\right)$,
then
\begin{align*}
D^{2}W\left(  U\right)  =D^{2}G\left(  U\right)  +D^{2}H\left(  \Phi-U\right)
\end{align*}
which is called the *market curvature* in the hedonic equilibrium literature.

Here, with logit heterogeneities,
\begin{align*}
\frac{\partial^{2}G\left(  U\right)  }{\partial U_{xy}\partial U_{xy}}  &
=\mu_{xy}-\frac{\mu_{xy}^{2}}{n_{x}}\\
\frac{\partial^{2}G\left(  U\right)  }{\partial U_{xy}\partial U_{xy^{\prime}%
}}  &  =-\frac{\mu_{xy}\mu_{xy^{\prime}}}{n_{x}}\text{ for }y\neq y^{\prime}%
\end{align*}
where $\mu_{xy}=\partial G\left(  U\right)  /\partial U_{xy}$; while $\partial^{2}G\left(  U\right)  /\partial U_{xy}\partial U_{x^{\prime}y^{\prime}}=0$ for $x\neq x^{\prime}$.

Similar formulas hold for $D^{2}H$.

In [12]:
edgeNewton = function(Phi,n,m, xtol = 1e-5 )
{
  nbX = length(n)
  nbY = length(m)
  Z <- function(theU)
  {
    theU = matrix(theU,nbX,nbY)
    theV = Phi-theU
    denomG = 1 + apply(exp(theU),1,sum)
    denomH = 1 + apply(exp(theV),2,sum)
    gradG <<- exp( theU ) * (n / denomG)
    gradH <<- t( t(exp(theV)) * (m / denomH) ) 
    #
    return(c(gradG - gradH ))
  }
  JZ <- function(theU)
  {
    hessG = hessH = matrix(0,nbX*nbY,nbX*nbY)
    #
    for(x in 1:nbX){
      for(y in 1:nbY){
        for(yprime in 1:nbY){
          
            hessG[x+nbX*(y-1),x+nbX*(yprime-1)] = ifelse(y==yprime,
                                                                 gradG[x,y]*(1-gradG[x,y]/n[x]),
                                                                 -gradG[x,y]*gradG[x,yprime]/(n[x]))
            hessH[(x-1)*nbY+y,(x-1)*nbY+yprime] = ifelse(y==yprime,
                                                                 gradH[x,y]*(1-gradH[x,y]/n[x]),
                                                                 -gradH[x,y]*gradH[x,yprime]/(n[x]))
          
        }
      }
    }
    return(hessG+hessH)
  }
  
  U_init = Phi / 2
  
  sol = nleqslv(x = U_init,
                fn = Z, jac = JZ,
                method = "Broyden", # "Newton"
                control = list(xtol=xtol))
                
  
  Usol = matrix(sol$x,nbX,nbY)
  mu = exp( Usol ) * (n / (1 + apply(exp(Usol),1,sum)))
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  val = sum(mu * Phi) - 2*sum(mu*log(mu / sqrt(matrix(n,ncol=1) %*% matrix(m,nrow=1)))) - sum(mux0*log(mux0 / n)) - sum(mu0y*log(mu0y / m))
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = sol$iter))
}

## Nodal methods

In Choo-Siow's TU-logit model, one has
\begin{align*}
\mu_{xy}=\sqrt{\mu_{x0}\mu_{0y}}\exp\left(  \frac{\Phi_{xy}}{2}\right)
\end{align*}
which allowed us to reformulate equilibrium as a problem over variables
$\mu_{x0}$ and $\mu_{0y}$.

Set $a_{x}=-\log\mu_{x0}$ and $b_{y}=-\log\mu_{0y}$. Then one can view
the problem as a *nodal problem* <a name="NodalOptim"></a>
\begin{align}
\min\sum_{x\in\mathcal{X}}n_{x}a_{x}+\sum_{y\in\mathcal{Y}}m_{y}b_{y}+E\left(
a,b\right)
\end{align} 
where $E\left(  a,b\right)  =2\sum_{xy}e^{\frac{\Phi_{xy}-a_{x}-b_{y}}{2}}+\sum_{x}e^{-a_{x}}+\sum_{y}e^{-b_{y}}$.

The problem has become an optimization problem in dimension $\left\vert
\mathcal{X}\right\vert +\left\vert \mathcal{Y}\right\vert $.

Huge dimensionality reduction: from $\left\vert \mathcal{X}\right\vert
\times\left\vert \mathcal{Y}\right\vert $ to $\left\vert \mathcal{X}%
\right\vert +\left\vert \mathcal{Y}\right\vert $, BUT

Nodal methods apply only in the logit case, while the previous methods
(*edge methods*) work for any heterogeneity.

We will solve for the equilibrium $\left(  \mu_{x0},\mu_{0y}\right)  $ using:
* Gradient descent on [nodal problem](#NodalOptim)

* Newton descent on [nodal problem](#NodalOptim)

* IPFP

## Gradient descent, nodal version

If $F\left(  a,b\right)  =\sum_{x\in\mathcal{X}}n_{x}a_{x}+\sum
_{y\in\mathcal{Y}}m_{y}b_{y}+E\left(  a,b\right)  $, then
\begin{align*}
\frac{\partial F}{\partial a_{x}}  &  =n_{x}-\sum_{y\in\mathcal{Y}}
e^{\frac{\Phi_{xy}-a_{x}-b_{y}}{2}}-e^{-a_{x}}\\
\frac{\partial F}{\partial b_{y}}  &  =m_{y}-\sum_{x\in\mathcal{X}}
e^{\frac{\Phi_{xy}-a_{x}-b_{y}}{2}}-e^{-b_{y}}%
\end{align*}
which interprets as another market imbalance measure.

In gradient descent, $a_{x}$ and $b_{y}$ adjust proportionally to market imbalance.

This is done via `nloptr`.

In [13]:
nodalGradient = function(Phi,n,m, xtol_rel = 1e-8 ,ftol_rel=1e-15)
{
  K = exp(Phi / 2)
  tK = t(K)
  nbX = length(n)
  nbY = length(m)
  eval_f=function(ab)
  {
    a = ab[1:nbX]
    b = ab[(1+nbX):(nbX+nbY)]
    A = exp(-a / 2)
    B = exp(-b / 2)
    A2 = A * A
    B2 = B * B
    val = sum(n*a)+sum(m*b) + 2 * matrix(A,nrow=1) %*% K %*% B + sum(A2) + sum(B2)
    grada = n - A * (K %*% B) - A2
    gradb = m - B * (tK %*% A) - B2
    grad = c(grada,gradb)
    return(list(objective = val, 
                gradient = grad))
  }
  ab_init = -c(log(n /2),log(m/2))
  
  resopt = nloptr(x0 = ab_init, eval_f = eval_f,
                  opt = list("algorithm" = "NLOPT_LD_LBFGS",
                             "xtol_rel"= xtol_rel,
                             "ftol_rel"= ftol_rel))
  absol = resopt$solution
  a = absol[1:nbX]
  b = absol[(1+nbX):(nbX+nbY)]
  A = exp(-a / 2)
  B = exp(-b / 2)
  mu = c(A) * t( c(B) * tK  )
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  val = sum(mu * Phi) - 2*sum(mu*log(mu / sqrt(matrix(n,ncol=1) %*% matrix(m,nrow=1)))) - sum(mux0*log(mux0 / n)) - sum(mu0y*log(mu0y / m))
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = resopt$iterations))
}

## Newton descent, nodal version

If $F\left(  a,b\right)  =\sum_{x\in\mathcal{X}}n_{x}a_{x}+\sum
_{y\in\mathcal{Y}}m_{y}b_{y}+E\left(  a,b\right)  $, then%
\begin{align*}
\frac{\partial^{2}F}{\partial a_{x}\partial a_{x^{\prime}}}  &  =1_{\left\{
x=x^{\prime}\right\}  }\left\{  \frac{1}{2}\sum_{y\in\mathcal{Y}}e^{\frac
{\Phi_{xy}-a_{x}-b_{y}}{2}}+e^{-a_{x}}\right\} \\
\frac{\partial^{2}F}{\partial b_{y}\partial b_{y^{\prime}}}  &  =1_{\left\{
y=y^{\prime}\right\}  }\left\{  \frac{1}{2}\sum_{x\in\mathcal{X}}e^{\frac
{\Phi_{xy}-a_{x}-b_{y}}{2}}+e^{-b_{y}}\right\} \\
\frac{\partial^{2}F}{\partial a_{x}\partial b_{y}}  &  =\frac{1}{2}%
e^{\frac{\Phi_{xy}-a_{x}-b_{y}}{2}}%
\end{align*}

This is done via `nleqslv`.

In [14]:
nodalNewton = function(Phi,n,m, xtol = 1e-8 )
{
  K = exp(Phi / 2)
  tK = t(K)
  nbX = length(n)
  nbY = length(m)
  Z=function(ab)
  {
    a = ab[1:nbX]
    b = ab[(1+nbX):(nbX+nbY)]
    A <<- exp(-a / 2)
    B <<- exp(-b / 2)
    A2 <<- A * A
    B2 <<- B * B
    sumx <<- A * (K %*% B)
    sumy <<- B * (tK %*% A)
    grada = n - sumx - A2
    gradb = m - sumy - B2
    grad = c(grada,gradb)
    return( grad)
  }
  JZ = function(ab)
  {
    J11 = diag(c(.5*sumx+A2))
    J22 = diag(c(.5*sumy+B2))
    J12 = .5 * c(A) * t( c(B) * tK  ) 
    J21 = t(J12)
    J = rbind(cbind(J11,J12),cbind(J21,J22))
    return(J)
  }
  
  ab_init = -c(log(n /2),log(m/2))

  sol = nleqslv(x = ab_init,
                fn = Z, jac = JZ,
                method = "Broyden", # "Newton"
                control = list(xtol=xtol))
  
  absol = sol$x
  a = absol[1:nbX]
  b = absol[(1+nbX):(nbX+nbY)]
  A = exp(-a / 2)
  B = exp(-b / 2)
  mu = c(A) * t( c(B) * tK  )
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  val = sum(mu * Phi) - 2*sum(mu*log(mu / sqrt(matrix(n,ncol=1) %*% matrix(m,nrow=1)))) - sum(mux0*log(mux0 / n)) - sum(mu0y*log(mu0y / m))
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = sol$iter))
}

## IPFP

Recall that setting $a_{x}=\sqrt{\mu_{x0}}$ and $b_{y}=\sqrt{\mu_{0y}}$
one can employ the following iterative scheme%
\begin{align*}
\left\{
\begin{array}
[c]{l}%
a_{x}^{2t+1}=\sqrt{n_{x}+\left(  \sum_{y\in\mathcal{Y}}b_{y}^{2t}%
K_{xy}/2\right)  ^{2}}-\sum_{y\in\mathcal{Y}}b_{y}^{2t}K_{xy}/2\\
b_{y}^{2t+2}=\sqrt{m_{y}+\left(  \sum_{x\in\mathcal{X}}a_{x}^{2t+1}%
K_{xy}/2\right)  ^{2}}-\sum_{x\in\mathcal{X}}a_{x}^{2t+1}K_{xy}/2
\end{array}
\right.
\end{align*}

The algorithm converges for any starting point $a^{0}$.

In [15]:
ipfp = function(Phi,n,m, tol = 1e-6)
{
  K = exp(Phi / 2)
  tK = t(K)
  B  = sqrt(m)
  cont = T
  iter=0
  while (cont)
  {
    iter = iter + 1
    KBover2 = K %*% B /2
    A = sqrt(n + KBover2 * KBover2) - KBover2
    tKAover2 = tK %*% A / 2
    B = sqrt(m + tKAover2 * tKAover2) - tKAover2
    
    discrepancy = max(abs( A * ( K %*% B + A) - n ) / n)
    if (discrepancy<tol)
    {cont = F}
  }
  mu = c(A) * t( c(B) * tK  )
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  val = sum(mu * Phi) - 2*sum(mu*log(mu / sqrt(matrix(n,ncol=1) %*% matrix(m,nrow=1)))) - sum(mux0*log(mux0 / n)) - sum(mu0y*log(mu0y / m))
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = iter))
}

## Linear programming

Recall that the matching surplus between $i$ and $j$ is therefore
\begin{align*}
\tilde{\Phi}_{ij}=\Phi_{x_{i}y_{j}}+\varepsilon_{iy_{j}}+\eta_{x_{i}j}%
\end{align*}
where $\Phi_{xy}=\alpha_{xy}+\gamma_{xy}$. The value of optimal matching is
thus, under its dual form,%
\begin{align*}
\min_{u_{i},v_{j}}  &  \sum_{i\in\mathcal{I}}u_{i}+\sum_{j\in\mathcal{J}}%
v_{j}\\
s.t.~  &  u_{i}+v_{j}\geq\Phi_{x_{i}y_{j}}+\varepsilon_{iy_{j}}+\eta_{x_{i}%
j}\\
&  u_{i}\geq\varepsilon_{i0}\\
&  v_{j}\geq\eta_{j0}%
\end{align*}

Here, $\varepsilon_{iy_{j}}$ and $\eta_{x_{i}j}$ are i.i.d. draws from
standard Gumbel distributions.

This way of doing things is not competitive, but interesting to keep in mind.

This is done via `Gurobi`.

In [16]:
simulatedLinprogr = function (Phi,n,m,nbDraws=1e3,seed=777)
{
  nbX = length (n)
  nbY = length (m)
  nbI = nbX * nbDraws
  nbJ = nbY * nbDraws
  #
  epsilon_iy = matrix(digamma(1) - log(-log(runif(nbI*nbY))),nbI,nbY)
  epsilon0_i = c(digamma(1) - log(-log(runif(nbI))))
  
  I_ix = matrix(0,nbI,nbX)
  for (x in 1:nbX)
  {
    I_ix[(nbDraws*(x-1)+1):(nbDraws*x),x] = 1
  }
  
  eta_xj = matrix(digamma(1) - log(-log(runif(nbX*nbJ))),nbX,nbJ)
  eta0_j = c(digamma(1) - log(-log(runif(nbI))))  
  
  I_yj = matrix(0,nbY,nbJ)
  for (y in 1:nbY)
  {
    I_yj[y,(nbDraws*(y-1)+1):(nbDraws*y)] = 1
  }
  
    
  ni = c(I_ix %*% n)/nbDraws
  mj = c(m %*% I_yj)/nbDraws
  #
  # based on this, can compute aggregated equilibrium in LP 
  #
  A_11 = suppressMessages( Matrix::kronecker(matrix(1,nbY,1),sparseMatrix(1:nbI,1:nbI,x=1)) )
  A_12 = sparseMatrix(i=NULL,j=NULL,dims=c(nbI*nbY,nbJ),x=0)
  A_13 = suppressMessages( Matrix::kronecker(sparseMatrix(1:nbY,1:nbY,x=-1),I_ix) )
  
  A_21 = sparseMatrix(i=NULL,j=NULL,dims=c(nbX*nbJ,nbI),x=0)
  A_22 = suppressMessages( Matrix::kronecker(sparseMatrix(1:nbJ,1:nbJ,x=1),matrix(1,nbX,1)) )
  A_23 = suppressMessages( Matrix::kronecker(t(I_yj),sparseMatrix(1:nbX,1:nbX,x=1)) )
  
  A_1  = cbind(A_11,A_12,A_13)
  A_2  = cbind(A_21,A_22,A_23)
  
  A    = rbind(A_1,A_2)
  # 
  nbconstr = dim(A)[1]
  nbvar = dim(A)[2]
  #
  lb  = c(epsilon0_i,t(eta0_j), rep(-Inf,nbX*nbY))
  rhs = c(epsilon_iy, eta_xj+Phi %*% I_yj)
  obj = c(ni,mj,rep(0,nbX*nbY))
  sense = rep(">=",nbconstr)
  modelsense = "min"
  #
  result = gurobi(list(obj=obj,A=A,modelsense=modelsense,rhs=rhs,sense=sense,lb=lb),params=list(OutputFlag=0))
  #
  muiy = matrix(result$pi[1:(nbI*nbY)],nrow=nbI)
  mu = t(I_ix) %*% muiy
  val = sum(ni*result$x[1:nbI]) + sum(mj*result$x[(nbI+1):(nbI+nbJ)])
  
  mux0 = n - apply(mu,1,sum)
  mu0y = m - apply(mu,2,sum)
  return(list(mu = mu, mux0 = mux0, mu0y = mu0y, val = val, iter = NA))
}

Benchmark is done via the `microbenchmark` package.

In [17]:
printStats = function(n,m,mu,phi,lambda)
{
  avgAbsDiff = -sum(mu * phi) / sum(mu) # average absolute age difference between matched partners
  fractionMarried = 2 * sum(mu) / (sum(n)+sum(m)) # fraction of married individuals
  # print(paste0("Value of lambda= ",lambda))
  print(paste0("Average absolute age difference between matched partners= ",avgAbsDiff))
  print(paste0("Fraction of married individuals= ",fractionMarried))
}

thelambda = 1
res_edgeGradient = edgeGradient(thelambda*thephi,then,them)
res_edgeNewton = edgeNewton(thelambda*thephi,then,them)
res_nodalGradient = nodalGradient(thelambda*thephi,then,them)
res_nodalNewton = nodalNewton(thelambda*thephi,then,them)
res_ipfp = ipfp(thelambda*thephi,then,them)
res_simulatedLinprogr = simulatedLinprogr(thelambda*thephi,then,them)

printStats(then,them,res_ipfp$mu,thephi,thelambda)

print("Values returned")
print(paste0("Edge gradient  = ",res_edgeGradient$val))
print(paste0("Edge Newton    = ",res_edgeNewton$val))
print(paste0("Nodal gradient = ",res_nodalGradient$val))
print(paste0("Nodal Newton   = ",res_nodalNewton$val))
print(paste0("IPFP           = ",res_ipfp$val))
print(paste0("Linear progr   = ",res_simulatedLinprogr$val))

print("Number of iterations")
print(paste0("Edge gradient  = ",res_edgeGradient$iter))
print(paste0("Edge Newton    = ",res_edgeNewton$iter))
print(paste0("Nodal gradient = ",res_nodalGradient$iter))
print(paste0("Nodal Newton   = ",res_nodalNewton$iter))
print(paste0("IPFP           = ",res_ipfp$iter))


res = microbenchmark(edgeGradient(thelambda*thephi,then,them),edgeNewton(thelambda*thephi,then,them),nodalGradient(thelambda*thephi,then,them),nodalNewton(thelambda*thephi,then,them),ipfp(thelambda*thephi,then,them),times=10)
print(res)

[1] "Average absolute age difference between matched partners= 0.303955662667604"
[1] "Fraction of married individuals= 0.910811811292031"
[1] "Values returned"
[1] "Edge gradient  = 2.71553056764454"
[1] "Edge Newton    = 2.71553056763396"
[1] "Nodal gradient = 2.71553056764911"
[1] "Nodal Newton   = 2.71553056764975"
[1] "IPFP           = 2.71553056764864"
[1] "Linear progr   = 2.71257516760806"
[1] "Number of iterations"
[1] "Edge gradient  = 71"
[1] "Edge Newton    = 28"
[1] "Nodal gradient = 39"
[1] "Nodal Newton   = 52"
[1] "IPFP           = 41"
Unit: microseconds
                                          expr        min         lq
  edgeGradient(thelambda * thephi, then, them)  32902.028  33532.362
    edgeNewton(thelambda * thephi, then, them) 855042.906 857466.817
 nodalGradient(thelambda * thephi, then, them)   2808.359   2935.477
   nodalNewton(thelambda * thephi, then, them)   4151.918   4345.224
          ipfp(thelambda * thephi, then, them)    604.354    636.829
        m

The IPFP is a clear winner. In spite of the fact that R is not good with
loops, it beats the next most efficient procedure by a factor 5.

edge-newton is penalized by large matrix inversions. However, it
converges in a remarkably low number of iterations. Could be sped up if matrix
inversion is done efficiently.