Packages:

- [NLopt](https://github.com/JuliaOpt/NLopt.jl) Powell's derivative-free algorithms
- [Optim](https://github.com/JuliaNLSolvers/Optim.jl) Gradient-based algorithms, autodifferentiation support

In [38]:
using NLopt
using Optim
using Printf

# Nonlinear unconstrained optimization

Example: projective measurements on the singlet state $\left| \Psi^- \right> = \left| 01 - 10 \right>/\sqrt{2} \in \mathcal{H}_\text{A} \otimes (\mathcal{H}_\text{B})$.

A for Alice and B for Bob.

We parameterize local measurements by qubit states.

$$ \left| \vec{\phi} \right> = \begin{pmatrix} \cos \phi_1 \\ e^{i \phi_2} \sin \phi_1 \end{pmatrix}, \qquad \left| \vec{\phi} \right>_\perp = \begin{pmatrix} -\sin \phi_1 \\ e^{i \phi_2} \cos \phi_1 \end{pmatrix}  $$

There are two measurement settings indexed by $s=1,2$ on $\mathcal{H}_\text{A}$, corresponding to two projectors $\{\left| \alpha_1 \right>, \left| \alpha_1 \right>_\perp\}$ and $\{\left| \alpha_2 \right>, \left| \alpha_2 \right>_\perp \}$, and two measurement settings indexed by $t=1,2$ on $\mathcal{H}_\text{B}$, corresponding to $\{\left| \beta_1 \right>, \left| \beta_1 \right>_\perp\}$ and $\{\left| \beta_2 \right>, \left| \beta_2 \right>_\perp\}$.

The measurement outcomes are 1-based (Julia, Matlab), so that $a=1$ corresponds to $\left | \alpha_s \right >$ and $a=2$ to $\left | \alpha_s \right >_\perp$. The same for Bob: for $b=1$ corresponds to $\left | \beta_t \right>$ and $b=2$ to $\left| \beta_t \right>_\perp$

Then we consider the joint, conditional probability distribution $$ P_\text{AB|ST}(a,b|s,t) $$. We compute using Born's rule:

$$ P_\text{AB|ST}(1,1|s,t) = \left( \left< \alpha_s \right | \otimes \left< \beta_t \right | \right ) \left | \Psi^- \right > $$
$$ P_\text{AB|ST}(2,1|s,t) = \left( \left< \alpha_s \right |_\perp \otimes \left< \beta_t \right | \right ) \left | \Psi^- \right > $$
$$ P_\text{AB|ST}(1,2|s,t) = \left( \left< \alpha_s \right | \otimes \left< \beta_t \right |_\perp \right ) \left | \Psi^- \right > $$
$$ P_\text{AB|ST}(2,2|s,t) = \left( \left< \alpha_s \right |_\perp \otimes \left< \beta_t \right |_\perp \right ) \left | \Psi^- \right > $$

We want to maximize the CHSH expression:

$$ C = \sum_{abst} (-1)^{(a-1)+(b-1)+(s-1)(t-1)} P_\text{AB|ST}(a,b|s,t) $$

In [3]:
function singlet_proj_prob(x)
# Computes the joint probability distribution given by two projective measurements on each subsystem of a singlet state
    P = zeros(eltype(x)::Type, (2, 2, 2, 2));
    A = zeros(Complex{eltype(x)::Type}, (2, 2, 2));
    B = zeros(Complex{eltype(x)::Type}, (2, 2, 2));
    A[:,1,1] = [ cos(x[1])
                 sin(x[1])];
    A[:,2,1] = [-sin(x[1])
                 cos(x[1])];
    A[:,1,2] = [ cos(x[2])
                 sin(x[2])];
    A[:,2,2] = [-sin(x[2])
                 cos(x[2])];
    B[:,1,1] = [ cos(x[3])
                 sin(x[3])];
    B[:,2,1] = [-sin(x[3])
                 cos(x[3])];
    B[:,1,2] = [ cos(x[4])
                 sin(x[4])];
    B[:,2,2] = [-sin(x[4])
                 cos(x[4])];
    for s = 1:2
        for t = 1:2
            for a = 1:2
                for b = 1:2
                    ov = kron(A[:,a,s], B[:,b,t])' * [0; 1; -1; 0]/sqrt(2);
                    P[a,b,s,t] = real(conj(ov)*ov);
                end
            end
        end
    end
    return P
end

singlet_proj_prob (generic function with 1 method)

In [4]:
function chsh(P)
# Computes the CHSH expression from a joint probability distribution
    C = 0;
    for s = 1:2
        for t = 1:2
            for a = 1:2
                for b = 1:2
                    C = C + (-1)^((a-1)+(b-1)+(s-1)*(t-1)) * P[a,b,s,t];
                end
            end
        end
    end
    return C
end

chsh (generic function with 1 method)

In [104]:
using Optim
# Set up objective function, minus sign for maximization
f(x) = -chsh(singlet_proj_prob(x))
# Random initial point
x0 = rand(Float64, 4)*2*pi
verbose = true;

In [105]:
# Derivative-free, and possibly not differentiable: Nelder-Mead recommended
res = Optim.optimize(f, x0, Optim.NelderMead(), Optim.Options(f_tol = 1e-8, show_trace = verbose, show_every = 10))
@printf("\n\nMaximum: %f in %d iterations\n", -Optim.minimum(res), Optim.iterations(res))
@printf("Diff: %0.1E\n", abs(Optim.minimum(res) + 2*sqrt(2)))

Iter     Function value    √(Σ(yᵢ-ȳ)²)/n 
------   --------------    --------------
     0    -1.797219e+00     9.927339e-01
 * time: 0.00016021728515625
    10    -2.191993e+00     2.288082e-01
 * time: 0.002167224884033203
    20    -2.745744e+00     6.492282e-02
 * time: 0.0038650035858154297
    30    -2.812342e+00     4.326778e-03
 * time: 0.005515098571777344
    40    -2.826882e+00     1.381557e-03
 * time: 0.006715059280395508
    50    -2.828178e+00     8.475774e-05
 * time: 0.0071620941162109375
    60    -2.828390e+00     3.208715e-05
 * time: 0.007607221603393555
    70    -2.828419e+00     4.963490e-06
 * time: 0.008007049560546875
    80    -2.828425e+00     7.722723e-07
 * time: 0.00844717025756836
    90    -2.828427e+00     7.894150e-08
 * time: 0.008884191513061523
   100    -2.828427e+00     4.873629e-08
 * time: 0.00929403305053711


Maximium: 2.828427 in 103 iterations
Diff: 2.9E-09


In [111]:
# Derivative-free but differentiable: Powell's BOBYQA building a quadratic model

# Uses NLopt, so different calling convention.
# Warning: Powell's method becomes interesting for medium scale problems, here not much difference with Nelder-Mead

function vis(x::Vector, grad::Vector)
# we fake a "grad" argument to make NLopt happy
   val = -chsh(singlet_proj_prob(x))
   return val
end

opt = Opt(:LN_BOBYQA, 4)
opt.lower_bounds = zeros(4)*1.0
opt.upper_bounds = ones(4)*2*pi*1.0
opt.min_objective = vis
opt.ftol_rel = 1e-8
(minf, minx, ret) = NLopt.optimize(opt, x0)
@printf("\n\nMaximum: %f in %d iterations\n", -minf, opt.numevals)
@printf("Diff: %0.1E\n", abs(maxf + 2*sqrt(2)))



Maximum: 2.828427 in 60 iterations
Diff: 3.3E-07


In [107]:
# Using information in provided Hessian, here using autodifferentiation
res = Optim.optimize(f, x0, Optim.Newton(), Optim.Options(f_tol = 1e-8, show_trace = verbose, show_every = 10); autodiff = :forward)
@printf("\n\nMaximum: %f in %d iterations\n", -Optim.minimum(res), Optim.iterations(res))
@printf("Diff: %0.1E\n", abs(Optim.minimum(res) + 2*sqrt(2)))

Iter     Function value   Gradient norm 
     0    -7.078627e-02     3.782969e+00
 * time: 5.698204040527344e-5


Maximium: 2.828427 in 5 iterations
Diff: 4.4E-16


In [108]:
# Only gradients provided: BSGS recommended (medium scale), Hessian is approximated, gradients from autodifferentation
res = Optim.optimize(f, x0, Optim.BFGS(), Optim.Options(f_tol = 1e-8, show_trace = verbose, show_every = 10); autodiff = :forward)
@printf("\n\nMaximum: %f in %d iterations\n", -Optim.minimum(res), Optim.iterations(res))
@printf("Diff: %0.1E\n", abs(Optim.minimum(res) + 2*sqrt(2)))

Iter     Function value   Gradient norm 
     0    -7.078627e-02     3.782969e+00
 * time: 5.0067901611328125e-5
    10    -2.828427e+00     2.889560e-07
 * time: 0.0009119510650634766


Maximium: 2.828427 in 11 iterations
Diff: 4.4E-16


In [109]:
# Gradients provided, large scale, use limited memory BFGS
res = Optim.optimize(f, x0, Optim.LBFGS(), Optim.Options(show_trace = verbose, show_every = 10); autodiff = :forward)
@printf("\n\nMaximum: %f in %d iterations\n", -Optim.minimum(res), Optim.iterations(res))
@printf("Diff: %0.1E\n", abs(Optim.minimum(res) + 2*sqrt(2)))

Iter     Function value   Gradient norm 
     0    -7.078627e-02     3.782969e+00
 * time: 6.604194641113281e-5
    10    -2.828427e+00     1.388879e-05
 * time: 0.002519845962524414


Maximium: 2.828427 in 12 iterations
Diff: 4.4E-16


# Exercices

- Find a "nice" algebraic expression for the maximizer corresponding to the objective value $2 \sqrt{2}$
- Add support for the complex part, see if there is any difference (real vs. complex quantum mechanics!).
- Parameterize the quantum state as $\left| \gamma \right> = \cos \gamma \left| 01 \right> + \sin \gamma \left| 10 \right>$. Which state leads to the largest violation?

## Bonus open ended exercices

- Plot the objective value for the iteration number; either use the trace, a callback, or just stop the algorithm after $n$ steps.
- (Beware package incompatibilities) Parameterize $\cos \alpha, \sin \alpha$ as $x_1, x_2$ with $x_1^2 + x_2^2 = 1$ for each angle, and rewrite the above as a polynomial optimization problem. Solve with JuMP and the polynomial optimization extensions. Can you certify that $2 \sqrt(2)$ is a global maximum?