Quantum optimal control methods formalize the problem of finding "control fields" that achieve some physical objective, using the time evolution of a quantum system, including possible constraints. The most direct example is a state-to-state transition, that is, for a known quantum state at time zero to evolve to a specific target state at final time T, controlling, e.g. a chemical reaction TannorJCP1985
. Another example is the realization of quantum gates, the building blocks of a quantum computer. In this case, the states forming a computational basis must transform according to a unitary transformation NielsenChuang
. The control fields might be the amplitudes of a laser pulse, for the control of a molecular system, RF fields for nuclear magnetic resonance, or microwave fields for superconducting circuits. There may be multiple independent controls involved in the dynamics, such as different color lasers used in the excitation of a Rydberg atom, or different polarization components of an electric field.
The quantum control methods build on a rich field of classical control theory BellmanBook,PontryaginBook
. This includes Krotov’s method Krotov.book
, which was originally formulated to optimize the soft landing of a spacecraft from orbit to the surface of a planet KonnovARC99
, before being applied to quantum mechanical problems SklarzPRA2002
. Fundamentally, they rely on the variational principle, that is, the minimization of a functional PontryaginBook
. Numerical methods are required in any other case. These start from an initial guess control (or set of guess controls, if there are multiple controls), and calculate an update to these controls that will decrease the value of the functional. The updated controls then become the guess for the next iteration of the algorithm, until the value of the functional is sufficiently small, or convergence is reached.
Mathematically, Krotov’s method, when applied to quantum systems Tannor92,ReichJCP12
, minimizes a functional of the most general form
where the
The functional consists of three parts:
- A final-time functional JT. This is the "main" part of the functional, and we can usually think of J as being an auxiliary functional in the optimization of JT.
-
A running cost on the control fields, ga. The most commonly used expression (and the only one currently supported by the
krotov
package) isPalaoPRA2003
$$g_a(\epsilon_l^{(i+1)}(t)) = \frac{\lambda_{a,l}}{S_l(t)} \Delta\epsilon_l^2(t)\,, \quad \Delta\epsilon_l(t) \equiv \epsilon_l^{(i+1)}(t) - \epsilon_l^{(i)}(t)\,.$$ This introduces two parameters for each control, the (inverse) Krotov "step width" λa, l and the update-shape function Sl(t) ∈ [0, 1]. Δϵl(t) is the update of the control in a single iteration of the optimization algorithm. As we will see below, λa, l determines the overall magnitude of Δϵl(t), and Sl(t) can be used to ensure boundary conditions on ϵl(i + 1)(t).
- An optional state-dependent running cost, gb, can be employed, e.g., for penalizing population in a subspace
PalaoPRA2008
. This is rarely used, as there are other methods to achieve the same effect, like using a non-Hermitian Hamiltonian to remove population from the forbidden subspace during the time evolution. Currently, thekrotov
package only supports gb ≡ 0.
The most commonly used final-time functionals (cf. krotov.functionals
) optimize for a set of initial states
in Hilbert space, or
in Liouville space.
The following functionals JT can be formed from these complex overlaps, taking into account that any optimization functional JT must be real. They differ by the way they treat the phases φk in the physical optimization goal PalaoPRA2003
:
-
Optimize for simultaneous state-to-state transitions, with completely arbitrary phases φk,
$$J_{T,\text{ss}} = 1- \frac{1}{N} \sum_{k=1}^{N} \Abs{\tau_k}^2\,,$$ cf.
.J_T_ss
. -
Optimize for simultaneous state-to-state transitions, with an arbitrary global phase, i.e., φk = φglobal for all k with arbitrary φglobal,
$$J_{T,\text{sm}} = 1- \frac{1}{N^2} \Abs{\sum_{k=1}^{N} \tau_k}^2 = 1- \frac{1}{N^2} \sum_{k=1}^{N} \sum_{k'=1}^{N} \tau_{k'}^* \tau_{k}\,,$$ cf.
.J_T_sm
. -
Optimize for simultaneous state-to-state transitions, with a global phase of zero, i.e., φk = 0 for all k,
$$J_{T,\text{re}} = 1-\frac{1}{N} \Re \left[\, \sum_{k=1}^{N} \tau_k \,\right]\,,$$ cf.
.J_T_re
.
Krotov’s method is based on a rigorous examination of the conditions for calculating the updated fields ϵl(i + 1)(t) such that Krotov.book,KonnovARC99,PalaoPRA2003,ReichJCP12,SklarzPRA2002
. It achieves this by adding a vanishing quantity to the functional that disentangles the implicit dependence of
For ga as in Eq. g_a
, this condition becomes the Krotov update equation Tannor92,PalaoPRA2003,SklarzPRA2002
,
with the equation of motion for the forward propagation of
For the moment, we have assumed unitary dynamics; the generalization to open system dynamics will be discussed later in this section. The co-states
with the boundary condition
Here, and in the following, we have dropped the index l of the controls and the associated λa, l and Sl(t); all equations are valid for each individual control.
Frequently, the control field ϵ(t) is required to be zero at t = 0 and t = T in order to smoothly switch on and off. To ensure that the update maintains this behavior, S(t) ∈ [0, 1] is chosen as a function with those same conditions. A typical example is a .flattop
function
with the .blackman
shape B(t; t0, t1), which is similar to a Gaussian, but exactly zero at t = t0, t1.
The scaling factor λa controls the overall magnitude of the pulse update, thereby taking the role of an (inverse) "step size". Values that are too large will change ϵ(i)(t) by only a small amount in every iteration, causing slow convergence. Values that are too small will result in numerical instability, see ChoiceOfLambdaA
.
The coupled equations krotov_first_order_update
-chi_boundary
can be generalized to open system dynamics by replacing Hilbert space states with density matrices, krotov_first_order_update
is the operator H on the right-hand side of whatever the equation of motion for the forward propagation of the states is, written in the form iℏϕ̇ = Hϕ, cf. Eq. fw_eqm
, see krotov.mu
. Note also that the backward propagation Eq. bw_eqm
uses the adjoint operator, which is relevant both for a dissipative Liouvillian BartanaJCP93,OhtsukiJCP99,GoerzNJP2014
and a non-Hermitian Hamiltonian MullerQIP11,GoerzQST2018
.
The condition krotov_proto_update
and the update Eq. krotov_first_order_update
are based on a first-order expansion of the auxiliary potential Φ with respect to the states. This first order is sufficient if the equation of motion is linear (krotov_proto_update
KonnovARC99,ReichJCP12
,
with
This second term is the "non-linear" or "second order" contribution. The corresponding update quation is, assuming Eq. g_a
,
The prefactor σ(t) to the second order update is a scalar function that must be chosen appropriately to ensure monotonic convergence.
As shown in Ref. ReichJCP12
, it is possible to numerically approximate σ(t). In Refs WattsPRA2015,GoerzPRA2015
, non-convex final-time functionals that depend higher than quadratically on the states are considered, for a standard equation of motion given by a linear Schrödinger equation. In this case,
σ(t) ≡ − max (εA,2A+εA) ,
where εA is a small non-negative number that can be used to enforce strict inequality in the second order optimality condition. The optimal value for A in each iteration can be approximated numerically as ReichJCP12
cf. krotov.second_order.numerical_estimate_A
, with
ΔJT ≡ JT({ϕk(i + 1)(T)}) − JT({ϕk(i)(T)}) .
See the /notebooks/07_example_PE.ipynb
for an example.
Note
Even when the second order update equation is mathematically required to guarantee monotonic convergence, very often an optimization with the first-order update equation krotov_first_order_update
will give converging results. Since the second order update requires more numerical resources (calculation and storage of the states
Sequential update scheme in Krotov’s method on a time grid.
The derivation of Krotov’s method assumes time-continuous control fields. Only in this case, monotonic convergence is mathematically guaranteed. However, for practical numerical applications, we have to consider controls on a discrete time grid with nt points running from t = 0 to t = T, with a time step ⅆt. The states are defined on the points of the time grid, while the controls are assumed to be constant on the intervals of the time grid. See the notebook Time Discretization in Quantum Optimal Control for details. This discretization yields the numerical scheme shown in figkrotovscheme
. It proceeds as follows PalaoPRA2003
:
- Construct the states
$\ket{\chi^{(i)}_k(T)}$ according to Eq.chi_boundary
. These typically depend on the states$\{\ket{\phi^{(i)}_k(T)}\}$ forward-propagated under the optimized pulse from the previous iteration, that is, the guess pulse in the current iteration. - Perform a backward-propagation using Eq.
bw_eqm
as the equation of motion over the entire time grid. The resulting state at each point in the time grid must be stored in memory. - Starting from the known initial states
$\ket{\phi_k} = \ket{\phi_k(t=0)}$ , calculate the pulse update for the first time step according to Eq.krotov_first_order_update
, with t = ⅆt/2 on the left-hand side (representing the first interval in the time grid, on which the control pulse is defined), and t = 0 on the right-hand side (representing the first point on the time grid). This approximation of t ≈ t + ⅆt/2 is what constitutes the "time discretization" mathematically. It resolves the seeming contradiction in the time-continuous Eq.krotov_first_order_update
that the calculation of ϵ(i + 1)(t) requires knowledge of the states$\ket{\phi_k^{(i+1)}(t)}$ obtained from a propagation under ϵ(i + 1)(t). - Use the updated field ϵ(i + 1)(ⅆt/2) for the first interval to propagate
$\ket{\phi_k(t=0)}$ for a single time step to$\ket{\phi_k^{(i+1)}(t=\dd t)}$ , with Eq.fw_eqm
as the equation of motion. The updates then proceed sequentially, until the final forward-propagated state$\ket{\phi^{(i+1)}_k(T)}$ is reached. - The updated control field becomes the guess control for the next iteration of the algorithm, starting again at step 1. The optimization continues until the value of the functional JT falls below some predefined threshold, or convergence is reached, i.e., ΔJT approaches zero so that no further significant improvement of JT is to be expected.
For multiple objectives, the scheme can run in parallel, and each objective contributes a term to the update. Summation of these terms yields the sum in Eq. krotov_first_order_update
. See krotov.parallelization
for details. For a second-order update, the forward propagated states from step 4, both for the current iteration and the previous iteration, must be stored in memory over the entire time grid.
The monotonic convergence of Krotov's method is only guaranteed in the continuous limit; a coarse time step must be compensated by larger values of the inverse step size λa, slowing down convergence. Generally, choosing λa too small will lead to numerical instabilities and unphysical features in the optimized pulse. A lower limit for λa can be determined from the requirement that the change Δϵ(t) should be at most of the same order of magnitude as the guess pulse ϵ(i)(t) for that iteration. The Cauchy-Schwarz inequality applied to the update equation yields
where
From a practical point of view, the best strategy is to start the optimization with a comparatively large value of λa, and after a few iterations lower λa as far as possible without introducing numerical instabilities. The value of λa may be adjusted dynamically with respect to the rate of convergence. Generally, the optimal choice of λa requires some trial and error.
When using the rotating wave approximation (RWA), it is important to remember that the target states are usually defined in the lab frame, not in the rotating frame. This is relevant for the construction of
Another solution would be to transform the result of the forward propagation
When the RWA is used the control fields are complex-valued. In this case the Krotov update equation is valid for both the real and the imaginary part independently. The most straightforward implementation of the method is for real controls only, requiring that any complex control Hamiltonian is rewritten as two independent control Hamiltonians, one for the real part and one for the imaginary part of the control field. For example,
with two independent control fields ϵre(t) = ℜ[ϵ(t)] and ϵim(t) = ℑ[ϵ(t)].
See the /notebooks/02_example_lambda_system_rwa_complex_pulse.ipynb
for an example.
The control equations have been written in the notation of Hilbert space. However, they are equally valid for a gate optimization in Liouville space, by replacing Hilbert space states with density matrices, krotov.mu
), and inner products with Hilbert-Schmidt products, ⟨ ⋅ | ⋅ ⟩ → ⟨⟨ ⋅ | ⋅ ⟩⟩, cf., e.g., Ref. GoerzNJP2014
.
See the /notebooks/04_example_dissipative_qubit_reset.ipynb
for an example.