# Double-bracket Gradient Descent Stratgies
This notebook demonstrates the gradient descent strategies for double-bracket rotations. The mehods uses a numerical method to find the gradient of the cost function with respect to the diagonal operator, and thereby variate the diagonal operator of the rotation. 

Finding the gradient requires the parameterization of the diagonal operator, and there are two ways of doing so:

1. Pauli-basis: $D(B,J)= \sum B_i Z_i + \sum J_{ij}Z_iZ_j + ...$
2. Computational-basis: $D(A)=\sum A_i|i\rangle\langle i|$

In [None]:
from qibo.models.dbi.double_bracket import DoubleBracketIteration, DoubleBracketGeneratorType, DoubleBracketScheduling, DoubleBracketCostFunction
from qibo.models.dbi.utils import generate_pauli_operator_dict, decompose_into_pauli_basis, params_to_diagonal_operator, ParameterizationTypes
from copy import deepcopy
from qibo.models.dbi.utils_dbr_strategies import gradient_descent
import numpy as np
from qibo import set_backend, hamiltonians
from qibo.hamiltonians import Hamiltonian
from qibo.quantum_info import random_hermitian
import matplotlib.pyplot as plt

In [None]:
def visualize_matrix(matrix, title=""):
    """Visualize hamiltonian in a heatmap form."""
    fig, ax = plt.subplots(figsize=(5,5))
    ax.set_title(title)
    try:
        im = ax.imshow(np.absolute(matrix), cmap="inferno")
    except TypeError:
        im = ax.imshow(np.absolute(matrix.get()), cmap="inferno")
    fig.colorbar(im, ax=ax)

def s_hist_to_plot(s_hist):
    # convert list of step durations taken to plotable
    s_plot = [0] * len(s_hist)
    for i in range(len(s_hist)):
        if i != 0:
            s_plot[i] = s_plot[i-1] + s_hist[i]
    return s_plot

# Random Hamiltonian

In [None]:
# set the qibo backend (we suggest qibojit if N >= 20)
set_backend("qibojit", platform="numba")

# hamiltonian parameters
nqubits = 5
seed = 10

# define the hamiltonian
h0 = random_hermitian(2**nqubits, seed=seed)
dbi = DoubleBracketIteration(
    Hamiltonian(nqubits, h0),
    mode=DoubleBracketGeneratorType.single_commutator,
    scheduling=DoubleBracketScheduling.hyperopt,
    cost=DoubleBracketCostFunction.off_diagonal_norm
)
# vosualize the matrix
visualize_matrix(dbi.h.matrix, title="Target hamiltonian")

Then we set up the required parameters for gradient descent.

In [None]:
# Pauli-basis
pauli_operator_dict = generate_pauli_operator_dict(nqubits)
pauli_operators = list(pauli_operator_dict.values())
# let initial d be approximation of $\Delta(H)
d_coef_pauli = decompose_into_pauli_basis(dbi.diagonal_h_matrix, pauli_operators=pauli_operators)
d_pauli = sum([d_coef_pauli[i]*pauli_operators[i] for i in range(nqubits)])

# Computational basis
d_coef_computational_partial = d_pauli.diagonal()
d_coef_computational_full = dbi.diagonal_h_matrix.diagonal()
d_computational_partial = params_to_diagonal_operator(d_coef_computational_partial, nqubits, ParameterizationTypes.computational, normalize=False)
d_computational_full = params_to_diagonal_operator(d_coef_computational_full, nqubits, ParameterizationTypes.computational, normalize=False)

plt.plot(d_coef_computational_partial, label="computational basis partial")
plt.plot(d_coef_computational_full, label=r"computational basis full = $\Delta(H)$")
plt.legend()
plt.title(r"Diagonal entries of $D$")


Now we want to compare 3 scenarios:

1. Pauli-basis: an approximation to the diagonal of $H$
2. Computational-partial: same as 1. in the computational basis.
3. Computational-full: a full parameterization of the diagonal of $H$ in the computational basis.

In [None]:
# 1. Pauli-basis
NSTEPS = 5
dbi_pauli = deepcopy(dbi)
loss_hist_pauli, d_params_hist_pauli, s_hist_pauli = gradient_descent(dbi_pauli, NSTEPS, d_coef_pauli, ParameterizationTypes.pauli, pauli_operator_dict=pauli_operator_dict)

In [None]:
# 2. Computational_partial
dbi_computational_partial = deepcopy(dbi)
loss_hist_computational_partial, d_params_hist_computational_partiali, s_computational_partial = gradient_descent(dbi_computational_partial, NSTEPS, d_coef_computational_partial, ParameterizationTypes.computational)

In [None]:
# 3. Computational_full
dbi_computational_full = deepcopy(dbi)
loss_hist_computational_full, d_params_hist_computational_full, s_computational_full = gradient_descent(dbi_computational_full, NSTEPS, d_coef_computational_full, ParameterizationTypes.computational)

In [None]:
s_plot_pauli = s_hist_to_plot(s_hist_pauli)
s_plot_computational_partial = s_hist_to_plot(s_computational_partial)
s_plot_computational_full = s_hist_to_plot(s_computational_full)

In [None]:
plt.plot(s_plot_pauli, loss_hist_pauli, label="pauli basis", marker="o")
plt.plot(s_plot_computational_partial, loss_hist_computational_partial, label="computational partial", marker="o")
plt.plot(s_plot_computational_full, loss_hist_computational_full, label="computational full", marker="o")
plt.legend()
plt.title("Off-diagonal norm")
plt.ylabel(r"$||\sigma(H)||_{HS}$")
plt.xlabel("s")


# TFIM

In [None]:
# hamiltonian parameters
nqubits = 5
h = 3

# define the hamiltonian
h = hamiltonians.TFIM(nqubits=nqubits, h=h)
dbi = DoubleBracketIteration(
    h,
    mode=DoubleBracketGeneratorType.single_commutator,
    scheduling=DoubleBracketScheduling.hyperopt
)
# vosualize the matrix
visualize_matrix(dbi.h.matrix, title="Target hamiltonian")

In [None]:
# Pauli-basis
pauli_operator_dict = generate_pauli_operator_dict(nqubits)
pauli_operators = list(pauli_operator_dict.values())
# let initial d be approximation of $\Delta(H)
d_coef_pauli = decompose_into_pauli_basis(dbi.diagonal_h_matrix, pauli_operators=pauli_operators)
d_pauli = sum([d_coef_pauli[i]*pauli_operators[i] for i in range(nqubits)])

# Computational basis
d_coef_computational_partial = d_pauli.diagonal()
d_coef_computational_full = dbi.diagonal_h_matrix.diagonal()
d_computational_partial = params_to_diagonal_operator(d_coef_computational_partial, nqubits, ParameterizationTypes.computational, normalize=False)
d_computational_full = params_to_diagonal_operator(d_coef_computational_full, nqubits, ParameterizationTypes.computational, normalize=False)

plt.plot(d_coef_computational_partial, label="computational basis partial")
plt.plot(d_coef_computational_full, label=r"computational basis full = $\Delta(H)$")
plt.legend()
plt.title(r"Diagonal entries of $D$")


In [None]:
# 1. Pauli-basis
NSTEPS = 3
dbi_pauli = deepcopy(dbi)
loss_hist_pauli, d_params_hist_pauli, s_hist_pauli = gradient_descent(dbi_pauli, NSTEPS, d_coef_pauli, ParameterizationTypes.pauli, pauli_operator_dict=pauli_operator_dict)

In [None]:
# 2. Computational_partial
dbi_computational_partial = deepcopy(dbi)
loss_hist_computational_partial, d_params_hist_computational_partiali, s_computational_partial = gradient_descent(dbi_computational_partial, NSTEPS, d_coef_computational_partial, ParameterizationTypes.computational)

In [None]:
# 3. Computational_full
dbi_computational_full = deepcopy(dbi)
loss_hist_computational_full, d_params_hist_computational_full, s_computational_full = gradient_descent(dbi_computational_full, NSTEPS, d_coef_computational_full, ParameterizationTypes.computational)

In [None]:
s_plot_pauli = s_hist_to_plot(s_hist_pauli)
s_plot_computational_partial = s_hist_to_plot(s_computational_partial)
s_plot_computational_full = s_hist_to_plot(s_computational_full)

In [None]:
plt.plot(s_plot_pauli, loss_hist_pauli, label="pauli basis", marker="o")
plt.plot(s_plot_computational_partial, loss_hist_computational_partial, label="computational partial", marker="o")
plt.plot(s_plot_computational_full, loss_hist_computational_full, label="computational full", marker="o")
plt.legend()
plt.title("Off-diagonal norm")
plt.ylabel(r"$||\sigma(H)||_{HS}$")
plt.xlabel("s")


After changing the cost function and scheduling method, we notice that quite consistently, the Pauli-based parameterization diagonalizes the hamiltonian the best, and for the first few iterations, the Computational-based partial (same initial operator as Pauli) performs very similarly, and diverges later on.

In [None]:
nqubits = 3
pauli_operator_dict = generate_pauli_operator_dict(
    nqubits, parameterization_order=1
)
params = [1, 2, 3]
operator_pauli = sum([
    params[i] * list(pauli_operator_dict.values())[i] for i in range(nqubits)
])
assert (
    operator_pauli
    == params_to_diagonal_operator(
        params, nqubits=nqubits, parameterization=ParameterizationTypes.pauli, pauli_operator_dict=pauli_operator_dict
    )
).all()
operator_element = params_to_diagonal_operator(
    params, nqubits=nqubits, parameterization=ParameterizationTypes.computational
)
assert (operator_element.diagonal() == params).all()