# Delta Estimation through an Ancilla system

@anglin2000quantumjosephsonhamiltonianphase is relevant-ish to this page.

We have a system described by:
- $H_S = -J_S J_x \otimes \mathbb{1} + \delta_S J_z \otimes \mathbb{1} + U_S J_z^2 \otimes \mathbb{1}$ describe's our system's @hamiltonian.
- $H_A = -J_A \mathbb{1} \otimes J_x + \delta_A \mathbb{1} \otimes J_z + U_A \mathbb{1} \otimes J_z^2$ describe's our ancilla's @hamiltonian.
- $H_{int} = \alpha_{xx} J_x \otimes J_x + \alpha_{xz} J_x \otimes J_z + \alpha_{zx} J_z \otimes J_x + \alpha_{zz} J_z \otimes J_z$ describes the interaction terms.
- $\rho = (\ket{0} \otimes \ket{\alpha} ) ( \bra{\alpha} \otimes \bra{0} )$ is the initial state. We choose form to allow the ancilla state $\ket{\alpha}$ to be controlled, while keeping it a pure state.

After letting the system evolve, we can measure the state of $H_S$ to inform ourselves about the underlying parameter $\delta_S$ guiding the system. In particular, we can quantify our uncertainty of $\delta_S$ through the ratio $\frac{\Delta A}{\frac{\partial}{\partial \delta} \braket{A}}$, where $A:= e^{-iHt} \rho e^{iHt}$ describes our observable. In our setup, we will consider:
1. Evolving only the system, and measure our uncertainty of $\delta_S$.
2. Evolving both the system and ancilla coupled, optimizing the coupling to improve our information $\delta_S$. As a first trial, we assume the system and ancilla have the same parameters

In [1]:
from matplotlib import pyplot as plt
from tqdm import tqdm
import numpy as np
import torch

plt.figure(figsize=(20,3))

# Setup GPU use
# https://github.com/pytorch/tutorials/issues/3263#issue-2811049983

device_name = ""


if torch.cuda.is_available():
    device = torch.device("cuda")
    device_name = torch.cuda.get_device_name(0)
else:
    device = torch.device("cpu")
    # device_name = torch.cpu ... # TODO: Find the cpu device name

torch.set_default_device(device)
print(f"Using {device} device: {device_name}")

Using cuda device: NVIDIA GeForce RTX 3050 Ti Laptop GPU


<Figure size 2000x300 with 0 Axes>

We are able to ignore the contributions of $U_S$ and $U_A$ in the 2 dimentional setup below, as in 2 dimentions $J_z^2 = \mathbb{1}$, which has no impact on the evolution of the system

In [2]:
def generate_hamiltonian(
    j_s: torch.Tensor = torch.Tensor([0]),
    delta_s: torch.Tensor = torch.Tensor([0]),
    j_a: torch.Tensor = torch.Tensor([0]),
    delta_a: torch.Tensor = torch.Tensor([0]),
    alpha_xx: torch.Tensor = torch.Tensor([0]),
    alpha_xz: torch.Tensor = torch.Tensor([0]),
    alpha_zx: torch.Tensor = torch.Tensor([0]),
    alpha_zz: torch.Tensor = torch.Tensor([0]),
) -> torch.Tensor:
    j_x = torch.tensor([[0,1], [1,0]], requires_grad=False)/2
    j_z = torch.tensor([[1,0], [0,-1]], requires_grad=False)/2


    hamiltonian_system = torch.kron(j_x, torch.eye(2, requires_grad=False))  * j_s.cuda() +\
                         torch.kron(j_z, torch.eye(2, requires_grad=False)) * delta_s.cuda()

    hamiltonian_ancilla = torch.kron(torch.eye(2, requires_grad=False), j_x)  * j_a.cuda() +\
                         torch.kron(torch.eye(2, requires_grad=False), j_z) * delta_a.cuda()

    hamiltonian_interactions = torch.kron(j_x, j_x) * alpha_xx.cuda() +\
                               torch.kron(j_x, j_z) * alpha_xz.cuda() +\
                               torch.kron(j_z, j_x) * alpha_zx.cuda() +\
                               torch.kron(j_z, j_z) * alpha_zz.cuda()


    return hamiltonian_system + hamiltonian_ancilla + hamiltonian_interactions

### Sanity check: Generated Hamiltonian

We want to ensure that our generated hamiltonian matches the expected values

In [3]:
generate_hamiltonian(j_s = torch.Tensor([1])).cpu()

tensor([[0.0000, 0.0000, 0.5000, 0.0000],
        [0.0000, 0.0000, 0.0000, 0.5000],
        [0.5000, 0.0000, 0.0000, 0.0000],
        [0.0000, 0.5000, 0.0000, 0.0000]])

In [4]:
generate_hamiltonian(delta_s = torch.Tensor([1])).cpu()

tensor([[ 0.5000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.5000,  0.0000,  0.0000],
        [ 0.0000,  0.0000, -0.5000,  0.0000],
        [ 0.0000,  0.0000,  0.0000, -0.5000]])

In [5]:
generate_hamiltonian(j_a = torch.Tensor([1])).cpu()

tensor([[0.0000, 0.5000, 0.0000, 0.0000],
        [0.5000, 0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.0000, 0.5000],
        [0.0000, 0.0000, 0.5000, 0.0000]])

In [6]:
generate_hamiltonian(delta_a = torch.Tensor([1])).cpu()

tensor([[ 0.5000,  0.0000,  0.0000,  0.0000],
        [ 0.0000, -0.5000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.5000,  0.0000],
        [ 0.0000,  0.0000,  0.0000, -0.5000]])

### As a Neural Network Layer

We write the above as a Layer that represents the evolution of the system

In [7]:
class SingleSystem(torch.nn.Module):
    def __init__(
            self,
            j_s: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            delta_s: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            j_a: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            delta_a: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            alpha_xx: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            alpha_xz: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            alpha_zx: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            alpha_zz: torch.Tensor = torch.nn.Parameter(torch.Tensor([0])),
            time: torch.Tensor = torch.nn.Parameter(torch.Tensor([0]))
    ):
        super().__init__()
        self.n_s = 2 # System size
        self.n_a = 2 # Ancilla size
        self.j_s = j_s
        self.delta_s = delta_s
        self.j_a = j_a
        self.delta_a = delta_a
        self.alpha_xx = alpha_xx
        self.alpha_xz = alpha_xz
        self.alpha_zx = alpha_zx
        self.alpha_zz = alpha_zz
        self.time = time

    def forward(self, x): # x is our quantum state
        with torch.device(device):
            j_x = torch.tensor([[0,1], [1,0]], requires_grad=False)/2
            j_z = torch.tensor([[1,0], [0,-1]], requires_grad=False)/2

            hamiltonian_system = torch.kron(j_x, torch.eye(2, requires_grad=False))  * self.j_s +\
                                 torch.kron(j_z, torch.eye(2, requires_grad=False)) * self.delta_s

            hamiltonian_ancilla = torch.kron(torch.eye(2, requires_grad=False), j_x)  * self.j_a +\
                                 torch.kron(torch.eye(2, requires_grad=False), j_z) * self.delta_a

            hamiltonian_interactions = torch.kron(j_x, j_x) * self.alpha_xx +\
                                       torch.kron(j_x, j_z) * self.alpha_xz +\
                                       torch.kron(j_z, j_x) * self.alpha_zx +\
                                       torch.kron(j_z, j_z) * self.alpha_zz

            hamiltonian = hamiltonian_system + hamiltonian_ancilla + hamiltonian_interactions

            left_operator = torch.linalg.matrix_exp(hamiltonian * self.time * torch.tensor([-1 * 1j]))
            right_operator = torch.linalg.matrix_exp(hamiltonian * self.time * torch.tensor([1 * 1j]))
            final_state = torch.matmul(
                left_operator,
                torch.matmul(
                    x.type(torch.complex64),
                    right_operator
                )
            )
            traced_state = torch.einsum(
                'ijkl->ik',
                final_state.view(self.n_s, self.n_a, self.n_s, self.n_a)
            )
        return traced_state

    def string(self):
        return f"e^(iH*{self.time:.3f}), V(x) = .5 * ( {self.omega:.2f}^2 (x - {self.center:.2f})^2)"

### Sanity check: Tracing function

We attempt a trivial partial trace to ensure our usage of `pytorch.view` and `pytorch.einsum` correctly replicates a partial trace

In [8]:
torch.einsum(
    'ijkl->ik',
    generate_hamiltonian(
        j_s=torch.Tensor([2]),
        delta_s=torch.Tensor([3])
    ).view(2,2,2,2)
).cpu()

tensor([[ 3.,  2.],
        [ 2., -3.]])

# Setup optimization routine

1. Setup the optimizer
2. Build the loss function ( $\frac{\Delta A}{\frac{\partial}{\partial \delta} \braket{A}}$ )
3. Optimize away :)

We numerically approximate $\frac{\partial}{\partial \delta} \braket{A}$ as the derivative isn't analytically available with the current setup. As such we approximate it as $\frac{\braket{A_{\delta_S + \epsilon}} - \braket{A_{\delta_S - \epsilon}}}{2 \epsilon}$, giving us the rsulting formula $2 \epsilon \frac{\braket{A_{\delta_S + \epsilon}} - \braket{A_{\delta_S - \epsilon}}}{\Delta A}$. Note that our observable is given by $\rho_t J_z$, as we masure the angular momentum $J_z$ of the final state $\rho_t$.

In [9]:
class SensitivityNetwork(torch.nn.Module):
    def __init__(
            self,
            j_s: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            delta_s: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            j_a: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            delta_a: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            alpha_xx: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            alpha_xz: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            alpha_zx: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            alpha_zz: torch.Tensor = torch.nn.Parameter(torch.rand([1])),
            time: torch.Tensor = torch.nn.Parameter(torch.Tensor([10]))
    ):
        super().__init__()
        self.n_s = 2 # System size
        self.n_a = 2 # Ancilla size
        self.epsilon = 1e-4
        self.j_s = j_s
        self.delta_s = delta_s
        self.j_a = j_a
        self.delta_a = delta_a
        self.alpha_xx = alpha_xx
        self.alpha_xz = alpha_xz
        self.alpha_zx = alpha_zx
        self.alpha_zz = alpha_zz
        self.time = time
        self.ancilla_state: torch.Tensor = torch.nn.Parameter(torch.Tensor([[1,0],[0,0]])) # Initialize as |0> state

    def forward(self): # x is our quantum state
        initial_state = torch.kron(
            torch.tensor([[1,0],[0,0]]),
            self.ancilla_state
        )

        final_state_less = SingleSystem(
            j_s=self.j_s,
            delta_s=self.delta_s - self.epsilon,
            j_a=self.j_a,
            delta_a=self.delta_a,
            alpha_xx=self.alpha_xx,
            alpha_xz=self.alpha_xz,
            alpha_zx=self.alpha_zx,
            alpha_zz=self.alpha_zz,
            time=self.time,
        )(initial_state)
        final_state = SingleSystem(
            j_s=self.j_s,
            delta_s=self.delta_s,
            j_a=self.j_a,
            delta_a=self.delta_a,
            alpha_xx=self.alpha_xx,
            alpha_xz=self.alpha_xz,
            alpha_zx=self.alpha_zx,
            alpha_zz=self.alpha_zz,
            time=self.time,
        )(initial_state)
        final_state_more = SingleSystem(
            j_s=self.j_s,
            delta_s=self.delta_s + self.epsilon,
            j_a=self.j_a,
            delta_a=self.delta_a,
            alpha_xx=self.alpha_xx,
            alpha_xz=self.alpha_xz,
            alpha_zx=self.alpha_zx,
            alpha_zz=self.alpha_zz,
            time=self.time,
        )(initial_state)

        j_z = (torch.tensor([[1,0], [0,-1]], requires_grad=False)/2).type(torch.complex64)
        expectation_more = torch.trace(torch.matmul(final_state_more, j_z))
        expectation_less = torch.trace(torch.matmul(final_state_less, j_z))
        observable = torch.matmul(final_state, j_z)
        variance = torch.trace(torch.matmul(observable, observable)) - torch.trace(observable)**2

        # Below we have the inverse of our target, chosen to minimize our loss

        return torch.tensor([
            (expectation_more - expectation_less).real,
            (2 * self.epsilon * variance).real,
            variance.real,
            observable[0][0],
            observable[0][1],
            observable[1][0],
            observable[1][1],
        ], requires_grad=True)

With everything prepared, we run the optimization loop

In [10]:
model = SensitivityNetwork().cuda()

optimizer = torch.optim.Adam(
    model.parameters(),
    lr=1e-3, #TODO: Optimize the learning rate https://discuss.pytorch.org/t/get-the-best-learning-rate-automatically/58269/4
)

for t in tqdm(range(1000)):

    output = model()
    sensitivity = torch.div(1, output[1]).real
    optimizer.zero_grad()
    sensitivity.backward()
    optimizer.step()

100%|██████████| 1000/1000 [00:09<00:00, 107.30it/s]


In [11]:
# model = SensitivityNetwork().cuda()
#
# optimizer = torch.optim.Adam(
#     model.parameters(),
#     lr=1e-3, #TODO: Optimize the learning rate https://discuss.pytorch.org/t/get-the-best-learning-rate-automatically/58269/4
# )

# output = model()
# sensitivity = torch.div(1, output[1]).real
#
# sensitivity

In [12]:
for name, param in model.named_parameters():
    print(f"{name}: {param.data}")

j_s: tensor([0.9689], device='cuda:0')
delta_s: tensor([0.4526], device='cuda:0')
j_a: tensor([0.4923], device='cuda:0')
delta_a: tensor([0.6583], device='cuda:0')
alpha_xx: tensor([0.9528], device='cuda:0')
alpha_xz: tensor([0.3967], device='cuda:0')
alpha_zx: tensor([0.6671], device='cuda:0')
alpha_zz: tensor([0.1065], device='cuda:0')
time: tensor([10.], device='cuda:0')
ancilla_state: tensor([[1., 0.],
        [0., 0.]], device='cuda:0')


In [13]:
sensitivity

tensor(-6.7109e+11, device='cuda:0', grad_fn=<SelectBackward0>)

In [14]:
output

tensor([ 8.0615e-05+0.0000e+00j, -1.4901e-12+0.0000e+00j,
        -7.4506e-09+0.0000e+00j,  1.9031e-01-1.1548e-07j,
        -3.0608e-01-5.0155e-02j,  3.0608e-01-5.0155e-02j,
        -5.0550e-01+3.7253e-08j], device='cuda:0', requires_grad=True)

In [15]:
obs = torch.tensor([[3.6754e-01-2.9802e-08j, -2.6959e-01-1.2645e-01j], [2.6959e-01-1.2645e-01j, -2.4125e-01-3.9116e-08j]])
obs.cpu()

tensor([[ 0.3675-2.9802e-08j, -0.2696-1.2645e-01j],
        [ 0.2696-1.2645e-01j, -0.2412-3.9116e-08j]])

In [16]:
torch.matmul(obs, obs).cpu()

tensor([[ 0.0464-2.2161e-08j, -0.0340-1.5969e-02j],
        [ 0.0340-1.5969e-02j, -0.0305+1.9064e-08j]])

In [17]:
torch.trace(torch.matmul(obs, obs))

tensor(0.0160-3.0973e-09j, device='cuda:0')

In [18]:
torch.trace(obs)**2

tensor(0.0159-1.7407e-08j, device='cuda:0')

In [19]:
torch.trace(torch.matmul(obs, obs)) - torch.trace(obs)**2

tensor(1.3132e-06+1.4310e-08j, device='cuda:0')