# <center>Reinforcement Learning to Disentangle Multiqubit Quantum States<br>from Partial Observations</center>
---
### <center>*Interactive Demos*</center>

This notebook demonstrates the disentangling abilities of our RL agents. We choose to show only the 4 and 5-qubit agents, because they produce short enough circuits for this presentation format. Because the agents policy is modeled with Transformer architecture, we decided to show the attention scores (from the 4 qubit agent only). In each Episode Step you can apply a single qubit or two-qubit rotation to the current state. This lets you explore the generalization capabilities of the Deep Reinforcement Learning framework. (Note that the rotations are not shown in the Quantum Circuit). The amplitudes of the state are also visible and modifiable trough text boxes (Click on `Set` button to modify). The policy of the agent is shown as bar chart. Below the policy (negative Y) we show for each action $(i,j)$ what the average reduction in entanglement

$$\Delta S_\mathrm{avg} = \frac{1}{L}\sum_{i=1}^{i=L}{\Delta S_\mathrm{ent}(\rho^{(i)})}$$

would be if the agent were to take action $(i,j)$. Using this 
information, it can be seen that our agent is not greedy - there are examples where the taken action is not the one that minimizes $\Delta S_\mathrm{avg}$. You can also input your own states to the agent. See the cells below

In [None]:
%matplotlib widget

import numpy as np
from demo_impl import start_demo_4q, start_demo_5q

#### Use this cell to extend the dropdown menu with your custom inital states
There are default initial states also, for example `|0>|Bell>|0>` stands for $|0_1\rangle|\mathrm{Bell_{23}}\rangle|0_4\rangle$, `|GHZ>|Bell>` for $\mathrm{|GHZ_{123}\rangle|Bell_{45}\rangle}$, `|RRRR>|R>` for $\mathrm{|R_{1234}\rangle|R_5\rangle}$ and so on...

In [None]:
# States are normalized and converted to np.complex64 automatically
# You can add both 4q and 5q states here - they will appear
# in the correct corresponding demo below, respectively
my_initial_states = {
    "My4qState": np.random.randn(16) + 1j * np.random.randn(16),
    "My5qState": np.random.randn(32) + 1j * np.random.randn(32),
}

### Demo with 4-qubit agent

- The Quantum Circuit is show in top-left, the policy of the agent (positive Y) and average entanglement reductions $\Delta S_\mathrm{avg}$ for each step $(i,j)$ (negative Y) is show in bottom-left and attention scores are in right subfigure.

- Use `Step` button to advance the episode step. The taken action is always the one with highest probability.
- Use `Undo` button to remove the last applied gate (undo taken action).
- Use `Reset` button to clear everything and go back to the selected state in the $\mathrm{Initial\ state}$ dropdown menu
- Use `Set` to apply the modifications to the z-basis amplitudes.
- Use **Single qubit rotation** to apply a rotation over qubit $i$ (selected again trough dropdown menu). You cannot "chain" rotations - you can only rotate around one axis per qubit $i$ (at current $\mathrm{Episode\ Step}$).
- Use **Two-qubit rotation** to apply a rotation over qubits $(i,j)$. Again, you cannot apply a series of rotations in the current $\mathrm{Episode\ Step}$).
- The entanglement of each qibit $S_\mathrm{ent}(\rho^{(i)})$ with the rest of the system is shown in the status box **Single qubit entanglements**. Whenever a qubit is disentangled, it's numeric value turns <mark style="background-color:lightgreen;">green</mark> 

In [None]:
start_demo_4q(my_initial_states)

### Demo with 5-qubit agent
- Instructions are the same as in the 4-qubit case. Here, attention scores are not show simply because they take too much space.

In [None]:
start_demo_5q(my_initial_states)