# Example 1: Reinforcement Learning Training Workflow

In [1]:
# Importing Packages

import numpy as np
import matplotlib.pyplot as plt
import time

import jax
import jax.numpy as jnp
from jax import jit, vmap, block_until_ready, config

from algos.ppo import PPO_make_train



In [2]:
# Seeding all Random Number Generation during the RL Training for Reproducibility

seed = 30

rng = jax.random.PRNGKey(seed)
rng, _rng = jax.random.split(rng)

## Photon Langevin Env

For this notebook, we work on the Photon Langevin Environment, where we are trying to improve Single Qubit Readout of a Superconducting Transmon Qubit. The physical system we deal with is of a Transmon Qubit Dispersively Coupled to a Microwave Cavity, ie a Readout Resonator. Treating the Transmon Qubit as a Non-Linear Quantum Harmonic Oscillator (QHO) and the Readout Resonator as a QHO, we can model the dynamics of the open system under an arbitrary input drive pulse on the resonator using Input-Output Theory and the Quantum Langevin Equation.

To simplify simulations and analysis, we use the Coherent Langevin Equations where we assume the Resonator is in a Coherent State, a valid approximation for the regimes labs typically operate in, such as IBM Quantum devices. By solving the differential equations from the Coherent Langevin Equations, we can simulate both the phase and photon population of the Resonator as a function of time. From this we extract figures of merit, such as maximum photon population, maximum fidelity, time to reset the resonator, to pass to our reward function and hence have an overal RL Environment to train on!

In [None]:
# Defining Cairo Params and RL Params

tau_0 = 0.398
kappa = 25.0
chi = 0.65
kerr = 0.002
gamma = 1/100
time_coeff = 10.0
snr_coeff = 10.0
rough_max_photons = 31
rough_max_amp_scaled = 1/0.43
actual_max_photons = rough_max_photons * (1 - jnp.exp(-0.5 * kappa * tau_0))**2
print(f"Rough Max Photons: {rough_max_photons}")
print(f"Actual Max Photons: {actual_max_photons}")
ideal_photon = 1e-2
scaling_factor = 7.5
gamma_I = 1/500
num_t1 = 8.0
photon_gamma = 1/2000
init_fid = 1 - 1e-3

batchsize = 64
num_envs = 8
num_updates = 2500
config = {
    "LR": 3e-3,
    "NUM_ENVS": num_envs,
    "NUM_STEPS": batchsize,
    "NUM_UPDATES": num_updates,
    "UPDATE_EPOCHS": 4,
    "NUM_MINIBATCHES": int(batchsize * num_envs / 64),
    "CLIP_EPS": 0.2,
    "VALUE_CLIP_EPS": 0.2,
    "ENT_COEF": 0.0,
    "VF_COEF": 0.5,
    "MAX_GRAD_NORM": 0.5,
    "ACTIVATION": "relu6",
    "LAYER_SIZE": 256,
    "ENV_NAME": "photon_langevin_readout_env",
    "ANNEAL_LR": False,
    "DEBUG": True,
    "DEBUG_ACTION": False,
    "PRINT_RATE": 100,
    "ACTION_PRINT_RATE": 100,
}