## 1. Scale up your Quantum job

In the previous section we discussed how to adiabatically prepare ordered phases in tightly packed arrays of Rydberg atoms in 2D. If you want to extract certain statistical information about the ordered phases you are studying, or information about the phase transition when preparing a ordered phase, you will likely want to scale the system sizes which you are analyzing. While these dynamical processes can be simulated with classical platforms, that won't happen without a degree of difficulty. At best, you will need to develop expertise in time-dependent tensor networks methods. For some more complicated dynamical scenarios, you may not be able to simulate the process at large scales classically at all. This is when learning how to operate quantum hardware and doing experiments directly becomes imperative.

In this section we will discuss how to use Bloqade to scale up your quantum job to run on real quantum hardware, manage your jobs and analyze your results. There are plenty of subtleties in using a cloud based device but we have your back!

### 1a. Before you start using Hardware

Using a cloud-based quantum device requires a few extra steps to get started. Presently, there are two ways of accessing Aquila, QuEra's analog quantum computer, on the cloud: either on Amazon Braket, or via qBraid's platform (which ultimately utilizes the Amazon Braket service). For regular Braket users you will need to make sure to sign in and authenticate your device to have access to braket if you are running your jobs locally, see [here](https://docs.aws.amazon.com/signin/latest/userguide/what-is-sign-in.html) for more details on how to set up your AWS accounts. Braket accounts for jobs by distinguishing "tasks" - different calls for a problem to be solved, typically each with distinct Hamiltonians - and "shots" - the repetition number of a given task to build statistics. Tasks and jobs are accounted and [priced differently](https://aws.amazon.com/braket/pricing/).

To setup a Bloqade notebook or script to submit jobs from your AWS account, all you will have to do is to initialize your AWS key ID, secret access key, region, and session token (if you are part of an organization that requires it). To set these up as environmental variables in a notebook, just include:

```
%env AWS_ACCESS_KEY_ID = ...
%env AWS_SECRET_ACCESS_KEY = ... 
%env AWS_SESSION_TOKEN = ...
%env AWS_DEFAULT_REGION = us-east-1
```
at the beginning of your notebook, or

```
import os
os.environ["AWS_ACCESS_KEY_ID"] = "..."
os.environ["AWS_SECRET_ACCESS_KEY"] = "..."
os.environ["AWS_SESSION_TOKEN"] = "..."
os.environ["AWS_DEFAULT_REGION"] = "us-east-1"
```
on your python script.

Once you have authenticated your device/environment you should be able to submit tasks, retrieve results, and access hardware capabilities.

> NOTE: due to security reasons, your AWS credentials must be refreshed every little while. While we will be accessing Aquila asynchronously to submit jobs to its queue, remember that you will need to routinely renew your credentials if fetching is done a long time after submission. Also, when submitting large batches, note that the submission process itself may take long and you must be careful with refreshing your credentials.

For Qbraid users, authentication is handled automatically by activating a virtual environment, see [here](https://docs.qbraid.com/projects/lab/en/latest/lab/quantum_jobs.html) for more details. You won't have to worry about refreshing credentials or anything.

### 1b. Scaling up your Quantum job

Now that we covered formal stuff, we can get to the real business. The 2D square lattice checkerboard phase of our previous chapter will serve as our testing base.

Going back to the previous chapter, notice that we set `L=3` to simulate a 3x3 array of Rydberg atoms. If we are interested in studying the physical properties of the checkerboard "phase", a humble goal, this system is likely too small. The checkerboard phase is a _bulk_ property of a system. That means that we can only confidently talk about a "phase" of a material if we are looking far from the boundaries of the system. In a 3x3 system, 1 atom is in the bulk of the square lattice, and 8 are on the boundary. We can hardly talk about bulk properties of the system and surface effects will really dominate phenomenology.

In [None]:
FreeResponseQuestion("exercise31b")

So we will have to prepare phases with many more qubits. While some of this can easily be done classically, realism can be lost, and a certain degree of specialization with high-performance simulation methods will be needed. While those are good, and in fact complementary, we will just aim at doing proper a proper calculation with a real quantum computer - and face all questions regarding errors and loss of fidelities of all kinds, from dynamics to measurement to atom placing. We can increase this to simulate a larger array of atoms on the hardware. After all, that is why they were invented.

That means Aquila can operate up to 256 qubits in a single task. Let's use this power to simulate an adiabatically prepared 11x11 square checkerboard phase with Rydberg atoms. By construction, we can achieve that by simply taking our previous example and setting `L=11`. Revisiting the parameter setting, we have:

In [1]:
# Section imports
from bloqade.atom_arrangement import Square
import numpy as np
from bokeh.io import output_notebook
output_notebook()

# Change the lattice spacing to vary the atom separation a, and thus also Rb/a
delta_end=2*np.pi*6.8 #final detuning
omega_max=2*np.pi*2.5 #max Rabi amplitude
lattice_spacing = 7.0 #size of edges of square lattice

C6 = 2*np.pi * 862690;
Rb = (C6 / (omega_max) )** (1/6) # R_B during bulk of protocol
print("Rb/a: ",Rb/lattice_spacing)

print("Delta/Omega: ", delta_end/omega_max)

Rb/a:  1.1964312624669644
Delta/Omega:  2.72


If it helps, revisit Chapter 2 Sec. 3a to see again how these parameters fix a Hamiltonian well within the checkerboard phase. Also, if you forgot some of the specific hardware constraints, or would like to organize your code and scales according to Aquila's dynamic range, you can us Bloqade's handy `get_capabilities` function to help you. For example:

In [2]:
from bloqade import get_capabilities
from decimal import Decimal

max_rabi = get_capabilities().capabilities.rydberg.global_.rabi_frequency_max
print(max_rabi/Decimal(2*np.pi)) # our standard maximum value for Rabi amplitude in MHz

2.514648100851946403173667416


Note that working with `Decimal` numbers is recommended for Bloqade to maximize precision. For a complete list of what fields are availible see the Bloqade documentation [here](https://bloqade.quera.com/latest/reference/hardware-capabilities/)

### 1c. Defining the Bloqade Program

Let's do the same adiabatic protocol from the previous chapter, but build two different geometries, small and large, so we can draw comparisons. Let's do so via some Bloqade tricks:

In [4]:
sweep_time = 2.4 #time length of the protocol 
rabi_amplitude_values = [0.0, omega_max, omega_max, 0.0]
rabi_detuning_values = [-delta_end, -delta_end, delta_end, delta_end]
durations = [0.8, sweep_time, 0.8]


geometries = {
    1: Square(3, lattice_spacing=lattice_spacing),
    2: Square(11, lattice_spacing=lattice_spacing),
}

prog_list = {
    idx:(geometry.rydberg.rabi.amplitude.uniform.piecewise_linear(durations, rabi_amplitude_values)
    .detuning.uniform.piecewise_linear(durations, rabi_detuning_values) )for idx, geometry in geometries.items()
}

So there we go. We have a list of programs, one for each geometry of interest for comparison purposes.

In [5]:
prog_list[1].parse_register().show()
prog_list[2].parse_register().show()

### 1d. Submission to Hardware

Submitting the program to hardware is just as simple as running on the emulator. The only difference is that the hardware job is submitted to a queue and may take some time to run. Accordingly, a best practice is to call a run using the `run_async` function instead of `run`. The main importance of doing that is that the `run_async` method will return a results object that will contain metadata about about the job(s) that have been submitted to Aquila. This way, you will be able to easily retrieve the corresponding data without mixing files, but also retrieve details about the specific program you requested to run in a specific submission, in case you by chance forget something or mixes file names.

So let's compare the different job submission processes for classical emulation and for Aquila:

_Classical:_
```python
emulation_3x3_10shots_results=prog_list[1].bloqade.python().run(shots=10)
```
_Aquila:_
```python
Aquila_3x3_10shots_results = prog_list[1].braket.aquila().run_async(shots=10)
bloqade.save(Aquila_3x3_10shots_results, "cherkerboard_3x3_10shots.json")
```

To run a job for a 11x11 lattice, all that you have to change is ` prog_list[1]` -> ` prog_list[2]`. But mind you: if you try running a classical emulation of this on Bloqade, you will likely freeze your computer.

The main differences are:
* `.bloqade.python()` -> `.braket.aquila()`, i.e., the Bloqade service is exchanged for the AWS Braket service, and the Python backend is substituted by the Aquila backend
* `.run` -> `.run_async`, for the reasons explained above
* When running on hardware (as well as when doing large simulations), it is imperative to save your data, remembering to use file names that clearly identify the corresponding task. This is done using Bloqade's  `save`. The saved file with the results can be loaded later using the `load` function. The corresponding object contains a list of all the jobs that have been submitted to the hardware along with the task ID which is used to retrieve the results from the hardware.


> Make sure to use best practices for simplicity and clarity when naming your files. You should really follow the same standards you would follow when submitting jobs to a regular high-performance computer cluster, if you ever used one. Clear identification will help you retrieving and analyzing data. For serious experimentation and research, this effective management is crucial for success. Also note that batches take time being submitted to Aquila. A rule of thumb to better manage that is to keep batch sizes to 20-50  tasks.

### 1e. Parallelizing registers

The task submission examples in sub-section 1c correspond to very few shots, to make sure no one is caught with pricey quantum computing runs. In current prices the 1 task and 10 shots would amount to 40 cents on Amazon Braket. But these few shots also limit our statistics. If you go back to Chapter 2 Section 3, you will notice that even our simulations for the 3x3 system were made with 100 shots, which is a better start for accounting to quantum statistics, noise, etc.

Now, 3x3 systems are small enough that they can be parallelized inside Aquila's field of view, effectively increasing our throughput. While one could define parallelized registers by hand, let's use a final Bloqade trick, as the package can help you achieve parallelization and data analysis much more easily.

To parallelize with Bloqade, all we have to do is to figure out what is the distance we'd like between patches. For us, we will use

In [1]:
inter_patch=11.16 #um

and to submit a parallelized job, all we need to change in our call is to write

```python
Aquila_3x3_10shots_9parallelized_results = prog_list[1].parallelize(inter_patch).braket.aquila().run_async(shots=10)
save(Aquila_3x3_10shots_9parallelized_results, "cherkerboard_3x3_10shots_9parallelized.json")
```
That is, incude the `parallelize(D)` method to the program before the choice of service, with `D` being the safety distance between patches in a given program (which is always a function of other energetics such as the Rabi drive).

In [None]:
FreeResponseQuestion("exercise31e")

We will see, in what follows, what are the potential consequences of parallelization, what are good practices to setting up distances, and how to analyze the corresponding data.

## _Submission checklist_

Many details must be chosen correctly to ensure a quantum job does what we want. So here goes a checklist to aid verifying things before committing to spend money on data:

* Is the geometry setup correct? Check:
    * number of atoms
    * correct interatomic distance scales
    * correct shape of the register

* Are the waveform settings correct? Check:
    * shape and maximum values of Rabi drive
    * shape and maximum values of detuning drive
    * general total length of the program

* Are the quantum statistics settings adequate? Check:
    * number of shots/repetitions
    * if parallelization is being used and, if so, if the correct distance between patches was chosen
    
* Are the general submission details correct? Check:
    * if appropriately named files are being created to save the metadata and actual calculation results
    * in case batching multiple tasks, if all of the items above are correct for every single task being automatized
