# Example for execution of multiple circuits in QPUs

Before executing, you must set up and `qraise` the QPUs, check the `README.md` for instructions. For this examples it will be optimal to have more than one QPU and at least one of them with ideal AerSimulator.

### Importing and adding paths to `sys.path`

In [1]:
import os, sys

# path to access c++ files
installation_path = os.getenv("INSTALL_PATH")
sys.path.append(installation_path)

print(installation_path)

/mnt/netapp1/Store_CESGA/home/cesga/jvazquez/works/cunqa/installation


### Let's get the QPUs that we q-raised!

In [2]:
from cunqa import getQPUs

qpus  = getQPUs()

for q in qpus:
    print(f"QPU {q.id}, backend: {q.backend.name}, simulator: {q.backend.simulator}, version: {q.backend.version}.")


[32m	info: Logger created.[0m
[34m	debug: File accessed correctly.[0m
[34m	debug: Object for QPU 0 created correctly.[0m
[34m	debug: Object for QPU 1 created correctly.[0m
[34m	debug: Object for QPU 2 created correctly.[0m
[34m	debug: Object for QPU 3 created correctly.[0m
[34m	debug: Object for QPU 4 created correctly.[0m
[34m	debug: Object for QPU 5 created correctly.[0m
[34m	debug: Object for QPU 6 created correctly.[0m
[34m	debug: Object for QPU 7 created correctly.[0m
[34m	debug: Object for QPU 8 created correctly.[0m
[34m	debug: Object for QPU 9 created correctly.[0m
[34m	debug: 10 QPU objects were created.[0m
QPU 0, backend: BasicAer, simulator: AerSimulator, version: 0.0.1.
QPU 1, backend: BasicAer, simulator: AerSimulator, version: 0.0.1.
QPU 2, backend: BasicAer, simulator: AerSimulator, version: 0.0.1.
QPU 3, backend: BasicAer, simulator: AerSimulator, version: 0.0.1.
QPU 4, backend: BasicAer, simulator: AerSimulator, version: 0.0.1.
QPU 5, backend:

The method `getQPUs()` accesses the information of the raised QPus and instanciates one `qpu.QPU` object for each, returning a list. If you are working with `jupyter notebook` we recomend to instanciate this method just once.

About the `qpu.QPU` objects:

- `QPU.id`: identificator of the virtual QPU, they will be asigned from 0 to n-1.


- `QPU.backend`: object `backend.Backend` that has information about the simulator and backend for the given QPU.


### Let's create a circuit to run in our QPUs!

We can create the circuit using `qiskit` or writting the instructions in the `json` format specific for `cunqa` (check the `README.md`), `OpenQASM2` is also supported. Here we choose not to complicate things and we create a `qiskit.QuantumCircuit`:

In [3]:
from qiskit import QuantumCircuit
from qiskit.circuit.library import QFT

n = 5 # number of qubits

qc = QuantumCircuit(n)

qc.x(0); qc.x(n-1); qc.x(n-2)

qc.append(QFT(n), range(n))

qc.append(QFT(n).inverse(), range(n))

qc.measure_all()

display(qc.draw())

### Execution time! Let's do it sequentially

In [4]:
counts = []

for i, qpu in enumerate(qpus):

    print(f"For QPU {qpu.id}, with backend {qpu.backend.name}:")
    
    # 1)
    qjob = qpu.run(qc, transpile = True, shots = 1000)# non-blocking call

    # 2)
    result = qjob.result() # bloking call

    # 3)
    time = qjob.time_taken()
    counts.append(result.get_counts())

    print(f"Result: \n{result.get_counts()}\n Time taken: {time} s.")

For QPU 0, with backend BasicAer:
[34m	debug: Transpilation done.[0m
[34m	debug: A QuantumCircuit was provided.[0m
[34m	debug: Translating to dict for AerSimulator...[0m
[34m	debug: QJob created.[0m
[34m	debug:  {"config":{"shots": 1000, "method": "statevector", "memory_slots": 5, "seed": 188}, "instructions":[{"name": "x", "qubits": [0], "params": []}, {"name": "x", "qubits": [3], "params": []}, {"name": "u2", "qubits": [4], "params": [-3.141592653589793, -3.141592653589793]}, {"name": "cp", "qubits": [4, 3], "params": [1.5707963267948966]}, {"name": "h", "qubits": [3], "params": []}, {"name": "cp", "qubits": [4, 2], "params": [0.7853981633974483]}, {"name": "cp", "qubits": [3, 2], "params": [1.5707963267948966]}, {"name": "h", "qubits": [2], "params": []}, {"name": "cp", "qubits": [4, 1], "params": [0.39269908169872414]}, {"name": "cp", "qubits": [3, 1], "params": [0.7853981633974483]}, {"name": "cp", "qubits": [2, 1], "params": [1.5707963267948966]}, {"name": "h", "qubits":

[34m	debug: Circuit was sent.[0m
[34m	debug: Qjob submitted to QPU 2.[0m
[34m	debug: Result received: {'backend_name': '', 'backend_version': '', 'date': '', 'job_id': '', 'metadata': {'max_gpu_memory_mb': 0, 'max_memory_mb': 1031551, 'omp_enabled': True, 'parallel_experiments': 1, 'time_taken_execute': 0.000890594, 'time_taken_parameter_binding': 1.1937e-05}, 'qobj_id': '', 'results': [{'data': {'counts': {'0x19': 1000}}, 'metadata': {'active_input_qubits': [0, 1, 2, 3, 4], 'batched_shots_optimization': False, 'device': 'CPU', 'fusion': {'applied': False, 'enabled': True, 'max_fused_qubits': 5, 'threshold': 14}, 'input_qubit_map': [[4, 4], [3, 3], [2, 2], [1, 1], [0, 0]], 'max_memory_mb': 1031551, 'measure_sampling': True, 'method': 'statevector', 'noise': 'ideal', 'num_bind_params': 1, 'num_clbits': 5, 'num_qubits': 5, 'parallel_shots': 1, 'parallel_state_update': 2, 'remapped_qubits': False, 'required_memory_mb': 1, 'runtime_parameter_bind': False, 'sample_measure_time': 0.0002

[34m	debug: Results correctly loaded.[0m
Result: 
{'11001': 1000}
 Time taken: 0.000890106 s.
For QPU 5, with backend BasicAer:
[34m	debug: Transpilation done.[0m
[34m	debug: A QuantumCircuit was provided.[0m
[34m	debug: Translating to dict for AerSimulator...[0m
[34m	debug: QJob created.[0m
[34m	debug:  {"config":{"shots": 1000, "method": "statevector", "memory_slots": 5, "seed": 188}, "instructions":[{"name": "x", "qubits": [0], "params": []}, {"name": "x", "qubits": [3], "params": []}, {"name": "u2", "qubits": [4], "params": [-3.141592653589793, -3.141592653589793]}, {"name": "cp", "qubits": [4, 3], "params": [1.5707963267948966]}, {"name": "h", "qubits": [3], "params": []}, {"name": "cp", "qubits": [4, 2], "params": [0.7853981633974483]}, {"name": "cp", "qubits": [3, 2], "params": [1.5707963267948966]}, {"name": "h", "qubits": [2], "params": []}, {"name": "cp", "qubits": [4, 1], "params": [0.39269908169872414]}, {"name": "cp", "qubits": [3, 1], "params": [0.7853981633974

[34m	debug: Circuit was sent.[0m
[34m	debug: Qjob submitted to QPU 7.[0m
[34m	debug: Result received: {'backend_name': '', 'backend_version': '', 'date': '', 'job_id': '', 'metadata': {'max_gpu_memory_mb': 0, 'max_memory_mb': 1031551, 'omp_enabled': True, 'parallel_experiments': 1, 'time_taken_execute': 0.000888361, 'time_taken_parameter_binding': 1.1392e-05}, 'qobj_id': '', 'results': [{'data': {'counts': {'0x19': 1000}}, 'metadata': {'active_input_qubits': [0, 1, 2, 3, 4], 'batched_shots_optimization': False, 'device': 'CPU', 'fusion': {'applied': False, 'enabled': True, 'max_fused_qubits': 5, 'threshold': 14}, 'input_qubit_map': [[4, 4], [3, 3], [2, 2], [1, 1], [0, 0]], 'max_memory_mb': 1031551, 'measure_sampling': True, 'method': 'statevector', 'noise': 'ideal', 'num_bind_params': 1, 'num_clbits': 5, 'num_qubits': 5, 'parallel_shots': 1, 'parallel_state_update': 2, 'remapped_qubits': False, 'required_memory_mb': 1, 'runtime_parameter_bind': False, 'sample_measure_time': 0.0002

[34m	debug: Results correctly loaded.[0m
Result: 
{'11001': 1000}
 Time taken: 0.000874751 s.


1. First we run the circuit with the method `QPU.run()`, passing the circuit, transpilation options and other run parameters. It is important to note that if we don´t specify `transpilation=True`, default is `False`, therefore the user will be responsible for the tranpilation of the circuit accordingly to the native gates and topology of the backend. This method will return a `qjob.QJob` object. Be aware that the key point is that the `QPU.run()`  method is **asynchronous**.


2. To get the results of the simulation, we apply the method `QJob.result()`, which will return a `qjob.Result` object that stores the information in its class atributes. Depending on the simulator, we will have more or less information. Note that this is a **synchronous** method.


3. Once we have the `qjob.Result` object, we can obtain the counts dictionary by `Result.get_counts()`. Another method independent from the simulator is `Result.time_taken()`, that gives us the time of the simulation in seconds.

In [None]:
%matplotlib inline

from qiskit.visualization import plot_histogram
import matplotlib.pyplot as plt
plot_histogram(counts, figsize = (10, 5), bar_labels=False); plt.legend([f"QPU {i}" for i in range(len(qpus))])
plt.show()
# plt.savefig(f"counts_{len(qpus)}_qpus.png", dpi=200)

### Cool isn't it? But this circuit is too simple, let's try with a more complex one!

In [None]:
import json

# impoting from examples/circuits/
with open("circuits/circuit_15qubits_10layers.json", "r") as file:
    circuit = json.load(file)

We have examples of circuit in `json` format so you can create your own, but as we said, it is not necessary since `qiskit.QuantumCircuit` and `OpenQASM2` are supported.

### This circuit has 15 qubits and 10 intermidiate measurements, let's run it in AerSimulator

In [8]:
for qpu in qpus:
    if qpu.backend.name == "BasicAer":
        qpu0 = qpu
        break

qjob = qpu0.run(circuit, transpile = True, shots = 1000)

result = qjob.result() # bloking call

time = qjob.time_taken()

counts.append(result.get_counts())

print(f"Result: Time taken: {time} s.")

Result: Time taken: 13.459400542 s.


### Takes much longer ... let's parallelize n executions in n different QPUs

Remenber that sending circuits to a given QPU is a **non-blocking call**, so we can use a loop, keeping the `qjob.QJob` objects in a list.

Then, we can wait for all the jobs to finish with the `qjob.gather()` function. Let's measure time to check that we are parallelizing:

In [9]:
import time
from cunqa import gather

qjobs = []
n = len(qpus)

tick = time.time()

for qpu in qpus:
    qjobs.append(qpu.run(circuit, transpile = True, shots = 1000))
    
results = gather(qjobs) # this is a bloking call
tack = time.time()

In [10]:
print(f"Time taken to run {n} circuits in parallel: {tack - tick} s.")
print("Time for each execution:")
for i, result in enumerate(results):
    print(f"For QJob {i}, time taken: {result.time_taken} s.")

Time taken to run 2 circuits in parallel: 13.609957695007324 s.
Time for each execution:
For QJob 0, time taken: 13.411032974 s.
For QJob 1, time taken: 13.45488732 s.


Looking at the times we confirm that the circuits were run in parallel.