In [None]:
# Setup: install Qiskit (runs automatically in Colab, no-op in Binder)
!pip install -q qiskit qiskit-aer qiskit-ibm-runtime pylatexenc

In [None]:
# Additional dependencies for this notebook
!pip install -q qiskit-serverless

# Transpilations-Optimierunge met SABRE
*Verwendungsschätzung: winniger wie en Menutt op enem Heron r2 Prozessor (HINWIES: Dat es nor en Schätzung. Ding Laufzick kann anders sin.)*
## Hintergrund
Transpilation es ene wichtige Schritt en Qiskit, dä Quanteschaltkreise en Forme ömwandelt, die met spezieller Quantehardware kompatibel sin. Dat ömfasst zwei Hauptphasen: **Qubit-Layout** (Abbeldung logischer Qubits op physikalische Qubits op dem Jerät) un **Gate-Routing** (sicherstelle, dat Multi-Qubit-Gates de Jerätkonnektivität respektiere, indem mer bei Bedarf SWAP-Gates enfööt).

SABRE (*SWAP-Based Bidirectional heuristic search algorithm*) es en mächtiges Optimierungs-Werkzeug för Layout un Routing beides. Et es besonders effektiv för **jroße Schaltkreise** (100+ Qubits) un Jeräte met komplexe Coupling-Kaate, wie dem **IBM&reg; Heron**, wo dat exponentielle Wachstum en de mögliche Qubit-Abbildunge effiziente Lösunge verlangt.

### Woröm SABRE bruche?
SABRE minimiert de Zahl vun SWAP-Gates un reduziert de Schaltkreisdeefe, wat de Schaltkreisleistung op echter Hardware verbessert. Sing heuristische Herangehensweise mäht et ideal för fortgeschrittene Hardware un jroße, komplexe Schaltkreise. Jüngste Verbesserunge, die em [LightSABRE](https://arxiv.org/abs/2409.08368)-Algorithmus engeföht woodte, optimiere SABREs Leistung wigger un bede schnellere Laufzicke un winniger SWAP-Gates. Diese Verbesserunge maache et noch effektiver för jroße Schaltkreise.

### Wat liers do hee
Dat Tutorial es en zwei Deele opjedelt:
1. Liere, wie mer SABRE met **Qiskit-Mustern** för fortgeschrittene Optimierung jroßer Schaltkreise brucht.
2. **qiskit_serverless** nutze, öm SABREs Potenzial för skalierbare un effiziente Transpilation ze maximiere.

Do wees:
- SABRE för Schaltkreise met 100+ Qubits optimiere un dobei Standardeinstellunge wie `optimization_level=3` öbertreffe.
- **LightSABRE-Verbesserunge** entdecke, die Laufzick verbessere un Gate-Zahle reduziere.
- Wichtige SABRE-Parameter (`swap_trials`, `layout_trials`, `max_iterations`, `heuristic`) aanpasse, öm **Schaltkreisqualität** un **Transpilations-Laufzick** ze balanciere.
## Aanforderunge
Bevör do met däm Tutorial aanfängst, stell secher, dat do Folgendes installiert häs:
- Qiskit SDK v1.0 oder später, met [Visualisierungs](https://docs.quantum.ibm.com/api/qiskit/visualization)-Unnerstötzung
- Qiskit Runtime v0.28 oder später (`pip install qiskit-ibm-runtime`)
- Serverless (`pip install qiskit-ibm-catalog qiskit_serverless`)
## Opstellung

In [1]:
from qiskit import QuantumCircuit
from qiskit.quantum_info import SparsePauliOp
from qiskit_ibm_catalog import QiskitServerless, QiskitFunction
from qiskit_ibm_runtime import QiskitRuntimeService
from qiskit_ibm_runtime import EstimatorOptions
from qiskit_ibm_runtime import EstimatorV2 as Estimator
from qiskit.transpiler import CouplingMap
from qiskit.transpiler.passes import SabreLayout, SabreSwap
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
import matplotlib.pyplot as plt
import numpy as np
import time

## Deel I. SABRE met Qiskit-Mustern bruche

SABRE kann en Qiskit jebruch weede, öm Quanteschaltkreise ze optimiere, indem et sowohl Qubit-Layout wie och Gate-Routing handhaabt. En däm Abschnitt föhre mir dech durch et **Minimal-Beispill** vun SABRE met Qiskit-Mustern, met dem Hauptfokus op Schritt 2 vun der Optimierung.

Öm SABRE ze lofe ze brenge, bruchs do:
- En **DAG** (Directed Acyclic Graph)-Darstellung vun dingem Quanteschaltkreis.
- De **Coupling-Kaate** vum Backend, die aanjitt, wie Qubits physikalisch verbunge sin.
- Dat **SABRE-Pass**, dat den Algorithmus aanwendt, öm Layout un Routing ze optimiere.

För dä Deel konzentriere mir uns op et **SabreLayout**-Pass. Et fööt sowohl Layout- wie och Routing-Versuche durch un arbeit dran, dat effizienteste Aanfangslayout ze finge, während et de Zahl vun SWAP-Gates minimiert. Wichtig es, dat `SabreLayout` för sich allein intern sowohl Layout wie och Routing optimiert, indem et de Lösung speichert, die de winnichste SWAP-Gates dazudoht. Merk, dat mer beim Bruche von nur **SabreLayout** de Heuristik vun SABRE nit ändere künne, ävver mir künne de Zahl vun `layout_trials` aanpasse.

### Schritt 1: Klassische Engänge op en Quanteproblem abbilden

En **GHZ (Greenberger-Horne-Zeilinger)**-Schaltkreis es en Quanteschaltkreis, dä ene verschränkte Zostand vorbereitet, wo all Qubits entweder em `|0...0⟩`- oder `|1...1⟩`-Zostand sin. Der GHZ-Zostand för $n$ Qubits weed mathematisch dajesteht als:
$$ |\text{GHZ}\rangle = \frac{1}{\sqrt{2}} \left( |0\rangle^{\otimes n} + |1\rangle^{\otimes n} \right) $$

Et weed konstruiert, indem mer:
1. En Hadamard-Gate op dat irschte Qubit aanwende, öm Superposition ze kreiere.
2. En Serie vun CNOT-Gates aanwende, öm de üvvrige Qubits met dem irschte ze verschränke.

För dat Beispill konstruiere mir absichtlich ene **Stern-Topologie-GHZ-Schaltkreis** anstatt enem linearen. En der Stern-Topologie fungiert dat irschte Qubit als "Knotepunkt", un all andere Qubits weede direkt met ihm durch CNOT-Gates verschränkt. Diese Wahl es absichtlich, weil, während der **lineare Topologie-GHZ-Zostand** theoretisch en $ O(N) $-Deefe op ener linearen Coupling-Kaate ohne SWAP-Gates emgesetzt weede kann, würd SABRE trivial en optimale Lösung finge, indem et ene 100-Qubit-GHZ-Schaltkreis op ene Delgraph vun der Heavy-Hex-Coupling-Kaate vum Backend abbildt.

Dä **Stern-Topologie-GHZ-Schaltkreis** stellt en bedütend schwierigeres Problem dar. Obwohl et emmer noch theoretisch en $ O(N) $-Deefe ohne SWAP-Gates ömjesetzt weede kann, erfordert et Finge vun dä Lösung de Identifikation enem optimale Aanfangslayout, wat vill schwieriger es wäjen der nit-linearen Konnektivität vum Schaltkreis. Diese Topologie dient als bessere Testfall för SABREs Bewertung, weil et zeigt, wie Konfigurationsparameter Layout- un Routing-Leistung ongk komplexere Bedingunge beeflusse.

![ghz_star_topology.png](../docs/images/tutorials/transpilation-optimizations-with-sabre/ghz_star_topology.avif)

Bemerkensweht:
- Dat **HighLevelSynthesis**-Werkzeug kann de optimale $ O(N) $-Deefe-Lösung för dä Stern-Topologie-GHZ-Schaltkreis ohne Enföhrung vun SWAP-Gates produziere, wie em Beld obe jezeigt.
- Alternativ kann dat **StarPrerouting**-Pass de Deefe wigger reduziere, indem et SABREs Routing-Entscheidunge leitet, obwohl et emmer noch einige SWAP-Gates enföhre künnt. Allerdings erhöht StarPrerouting de Laufzick un erfordert Integration en dä initiale Transpilationsprozess.

För de Zwecke vun däm Tutorial schließe mir sowohl HighLevelSynthesis wie och StarPrerouting us, öm de direkte Auswirkung vun SABRE-Konfiguration op Laufzick un Schaltkreisdeefe ze isoliere un ze beleuchte. Indem mir dä Erwartungswäht $ \langle Z_0 Z_i \rangle $ för jedes Qubit-Paar messe, analysiere mir:
- Wie jood SABRE SWAP-Gates un Schaltkreisdeefe reduziert.
- De Auswirkung vun dise Optimierunge op de Fidelität vum ömjesetzten Schaltkreis, wo Abweichunge vun $ \langle Z_0 Z_i \rangle = 1 $ Verschränkungsverloß aanzeige.!

In [2]:
# set seed for reproducibility
seed = 42
num_qubits = 110

# Create GHZ circuit
qc = QuantumCircuit(num_qubits)
qc.h(0)
for i in range(1, num_qubits):
    qc.cx(0, i)

qc.measure_all()

Als Nächstes weede mir de Operatore vun Interesse abbilden, öm dat Verhalte vum System ze bewerte. Speziell weede mir `ZZ`-Operatore zwischen Qubits bruche, öm ze ungersöke, wie de Verschränkung degradiert, wenn de Qubits wigger ussenein ligge. Diese Analyse es kritisch, weil Ungenauichkeite en de Erwartungswähte  $\langle Z_0 Z_i \rangle$ för wiet ussenein liggende Qubits de Auswirkung vun Rausche un Fehler en der Schaltkreisömsetzing offenbare künne. Indem mir diese Abweichunge studiere, gewenne mir Ensichte, wie jood dä Schaltkreis Verschränkung ongk verschiedene SABRE-Konfiguratione bewaahrt un wie effektiv SABRE de Auswirkung vun Hardwarebeschränkunge minimiert.

In [3]:
# ZZII...II, ZIZI...II, ... , ZIII...IZ
operator_strings = [
    "Z" + "I" * i + "Z" + "I" * (num_qubits - 2 - i)
    for i in range(num_qubits - 1)
]
print(operator_strings)
print(len(operator_strings))

operators = [SparsePauliOp(operator) for operator in operator_strings]

['ZZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIIIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII', 'ZIIIIIIIIZIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII

### Step 2: Optimize problem for quantum hardware execution

In this step, we focus on optimizing the circuit layout for execution on a specific quantum hardware device with 127 qubits. This is the main focus of the tutorial, as we perform **SABRE optimizations and transpilation** to achieve the best circuit performance. Using the `SabreLayout` pass, we determine an initial qubit mapping that minimizes the need for SWAP gates during routing. By passing the `coupling_map` of the target backend, `SabreLayout` adapts the layout to the device's connectivity constraints.

We will use `generate_preset_pass_manager` with `optimization_level=3` for the transpilation process and customize the `SabreLayout` pass with different configurations. The goal is to find a setup that produces a transpiled circuit with the **lowest size and/or depth**, demonstrating the impact of SABRE optimizations.

#### Why Are Circuit Size and Depth Important?

- **Lower size (gate count):** Reduces the number of operations, minimizing opportunities for errors to accumulate.
- **Lower depth:** Shortens the overall execution time, which is critical for avoiding decoherence and maintaining quantum state fidelity.

By optimizing these metrics, we improve the circuit’s reliability and execution accuracy on noisy quantum hardware.

Select the backend.

In [4]:
service = QiskitRuntimeService()
# backend = service.least_busy(
#    operational=True, simulator=False, min_num_qubits=127
# )
backend = service.backend("ibm_boston")
print(f"Using backend: {backend.name}")

Using backend: ibm_boston


### Schritt 2: Problem för Quantehardware-Ömsetung optimiere
En däm Schritt konzentriere mir uns drop, dat Schaltkreislayout för de Ömsetung op enem spezifische Quantehardware-Jerät met 127 Qubits ze optimiere. Dat es der Hauptfokus vum Tutorial, weil mir **SABRE-Optimierunge un Transpilation** durchföhre, öm de beste Schaltkreisleistung ze erziehle. Met dem `SabreLayout`-Pass bestimme mir en Aanfangs-Qubit-Abbildung, die dä Bedarf för SWAP-Gates beim Routing minimiert. Indem mir de `coupling_map` vum Zeel-Backend üvvergäve, passt sich `SabreLayout` de Layoutbeschränkunge vum Jerät aan.

Mir weede `generate_preset_pass_manager` met `optimization_level=3` för dä Transpilationsprozess bruche un dat `SabreLayout`-Pass met verschiedene Konfiguratione aanpasse. Dat Ziel es et, en Opstellung ze finge, die ene transpilierte Schaltkreis met de **winnichste Jrößß un/oder Deefe** produziert, un dobei de Auswirkung vun SABRE-Optimierunge zeigt.

#### Woröm sin Schaltkreisjröße un Deefe wichtig?
- **Winniger Jröße (Gate-Zahl):** Reduziert de Zahl vun Operatione un minimiert de Jeleehenheete för Fehleranhäufung.
- **Winniger Deefe:** Verkööt de jesampte Ömsetungszick, wat kritisch es, öm Dekohärenz ze vermigge un Quantezustandsfidelität ze bewaahre.

Indem mir diese Metrike optimiere, verbessere mir de Zuverlässichkeit un Ömsetungsgenauichkeit vum Schaltkreis op rauschiger Quantehardware.
Wähl dat Backend us.

In [5]:
# Get the coupling map from the backend
cmap = CouplingMap(backend().configuration().coupling_map)

# Create the SabreLayout passes for the custom configurations
sl_2 = SabreLayout(
    coupling_map=cmap,
    seed=seed,
    max_iterations=4,
    layout_trials=200,
    swap_trials=200,
)
sl_3 = SabreLayout(
    coupling_map=cmap,
    seed=seed,
    max_iterations=8,
    layout_trials=200,
    swap_trials=200,
)

# Create the pass managers, need to first create then configure the SabreLayout passes
pm_1 = generate_preset_pass_manager(
    optimization_level=3, backend=backend, seed_transpiler=seed
)
pm_2 = generate_preset_pass_manager(
    optimization_level=3, backend=backend, seed_transpiler=seed
)
pm_3 = generate_preset_pass_manager(
    optimization_level=3, backend=backend, seed_transpiler=seed
)

Now we can configure the `SabreLayout` pass in the custom pass managers. To do this we know that for the default `generate_preset_pass_manager` on `optimization_level=3`, the `SabreLayout` pass is at index 2, as `SabreLayout` occurs after `SetLayout` and `VF2Laout` passes. We can access this pass and modify its parameters.

In [6]:
pm_2.layout.replace(index=2, passes=sl_2)
pm_3.layout.replace(index=2, passes=sl_3)

Öm de Auswirkung vun verschiedene Konfiguratione op Schaltkreisoptimierung ze bewerte, weede mir drei Pass-Manager erstelle, jeder met einzigartige Einstellunge för dat `SabreLayout`-Pass. Diese Konfiguratione helfe, dä Kompromiss zwischen Schaltkreisqualität un Transpilationszick ze analysiere.

#### Wichtige Parameter
- **`max_iterations`**: De Zahl vun Vörwärts-Röckwärts-Routing-Iteratione, öm dat Layout ze verfeinere un Routing-Koste ze reduziere.
- **`layout_trials`**: De Zahl vun zufällige Aanfangslayouts, die jetestet weede, wobei dat usgewählt weed, dat SWAP-Gates minimiert.
- **`swap_trials`**: De Zahl vun Routing-Versuche för jedes Layout, die Gate-Platzierung för bessere Routing verfeinere.

Erhöh `layout_trials` un `swap_trials`, öm en gründlichere Optimierung durchzeföhre, op Koste vun erhöhter Transpilationszick.

#### Konfiguratione en däm Tutorial
1. **`pm_1`**: Standardeinstellunge met `optimization_level=3`.
   - `max_iterations=4`
   - `layout_trials=20`
   - `swap_trials=20`

2. **`pm_2`**: Erhöht de Zahl vun Versuche för bessere Erkundung.
   - `max_iterations=4`
   - `layout_trials=200`
   - `swap_trials=200`

3. **`pm_3`**: Erweidert `pm_2`, indem et de Zahl vun Iteratione för wigger Verfeinerung erhöht.
   - `max_iterations=8`
   - `layout_trials=200`
   - `swap_trials=200`

Indem mir de Ergebnisse vun dise Konfiguratione verglieche, wolle mir bestimme, welch dat beste Jlichgewicht zwischen Schaltkreisqualität (zum Beispill Jröße un Deefe) un Berechnungskoste erreicht.

In [7]:
# Transpile the circuit with each pass manager and measure the time
t0 = time.time()
tqc_1 = pm_1.run(qc)
t1 = time.time() - t0
t0 = time.time()
tqc_2 = pm_2.run(qc)
t2 = time.time() - t0
t0 = time.time()
tqc_3 = pm_3.run(qc)
t3 = time.time() - t0

# Obtain the depths and the total number of gates (circuit size)
depth_1 = tqc_1.depth(lambda x: x.operation.num_qubits == 2)
depth_2 = tqc_2.depth(lambda x: x.operation.num_qubits == 2)
depth_3 = tqc_3.depth(lambda x: x.operation.num_qubits == 2)
size_1 = tqc_1.size()
size_2 = tqc_2.size()
size_3 = tqc_3.size()

# Transform the observables to match the backend's ISA
operators_list_1 = [op.apply_layout(tqc_1.layout) for op in operators]
operators_list_2 = [op.apply_layout(tqc_2.layout) for op in operators]
operators_list_3 = [op.apply_layout(tqc_3.layout) for op in operators]

# Compute improvements compared to pass manager 1 (default)
depth_improvement_2 = ((depth_1 - depth_2) / depth_1) * 100
depth_improvement_3 = ((depth_1 - depth_3) / depth_1) * 100
size_improvement_2 = ((size_1 - size_2) / size_1) * 100
size_improvement_3 = ((size_1 - size_3) / size_1) * 100
time_increase_2 = ((t2 - t1) / t1) * 100
time_increase_3 = ((t3 - t1) / t1) * 100

print(
    f"Pass manager 1 (4,20,20)  : Depth {depth_1}, Size {size_1}, Time {t1:.4f} s"
)
print(
    f"Pass manager 2 (4,200,200): Depth {depth_2}, Size {size_2}, Time {t2:.4f} s"
)
print(f"  - Depth improvement: {depth_improvement_2:.2f}%")
print(f"  - Size improvement: {size_improvement_2:.2f}%")
print(f"  - Time increase: {time_increase_2:.2f}%")
print(
    f"Pass manager 3 (8,200,200): Depth {depth_3}, Size {size_3}, Time {t3:.4f} s"
)
print(f"  - Depth improvement: {depth_improvement_3:.2f}%")
print(f"  - Size improvement: {size_improvement_3:.2f}%")
print(f"  - Time increase: {time_increase_3:.2f}%")

Pass manager 1 (4,20,20)  : Depth 439, Size 2346, Time 0.5775 s
Pass manager 2 (4,200,200): Depth 395, Size 2070, Time 3.9927 s
  - Depth improvement: 10.02%
  - Size improvement: 11.76%
  - Time increase: 591.43%
Pass manager 3 (8,200,200): Depth 375, Size 1873, Time 2.3079 s
  - Depth improvement: 14.58%
  - Size improvement: 20.16%
  - Time increase: 299.67%


Jetz künne mir dat `SabreLayout`-Pass en de benutzerdefinierte Pass-Manager konfiguriere. Doför wesse mir, dat för dä Standard-`generate_preset_pass_manager` op `optimization_level=3` dat `SabreLayout`-Pass op Index 2 es, weil `SabreLayout` no `SetLayout` un `VF2Layout`-Passes kütt. Mir künne op dat Pass zugrieffe un sing Parameter modifiziere.

In [8]:
# Plot the results of the metrics
times = [t1, t2, t3]
depths = [depth_1, depth_2, depth_3]
sizes = [size_1, size_2, size_3]
pm_names = [
    "pm_1 (4 iter, 20 trials)",
    "pm_2 (4 iter, 200 trials)",
    "pm_3 (8 iter, 200 trials)",
]
colors = plt.cm.viridis(np.linspace(0.2, 0.8, len(pm_names)))

# Create a figure with three subplots
fig, axs = plt.subplots(3, 1, figsize=(6, 9), sharex=True)
axs[0].bar(pm_names, times, color=colors)
axs[0].set_ylabel("Time (s)", fontsize=12)
axs[0].set_title("Transpilation Time", fontsize=14)
axs[0].grid(axis="y", linestyle="--", alpha=0.7)
axs[1].bar(pm_names, depths, color=colors)
axs[1].set_ylabel("Depth", fontsize=12)
axs[1].set_title("Circuit Depth", fontsize=14)
axs[1].grid(axis="y", linestyle="--", alpha=0.7)
axs[2].bar(pm_names, sizes, color=colors)
axs[2].set_ylabel("Size", fontsize=12)
axs[2].set_title("Circuit Size", fontsize=14)
axs[2].set_xticks(range(len(pm_names)))
axs[2].set_xticklabels(pm_names, fontsize=10, rotation=15)
axs[2].grid(axis="y", linestyle="--", alpha=0.7)

# Add some spacing between subplots
plt.tight_layout()
plt.show()

<Image src="../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/818a8997-d2c7-4661-a6ea-f58eac376bf8-0.avif" alt="Output of the previous code cell" />

Met jedem konfigurierte Pass-Manager weede mir jetz dä Transpilationsprozess för jedes usföhre. Öm Ergebnisse ze vergliche, weede mir wichtige Metrike verfolge, daronger de Transpilationszick, de Deefe vum Schaltkreis (jemesse als Zwei-Qubit-Gate-Deefe) un de jesampte Zahl vun Gates en de transpilierte Schaltkreise

In [9]:
options = EstimatorOptions()
options.resilience_level = 2
options.dynamical_decoupling.enable = True
options.dynamical_decoupling.sequence_type = "XY4"

# Create an Estimator object
estimator = Estimator(backend, options=options)

In [10]:
# Submit the circuit to Estimator
job_1 = estimator.run([(tqc_1, operators_list_1)])
job_1_id = job_1.job_id()
print(job_1_id)

job_2 = estimator.run([(tqc_2, operators_list_2)])
job_2_id = job_2.job_id()
print(job_2_id)

job_3 = estimator.run([(tqc_3, operators_list_3)])
job_3_id = job_3.job_id()
print(job_3_id)

d5k0qs7853es738dab6g
d5k0qsf853es738dab70
d5k0qsf853es738dab7g


In [11]:
# Run the jobs
result_1 = job_1.result()[0]
print("Job 1 done")
result_2 = job_2.result()[0]
print("Job 2 done")
result_3 = job_3.result()[0]
print("Job 3 done")

Job 1 done
Job 2 done
Job 3 done


![Output of the previous code cell](../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/818a8997-d2c7-4661-a6ea-f58eac376bf8-0.avif)

### Schritt 3: Met Qiskit-Primitives ömsetze
En däm Schritt bruche mir dat `Estimator`-Primitive, öm de Erwartungswähte $\langle Z_0 Z_i \rangle$ för de `ZZ`-Operatore ze berechne, un bewerte de Verschränkung un Ömsetungsqualität vun de transpilierte Schaltkreise. Öm uns aan typische Benutzer-Workflows aanzupasse, ovvergäve mir dä Job för de Ömsetung un wende Fehlerungerdrückung met **dynamischer Entkopplung** aan, ener Technik, die Dekohärenz mildert, indem se Gate-Sequenze enfööt, öm Qubit-Zustände ze erhalte. Zusätzlich lääje mir en Resilience-Level fess, öm Rausche entjegenzewirke, wobei höhere Levels jenauere Ergebnisse op Koste vun erhöhter Verarbeitungszick bede. Dä Aansatz bewertet de Leistung vun jeder Pass-Manager-Konfiguration ongk realistische Ömsetungsbedingunge.

In [12]:
data = list(range(1, len(operators) + 1))  # Distance between the Z operators

values_1 = list(result_1.data.evs)
values_2 = list(result_2.data.evs)
values_3 = list(result_3.data.evs)

plt.plot(
    data,
    values_1,
    marker="o",
    label="pm_1 (iters=4, swap_trials=20, layout_trials=20)",
)
plt.plot(
    data,
    values_2,
    marker="s",
    label="pm_2 (iters=4, swap_trials=200, layout_trials=200)",
)
plt.plot(
    data,
    values_3,
    marker="^",
    label="pm_3 (iters=8, swap_trials=200, layout_trials=200)",
)
plt.xlabel("Distance between qubits $i$")
plt.ylabel(r"$\langle Z_i Z_0 \rangle / \langle Z_1 Z_0 \rangle $")
plt.legend()
plt.show()

<Image src="../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/bc6cb36f-4bf2-4275-baf5-9557fcba520a-0.avif" alt="Output of the previous code cell" />

### Analysis of Results

The plot shows the expectation values $\langle Z_0 Z_i \rangle / \langle Z_0 Z_0 \rangle$  as a function of the distance between qubits for three pass manager configurations with increasing levels of optimization. In the ideal case, these values remain close to 1, indicating strong correlations across the circuit. As the distance increases, noise and accumulated errors lead to a decay in correlations, revealing how well each transpilation strategy preserves the underlying structure of the state.

Among the three configurations, `pm_1` clearly performs the worst. Its correlation values decay rapidly as the distance increases and approach zero much earlier than the other two configurations. This behavior is consistent with its larger circuit depth and gate count, where accumulated noise quickly degrades long-range correlations.

Both `pm_2` and `pm_3` represent significant improvements over `pm_1` across essentially all distances. On average, `pm_3` exhibits the strongest overall performance, maintaining higher correlation values over longer distances and showing a more gradual decay. This aligns with its more aggressive optimization, which produces shallower circuits that are generally more robust to noise accumulation.

That said, `pm_2` shows noticeably better accuracy at short distances compared to `pm_3`, despite having a slightly larger depth and gate count. This suggests that circuit depth alone does not fully determine performance; the specific structure produced by the transpilation, including how entangling gates are arranged and how errors propagate through the circuit, also plays an important role. In some cases, the transformations applied by `pm_2` appear to better preserve local correlations, even if they do not scale as well to longer distances.

Taken together, these results highlight a trade-off between circuit compactness and circuit structure. While increased optimization generally improves long-range stability, the best performance for a given observable depends on both reducing circuit depth and producing a structure that is well matched to the noise characteristics of the hardware.

## Part II. Configuring the heuristic in SABRE and using Serverless

In addition to adjusting trial numbers, SABRE supports customization of the routing heuristic used during transpilation. By default, `SabreLayout` employs the decay heuristic, which dynamically weights qubits based on their likelihood of being swapped. To use a different heuristic (such as the `lookahead` heuristic), you can create a custom `SabreSwap` pass and connect it to `SabreLayout` by running a `PassManager` with `FullAncillaAllocation`, `EnlargeWithAncilla`, and `ApplyLayout`. When using `SabreSwap` as a parameter for `SabreLayout`, only one layout trial is performed by default. To efficiently run multiple layout trials, we leverage the serverless runtime for parallelization. For more about serverless, see the [Serverless documentation](/docs/guides/serverless).

### How to Change the Routing Heuristic
1. Create a custom `SabreSwap` pass with the desired heuristic.
2. Use this custom `SabreSwap` as the routing method for the `SabreLayout` pass.

While it is possible to run multiple layout trials using a loop, serverless runtime is the better choice for large-scale and more vigorous experiments. Serverless supports parallel execution of layout trials, significantly speeding up the optimization of larger circuits and large experimental sweeps. This makes it especially valuable when working with resource-intensive tasks or when time efficiency is critical.

This section focuses solely on step 2 of optimization: minimizing circuit size and depth to achieve the best possible transpiled circuit. Building on the earlier results, we now explore how heuristic customization and serverless parallelization can further enhance optimization performance, making it suitable for large-scale quantum circuit transpilation.

### Results without serverless runtime (1 layout trial):

In [17]:
swap_trials = 1000

# Default PassManager with `SabreLayout` and `SabreSwap`, using heuristic "decay"
sr_default = SabreSwap(
    coupling_map=cmap, heuristic="decay", trials=swap_trials, seed=seed
)
sl_default = SabreLayout(
    coupling_map=cmap, routing_pass=sr_default, seed=seed
)
pm_default = generate_preset_pass_manager(
    optimization_level=3, backend=backend, seed_transpiler=seed
)
pm_default.layout.replace(index=2, passes=sl_default)
pm_default.routing.replace(index=1, passes=sr_default)

t0 = time.time()
tqc_default = pm_default.run(qc)
t_default = time.time() - t0
size_default = tqc_default.size()
depth_default = tqc_default.depth(lambda x: x.operation.num_qubits == 2)


# Custom PassManager with `SabreLayout` and `SabreSwap`, using heuristic "lookahead"
sr_custom = SabreSwap(
    coupling_map=cmap, heuristic="lookahead", trials=swap_trials, seed=seed
)
sl_custom = SabreLayout(coupling_map=cmap, routing_pass=sr_custom, seed=seed)
pm_custom = generate_preset_pass_manager(
    optimization_level=3, backend=backend, seed_transpiler=seed
)
pm_custom.layout.replace(index=2, passes=sl_custom)
pm_custom.routing.replace(index=1, passes=sr_custom)

t0 = time.time()
tqc_custom = pm_custom.run(qc)
t_custom = time.time() - t0
size_custom = tqc_custom.size()
depth_custom = tqc_custom.depth(lambda x: x.operation.num_qubits == 2)

print(
    f"Default (heuristic='decay')    : Depth {depth_default}, Size {size_default}, Time {t_default}"
)
print(
    f"Custom  (heuristic='lookahead'): Depth {depth_custom}, Size {size_custom}, Time {t_custom}"
)

Default (heuristic='decay')    : Depth 443, Size 3115, Time 1.034372091293335
Custom  (heuristic='lookahead'): Depth 432, Size 2856, Time 0.6669301986694336


Here we see that the `lookahead` heuristic performs better than the `decay` heuristic in terms of circuit depth, size, and time. This improvements highlights how we can improve SABRE beyond just trials and iterations for your specific circuit and hardware constraints. Note that these results are based on a single layout trial. To achieve more accurate results, we recommend running multiple layout trials, which can be done efficiently using the serverless runtime.

### Results with serverless runtime (multiple layout trials)

Qiskit Serverless requires setting up your workload’s `.py` files into a dedicated directory. The following code cell is a Python file in the `source_files` directory named `transpile_remote.py`. This file contains the function that runs the transpilation process.

In [18]:
# This cell is hidden from users, it makes sure the `source_files` directory exists
from pathlib import Path

Path("source_files").mkdir(exist_ok=True)

In [26]:
%%writefile source_files/transpile_remote.py
import time
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
from qiskit.transpiler.passes import SabreLayout, SabreSwap
from qiskit.transpiler import CouplingMap
from qiskit_serverless import get_arguments, save_result, distribute_task, get
from qiskit_ibm_runtime import QiskitRuntimeService

@distribute_task(target={
    "cpu": 1,
    "mem": 1024 * 1024 * 1024
})
def transpile_remote(qc, optimization_level, backend_name, seed, swap_trials, heuristic):
    """Transpiles an abstract circuit into an ISA circuit for a given backend."""

    service = QiskitRuntimeService()
    backend = service.backend(backend_name)

    pm = generate_preset_pass_manager(
        optimization_level=optimization_level,
        backend=backend,
        seed_transpiler=seed
    )

    # Changing the `SabreLayout` and `SabreSwap` passes to use the custom configurations
    cmap = CouplingMap(backend().configuration().coupling_map)
    sr = SabreSwap(coupling_map=cmap, heuristic=heuristic, trials=swap_trials, seed=seed)
    sl = SabreLayout(coupling_map=cmap, routing_pass=sr, seed=seed)
    pm.layout.replace(index=2, passes=sl)
    pm.routing.replace(index=1, passes=sr)

    # Measure the transpile time
    start_time = time.time()  # Start timer
    tqc = pm.run(qc)  # Transpile the circuit
    end_time = time.time()  # End timer

    transpile_time = end_time - start_time  # Calculate the elapsed time
    return tqc, transpile_time  # Return both the transpiled circuit and the transpile time


# Get program arguments
arguments = get_arguments()
circuit = arguments.get("circuit")
backend_name = arguments.get("backend_name")
optimization_level = arguments.get("optimization_level")
seed_list = arguments.get("seed_list")
swap_trials = arguments.get("swap_trials")
heuristic = arguments.get("heuristic")

# Transpile the circuits
transpile_worker_references = [
    transpile_remote(circuit, optimization_level, backend_name, seed, swap_trials, heuristic)
    for seed in seed_list
]

results_with_times = get(transpile_worker_references)

# Separate the transpiled circuits and their transpile times
transpiled_circuits = [result[0] for result in results_with_times]
transpile_times = [result[1] for result in results_with_times]

# Save both results and transpile times
save_result({"transpiled_circuits": transpiled_circuits, "transpile_times": transpile_times})

Overwriting source_files/transpile_remote.py


The following cell uploads the `transpile_remote.py` file as a Qiskit Serverless program under the name `transpile_remote_serverless`.

In [27]:
serverless = QiskitServerless()

transpile_remote_demo = QiskitFunction(
    title="transpile_remote_serverless",
    entrypoint="transpile_remote.py",
    working_dir="./source_files/",
)
serverless.upload(transpile_remote_demo)
transpile_remote_serverless = serverless.load("transpile_remote_serverless")

### Schritt 4: Nohverarbeitung un Ergebnis em gewünschte klassische Format zoröckgäve
Sobald dä Job abgeschlosse es, analysiere mir de Ergebnisse, indem mir de Erwartungswähte  $\langle Z_0 Z_i \rangle$ för jedes Qubit plotte. En ener ideale Simulation sollte all  $\langle Z_0 Z_i \rangle$-Wähte 1 sin un perfekte Verschränkung övver de Qubits widerspiegele. Allerdings, wäjen Rausche un Hardwarebeschränkunge, nemme de Erwartungswähte typischerweise av, wenn `i` zunimmp, un offenbare, wie Verschränkung övver Distanz degradiert.

En däm Schritt vergliche mir de Ergebnisse vun jeder Pass-Manager-Konfiguration met der ideale Simulation. Indem mir de Abweichung vun $\langle Z_0 Z_i \rangle$ vun 1 för jede Konfiguration ungersöke, künne mir quantifiziere, wie jood jede Pass-Manager Verschränkung bewaahrt un de Auswirkunge vun Rausche mildert. Diese Analyse bewertet direkt de Auswirkung vun SABRE-Optimierunge op Ömsetungsfidelität un hebt hervor, welch Konfiguration dat beste Jlichgewicht zwischen Optimierungsqualität un Ömsetungsleistung erreicht.

De Ergebnisse weede visualisiert, öm Ungerscheede övver Pass-Manager ze betone, un ze zeige, wie Verbesserunge em Layout un Routing de finale Schaltkreisömsetung op rauschiger Quantehardware beeinflusse.

In [28]:
num_seeds = 20  # represents the different layout trials
seed_list = [seed + i for i in range(num_seeds)]

![Output of the previous code cell](../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/bc6cb36f-4bf2-4275-baf5-9557fcba520a-0.avif)

### Analyse vun Ergebnisse
Dat Diagramm zeigt de Erwartungswähte $\langle Z_0 Z_i \rangle / \langle Z_0 Z_0 \rangle$ als Funktion vun der Distanz zwischen Qubits för drei Pass-Manager-Konfiguratione met zunemmende Optimierungslevels. Em ideale Fall blivve diese Wähte noh aan 1 un zeige starke Korrelationne övver dä Schaltkreis aan. Wenn de Distanz zunimmp, föhre Rausche un akkumulierte Fehler ze enem Verfall vun Korrelationne, wat offenbart, wie jood jede Transpilationsstrategie de zugrunde liggende Struktur vum Zostand bewaahrt.

Unger de drei Konfiguratione schniggt `pm_1` eindeutig am schlächteste av. Sing Korrelationswähte falle rapide, wenn de Distanz zunimmp, un nähre sich Null vill fröher als de andere zwei Konfiguratione. Dat Verhalte es konsistent met singer jrößere Schaltkreisdeefe un Gate-Zahl, wo akkumuliertes Rausche schnell Langstrecke-Korrelationne degradiert.

Sowohl `pm_2` wie och `pm_3` stelle bedütende Verbesserunge övver `pm_1` övver im Wesentliche all Distanze dar. Em Durchschnitt zeigt `pm_3` de stärkste Jesamtleistung, behält höhere Korrelationswähte övver längere Distanze bei un zeigt ene allmählichere Verfall. Dat stimmp övverein met singer aggressivere Optimierung, die flachere Schaltkreise produziert, die en der Regel robuster gejenövver Rauschanhäufung sin.

Dat jesaht, zeigt `pm_2` bemerkensweht bessere Jenauichkeit op koote Distanze em Verglich zu `pm_3`, trotz ener leicht jrößere Deefe un Gate-Zahl. Dat deutet drop hin, dat Schaltkreisdeefe allein de Leistung nit völlig bestimmp; de spezifische Struktur, die durch de Transpilation produziert weed, einschließlich wie verschränkende Gates aanjeordnet weede un wie Fehler sich durch dä Schaltkreis fortpflanze, speelt och en wichtige Rolle. En einige Fälle schinge de Transformatione, die durch `pm_2` aanjewandt weede, lokale Korrelationne besser ze bewaahre, och wenn se nit so joot övver längere Distanze skaliere.

Zosammejenomme hevve diese Ergebnisse ene Kompromiss zwischen Schaltkreiskompaktheit un Schaltkreisstruktur hervor. Während erhöhte Optimierung jenerlisch Langstrecke-Stabilität verbessert, hängt de beste Leistung för ene bestimmte Observable sowohl vun der Reduzierung vun Schaltkreisdeefe wie och vun der Produktion ener Struktur av, die joot met de Rauscheigenschafte vun der Hardware övvereinstimmp.
## Deel II. De Heuristik en SABRE konfiguriere un Serverless bruche
Nävven der Aanpassung vun Versuchszahle unnerstötzt SABRE de Aanpassung vun der Routing-Heuristik, die während der Transpilation jebruch weed. Standardmäßig verwendet `SabreLayout` de Decay-Heuristik, die Qubits dynamisch basierend op ihrer Wahrscheinlichkeit gewichtet, jeswappt ze weede. Öm en andere Heuristik (wie de `lookahead`-Heuristik) ze bruche, kanns do en benutzerdefiniertes `SabreSwap`-Pass erstelle un et met `SabreLayout` verbinge, indem do ene `PassManager` met `FullAncillaAllocation`, `EnlargeWithAncilla` un `ApplyLayout` usföhrs. Beim Bruche vun `SabreSwap` als Parameter för `SabreLayout` weed standardmäßig nor ei Layout-Versuch durchgeföht. Öm effizient mehrere Layout-Versuche uszföhre, nutze mir de Serverless-Runtime för Parallelisierung. För mih övver Serverless, luur [Serverless-Dokumentation](/guides/serverless) aan.

### Wie mer de Routing-Heuristik ändert
1. Erstell en benutzerdefiniertes `SabreSwap`-Pass met der gewünschte Heuristik.
2. Bruch dat benutzerdefinierte `SabreSwap` als Routing-Methode för dat `SabreLayout`-Pass.

Während et möglich es, mehrere Layout-Versuche met ener Schleife uszföhre, es de Serverless-Runtime de bessere Wahl för jroß aangelegte un energischere Experimente. Serverless unnerstötzt parallele Ömsetung vun Layout-Versuche, wat de Optimierung jrößerer Schaltkreise un jroße experimentelle Sweeps erheblich beschleunicht. Dat mäht et besonders wertvoll beim Schaffe met ressourcenintensive Opjave oder wenn Zickeffizienz kritisch es.

Dä Abschnitt konzentriert sich nur op Schritt 2 vun der Optimierung: Schaltkreisjröße un Deefe minimiere, öm dä beste transpilierte Schaltkreis ze erziehle. Opbauend op de fröhere Ergebnisse erkunde mir jetz, wie Heuristik-Aanpassung un Serverless-Parallelisierung de Optimierungsleistung wigger verbessere künne, un et geeignet maache för jroß aangelegte Quanteschaltkreis-Transpilation.
### Ergebnisse ohne Serverless-Runtime (1 Layout-Versuch):

In [29]:
job_lookahead = transpile_remote_serverless.run(
    circuit=qc,
    backend_name=backend.name,
    optimization_level=3,
    seed_list=seed_list,
    swap_trials=swap_trials,
    heuristic="lookahead",
)

In [30]:
job_lookahead.job_id

'15767dfc-e71d-4720-94d6-9212f72334c2'

In [31]:
job_lookahead.status()

'QUEUED'

Receive the logs and results from the serverless runtime.

In [21]:
logs_lookahead = job_lookahead.logs()
print(logs_lookahead)

No logs yet.


Once a program is `DONE`, you can use `job.results()` to fetch the result stored in `save_result()`.

In [32]:
# Run the job with lookahead heuristic
start_time = time.time()
results_lookahead = job_lookahead.result()
end_time = time.time()

job_lookahead_time = end_time - start_time

De folgede Zelle lädt de `transpile_remote.py`-Datei als en Qiskit Serverless-Programm ongk dem Name `transpile_remote_serverless` hoch.

In [33]:
job_decay = transpile_remote_serverless.run(
    circuit=qc,
    backend_name=backend.name,
    optimization_level=3,
    seed_list=seed_list,
    swap_trials=swap_trials,
    heuristic="decay",
)

In [34]:
job_decay.job_id

'00418c76-d6ec-4bd8-9f70-05d0fa14d4eb'

In [35]:
logs_decay = job_decay.logs()
print(logs_decay)

No logs yet.


In [36]:
# Run the job with the decay heuristic
start_time = time.time()
results_decay = job_decay.result()
end_time = time.time()

job_decay_time = end_time - start_time

In [37]:
# Extract transpilation times
transpile_times_decay = results_decay["transpile_times"]
transpile_times_lookahead = results_lookahead["transpile_times"]

# Calculate total transpilation time for serial execution
total_transpile_time_decay = sum(transpile_times_decay)
total_transpile_time_lookahead = sum(transpile_times_lookahead)

# Print total transpilation time
print("=== Total Transpilation Time (Serial Execution) ===")
print(f"Decay Heuristic    : {total_transpile_time_decay:.2f} seconds")
print(f"Lookahead Heuristic: {total_transpile_time_lookahead:.2f} seconds")

# Print serverless job time (parallel execution)
print("\n=== Serverless Job Time (Parallel Execution) ===")
print(f"Decay Heuristic    : {job_decay_time:.2f} seconds")
print(f"Lookahead Heuristic: {job_lookahead_time:.2f} seconds")

# Calculate and print average runtime per transpilation
avg_transpile_time_decay = total_transpile_time_decay / num_seeds
avg_transpile_time_lookahead = total_transpile_time_lookahead / num_seeds
avg_job_time_decay = job_decay_time / num_seeds
avg_job_time_lookahead = job_lookahead_time / num_seeds

print("\n=== Average Time Per Transpilation ===")
print(f"Decay Heuristic (Serial)    : {avg_transpile_time_decay:.2f} seconds")
print(f"Decay Heuristic (Serverless): {avg_job_time_decay:.2f} seconds")
print(
    f"Lookahead Heuristic (Serial)    : {avg_transpile_time_lookahead:.2f} seconds"
)
print(
    f"Lookahead Heuristic (Serverless): {avg_job_time_lookahead:.2f} seconds"
)

# Calculate and print serverless improvement percentage
decay_improvement_percentage = (
    (total_transpile_time_decay - job_decay_time) / total_transpile_time_decay
) * 100
lookahead_improvement_percentage = (
    (total_transpile_time_lookahead - job_lookahead_time)
    / total_transpile_time_lookahead
) * 100

print("\n=== Serverless Improvement ===")
print(f"Decay Heuristic    : {decay_improvement_percentage:.2f}%")
print(f"Lookahead Heuristic: {lookahead_improvement_percentage:.2f}%")

=== Total Transpilation Time (Serial Execution) ===
Decay Heuristic    : 112.37 seconds
Lookahead Heuristic: 85.37 seconds

=== Serverless Job Time (Parallel Execution) ===
Decay Heuristic    : 5.72 seconds
Lookahead Heuristic: 5.85 seconds

=== Average Time Per Transpilation ===
Decay Heuristic (Serial)    : 5.62 seconds
Decay Heuristic (Serverless): 0.29 seconds
Lookahead Heuristic (Serial)    : 4.27 seconds
Lookahead Heuristic (Serverless): 0.29 seconds

=== Serverless Improvement ===
Decay Heuristic    : 94.91%
Lookahead Heuristic: 93.14%


These results demonstrate the substantial efficiency gains from using serverless execution for quantum circuit transpilation. Compared to serial execution, serverless execution dramatically reduces overall runtime for both the `decay` and `lookahead` heuristics by parallelizing independent transpilation trials. While serial execution reflects the full cumulative cost of exploring multiple layout trials, the serverless job times highlight how parallel execution collapses this cost into a much shorter wall-clock time. As a result, the effective time per transpilation is reduced to a small fraction of that required in the serial setting, largely independent of the heuristic used. This capability is particularly important for optimizing SABRE to its fullest potential. Many of SABRE’s strongest performance gains come from increasing the number of layout and routing trials, which can be prohibitively expensive when executed sequentially. Serverless execution removes this bottleneck, enabling large-scale parameter sweeps and deeper exploration of heuristic configurations with minimal overhead.

Overall, these findings show that serverless execution is key to scaling SABRE optimization, making aggressive experimentation and refinement practical compared to serial execution.

Obtain the results from the serverless runtime and compare the results of the lookahead and decay heuristics. We will compare the sizes and depths.

In [38]:
# Extract sizes and depths
sizes_lookahead = [
    circuit.size() for circuit in results_lookahead["transpiled_circuits"]
]
depths_lookahead = [
    circuit.depth(lambda x: x.operation.num_qubits == 2)
    for circuit in results_lookahead["transpiled_circuits"]
]
sizes_decay = [
    circuit.size() for circuit in results_decay["transpiled_circuits"]
]
depths_decay = [
    circuit.depth(lambda x: x.operation.num_qubits == 2)
    for circuit in results_decay["transpiled_circuits"]
]


def create_scatterplot(x, y1, y2, xlabel, ylabel, title, labels, colors):
    plt.figure(figsize=(8, 5))
    plt.scatter(
        x, y1, label=labels[0], color=colors[0], alpha=0.8, edgecolor="k"
    )
    plt.scatter(
        x, y2, label=labels[1], color=colors[1], alpha=0.8, edgecolor="k"
    )
    plt.xlabel(xlabel, fontsize=12)
    plt.ylabel(ylabel, fontsize=12)
    plt.title(title, fontsize=14)
    plt.legend(fontsize=10)
    plt.grid(axis="y", linestyle="--", alpha=0.7)
    plt.tight_layout()
    plt.show()


create_scatterplot(
    seed_list,
    sizes_lookahead,
    sizes_decay,
    "Seed",
    "Size",
    "Circuit Size",
    ["lookahead", "Decay"],
    ["blue", "red"],
)
create_scatterplot(
    seed_list,
    depths_lookahead,
    depths_decay,
    "Seed",
    "Depth",
    "Circuit Depth",
    ["lookahead", "Decay"],
    ["blue", "red"],
)

<Image src="../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/4cf9588b-8ea6-4761-b544-14bef8f0be85-0.avif" alt="Output of the previous code cell" />

<Image src="../docs/images/tutorials/transpilation-optimizations-with-sabre/extracted-outputs/4cf9588b-8ea6-4761-b544-14bef8f0be85-1.avif" alt="Output of the previous code cell" />

Each point in the scatter plots above represents a layout trial, with the x-axis indicating the circuit depth and the y-axis indicating the circuit size. The results reveal that the lookahead heuristic generally outperforms the decay heuristic in minimizing circuit depth and circuit size. In practical applications, the goal is to identify the optimal layout trial for your chosen heuristic, whether prioritizing depth or size. This can be achieved by selecting the trial with the lowest value for the desired metric. Importantly, increasing the number of layout trials improves the chances of achieving a better result in terms of size or depth, but it comes at the cost of higher computational overhead.

In [39]:
min_depth_lookahead = min(depths_lookahead)
min_depth_decay = min(depths_decay)
min_size_lookahead = min(sizes_lookahead)
min_size_decay = min(sizes_decay)
print(
    "Lookahead: Min Depth",
    min_depth_lookahead,
    "Min Size",
    min_size_lookahead,
)
print("Decay:     Min Depth", min_depth_decay, "Min Size", min_size_decay)

Lookahead: Min Depth 399 Min Size 2452
Decay:     Min Depth 415 Min Size 2611


Empfang de Logs un Ergebnisse vun der Serverless-Runtime.

In [40]:
# This cell is hidden from users, it cleans up the `source_files` directory
from pathlib import Path

Path("source_files/transpile_remote.py").unlink()
Path("source_files").rmdir()

## Conclusion

In this tutorial, we explored how to optimize large circuits using SABRE in Qiskit. We demonstrated how to configure the `SabreLayout` pass with different parameters to balance circuit quality and transpilation runtime. We also showed how to customize the routing heuristic in SABRE and use the `QiskitServerless`runtime to parallelize layout trials efficiently for when `SabreSwap` is involved. By adjusting these parameters and heuristics, you can optimize the layout and routing of large circuits, ensuring they are executed efficiently on quantum hardware.

## Tutorial survey

Please take this short survey to provide feedback on this tutorial. Your insights will help us improve our content offerings and user experience.

[Link to survey](https://your.feedback.ibm.com/jfe/form/SV_d9YWUSQIAvU9HXE)