# A block encoding circuit for a D-dimensional Laplacian with periodic boundary conditions

___


This work is an implementation of the paper on **Efficient and Explicit Block Encoding of Finite Difference Discretizations of the Laplacian** (https://arxiv.org/abs/2509.02429).

### **Introduction**

Block encoding is a well known technique in quantum computing used to embed non-unitary matrices on a quantum computer that only allows for unitary evolution. 

***Definition:***  
    Let $ a, n, m \in \mathbb{N} $ such that $ m = a + n $. A $ m $ -qubit unitary $ U $  is said to be an $(\alpha, a)$ -block-encoding of an $ n $ -qubit operator $ A $  if  

$$
\tilde{A} = \left( \langle 0 |^{\otimes a} \otimes I_n \right) U \left( |0 \rangle^{\otimes a} \otimes I_n \right)
\tag{1}
$$  

where,  $A = \alpha \tilde{A}$ . The parameters $(\alpha, a)$  represent the *subnormalization factor* (which adjusts for encoding matrices of any norm), and the *number of ancilla qubits* used in the block-encoding scheme respectively. 

Efficiently block encoding arbitrary matrices is a very difficult problem and this task is not trivial even for well structured and sparse matrices. The paper by Sturm et al. 2025 (https://arxiv.org/abs/2509.02429), provides for efficient quantum circuits for block encoding $N-$ dimensional Laplacians, with periodic boundary conditions along all dimensions. This is useful in many applications, especially ones involving problems of linear algebra. Moreover, given an efficient block encoding of a matrix $\tilde{A}$, its possible to efficiently construct a block encoding of certain polynomials of $\tilde{A}$ through quantum singular value transformations (QSVT). 

### **Notebook contents**
- ##### Block encoding circuits for D- dimensional Laplacian matrix with periodic boundary conditions along all dimnesions.
___

## Discretizing the $D$-Dimensional Laplacian with Periodic Boundary Conditions

We consider the Laplacian operator on the $D$-dimensional unit hypercube  
$$
\Omega_D = [0,1]^D ,
$$
where a point is written as  
$$
\mathbf{x} = (x^{(0)}, x^{(1)}, \ldots, x^{(D-1)}).
$$


The $D$-dimensional Laplacian is the sum of second derivatives along each coordinate axis:
\begin{equation}
L_D = \sum_{d=0}^{D-1} \frac{\partial^2}{\partial (x^{(d)})^2}.
\tag{1}
\end{equation}

We impose **periodic boundary conditions**, meaning the function repeats itself across opposite sides of the domain:
$$
v(x^{(0)},\ldots,0,\ldots,x^{(D-1)}) 
= 
v(x^{(0)},\ldots,1,\ldots,x^{(D-1)}).
$$

This ensures the domain “wraps around” in **every dimension**.


To approximate the operator on a computer, we replace each interval with a **uniform grid** (*we assume equidistant grid points as per the paper*).  

For one dimension, choose
$$
\Omega_{1,h} = \{ jh \mid j=0,1,\dots,N-1 \},
\qquad h = \frac{1}{N}.
$$
Because of periodicity, the point at $ x=1 $ is identical to $ x=0 $, so we only keep $ N $ samples.

<p align="center">
  <img src="1d_discretization.png" width="40%">
</p>

To build a $D$-dimensional grid, take the Cartesian product:
$$
\Omega_{D,h} = \Omega_{1,h} \times \cdots \times \Omega_{1,h}.
$$

Each grid point is indexed by a $D$-tuple:
$$
(j^{(0)}, j^{(1)},\ldots, j^{(D-1)}),
\qquad j^{(d)} \in \{0,\dots,N-1\}.
$$

The total number of points is  
$$
N_D = N^D.
$$


To represent the function values as a vector, we map each integer index $j^{(i)}$ to a $2^n$ dimensional vector $\ket{j^{(i)}}$ in the Hilbert sapce, where $N=2^n$. Thus, 
$$
(j^{(0)},j^{(1)},\ldots,j^{(D-1)})
\quad \longmapsto \quad
j = \ket{j^{(D-1)}}\otimes \cdots \otimes\ket{j^{(1)}}\otimes\ket{j^{(0)}},
$$

This ordering turns the entire grid $\Omega_{D,h}$ into a vector of $\mathbf{\Omega_{D,h}}$ size $N^D= 2^{nD}$ where,

$$
\mathbf{\Omega_{D,h}}
= \left( j^{(0)} h,\; j^{(1)} h,\; \ldots,\; j^{(D-1)} h \right)
\,\lvert j^{(D-1)} \rangle \cdots \lvert j^{(1)} \rangle \lvert j^{(0)} \rangle .
$$

___

#### Building the 1D Finite-Difference Laplacian Matrix (Periodic Boundary Conditions)

To construct the discrete Laplacian on a 1D periodic grid, we approximate the second
derivative using the standard centered finite-difference stencil.


We discretize the interval $[0,1]$ using  
$$
x_j = jh, \quad j = 0,1,\dots, N-1, \qquad h = \frac{1}{N}.
$$

Because the boundary is periodic, the point at $x=1$ is *identified* with $x=0$.
So the grid has exactly $N$ degrees of freedom.



For a smooth function $u(x)$, the second derivative at grid point $x_j$ is approximated by

$$
\frac{d^2u}{dx^2}(x_j)
\approx 
\frac{u_{j-1} - 2u_j + u_{j+1}}{h^2}.
$$

This is the classic *three-point centered stencil*.

Without periodicity, $u_{j-1}$ and $u_{j+1}$ would fail at the boundaries.  
With periodicity, the grid “wraps around,” so:

$$
u_{-1} \equiv u_{N-1}, \qquad 
u_{N} \equiv u_0.
$$

This single idea gives the Laplacian its circulant structure.


For each point $j$, the finite-difference rule says:

- coefficient of $u_{j-1}$ is $+1/h^2$
- coefficient of $u_{j}$ is $-2/h^2$
- coefficient of $u_{j+1}$ is $+1/h^2$

Putting these coefficients into matrix form gives an $N \times N$ matrix where:

- the main diagonal contains $-2/h^2$
- the first upper and first lower diagonal contain $1/h^2$
- periodicity adds *two extra wrap-around entries*

Specifically:

- entry $(0, N-1)$ gets $+1/h^2$ (from $u_{-1}=u_{N-1}$)
- entry $(N-1, 0)$ gets $+1/h^2$ (from $u_N=u_0$)


Putting this together, the discrete $1D$ Laplacian is

\begin{equation}
L_{1,h} = \frac{1}{h^2}
\begin{pmatrix}
-2 & 1 & 0 & \cdots & 0 & 1\\
1 & -2 & 1 & \cdots & 0 & 0\\
0 & 1 & -2 & \cdots & 0 & 0\\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
1 & 0 & 0 & \cdots & 1 & -2
\end{pmatrix}.
\tag{2}
\end{equation}

This matrix is the building block for higher-dimensional Laplacians via Kronecker sums.

___


#### Building the D- dimensional Finite-Difference Laplacian Matrix (Periodic Boundary Conditions)


The discrete Laplacian on the full $D$-dimensional grid is a **Kronecker sum** of 1D Laplacians:

\begin{equation}
L_{D,h} 
= 
\sum_{d=0}^{D-1}
I_N^{\otimes (D-1-d)} \otimes L_{1,h} \otimes I_N^{\otimes d}.
\tag{3}
\end{equation}

Each term contributes the second derivative along one dimension, while identities preserve all other coordinates.  
This operator is sparse: every grid point couples only to its $2D$ nearest periodic neighbors.


The maximum eigenvalue magnitude of the discrete Laplacian is known analytically:
$$
\lambda_{D, \max} = \frac{4D}{h^2}.
$$

To make the matrix have operator norm 1 (required for several quantum algorithms), define the **scaled Laplacian**:
$$
\widetilde{L}_{D,h}
= 
\frac{1}{\lambda_{D,\max}} L_{D,h}
= 
\frac{1}{4D/h^2} L_{D,h}.
$$

Plugging this into the Kronecker-sum representation yields

\begin{equation}
\widetilde{L}_{D,h}
=
\frac{1}{D}
\sum_{d=0}^{D-1}
I_N^{\otimes (D-1-d)} \otimes \widetilde{L}_{1,h} \otimes I_N^{\otimes d},
\tag{4}
\end{equation}
where
$$
\widetilde{L}_{1,h} = \frac{h^2}{4} L_{1,h}
$$
is the 1D Laplacian scaled to have norm 1.

We now provide the block encoding of the scaled Laplacian matrix. 
___

#### Block encoding the 1D scaled Laplacian matrix

With periodic boundary conditions, the 1D Laplacian matrix has a circulant structure as shown (for $N= 16$ grid points): 


<p align="center">
  <img src="1D_structure.png" width="30%">
</p>

This structure can be block encoded via a simple quantum circuit shown below: 

<p align="center">
  <img src="1D_Lap_BE.png" width="30%">
</p>

Here $S_{-}$ and $S_{+}$ are unitary operations that take $\ket{j}$ to $\ket{(j+1) \mod{N}}$ and $\ket{(j-1) \mod{N}}$ respectively. 

One can easily see that the scaled 1D Laplacian can be written in terms of linear combination of unitaries as:

\begin{equation}
\widetilde L_{1} \;:=\; \frac{1}{4}\,(S_- - 2I + S_+).
\tag{5}
\end{equation}
___




#### Necessary modules and global functions

In [1]:
# import all necessary modules
import builtins
import math
import random
import time
from typing import List

import matplotlib.pyplot as plt
import numpy as np

import classiq
from classiq import *
from classiq.qmod.symbolic import *

# np.set_printoptions(precision=2, suppress=True, linewidth=120, threshold=10000)   # print options for neatness


# post-processing function to get reduced state vector after projection on ancilla=0
def get_projected_state_vector(
    execution_result,
    measured_var: str,
    projections: dict,
) -> np.ndarray:
    """
    This function returns a reduced statevector from execution results.
    measured var: the name of the reduced variable
    projections: on which values of the other variables to project, e.g., {"anc_M": 1}
    Note: For this function to work properly all variables, except auxiliary qubits, must be declared as
    output of the model.
    """
    projected_size = len(execution_result.output_qubits_map[measured_var])
    proj_statevector = np.zeros(2**projected_size).astype(complex)
    for sample in execution_result.parsed_state_vector:
        if all(
            int(sample.state[key]) == projections[key] for key in projections.keys()
        ):
            value = int(sample.state[measured_var])
            proj_statevector[value] += sample.amplitude
    global_phase = np.angle(proj_statevector[0])
    return np.real(proj_statevector / np.exp(1j * global_phase))


def fidelity_with_phase_alignment(psi, phi, align=True):
    """
    Compute fidelity between two pure-state vectors (|<psi|phi>|^2).
    If `align` is True, first remove the global phase of `phi` so the
    inner product with `psi` is real-positive (this also fixes an overall sign).
    Inputs may be unnormalized; they are normalized inside.
    Returns fidelity (float). If return_alignment True returns tuple
    (fidelity, phi_aligned, applied_phase).
    """

    psi = np.asarray(psi, dtype=complex).ravel()
    phi = np.asarray(phi, dtype=complex).ravel()
    if psi.size != phi.size:
        raise ValueError("State vectors must have same dimension")

    npsi = np.linalg.norm(psi)
    nphi = np.linalg.norm(phi)
    if npsi == 0 or nphi == 0:
        raise ValueError("Zero vector provided")

    psi = psi / npsi
    phi = phi / nphi

    overlap = np.vdot(psi, phi)  # <psi|phi>
    if align:
        phase = np.angle(overlap)
        phi_aligned = phi * np.exp(-1j * phase)
        overlap_aligned = np.vdot(psi, phi_aligned)
        fidelity = float(np.abs(overlap_aligned) ** 2)
    else:
        phase = 0.0
        phi_aligned = phi
        fidelity = float(np.abs(overlap) ** 2)

    return fidelity

In [2]:
# classical function to generate scaled 1D Laplacian matrix with periodic boundary conditions


def shift_plus(N):
    S = np.zeros((N, N), dtype=complex)
    for j in range(N):
        S[(j + 1) % N, j] = 1.0
    return S


def shift_minus(N):
    S = np.zeros((N, N), dtype=complex)
    for j in range(N):
        S[(j - 1) % N, j] = 1.0
    return S


def laplacian_1d(N):
    """Scaled 1D Laplacian \tilde L on N grid points (periodic)."""
    return 0.25 * (shift_minus(N) - 2 * np.eye(N) + shift_plus(N))

In [3]:
N = 32
n = int(math.log2(N))
lap_scaled = laplacian_1d(N)


@qfunc
def adder_mod(a: QNum):
    a += 1


@qfunc
def subtractor_mod(a: QNum):
    a += -1


@qfunc
def main(j: Output[QNum], l: Output[QNum]):
    allocate(n, j)
    allocate(2, l)

    # we chose the input state |0> for testing

    apply_to_all(H, l)
    apply_to_all(Z, l)

    l_arr = QArray()
    bind(l, l_arr)

    control(l_arr[0] == 0, lambda: subtractor_mod(j)), control(
        l_arr[1] == 1, lambda: adder_mod(j)
    )

    bind(l_arr, l)

    apply_to_all(H, l)
    apply_to_all(Z, l)


qmod = create_model(main)
backend_preferences = ClassiqBackendPreferences(backend_name="simulator_statevector")
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=1, backend_preferences=backend_preferences
    ),
)
constraints = Constraints(
    max_width=18, optimization_parameter=OptimizationParameter.DEPTH
)
qmod = set_constraints(qmod, constraints)
qprog = synthesize(qmod)
# show(qprog)

# write_qmod(qmod, "1D_Periodic_Laplacian_BE")

print("Circuit Width:", qprog.data.width)
print("Circuit Depth:", qprog.transpiled_circuit.depth)
print("Gate counts:", qprog.transpiled_circuit.count_ops)

job = execute(qprog)
results = job.result()[0].value

# Post processing
reduced_state = get_projected_state_vector(results, "j", {"l": 0})
print("The reduced state vector for j when l=0, k=0 is:")
print(reduced_state)

# #theoretical reduced state vector

b = [0 for _ in range(N)]
b[0] = 1
theoretical_reduced_state = np.matmul(lap_scaled, b)
print(f"Expected theoretical state vector:")
print(theoretical_reduced_state)
print("********")

Circuit Width: 7
Circuit Depth: 113
Gate counts: {'u': 111, 'cx': 100}
The reduced state vector for j when l=0, k=0 is:
[ 5.00000000e-01 -2.50000000e-01  2.03934950e-16  2.60524259e-16
 -3.00593240e-16 -3.80160209e-17  8.15406091e-17 -4.52150298e-17
 -6.30058944e-17  1.21989910e-16 -4.46287283e-18  5.15617198e-17
 -8.55801898e-17 -1.61909579e-17  2.01956495e-17 -1.05456418e-17
 -1.37383090e-16  9.49128933e-17 -1.17137621e-18 -3.88260210e-17
  2.36256948e-17  3.94809326e-17 -3.59332343e-17 -9.83622130e-18
  7.68836822e-17 -2.74072948e-18 -4.98416893e-17 -3.17034970e-16
  2.84043112e-16  9.53120283e-17 -1.58750885e-16 -2.50000000e-01]
Expected theoretical state vector:
[-0.5 +0.j  0.25+0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j
  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j
  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j
  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j  0.  +0.j
  0.  +0.j  0.  +0.j  0.  +0.j  0.25+0.j

In [4]:
fidelity = fidelity_with_phase_alignment(reduced_state, theoretical_reduced_state)
print(
    f"Fidelity between theoretical and obtained reduced state vector: {np.round(fidelity, 4)}"
)

Fidelity between theoretical and obtained reduced state vector: 1.0


#### **Note:** The state vector results should match upto a overall global phase. Hence the above theoretical and simulated results match exactly. 

___

#### Block encoding D-dimensional scaled Laplacian matrix


The scaled D-dimensional laplacian from equation $4$ is as follows:

$$
\widetilde{L}_{D,h}
=
\frac{1}{D}
\sum_{d=0}^{D-1}
I_N^{\otimes (D-1-d)} \otimes \widetilde{L}_{1,h} \otimes I_N^{\otimes d},
$$

where we can use the expression from equation $5$ for $\widetilde{L}_{1,h}$.

The structure of the matrices for $D=2$ and $D=3$ is shown below:

<div style="display:flex; justify-content:center; gap:20px; align-items:center;">
  <img src="2D_Lap_structure.png" alt="D-dim Lap BE (left)" style="width:40%; object-fit:contain;">
  <img src="3D_Lap_structure.png" alt="D-dim Lap BE (right)" style="width:40%; object-fit:contain;">
</div>



Due to the above decomposition, we can easily extend the idea of 1D Laplacian block encoding to a D-dimensional laplacian by introducing an additional control qubit register that controls on the dimension number, as follows: 

<p align="center">
  <img src="Ddim_Lap_BE.png" width="50%">
</p>

here $\hat{d}= \lceil{\log D}\rceil$

___

In [5]:
# classical function to generate scaled 1D Laplacian matrix with periodic boundary conditions


def laplacian_multiD(Ns):
    """
    Build D-dimensional Laplacian with periodic BCs.
    Ns = [N0, N1, ..., N_{D-1}], where Ni = 2**n_i.
    """
    D = len(Ns)
    N_total = np.prod(Ns)
    L_total = np.zeros((N_total, N_total), dtype=complex)

    for i, Ni in enumerate(Ns):
        Li = laplacian_1d(Ni)
        # Build tensor with reversed order (so last entry in Ns = qubit 0)
        kron_terms = []
        for j, Nj in enumerate(Ns):
            if j == i:
                kron_terms.append(Li)
            else:
                kron_terms.append(np.eye(Nj, dtype=complex))
        # reverse list for little-endian convention
        kron_terms = kron_terms[::-1]
        Li_full = kron_terms[0]
        for term in kron_terms[1:]:
            Li_full = np.kron(Li_full, term)
        L_total += Li_full

    d = math.ceil(math.log2(builtins.max(1, D)))

    return L_total * (1 / (2**d))  # scale by 1/D

In [7]:
# Classical Inputs: N_i, D
D = 2  # number of dimensions
d = math.ceil(math.log2(builtins.max(1, D)))  # d hat of the paper
N = [2, 4]  # grid size in each dimension is 4, 16
n = [int(math.log2(i)) for i in N]  # number of qubits in each dimension
total_system_size = np.sum(n)
lap_scaled = laplacian_multiD(N)


@qfunc
def adder_mod(a: QNum):
    a += 1


@qfunc
def subtractor_mod(a: QNum):
    a += -1


def control_adder_mod(j: List[QNum], l: QArray, control_bit: QNum):
    for i in range(D):
        control(
            control_bit == i,
            lambda: [
                control(l[0] == 0, lambda: subtractor_mod(j[i])),
                control(l[1] == 1, lambda: adder_mod(j[i])),
            ],
        )


@qfunc
def main(j: Output[QNum], l: Output[QNum], k: Output[QNum]):
    allocate(total_system_size, j)
    allocate(2, l)
    allocate(d, k)

    # we chose the input state |0> for testing
    apply_to_all(H, l)
    apply_to_all(Z, l)
    apply_to_all(H, k)

    j_regs = [QNum(f"j{i}", n[i], False, 0) for i in range(len(n))]
    l_arr = QArray()

    bind(j, j_regs)
    bind(l, l_arr)

    control_adder_mod(j_regs, l_arr, k)

    bind(l_arr, l)
    bind(j_regs, j)

    apply_to_all(H, l)
    apply_to_all(H, k)


qmod = create_model(main)
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=1, backend_preferences=backend_preferences
    ),
)
qmod = set_constraints(qmod, constraints)
qprog = synthesize(qmod)
# show(qprog)

write_qmod(qmod, "ND_Periodic_Laplacian_BE")

print("Circuit Width:", qprog.data.width)
print("Circuit Depth:", qprog.transpiled_circuit.depth)
print("Gate counts:", qprog.transpiled_circuit.count_ops)

job = execute(qprog)
results = job.result()[0].value

# Post processing
reduced_state = get_projected_state_vector(results, "j", {"l": 0, "k": 0})
print("The reduced state vector for j when l=0, k=0 is:")
print(reduced_state)

# #theoretical reduced state vector
b = [0 for i in range(N[0] * N[1])]
b[0] = 1
theoretical_reduced_state = np.matmul(lap_scaled, b)
print(f"Expected theoretical state vector:")
print(theoretical_reduced_state)
print("********")

Circuit Width: 6
Circuit Depth: 78
Gate counts: {'u': 56, 'cx': 48}
The reduced state vector for j when l=0, k=0 is:
[ 5.00000000e-01 -2.50000000e-01 -1.25000000e-01 -3.61302947e-17
 -1.71584428e-16  3.26277430e-17 -1.25000000e-01  5.27811857e-17]
Expected theoretical state vector:
[-0.5  +0.j  0.25 +0.j  0.125+0.j  0.   +0.j  0.   +0.j  0.   +0.j
  0.125+0.j  0.   +0.j]
********


In [8]:
fidelity = fidelity_with_phase_alignment(reduced_state, theoretical_reduced_state)
print(
    f"Fidelity between theoretical and obtained reduced state vector: {np.round(fidelity, 4)}"
)

Fidelity between theoretical and obtained reduced state vector: 1.0


### Reference

- Sturm, A., & Schillo, N. (2025). Efficient and Explicit Block Encoding of Finite Difference Discretizations of the Laplacian. arXiv preprint https://arxiv.org/abs/2509.02429