# Block encoding of select structured matrices
___

This work is an implementation of the paper on **Block-encoding structured matrices for data input in quantum computing** (https://quantum-journal.org/papers/q-2024-01-11-1226/).

### **Introduction**

Block encoding is a well known technique in quantum computing used to embed non-unitary matrices on a quantum computer that only allows for unitary evolution. 

***Definition:***  
    Let $ a, n, m \in \mathbb{N} $ such that $ m = a + n $. A $ m $ -qubit unitary $ U $  is said to be an $(\alpha, a)$ -block-encoding of an $ n $ -qubit operator $ A $  if  

$$
\tilde{A} = \left( \langle 0 |^{\otimes a} \otimes I_n \right) U \left( |0 \rangle^{\otimes a} \otimes I_n \right)
\tag{1}
$$  

where,  $A = \alpha \tilde{A}$ . The parameters $(\alpha, a)$  represent the *subnormalization factor* (which adjusts for encoding matrices of any norm), and the *number of ancilla qubits* used in the block-encoding scheme respectively. 

Efficiently block encoding arbitrary matrices is a very difficult problem and this task is not trivial even for well structured and sparse matrices. The paper by Sunderhauf et al. 2024 (https://quantum-journal.org/papers/q-2024-01-11-1226/), provides for efficient quantum circuits for block encoding arithmetically structured matrices. This is useful in many applications, especially ones involving problems of linear algebra. Moreover, given an efficient block encoding of a matrix $\tilde{A}$, its possible to efficiently construct a block encoding of certain polynomials of $\tilde{A}$ through quantum singular value transformations (QSVT). 

### **Notebook contents**
- ##### Block encoding circuits for Checkerboard matrix, Toeplitz matrix, Tridiagonal symmetric matrix and 2D Laplacian matrix.
___

### **Notation for Sparse Matrix Structures**

We consider an $N \times N$ sparse matrix $A$ whose nonzero entries are drawn from a fixed set of $D$ distinct data values:
$$
A_d \quad (0 \le d < D)
$$
Multiple positions in the matrix may contain the same data value.

We denote:
- $M$: maximum number of times any of the $D$ values appears in the matrix (multiplicity of a data value)
- $S_c$: maximum number of nonzero entries in any column  
- $S_r$: maximum number of nonzero entries in any row  
- $S = \max(S_c, S_r)$: overall sparsity level (smaller $S$ means a sparser matrix)

We assume the following to hold for this work:
- all $D$ data items have the same multiplicity $M$,
- all rows and columns have equal sparsity: $S_c = S_r = S$,
- all quantities are powers of two.

Then, the total number of nonzero entries satisfies
$$
MD = NS_c = NS_r = \#\text{nonzero}. 
$$

**Note:** If the above equality does not hold, we pad $S$ and/or $D$ till the equality is satisfied. 

Each nonzero entry can then be equivalently uniquely labeled from three perspectives:
1. **By data value and repetition index:** $(d, m)$  
2. **By column index and sparsity index:** $(j, s_c)$  
3. **By row index and sparsity index:** $(i, s_r)$  

These perspectives describe the same sparsity pattern using different coordinate systems.


### **Block encoding ingredients**

We first label each matrix entry by a data label, say $(d,m). Then, we would require the following ingredients to block encode the matrix given above.

- Oracle $ O_c$: $$ O_c |d, m\rangle \rightarrow\left|j, s_c\right\rangle $$ where $s_c$ is the column sparsity index, $0 \le s_c \le S_c$.
- Oracle $ O_r$: $$ O_r |d, m\rangle \rightarrow\left|i, s_r\right\rangle $$ where $s_r$ is the row sparsity index, $0 \le s_r \le S_r$.

The above two oracles are solely dependent on the structure of the matrix in question and can be constructed through appropriate quantum arithmetic. 

- Apart from these, a dataloading oracle encoding the values of the $D$ data items is required:
$$
O_{\text {data }}=\sum_{d=0}^{D-1} R_X\left(2 \arccos A_d /\|A\|_{\max }\right) \otimes|d\rangle\langle d|
$$

The factor $\|A\|_{\max }=\max _d\left|A_d\right|$ is needed to ensure all values are in-range of the arccosine. Provided the first qubit is initialised and postselected as $|0\rangle$, the effect of $O_{\text {data }}$ is loading the correct values:

$$
O_{\text {data }}|0\rangle \otimes|d\rangle=\left(A_d /\|A\|_{\max }|0\rangle-i \sqrt{1-\left(\left|A_d\right| /\|A\|_{\max }\right)^2}|1\rangle\right) \otimes|d\rangle .
$$

### **General Block encoding circuit**

<p align="center">
  <img src="General_BE.png" width="75%">
</p>


#### **Important Note**
- In the below illustrations we scale the matrix entires of $A$ by dividing the unscaled $A$ by $||A||_{max}$ so that entries of scaled $A_s$ are between $-1$ and $+1$. We block encode the scaled matrix $A_s$  (where $||A_{s}||_{max}= 1$). 

___

#### Necessary modules and global functions

In [1]:
# import all necessary modules
import builtins
import math
import random
import time

import matplotlib.pyplot as plt
import numpy as np

np.set_printoptions(
    precision=2, suppress=True, linewidth=120, threshold=10000
)  # print options for neatness

import classiq
from classiq import *
from classiq.qmod.symbolic import *


# function to extract unique values from structured matrices
def _unique_values_core(matrix, mode, Nx=None):
    """
    helper to extract unique values (as needed for O_data oracle) from structured matrices.

    Parameters
    ----------
    matrix : array-like or sparse-like
        Input matrix (Laplacian, tridiagonal, Toeplitz).
    mode : {'laplacian', 'tridiagonal', 'toeplitz'}
        Extraction mode.
    Nx : int, optional
        Grid size in x-direction (only needed for 'laplacian').

    Returns
    -------
    list
        List of unique values
    """

    mat = matrix

    if mode == "laplacian":
        if Nx is None:
            raise ValueError("Nx must be provided for mode='laplacian'.")
        # 2D Laplacian: pick the main diag, 1-step, and Nx-step diagonals
        vals = [
            mat.diagonal(0)[0],  # A0
            mat.diagonal(1)[0],  # A1
            mat.diagonal(Nx)[0],  # A2
        ]
        return vals

    elif mode == "tridiagonal":
        size = mat.shape[0]
        vals = []

        for i in range(size):
            for j in range(size):
                v = mat[i, j]
                if v not in vals:
                    vals.append(v)

        return vals

    elif mode == "toeplitz":
        N = mat.shape[0]
        vals = []
        for offset in range(-N + 1, N):
            diagonal = np.diag(mat, k=offset)
            if diagonal.size > 0:
                v = diagonal[0]
                if v != 0 and v not in vals:
                    vals.append(v)
        return vals[::-1]

    else:
        raise ValueError(f"Unknown mode '{mode}'.")


# post-processing function to get reduced state vector after projection on ancilla=0
def get_projected_state_vector(
    execution_result,
    measured_var: str,
    projections: dict,
) -> np.ndarray:
    """
    This function returns a reduced statevector from execution results.
    measured var: the name of the reduced variable
    projections: on which values of the other variables to project, e.g., {"anc_M": 1}
    """
    projected_size = len(execution_result.output_qubits_map[measured_var])
    proj_statevector = np.zeros(2**projected_size).astype(complex)
    for sample in execution_result.parsed_state_vector:
        if all(
            int(sample.state[key]) == projections[key] for key in projections.keys()
        ):
            value = int(sample.state[measured_var])
            proj_statevector[value] += sample.amplitude
    global_phase = np.angle(proj_statevector[0])
    return np.real(proj_statevector / np.exp(1j * global_phase))


# general oracle to load data values from a classical array TM into data register using angle encoding
@qfunc
def oracle_d(s: QNum, data: QNum, TM: CArray[CReal]):
    repeat(
        TM.len,
        lambda i: control(
            s == i, lambda: (a_d := TM[i], theta := 2 * acos(a_d), RX(theta, data))
        ),
    )

### 1. Checkerboard Matrix



A checkerboard matrix $A\in\mathbb{R}^{N\times N}$ is any matrix whose entries alternate between two scalar values depending on the parity of the index sum.

Given two values $a_0,a_1\in[-1,1]$, we define
$$

A_{ij} = 
\begin{cases}
a_0, & (i+j)\ \text{even},\\[4pt]
a_1, & (i+j)\ \text{odd},
\end{cases} \qquad i,j=0,\dots,N-1.

$$
Thus the matrix contains only two distinct values and a simple periodic pattern
that is useful for testing structure-aware block-encoding. An example heatmap showing the structure is given below:

<p align="center">
  <img src="checkerboard.png" width="25%">
</p>

For convenience, the index $m\in\{0,\dots,N^2/2-1\}$ is decomposed into three parts,
$$
m \;=\; N\,m^{\mathrm{hi}} \;+\; \frac{N}{2}\,m^{\mathrm{mid}} \;+\; m^{\mathrm{lo}},
$$
corresponding respectively to the high, mid, and low $\log(N/2)$-bit segments.
Given a data index $d\in\{0,1\}$ and multiplicity index $m$, the row and column
indices $(i,j)$ of the checkerboard matrix are obtained as
\begin{align}
i(d,m) &= \left\lfloor \frac{m}{N/2} \right\rfloor
       \;=\; 2\,m^{\mathrm{hi}} + m^{\mathrm{mid}}, \\[6pt]
j(d,m) &= 2\,\big(m \bmod (N/2)\big) \;+\; \big(d + i(d,m)\bmod 2\big)
       \;=\; 2\,m^{\mathrm{lo}} + \big(d + m^{\mathrm{mid}} \bmod 2\big).
\end{align}
These relations give a compact arithmetic mapping from $(d,m)$ labels to the corresponding matrix entry $(i,j)$ in the $N\times N$ checkerboard matrix. One can deduce that $O_r$ would simply be the identity operator and the block encoding circuit is provided as below. 


<p align="center">
  <img src="checkerboard_circuit.png" width="50%">
</p>


Here the subnormalization factor is $N$ and $n+1$ ancilla qubits are used, where $n= log_2N$. Greater the subnormalization and ancilla qubits greater is the block encoding cost, so for efficient block encodings one tries to minimize these resources. 

In [2]:
def checkerboard_matrix(N, val_even=None, val_odd=None, rng=None, decimals=2):
    """
    Generate an N×N checkerboard matrix with alternating values.

    Parameters
    ----------
    N : int
        Matrix dimension.

    val_even : float, optional
        Value at positions where (i + j) is even. Random in [-1,1] if None.

    val_odd : float, optional
        Value at positions where (i + j) is odd. Random in [-1,1] if None.

    rng : np.random.Generator, optional
        Random generator for reproducibility.

    decimals : int, optional
        Decimal precision for the values.

    Returns
    -------
    np.ndarray, list
        The checkerboard matrix and the list [val_even, val_odd].
    """

    if rng is None:
        rng = np.random.default_rng()

    # Randomly choose the two values if not provided and round them
    if val_even is None:
        val_even = float(rng.uniform(-1, 1))
    else:
        val_even = float(val_even)
    if val_odd is None:
        val_odd = float(rng.uniform(-1, 1))
    else:
        val_odd = float(val_odd)

    val_even = round(val_even, decimals)
    val_odd = round(val_odd, decimals)

    # checkerboard pattern
    i = np.arange(N).reshape(-1, 1)
    j = np.arange(N).reshape(1, -1)
    parity = (i + j) % 2  # 0 for even, 1 for odd positions

    M = np.where(parity == 0, val_even, val_odd).astype(float)

    # Round matrix entries to the requested number of decimals
    M = np.round(M, decimals)

    return M, [val_even, val_odd]


# Test case
if __name__ == "__main__":
    N = 8
    M, values = checkerboard_matrix(N)
    print("Checkerboard matrix:\n")
    print(M)
    print("Values (val_even, val_odd):", values)

Checkerboard matrix:

[[-0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84]
 [-0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08]
 [-0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84]
 [-0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08]
 [-0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84]
 [-0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08]
 [-0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84]
 [-0.84 -0.08 -0.84 -0.08 -0.84 -0.08 -0.84 -0.08]]
Values (val_even, val_odd): [-0.08, -0.84]


In [3]:
@qfunc
def array_swap(j: QNum, k: QNum):
    z_arr1 = QArray("z1")
    z_arr2 = QArray("z2")
    bind(j, z_arr1)
    bind(k, z_arr2)
    repeat(z_arr1.len, lambda i: SWAP(z_arr1[i], z_arr2[i]))
    bind(z_arr1, j)
    bind(z_arr2, k)


@qfunc
def checkerboard_BE(data: QNum, s: QNum, j: QNum):
    n = int(math.log2(N / 2))

    s0 = QNum("s0", 1, False, 0)
    s1 = QNum("s1", n, False, 0)

    j0 = QNum("j0", n, False, 0)
    j1 = QNum("j1", 1, False, 0)

    bind(s, [s0, s1])
    bind(j, [j0, j1])

    array_swap(s1, j0)
    CX(s0, j1)

    oracle_d(
        s0, data, values
    )  # refers to O_data which is common across all block encodings in this notebook

    bind([s0, s1], s)
    bind([j0, j1], j)


@qfunc
def main(data: Output[QNum], s: Output[QNum], j: Output[QNum]):
    n = int(math.log2(N))

    allocate(n, s)
    allocate(n, j)
    allocate(1, data)

    apply_to_all(H, j)

    within_apply(lambda: apply_to_all(H, s), lambda: checkerboard_BE(data, s, j))


preferences = Preferences(timeout_seconds=1500, optimization_level=3)
constraints = Constraints(
    optimization_parameter=OptimizationParameter.DEPTH,
    max_width=10,  # change max_width as needed, minimum width is 2n+1
)
qmod = create_model(main, preferences=preferences)
qmod = set_constraints(qmod, constraints)

# write_qmod(
#     qmod,
#     'checkerboard_BE_N{}'.format(N)
# )

# Synthesize the quantum program
backend_preferences = ClassiqBackendPreferences(backend_name="simulator_statevector")
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=1, backend_preferences=backend_preferences
    ),
)
qprog = synthesize(qmod)
# show(qprog)
circuit_width = qprog.data.width
circuit_depth = qprog.transpiled_circuit.depth
print("The circuit width is:", circuit_width)
print("The circuit depth is:", circuit_depth)

job = execute(qprog)
results = job.result()[0].value


reduced_state = get_projected_state_vector(results, "j", {"s": 0, "data": 0})
reduced_state = reduced_state * N  # rescale by subnormalization factor N
print("The reduced state vector for j when s=0, data=0 is:")
print(reduced_state)


# theoretical reduced state vector
theoretical_reduced_state = np.matmul(M, [1 / np.sqrt(N) for i in range(N)])
print("Theoretical reduced state vector for j when l=0 and data=0 is:")
print(theoretical_reduced_state)

The circuit width is: 7
The circuit depth is: 11
The reduced state vector for j when s=0, data=0 is:
[1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3]
Theoretical reduced state vector for j when l=0 and data=0 is:
[-1.3 -1.3 -1.3 -1.3 -1.3 -1.3 -1.3 -1.3]
The reduced state vector for j when s=0, data=0 is:
[1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3]
Theoretical reduced state vector for j when l=0 and data=0 is:
[-1.3 -1.3 -1.3 -1.3 -1.3 -1.3 -1.3 -1.3]


#### **Note:** The state vector results should match upto a overall global phase. Hence the above theoretical and simulated results match exactly. 

____

### 2. Toeplitz Matrix

Consider a $N$ x $N$ Toeplitz matrix with $D$ diagonals (or $D$ values), offset from the main diagonal by $k$ :

$$
\left(\begin{array}{cccccccc}
A_k & A_{k-1} & \cdots & A_0 & & & & \\
A_{k+1} & A_k & A_{k-1} & \cdots & A_0 & & & \\
\vdots & A_{k+1} & A_k & A_{k-1} & \cdots & A_0 & & \\
A_{D-1} & \vdots & A_{k+1} & A_k & A_{k-1} & \cdots & A_0 & \\
& A_{D-1} & \vdots & A_{k+1} & A_k & A_{k-1} & \cdots & A_0 \\
& & A_{D-1} & \vdots & A_{k+1} & A_k & A_{k-1} & \cdots \\
& & & A_{D-1} & \vdots & A_{k+1} & A_k & A_{k-1} \\
& & & & A_{D-1} & \vdots & A_{k+1} & A_k
\end{array}\right)
$$

For the distinct values, we choose $d$ as equal to the subscript of the matrix elements above. For $m$, we simply choose the column index. Arithmetically, the mapping to row ($i$) and column indices ($j$) is then:

$$
i(d, m)=d-k+m, \quad j(d, m)=m
$$

and out-of-range $(d, m)$ pairs are those where an overflow or underflow in the calculation of $i$ occurs, that is when

$$
d-k+m<0 \text { or } d-k+m \geq N \text {. }
$$

Note that the range of the data labels are as follows:

$i,j,m: \{0,1,.., N-1\} , d: \{0,1,..., D-1\}$


In the case of Topelitz matrix, elements in any row and any column has a unique data label $d$ hence we do not require a separate arithmetic for $s_c$ and $s_r$, they can be set equal to $d$ as their range also match. 

#### Block encoding circuit:

<p align="center">
  <img src="toeplitz_circuit.png" width="50%">
</p>

Here the subnormalization factor is $D$ and $2+log_2D$ ancilla qubits are used.

In [4]:
# Toeplitz matrix
def generate_toeplitz_matrix(D, N, k):
    """
    Generate a Toeplitz matrix with D diagonals and size NxN.

    Parameters:
    D (int): Number of diagonals, must be a power of 2.
    N (int): Size of the matrix, must be a power of 2 and N >= D.
    k (int): Offset from the main diagonal.

    Returns:
    numpy.ndarray: NxN Toeplitz matrix.
    """
    # Check if D and N are powers of 2
    if not (D & (D - 1) == 0 and D > 0):
        raise ValueError("D must be a power of 2.")
    if not (N & (N - 1) == 0 and N > 0):
        raise ValueError("N must be a power of 2.")
    if N < D:
        raise ValueError("N must be greater than or equal to D.")
    if k < 0 or k >= D:
        raise ValueError("k must be between 0 and D-1.")

    # Generate random diagonal values between -1 and 1
    diagonals = np.random.uniform(-1, 1, D)

    # Generate the Toeplitz matrix
    toeplitz_matrix = np.zeros((N, N))
    for offset in range(D):
        diag_index = k - offset
        if diag_index >= 0:
            np.fill_diagonal(toeplitz_matrix[:, diag_index:], diagonals[offset])
        if diag_index < 0:
            np.fill_diagonal(toeplitz_matrix[-diag_index:, :], diagonals[offset])

    return toeplitz_matrix


def get_unique_toeplitz(toeplitz_matrix):
    """
    Get the list of unique values used in the Toeplitz matrix,
    ordered from 0th to (D-1)th diagonal
    """
    return _unique_values_core(toeplitz_matrix, mode="toeplitz")


# Test case
D = 4  # Number of diagonals (must be a power of 2, for now)
N = 16  # Size of the matrix (must be a power of 2 and >= D)
k = 1  # Offset from the main diagonal, k must be between 0 and D-1

toeplitz_matrix = generate_toeplitz_matrix(D, N, k)
print("Toeplitz Matrix:\n", toeplitz_matrix)

unique_values_toeplitz = get_unique_toeplitz(
    toeplitz_matrix
)  # unique values in the matrix (each has a unique label- d)
print("Unique values:", unique_values_toeplitz)

Toeplitz Matrix:
 [[0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.   0.   0.   0.  ]
 [0.   0.   0.   0.   0.   0.   0.   0.   0.   0.93 0.86 0.45 0.54 0.

In [5]:
@qfunc
def oracle_r(j: QNum, s: QNum, k: CInt):
    inplace_add(s - k, j)


@qfunc
def Toep_BE(data: QNum, s: QNum, j: QNum, dlt: QBit):
    T_M = unique_values_toeplitz
    oracle_d(s, data, T_M)

    tmp = QNum("tmp")
    tmp |= s - k + j

    oracle_r(j, s, k)
    control(tmp < 0, lambda: X(dlt))
    control(tmp >= N, lambda: X(dlt))


@qfunc
def main(data: Output[QNum], s: Output[QNum], j: Output[QNum], dlt: Output[QBit]):
    d = int(math.log2(D))
    n = int(math.log2(N))

    allocate(d, s)
    allocate(n, j)
    allocate(1, data)
    allocate(1, dlt)

    apply_to_all(H, j)

    within_apply(lambda: apply_to_all(H, s), lambda: Toep_BE(data, s, j, dlt))


preferences = Preferences(timeout_seconds=1500, optimization_level=3)
constraints = Constraints(
    optimization_parameter=OptimizationParameter.DEPTH,
    max_width=18,  # change max_width as needed, minimum width as per algo is n+2+ log2(D)- but more might be used by synthesizer to do the arithmetics
)
qmod = create_model(main, preferences=preferences)
qmod = set_constraints(qmod, constraints)

# write_qmod(
#     qmod,
#     'toeplitz_BE_N{}'.format(N)
# )

# Synthesize the quantum program
backend_preferences = ClassiqBackendPreferences(backend_name="simulator_statevector")
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=1, backend_preferences=backend_preferences
    ),
)
qprog = synthesize(qmod)
# show(qprog)
circuit_width = qprog.data.width
circuit_depth = qprog.transpiled_circuit.depth
print("The circuit width is:", circuit_width)
print("The circuit depth is:", circuit_depth)


job = execute(qprog)
results = job.result()[0].value


reduced_state = get_projected_state_vector(results, "j", {"s": 0, "data": 0, "dlt": 0})
reduced_state = reduced_state * D  # D is the subnormalization factor
print("The reduced state vector for j when s=0, dlt=1 and data=0 is:")
print(reduced_state)


# theoretical reduced state vector
theoretical_reduced_state = np.matmul(
    toeplitz_matrix, [1 / np.sqrt(N) for i in range(N)]
)
print("Theoretical reduced state vector for j when l=0 and data=0 is:")
print(theoretical_reduced_state)

		at ipynb cell line 13 in function 'Toep_BE'


The circuit width is: 18
The circuit depth is: 225
The reduced state vector for j when s=0, dlt=1 and data=0 is:
[0.25 0.46 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.56]
Theoretical reduced state vector for j when l=0 and data=0 is:
[0.25 0.46 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.56]
The reduced state vector for j when s=0, dlt=1 and data=0 is:
[0.25 0.46 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.56]
Theoretical reduced state vector for j when l=0 and data=0 is:
[0.25 0.46 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.69 0.56]


____

### 3. Tridiagonal symmetric Matrix

We now consider a tridiagonal matrix which is also symmetric. This matrix has the below structure and sparsity pattern.


<p align="center">
  <img src="symmetric_tridiagonal.png" width="25%">
</p>

The block encoding technique is similar to the one for Toeplitz matrix. Oracles $O_c$ and $O_r$ need to be appropriately constructed using quantum arithmetics. The oracle $O_{data}$ for loading the unique data values would be identical as above. The below is the full block encoding circuit:


<p align="center">
  <img src="tridiagonal_symmetric_circuit.png" width="50%">
</p>

Here the subnormalization factor is $3$ and $3$ ancilla qubits are used irrespective of matrix size.

In [6]:
# Tridiagonal symmetric matrix
def generate_tridiagonal_symmetric_matrix(size):
    """
    Generate a symmetric tridiagonal matrix of given size.
    The size must be a power of 2, and diagonal values are distinct random values between -1 and 1.

    Parameters:
        size (int): Size of the matrix (must be a power of 2).

    Returns:
        np.ndarray: A symmetric tridiagonal matrix.

    Raises:
        ValueError: If the size is not a power of 2.
    """
    if size <= 0 or (size & (size - 1)) != 0:
        raise ValueError("Size must be a power of 2.")

    # Generate distinct random diagonal values
    diagonal_values = np.random.uniform(-1, 1, size)
    while len(set(diagonal_values)) < size:
        diagonal_values = np.random.uniform(-1, 1, size)  # Ensure uniqueness

    # Generate random off-diagonal values between -1 and 1 for tridiagonal matrix
    off_diagonal_values = np.random.uniform(-1, 1, size - 1)

    # Construct the tridiagonal symmetric matrix
    matrix = np.zeros((size, size))
    np.fill_diagonal(matrix, diagonal_values)
    np.fill_diagonal(matrix[1:], off_diagonal_values)
    np.fill_diagonal(matrix[:, 1:], off_diagonal_values)

    return matrix


def unique_val_tridiagonal(matrix):
    """
    Extract unique values from a (tri)diagonal matrix in a specific order.
    The value 0 is always placed at the end of the list.
    """
    vals = _unique_values_core(matrix, mode="tridiagonal")
    non_zero = [v for v in vals if v != 0]
    return non_zero + [0]


# Test case for tridiagonal symmetric matrix, let's take a 16x16 matrix
size = 16  # Must be a power of 2
matrix = generate_tridiagonal_symmetric_matrix(size)
matrix = np.round(matrix, 4)
print("Tridiagonal symmetric system", matrix)
unique_values = unique_val_tridiagonal(
    matrix
)  # unique values in the matrix (each has a unique label- d)

print("Unique values in the matrix:", unique_values)

Tridiagonal symmetric system [[-0.67  0.91  0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.91 -0.01  0.48  0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.48 -0.85  0.02  0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.02 -0.18 -0.08  0.    0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.   -0.08  0.99 -0.2   0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.   -0.2  -0.31 -0.61  0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.   -0.61  0.68 -0.1   0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.   -0.1  -0.05 -0.63  0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.   -0.63 -0.5   0.93  0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.93 -0.65  0.58  0.    0.   

In [7]:
@qfunc
def eq_sup(s: QNum):
    inplace_prepare_amplitudes(
        [np.sqrt(1 / 3), np.sqrt(1 / 3), 0, np.sqrt(1 / 3)], 0, s
    )


@qfunc
def main(s: Output[QNum], j: Output[QNum], data: Output[QNum]):
    n = int(math.log2(size))
    allocate(2, s)
    allocate(n, j)
    allocate(1, data)

    apply_to_all(H, j)
    eq_sup(s)
    T_M = unique_values

    d = QNum("d", n + 1, False, 0)
    s0 = QNum("s0", 1, False, 0)
    s1 = QNum("s1", 1, False, 0)
    bind(s, [s0, s1])
    control(s0 == 1, lambda: control(s1 == 0, lambda: inplace_add(-1, j)))

    bind([s0, j], d)
    oracle_d(d, data, T_M)
    bind(d, [s0, j])

    control(s1 == 1, lambda: inplace_add(1, j))
    bind([s0, s1], s)

    invert(lambda: eq_sup(s))


backend_preferences = ClassiqBackendPreferences(backend_name="simulator_statevector")
constraints = Constraints(
    optimization_parameter=OptimizationParameter.DEPTH,
    max_width=10,  # change max_width as needed, minimum width is n+3
)
preferences = Preferences(timeout_seconds=900)

qmod = create_model(main)
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=100000, backend_preferences=backend_preferences
    ),
)
qmod = set_preferences(qmod, preferences)
qmod = set_constraints(qmod, constraints)
# write_qmod(
#     qmod,
#     'tridiagonal_BE_size{}'.format(size)
# )

qprog = synthesize(qmod)
# show(qprog)
circuit_width = qprog.data.width
circuit_depth = qprog.transpiled_circuit.depth
print("The circuit width is:", circuit_width)
print("The circuit depth is:", circuit_depth)

job = execute(qprog)
results = job.result()[0].value

reduced_state = get_projected_state_vector(results, "j", {"s": 0, "data": 0})
print("The reduced state vector for j when s=0, data=0 is:")
print(reduced_state)

theoretical_reduced_state = (
    np.matmul(matrix, [1 / np.sqrt(size) for i in range(size)]) / 3
)  # here 3 is the subnormalization factor
print(f"Expected theoretical state vector:")
print(theoretical_reduced_state)
print("********")

The circuit width is: 10
The circuit depth is: 1382
The reduced state vector for j when s=0, data=0 is:
[ 0.02  0.11 -0.03 -0.02  0.06 -0.09 -0.   -0.06 -0.02  0.07  0.04 -0.12 -0.01  0.03 -0.08 -0.11]
Expected theoretical state vector:
[ 0.02  0.11 -0.03 -0.02  0.06 -0.09 -0.   -0.06 -0.02  0.07  0.04 -0.12 -0.01  0.03 -0.08 -0.11]
********
The reduced state vector for j when s=0, data=0 is:
[ 0.02  0.11 -0.03 -0.02  0.06 -0.09 -0.   -0.06 -0.02  0.07  0.04 -0.12 -0.01  0.03 -0.08 -0.11]
Expected theoretical state vector:
[ 0.02  0.11 -0.03 -0.02  0.06 -0.09 -0.   -0.06 -0.02  0.07  0.04 -0.12 -0.01  0.03 -0.08 -0.11]
********


___

### 4. Two-dimensional Laplacian

The discrete Laplacian in one dimension is obtained from the second–order finite difference stencil  
$$
\Delta f(x_i) = \frac{f(x_{i-1}) - 2 f(x_i) + f(x_{i+1})}{(\Delta x)^2},
$$
which corresponds to a tridiagonal Toeplitz matrix with $-2/(\Delta x)^2$ on the diagonal and $1/(\Delta x)^2$ on the off-diagonals.

To generalize this to two dimensions, consider a rectangular grid of size  
$$
N_x \times N_y,
$$
with grid spacings $\Delta x$ and $\Delta y$.  
A grid point is indexed as $(a,b)$ with  
$$
a = 0,\ldots,N_x-1, \qquad b = 0,\ldots,N_y-1.
$$

The 2D finite-difference Laplacian uses the standard 5-point stencil:
$$
\Delta f(x_a, y_b)
=
\frac{f(x_{a-1},y_b) - 2 f(x_a,y_b) + f(x_{a+1},y_b)}{(\Delta x)^2}
+
\frac{f(x_a,y_{b-1}) - 2 f(x_a,y_b) + f(x_a,y_{b+1})}{(\Delta y)^2}.
$$

To write this as a matrix, we flatten the 2D grid into a vector of length  
$$
N = N_x N_y
$$
using **row-major ordering**:
$$
f_{a + b N_x} = f(x_a, y_b).
$$

With this ordering, the Laplacian becomes an $N \times N$ sparse matrix $A$.  
Each row corresponds to a grid point and couples only to its nearest neighbors.  
The matrix entries take only a few values:
$$
A_{a_1 + b_1 N_x,\; a_2 + b_2 N_x} =
\begin{cases}
A_0 = -2\!\left(\frac{1}{(\Delta x)^2} + \frac{1}{(\Delta y)^2}\right), & a_1=a_2,\; b_1=b_2, \\[6pt]
A_1 = \frac{1}{(\Delta x)^2}, & |a_1-a_2| = 1,\; b_1=b_2, \\[6pt]
A_2 = \frac{1}{(\Delta y)^2}, & |b_1-b_2| = 1,\; a_1=a_2, \\[6pt]
0, & \text{otherwise}.
\end{cases}
$$

This produces the familiar **five-diagonal block structure**:  
- the main diagonal contains $A_0$,  
- horizontal neighbors contribute $A_1$,  
- vertical neighbors contribute $A_2$,  
- all other entries are zero.

For example, when $N_x = N_y = 4$, we get the below structure:


<p align="center">
  <img src="2D_Lap.png" width="25%">
</p>


Below is the full block encoding circuit (which is also Hermitian in this case):

<p align="center">
  <img src="2D_Laplacian_circuit.png" width="60%">
</p>

Here the subnormalization factor is $5$ and $5$ ancilla qubits are used, irrespective of matrix size.  





In [8]:
# Laplacian Matrix
def generate_laplacian_matrix(Nx, Ny, delta_x, delta_y):
    N = Nx * Ny
    A = np.zeros((N, N))

    A0 = -2 / (delta_x**2) - 2 / (delta_y**2)
    A1 = 1 / (delta_x**2)
    A2 = 1 / (delta_y**2)

    for a1 in range(Nx):
        for b1 in range(Ny):
            index = a1 + b1 * Nx

            A[index, index] = A0

            # off-diagonal in x-direction
            if a1 > 0:
                A[index, index - 1] = A1
            if a1 < Nx - 1:
                A[index, index + 1] = A1

            # off-diagonal in y-direction
            if b1 > 0:
                A[index, index - Nx] = A2
            if b1 < Ny - 1:
                A[index, index + Nx] = A2

    return A


def unique_val_lap(laplacian, Nx):
    """Extract unique diagonal values from a 2D Laplacian matrix."""
    return _unique_values_core(laplacian, mode="laplacian", Nx=Nx)


# Test case for 2D Laplacian matrix
Nx = 4  # Number of grid points in x-direction (must be a power of 2)
Ny = 4  # Number of grid points in y-direction (must be a power of 2)
delta_x = 1.0  # Grid spacing in x-direction
delta_y = 1.0  # Grid spacing in y-direction
laplacian = generate_laplacian_matrix(Nx, Ny, delta_x, delta_y)
print("2D Laplacian Matrix:\n", laplacian)
lap_scaled = laplacian / np.max(np.abs(laplacian))

unique_values_laplacian = unique_val_lap(lap_scaled, Nx)

2D Laplacian Matrix:
 [[-4.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 1. -4.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1. -4.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1. -4.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 1.  0.  0.  0. -4.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  1. -4.  1.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.  1. -4.  1.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  1. -4.  0.  0.  0.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0. -4.  1.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  1.  0.  0.  1. -4.  1.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  1. -4.  1.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  1. -4.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0. -4.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  1. -4.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0

In [9]:
j1 = int(math.log2(Nx))
j2 = int(math.log2(Ny))


@qfunc
def eq_sup_lap(s: QNum):
    inplace_prepare_state([1 / 5, 0, 1 / 5, 1 / 5, 1 / 5, 1 / 5, 0, 0], 0, s)


@qfunc
def oracle_org(s: QArray, j: QNum, dlt: QBit):
    j1 = math.log2(Nx)
    j2 = math.log2(Ny)
    j1reg = QNum("j1", j1, False, 0)
    j2reg = QNum("j2", j2, False, 0)
    s0 = QNum("s0", 1, False, 0)
    s1 = QNum("s1", 1, False, 0)
    s2 = QNum("s2", 1, False, 0)
    bind(s, [s0, s1, s2])
    bind(j, [j1reg, j2reg])
    control(s0 == 1, lambda: control(s1 == 0, lambda: control(s2 == 0, lambda: X(dlt))))
    control(s1 == 1, lambda: control(j1reg == 0, lambda: X(dlt)))
    control(s2 == 1, lambda: control(j2reg == 0, lambda: X(dlt)))
    bind([j1reg, j2reg], j)
    bind([s0, s1, s2], s)


@qfunc
def laplacian_BE(s: QNum, j: QNum, data: QNum, dlt: QNum):
    j1reg = QNum("j1", j1, False, 0)
    j2reg = QNum("j2", j2, False, 0)
    s0 = QNum("s0", 1, False, 0)
    s1 = QNum("s1", 1, False, 0)
    s2 = QNum("s2", 1, False, 0)
    bind(s, [s0, s1, s2])
    bind(j, [j1reg, j2reg])
    control(s0 == 0, lambda: control(s2 == 1, lambda: inplace_add(1, j2reg)))
    bind([j1reg, j2reg], j)
    control(s1 == 1, lambda: control(s0 == 0, lambda: inplace_add(1, j)))
    bind([s0, s1, s2], s)

    oracle_org(s, j, dlt)

    bind(s, [s0, s1, s2])
    d = QNum("d", 2, False, 0)
    bind([s1, s2], d)
    Z(data)
    oracle_d(d, data, unique_values_laplacian)
    bind(d, [s1, s2])

    control(s1 == 1, lambda: X(s0))
    control(s2 == 1, lambda: X(s0))

    control(
        s1 == 1, lambda: control(s0 == 0, lambda: inplace_add(-1, j))
    )  # is j=0 here causing an issue?  problem here when superposition used
    bind(j, [j1reg, j2reg])
    control(
        s2 == 1, lambda: control(s0 == 0, lambda: inplace_add(-1, j2reg))
    )  # is j=0 here causing an issue?
    bind([j1reg, j2reg], j)
    bind([s0, s1, s2], s)


@qfunc
def main(s: Output[QNum], j: Output[QNum], data: Output[QNum], dlt: Output[QBit]):
    allocate(3, s)
    allocate(1, dlt)
    allocate(1, data)
    allocate(j1 + j2, j)
    apply_to_all(H, j)

    within_apply(lambda: eq_sup_lap(s), lambda: laplacian_BE(s, j, data, dlt))


# print(states)
qmod = create_model(main)
backend_preferences = ClassiqBackendPreferences(backend_name="simulator_statevector")
qmod = set_execution_preferences(
    qmod,
    execution_preferences=ExecutionPreferences(
        num_shots=1000000, backend_preferences=backend_preferences
    ),
)
constraints = Constraints(
    optimization_parameter=OptimizationParameter.DEPTH,
    max_width=10,  # change max_width as needed, minimum width is j1+j2+5
)
qmod = set_constraints(qmod, constraints)

# write_qmod(
#     qmod,
#     'laplacian_BE_Nx{}_Ny{}'.format(Nx, Ny)
# )

qprog = synthesize(qmod)
# show(qprog)
circuit_width = qprog.data.width
circuit_depth = qprog.transpiled_circuit.depth
print("The circuit width is:", circuit_width)
print("The circuit depth is:", circuit_depth)

job = execute(qprog)
results = job.result()[0].value

# Post processing
reduced_state = get_projected_state_vector(results, "j", {"s": 0, "data": 0, "dlt": 0})
print("The reduced state vector for j when s=0, data=0 is:")
print(reduced_state)

# theoretical reduced state vector
theoretical_reduced_state = (
    np.matmul(lap_scaled, [1 / np.sqrt(Nx * Ny) for i in range(Nx * Ny)]) / 5
)
print(f"Expected theoretical state vector:")
print(theoretical_reduced_state)
print("********")

The circuit width is: 10
The circuit depth is: 324
The reduced state vector for j when s=0, data=0 is:
[0.03 0.01 0.01 0.03 0.01 0.   0.   0.01 0.01 0.   0.   0.01 0.03 0.01 0.01 0.03]
Expected theoretical state vector:
[-0.03 -0.01 -0.01 -0.03 -0.01  0.    0.   -0.01 -0.01  0.    0.   -0.01 -0.03 -0.01 -0.01 -0.03]
********


____

### Reference

- Sunderhauf, C., Campbell, E., Camps, J.: Block-encoding structured matrices fordata input in quantum computing. Quantum 8, 1226 (2024) https://doi.org/10.22331/q-2024-01-11-1226.

___