Skip to content

Model description

JustWhit3 edited this page Jan 25, 2024 · 5 revisions

Table of contents

What is unfolding?

THe unfolding (also called deconvolution or matrix-inversion problem) is a statistical method used to reconstruct a signal from the corresponding measured biased/smeared distribution.

We start with the following equation:

$\vec{\mu}=R \vec{x}$

in which $\vec{\mu}$ is the measured distribution, $\vec{x}$ is the reconstructed one and $R$ is the response matrix which is used to decode the measured distribution and extract the unfolded one.

The most intuitive way to solve this problem is to invert the response matrix and get the reconstructed distribution in this way:

$\vec{x}=R^{-1} \vec{d}$

the problem is that the matrix is often not invertible and therefore other numerical or approximated approaches are used to solve this problem; most classical common ones are:

Model construction

The mathematical model used in QUnfold starts from the likelihood formulation of the classical unfolding problem, which is represented by the log-likelihood maximization of the following expression:

$\vec{x} = \max _{\vec{x}}(\log \mathcal{L}(\vec{\mu} \mid \vec{d})+\lambda \mathcal{S}(\vec{\mu}))$

where $\vec{x}$ and $\vec{d}$ have the same meaning of before, while $\lambda$ is a Lagrange multiplier called the regularization parameter and $\mathcal{S}$ is the regularization function.

Assuming each bin entries are distributed following a normal distribution (Gaussian approximation), the previous expression can be reformulated in a form of a minimization procedure:

$\vec{x} = \min _{\vec{x}}\left(|R \vec{x}-\vec{d}|^2+\lambda|G \vec{x}|^2\right)$

where $R$ is the usual response matrix, while $G$ is the Laplacian operator obtained from the discrete second-order derivative (average curvature), in the case of QUnfold. One is free of choosing another nth-derivative order of course. The second term of the minimization term $\lambda|G \vec{x}|^2$ is usually called Tikhonov regularization.

Embedding into QUBO model

Up to this moment, nothing quantum has been introduced yet.

Now, our aim is to convert the content of the previous minimization equation into an Ising hamiltonian which is the standard hamiltonian used in a quantum annealer to solve optimization problems. This problem formulation is also called Quadratic Unconstrained Binary Optimization (QUBO) model.

The hamiltonian we construct has the following form:

$H(\vec{x})=\sum_i a_i x_i+\sum_{i, j} b_{i j} x_i x_j=\vec{a} \cdot \vec{x}+\vec{x}^T B \vec{x}$

where:

$\vec{a}=-2 R^T \vec{d}$ $B=R^T R+\lambda G^T G$

Input vectors (distributions) of this hamiltonian have to be encoded in binary form, using a logarithmic binary encoder which in our case is provided by the PyQUBO library. In this way, our problem has been converted into a Binary Quadratic Model (BQM) and can be solved by the D-Wave quantum annealer.

Known limits of the model

The complexity of the problem is proportional to the number of bins $N_{\text {bins }}$ of the considered distribution, since we require at least one logical qubit for each bin, and to the logarithm in base 2 of the $N_{\text {entries }}$. Being more precise, we need:

$N_{\text{log-qubits} } \propto \sum_i^{N_{\text {bins }}} \cdot \log_2\left(N_{\text {entries }}^i\right)$

where $N_{\text {entries }}$ is the vector of the number of entries in each bin.

This is directly obtained from the binary encoding procedure.

However, this limit on the required number of logical qubits is very high: for example, if we have a distribution with 20 bins (which is a very high number in particle physics) and 5.000.000 entries we need $N_{\text{log-qubits} } \approx 350$, but D-Wave quantum annealer currently has 5000+ qubits available, so the unfolding can be can be broadly solved in any real case application.

Clone this wiki locally