# Probability Loading Benchmark.

$$\newcommand{\braket}[2]{\left\langle{#1}\middle|{#2}\right\rangle}$$
$$\newcommand{\ket}[1]{\left|{#1}\right\rangle}$$
$$\newcommand{\bra}[1]{\left\langle{#1}\right|}$$


In [None]:
import sys
sys.path.append("../")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random

## 0. The QPU

To simulate the quantum circuits generated by the functions presented in this notebook a configured myQLM (or QLM) **Quantum Process Unit (QPU)** is mandatory. 

The **QPU** can execute an ideal simulation or can simulate the quantum circuits under a noisy hardware model (noisy simulation). To easily deal with these 2 kinds of simulations the  *select_qpu* function from **tnbs.qpu.select_qpu** module was developed. The input of this function is a Python dictionary that allows to the user configure easily a **QPU**.

In the present notebook, only ideal simulation is used. Please refer to the **02_NoisySimulation_BTC_01_PL.ipynb** notebook for configuring noisy models and the corresponding noisy **QPU**s.

In [None]:
sys.path.append("../../../")
from qpu.select_qpu import select_qpu

The minimum Python dictionary for configuring an ideal **QPU** is presented in the following cell. In this case, the user only has to provide a value to the *qpu_type* key. Depending on the type of simulator desired the following strings should be provided:

* *qlmass_linalg*: to use the **LinAlg Quantum Learning Machine (QLM)** algebra simulator. In this case, the computation will be sent to the **QLM** by using the  Qaptiva QLM as a Service.
* *qlmass_mps*: to use **MPS QLM** simulator. In this case, the computation will be sent to the **QLM** by using the  Qaptiva QLM as a Service.
* *python*: to use the PyLinalg algebra simulator.
* *c*: to use the CLinalg alegbra simulator.
* *linalg*: to use the **LinAlg QLM**. In this case, the user should be inside a **EVIDEN QLM**
* *mps*: to use the **MPS QLM** simulator. In this case, the user should be inside a **EVIDEN QLM**

In [None]:
# List with the strings taht should be provided for an ideal QPU
ideal_qpus = ["c", "python", "linalg", "mps", "qlmass_linalg", "qlmass_mps"]
qpu_config = {
    #the following strings can be used:
    #c,python, linalg, mps, qlmass_linalg, qlmass_mps
    "qpu_type": ideal_qpus[0], 
    # The following keys are used for configuring noisy simulations
    "t_gate_1qb" : None,
    "t_gate_2qbs" : None,
    "t_readout": None,
    "depol_channel" : {
        "active": False,
        "error_gate_1qb" : None,
        "error_gate_2qbs" : None
    },
    "idle" : {
        "amplitude_damping": False,
        "dephasing_channel": False,
        "t1" : None,
        "t2" : None
    },
    "meas": {
        "active":False,
        "readout_error": None
    }
}
qpu = select_qpu(qpu_config)

## 01. Kernel

The **PL Kernel** can be defined, mathematically as follows:

Let $\mathbf{V}$ be a normalised vector of complex values:

\begin{equation}\label{eq:vector}
    \mathbf{V} = \{v_0, v_1, \cdot, v_{2^n-1} \}, v_i\in \mathbb{C} 
\end{equation}

such that

\begin{equation}\label{eq:vector_norm}
    \sum_{i=0}^{2^n-1}|v_i|^2 =1
\end{equation}

The main task of the **PL Kernel** is the creation of an operator $\mathbf{U}$, from the normalised vector $\mathbf{V}$, which satisfies equation:

\begin{equation}
    \mathbf{U}|0\rangle_n = \sum_{i=0}^{2^n-1} v_i|i\rangle_n
\end{equation}

In the case of the **TNBS** we are going to use a probability density, **pdf**, as the input vector (so $V = P$):

\begin{equation}\label{eq:probabilÇities}
    \mathbf{P} = \{p_0, p_1, \cdot, p_{2^n-1} \}, p_i\in [0,1] 
\end{equation}

where:

\begin{equation}\label{eq:prob_norm}
    \sum_{i=0}^{2^n-1}|p_i|^2 =1
\end{equation}

For this particular case:

\begin{equation}\label{eq:problem_pl2}
    \mathbf{U}_p|0\rangle_n = \sum_{i=0}^{2^n-1} \sqrt{p_i}|i\rangle_n
\end{equation}

## 02. Benchmark Test Case.


The associated **BTC** for the **PL** benchmark will be the loading of a Gaussian function. The procedure will be:

1. Create the discrete probability density function
2. Creating the probability loading unitary operator $\mathbf{U}_p$
3. Execution of the quantum program and measuring of the quantum probability distribution.
4. Metrics computation.

### 1. Create the discrete probability density function.

We need to create the discrete probability density function. The **TNBS** fixes the following procedure:

* Take a random uniform distribution with a particular mean, $\tilde{\mu}$ and standard deviation, $\tilde{\sigma}$, selected within the following ranges:
    * $\tilde{\mu} \in [-2, 2]$
    * $\tilde{\sigma} \in [0.1, 2]$
* So the normal **PDF** is: $N_{\tilde{\mu},\tilde{\sigma}} (x)$ 
* Set the number of qubits to $n$.
* Create an array of $2^n$ values: $\mathbf{x}=\{x_0, x_1, x_2, \cdots, x_{2^n-1}\}$ where
    * $x_0$ such that $$\int _{-\infty} ^{x_0} N_{\tilde{\mu},\tilde{\sigma}}(x)dx = 0.05$$
    * $x_{2^n-1}$ such that $$\int _{-\infty} ^{x_{2^n-1}}N_{\tilde{\mu},\tilde{\sigma}}(x) dx = 0.95$$
    * $x_{i+1} = x_i + \Delta x$
    * $\Delta x = \frac{x_{2^n-1}-x_0}{2^n}$
* Create a $2^n$ values array, $\mathbf{P}$ from $\mathbf{x}$ by:  
    $$\mathbf{P}(\mathbf{x}) = \{ P(x_0), P(x_1), \cdots, P(x_{2^n-1}) \} = \{N_{\tilde{\mu},\tilde{\sigma}}(x_0), N_{\tilde{\mu},\tilde{\sigma}}(x_1), \cdots, N_{\tilde{\mu},\tilde{\sigma}}(x_{2^n-1}) \}$$
* Normalize the $\mathbf{P}$ array: 
    $$\mathbf{P_{norm}}(\mathbf{x}) = \{ P_{norm}(x_0), P_{norm}(x_1), \cdots, P_{norm}(x_{2^n-1}) \}$$
    where $$P_{norm}(x_{i}) = \frac{P(x_i)}{\sum_{j=0}^{2^n-1} P(x_j)}$$
* Compute the number of shots $n_{shots}$   as:
    $$n_{shots} = \min(10^6, \frac{100}{\min(\mathbf{P_{norm}}(\mathbf{x}))})$$
    
All this part of the procedure is implemented by the *get_theoric_probability* function from **PL/data\_loading** module. The function takes as inputs:

* *n_qbits*: number of qubits for discretization of Gaussian probability distribution.
* *mean*: mean of the Gaussian probability distribution ($\tilde{\mu}$).
* *sigma*: standard deviation of the Gaussian probability distribution ($\tilde{\sigma}$).

returns the following outputs:

* Domain discretization in $2^{\text{n_qbits}}$: $\mathbf{x}$ 
* Discretization of the Gaussian probability distribution: $P_{norm}(x)$
* Domain discretization step: $\Delta x*$
* Number of shots for executing the Quantum circuit: $n_{shots}$
* Scipy function with the configured Gaussian probability distribution: $N_{\tilde{\mu},\tilde{\sigma}}$

For the benchmark test case following conditions should be taken into account:

* $\tilde{\mu} \in [-2, 2]$ 
* $\tilde{\sigma} \in [0.1, 2]$

In [None]:
from data_loading import get_theoric_probability

In [None]:
mu = random.uniform(-2., 2.)
sigma = random.uniform(0.1, 2.)
x, pnx, deltax, shots, normx = get_theoric_probability(5, mu, sigma)

muy = random.uniform(-2., 2.)
sigmay = random.uniform(0.1, 2.)
y, pny, deltay, shotsy, normy = get_theoric_probability(5, muy, sigmay)

In [None]:
plt.plot(x, pnx, '-o')
plt.plot(y, pny, '-o')

### 2. Creating the probability loading unitary operator $\mathbf{U}_p$,

Once the discrete probability distribution is obtained, the unitary operator $\mathbf{U}_p$ for loading it into a quantum state should be created. This operator $\mathbf{U}_p$ acts in the following way:

\begin{equation}
    \mathbf{U}_p|0\rangle_n = \sum_{i=0}^{2^n-1} \sqrt{p_i}|i\rangle_n
\end{equation}

The *load_probability* function from **PL/data\_loading** module creates this operator $\mathbf{U}_p$ given the discrete probability function as input array. The function needs 2 inputs:
* array with the normalised discrete probability array
* method: string for selecting the algorithm for creating the $\mathbf{U}_p$. The algorithm for creating the $\mathbf{U}_p$ will be the one that appeared in *Grover, L., & Rudolph, T. (2002). Creating superpositions that correspond to efficiently integrable probability distributions*. In this algorithm, controlled rotations by state are needed to load the probability distribution into the quantum state. The selection method allows different implementations of these controlled rotations by state:
    * *brute\_force*: uses the direct implementation of controlled rotation by state.
    * *multiplexor*: the controlled rotations are implemented using **Quantum mulitplexors** as explained in: *V.V. Shende and S.S. Bullock and I.L. Markov. Synthesis of quantum-logic circuits*.
    * *KPTree*: **myqlm** implementation of the *Grover and Rudolph* algorithm  using **Quantum mulitplexors**.
    
The output of the function is a **myqlm** gate with the circuit implementation of the $\mathbf{U}_p$ operator.

In [None]:
from data_loading import load_probability

In [None]:
Up_BF = load_probability(pnx, "brute_force")
Up_QMF = load_probability(pnx, "multiplexor")
Up_KPtree = load_probability(pnx, "KPTree")

In [None]:
%qatdisplay Up_BF --depth 2 --svg

In [None]:
%qatdisplay Up_QMF --depth 2 --svg

In [None]:
%qatdisplay Up_KPtree --depth 2 --svg

### 3. Execution of the quantum program and measuring of the quantum probability distribution.

Execute the quantum program $\mathbf{U}|0\rangle_n$ and measure all the $n$ qubits a number of times equal to $n_{shots}$. Store the number of times each state $|i\rangle_n$ is obtained, $m_i$, and compute the probability of obtaining it as $$Q_i = \frac{m_i}{n_{shots}} \forall i = \{0, 1, \cdots, 2^n-1\}$$

This is done by the function *get_qlm_probability* from **data_loading** module. This function executes steps 2 and 3. ´The inputs are:

* array with the normalised discrete probability array
* method: string for selecting the algorithm for creating the $\mathbf{U}_p$. The algorithm for creating the $\mathbf{U}_p$ will be the one that appeared in *Grover, L., & Rudolph, T. (2002). Creating superpositions that correspond to efficiently integrable probability distributions*. In this algorithm, controlled rotations by state are needed to load the probability distribution into the quantum state. The selection method allows different implementations of these controlled rotations by state:
    * *brute\_force*: uses the direct implementation of controlled rotation by state.
    * *multiplexor*: the controlled rotations are implemented using **Quantum mulitplexors** as explained in: *V.V. Shende and S.S. Bullock and I.L. Markov. Synthesis of quantum-logic circuits*.
    * *KPTree*: **myqlm** implementation of the *Grover and Rudolph* algorithm  using **Quantum mulitplexors**.
* shots: $n_{shots}$ the circuit should be executed and measured.
* qpu: **myqlm** quantum process unit (**QPU**) for executing the computation.

The outputs of the function are:
* result: pandas DataFrame with the results of the measurements by possible state. Columns are:
    * States: the quantum states measured
    * Int_lsb: integer representation of the States using least significative bit
    * Probability: the measured probability of the quantum states: this is $Q_i$
    * Amplitude: amplitude of the quantum states (only for exact simulation).
    * Int: integer representation of the States
* circuit: complete executed circuit in my_qlm format
* quantum_time: time needed for obtaining the complete quantum distribution.

In [None]:
from data_loading import get_qlm_probability

In [None]:
result, circuit, qtime = get_qlm_probability(pnx, "multiplexor", shots, qpu)

In [None]:
result

In [None]:
%qatdisplay circuit --depth  --svg

### 4. Metrics Computation


Finally, we need to compare the theoretical probability distribution $N_{\tilde{\mu},\tilde{\sigma}}(x)$ and the measured quantum ones ($Q$). 
This is done using 2 different metrics:

1. The Kolmogorov-Smirnov (*KS*) distance.
2. The Kullback-Leibler (*KL*) divergence.


### 4.1 The Kolmogorov-Smirnov (*KS*) distance.

To compute the **KS** distance following steps should be done:

1. Transform the obtained quantum states to the corresponding $\ket{i}$ to the original $x_i$ values.
2. Now for each $x_i$ a probability of $Q_i$ is associated.
3. Now compute the **KS** distance using:$$KS = \max_{\substack{x}} \left| F^{Q}_n(x) - \int_{-\infty}^xN_{\tilde{\mu},\tilde{\sigma}}(y)dy\right|$$ where$$F^{Q}_n(x) = \sum_{i=0}^{2^n-1} \left\{
\begin{array}{ll}
      Q_i & x_i \leq x \\
      0 & x_i > x \\
\end{array}
\right.$$

In [None]:
#1. Transform the obtained quantum states to the corresponding $\ket{i}$ to the original $x_i$ values.
result["x"] = x[result["Int_lsb"]]

In [None]:
# Now we have relation between x_i and Q_i
result

In [None]:
# Compute KS
ks = np.abs(
    result["Probability"].cumsum() - normx.cdf(result["x"])
).max()

In [None]:
print("The Kolmogorov-Smirnov is: {}".format(ks))

#### testing KS with scipy package

The proposed **KS** implementation can be compared with the implementation of the **KS** of the scipy package. This package compares the samples from a distribution with a theoretical distribution. We need then the states that were obtained in the quantum routine. We can rebuild them using the information in the *result* pdf:

In [None]:
# Re build the quantum sampling
import itertools
medidas = list(itertools.chain(
    *result.apply(
        lambda x : [x["x"]] * int(round(x["Probability"] * shots)), 
        axis=1
    )
))

In [None]:
#using  ks from scipy
from scipy.stats import entropy, kstest

In [None]:
scipy_ks = kstest(medidas, normx.cdf)
print("KS using scipy: "+str(scipy_ks.statistic))

In [None]:
# Error between scipy and proposed implementations
ks - scipy_ks.statistic

### 4.2 The Kullback-Leibler (*KL*) divergence.

To compute the **KL** divergence the following formula should be used:

$$KL = \sum_{i=0}^{2^n-1} P_{norm}(x_i) \ln \frac{P_{norm}(x_i)}{\max(\epsilon, Q_i)}$$ where $$\epsilon = \min(P_{norm}(x_i) *10^{-5})$$

In [None]:
epsilon = pnx.min() * 1.0e-5
kl_pdf = pd.merge(
    pd.DataFrame(
        [x, pnx], index=["x", "p_th"]
    ).T,
    result[["x", "Probability"]],
    on = ["x"],
    how = "outer"
).fillna(epsilon)

In [None]:
kl = kl_pdf["p_th"] @ np.log(kl_pdf["p_th"] / kl_pdf["Probability"])

print("The Kullback-Leiber divergence is: "+str(kl))

The entropy function from scipy allows us to compute the **KL**

In [None]:
from scipy.stats import entropy

In [None]:
sicpy_kl = entropy(kl_pdf["p_th"], kl_pdf["Probability"])
print("The scipy Kullback-Leiber divergence is: "+str(sicpy_kl))

In [None]:
kl - sicpy_kl

We can compare graphically the measured quantum distribution versus the theorical discretized one

In [None]:
plt.plot(x, pnx, '-')
plt.plot(x, result["Probability"], 'o')
plt.legend(["theoretical pdf", "quantum pdf"])

## 03. The *LoadProbabilityDensity* class

The *LoadProbabilityDensity* python class inside the **PL/load_probabilities** module allows the user to build the procedure explained in section 02 of the notebook easily and directly. When the class is instantiated a Python dictionary that configures the **BTC** execution should be provided. The mandatory keys are:

* load_method: string with the method for implementing the $\mathbf{U}_p$
* number_of_qbits: number of qubits for discretizing the domain.
* qpu: the QPU for executing the quantum circuits

Additionally, the user can provide a fixed mean and a standard deviation by providing the following keys:

* mean: float
* sigma: float

If this keys are not provided then random values will be used:

* $\tilde{\mu} \in [-2, 2]$ 
* $\tilde{\sigma} \in [0.1, 2]$

In [None]:
from load_probabilities import LoadProbabilityDensity

In [None]:
configuration = {
    "load_method": "brute_force", "number_of_qbits": 8, "qpu": qpu,
    "mean": 1.2, "sigma": 0.2
}
btc_pl = LoadProbabilityDensity(**configuration)

For executing the procedure the *exe* method of the class should be invoked.

In [None]:
btc_pl.exe()

The following attributes can be accessed:

* data: numoy array with the theoretical pdf.
* result: pandas DataFrame with the quantum pdf
* circuit: *myqlm* circuit
* mean: mean of the theoric Gaussian distribution
* sigma: variance of the theoric Gaussian distribution
* ks: Kolmogorov-Smirnov metric
* kl: Kullback-Leibler divergence

In [None]:
print("The mean of the Gaussian pdf is: {}".format(btc_pl.mean))
print("The variance of the Gaussian pdf is: {}".format(btc_pl.sigma))

print("The Kolmogorov-Smirnov is: {}".format(ks))
print("The Kullback-Leibler divergence is: {}".format(kl))

In [None]:
plt.plot(btc_pl.x_, btc_pl.data, '-')
plt.plot(btc_pl.x_, btc_pl.result["Probability"], 'o', alpha=0.3)
plt.legend(["theoretical pdf", "quantum pdf"])

In [None]:
circuit = btc_pl.circuit
%qatdisplay circuit --dept --svg

Finally the method *summary* creates a pandas DataFrame (*pdf* attribute) with the complete information of the execution


In [None]:
btc_pl.summary()

In [None]:
btc_pl.pdf

If *mean* and *sigma* are not provided random values will be used:

In [None]:
configuration = {
    "load_method": "brute_force", "number_of_qbits": 8, "qpu": qpu,
}

pdf = []
for i in range(4):
    btc_pl = LoadProbabilityDensity(**configuration)
    btc_pl.exe()
    pdf.append(btc_pl.pdf)
    
pdf = pd.concat(pdf)

In [None]:
pdf

### Command line execution

The complete **BTC** can be executed by invoking the module **PL/load_probabilities** as a command line. For getting the input arguments the following command can be used:

    python load_probabilities.py -h

    usage: load_probabilities.py [-h] [-n_qbits N_QBITS] [-mean MEAN] [-sigma SIGMA] [-method METHOD]
                                 [-json_qpu JSON_QPU] [-id ID] [-name BASE_NAME] [-folder FOLDER] [--save] [--count]
                                 [--print] [--exe]

    optional arguments:
      -h, --help          show this help message and exit
      -n_qbits N_QBITS    Number of qbits for interval discretization.
      -mean MEAN          Mean for the Gaussian Distribution
      -sigma SIGMA        Standar Deviation for the Gaussian Distribution
      -method METHOD      For selecting the load method: multiplexor, brute_force, KPTree
      -json_qpu JSON_QPU  JSON with the qpu configuration
      -id ID              For executing only one element of the list
      -name BASE_NAME     Additional name for the generated files. Only with --save
      -folder FOLDER      Path for storing folder. Only with --save
      --save              For saving staff
      --count             For counting elements on the list
      --print             For printing
      --exe               For executing program
  
  
The qpu configuration should be provided as a JSON file. In the folder **tnbs/qpu/** two examples of JSON files can be found:
* *tnbs/qpu/qpu_ideal.json*: This JSON configures qpus for ideal simulation.
* *tnbs/qpu/qpu_noisy.json*: This JSON configures qpus for noisy simulation (only valid if the user is connected to an EVIDEN QLM)

These JSON files allow to the user configure several qpus at the same time and the user can select which one to use. 


#### **--count** argument

The **--count** argument allows to the user know how many qpus have been configured for the corresponding JSON qpu file configuration. 

If the *PL/qpu/qpu_ideal.json* was not modified then the following command will return 6 (because 6 different qpus are configured originally in the JSON file):

    python load_probabilities.py -n_qbits 10 -method multiplexor -json_qpu ../../qpu/qpu_ideal.json --count


#### **--print** argument

The **--print** argument in combination with the -id ID argument allows to the user know what is the configuration of the QPU, the number of qubits and the method that will be used.

If the *PL/qpu/qpu_ideal.json* was not modified then the following command:

    python load_probabilities.py -n_qbits 10 -method multiplexor -json_qpu ../../qpu/qpu_ideal.json -id 0 --print

will return:

    {'load_method': 'multiplexor', 'number_of_qbits': 10, 'mean': None, 'sigma': None, 'qpu': {'qpu_type': 'c', 't_gate_1qb': None, 't_gate_2qbs': None, 't_readout': None, 'depol_channel': {'active': False, 'error_gate_1qb': None, 'error_gate_2qbs': None}, 'idle': {'amplitude_damping': False, 'dephasing_channel': False, 't1': None, 't2': None}, 'meas': {'active': False, 'readout_error': None}}}
    
Meanwhile the command 

    python load_probabilities.py -n_qbits 10 -method multiplexor -json_qpu ../../qpu/qpu_ideal.json -id 3 --print

will return:

    {'load_method': 'multiplexor', 'number_of_qbits': 10, 'qpu': {'qpu_type': 'mps', 't_gate_1qb': None, 't_gate_2qbs': None, 't_readout': None, 'depol_channel': {'active': False, 'error_gate_1qb': None, 'error_gate_2qbs': None}, 'idle': {'amplitude_damping': False, 'dephasing_channel': False, 't1': None, 't2': None}, 'meas': {'active': False, 'readout_error': None}}}
    
    
#### **--exe** argument

The **--exe** argument in combination with the -id ID argument allows the user to solve the desired probability loading problem with the selected qpu. 

If the *PL/qpu/qpu_ideal.json* was not modified then the following command:

    python load_probabilities.py -n_qbits 10 -method multiplexor -json_qpu ../../qpu/qpu_ideal.json  -id 0 --exe
    
will solver the **PL** for a 10 qubits probability density discretization, using the **multiplexor** methods for building the $\mathbf{U}_p$ operator and the **CLinalg** qpu.

#### **--save** argument

The **--save** argument, in combination with -id ID and the  **--exe**  arguments, will execute the PL and save the following staff:
* The summary pandas DataFrame (*pdf* attribute of the object)
* The probability pandas DataFrame (*kl_pdf* attribute of the object)
* The qiskit export of the quantum circuit (*circuit* attribute of the object as qisikit circuit)

The folder for storing can be provided using the *-folder FOLDER* and with *-name BASE_NAME* the user can provide a name to add to the different generated files.

## 04. my\_benchmark\_execution

A complete benchmark execution following the **TNBS** guidelines can be performed by using the **my\_benchmark\_execution.py** module in the **BTC_01_PL** folder.

The probability loading algorithm can be configured in the *kernel_configuration* dictionary at the end of the file. Additionally, the number of qubits for executing the complete benchmark can be provided as a list to the key *list_of_qbits* of the *benchmark_arguments*.

For changing the folder where all the files generated by the benchmark are stored the path can be provided to the key *saving_folder*  of the *benchmark_arguments*.

## 05. Generating the JSON file.

Once the files from a complete benchmark execution are generated the information should be formated following the **NEASQC JSON schema**. For doing this the **neasqc_benchmark.py** module can be used. At the end of the file the path to the folder where all the files from benchmark are stored should be provided to the variable **folder**.

For creating the JSON file following command should eb executed:

    python neasqc_benchmark.py

## 06. Complete Workflow.

The bash script **benchmark_exe.sh** allows to automatize the execution of the benchmark and the JSON file generation (once the *my_benchmark_execution.py* and the *neasqc_benchmark.py* are properly configured).

    bash benchmark_exe.sh