# Predicting Qiskit Single-Qubit Noise Model Parameters with Machine Learning <a class="tocSkip">
   
#### Milestone 1 Review <a class="tocSkip">
    
Â© Noam Siegel, Raphael Buzaglo, Gadi Aleksandrowicz, Shelly Garion
    
***
    

# Problem Setup

## Notation
Let  $\{c_1, c_2,  ..., c_n\} $ be a collection of circuits. Each circuit is consists of a one-qubit $U_3(\theta, \phi, \lambda)$ gate and a classical measurement in the standard basis.
    
Let $\{y_1, y_2, ... y_N\}$ be a collection of noise model parameter vectors, i.e. $y_i = {\theta}_{i}^{j=1...k}$.
    
Let $\{f_1, f_2, f_N\} $ be the resulting frequency outcomes of measuring the noisy circuits.

Then define:

__Features__ := $X = \{x_1, x_2, ... x_N\} $ = $\{(c_1, f_1),(c_2, f_2), ..., (c_n, f_N)\} $
    
__Labels__ := $Y = \{y_1, y_2, ... y_N\}$ =  $\{{\theta}_{1}^{j=1...k}, {\theta}_{2}^{j=1...k},..., {\theta}_{N}^{j=1...k}\}$

Thus, $X$ consists of pairs of (circuit, outcome frequency), and $Y$ consists of noise model parameters, the types of which will be specified later.

__The aim is to learn to predict the noise model parameters using machine learning.__
 
    
## Relationship Between Variables

__Question__. What can we say about the relationships between the variables?

__Answer 1__. It is likely that we can find an explicit description of the function $(c_i, y_i) \to (f_1)$ using the theoretical framework for noisy quantum circuits (i.e. using state vectors, matrix multiplications...) 

__Answer 2__. It is hard to talk about the mapping $(x_i) = (c_i, f_i) \to (y_i)$: there could be multiple $y_i$'s for a given $x_i$ (depending on our noise meta-model). For example: depolarization errors do not affect the Hadamard circuit. In this scenario we cannot know from the frequency counts if the channel is noisy or ideal. Therefore, it makes more sense to talk about learning a _statistical estimation_ (a probability density function) rather than discovering an exact mapping.


# Approach

## Statistical Model

We wish to construct an appropriate [statistical model](https://en.wikipedia.org/wiki/Statistical_model#Formal_definition) to represent the data-generating process. Then, we sample data from this model using a quantum simulator. Finally, machine-learning based statistical estimators can be used to try and infer the missing parameters on unseen data.

    
### Statistical Model Design
 
We wish to define a statistical model which is admissible with relation to some simplified [quantum noise](https://en.wikipedia.org/wiki/Quantum_noise) model.

We define two subspaces of the sample state:
> The feature space is the box $F = [0, 2\pi]^3 \times [0, 1]$, and the label space is some box $L = I_1 \times ... \times I_k $.
Together, the sample state is $S = F \times L$

We propose choices for $I_1,... ,I_k$, although these are dependent on the statistical model which is not final.


#### Proposal 1
##### Noise Model
We define the following six variables:
* $p$ - depolarizing error parameter
* $t_1, t_2, pop$ - thermal relaxation error parameters
* $p_{0\mid 1}, p_{1 \mid 0}$ - readout error parameters

These variables are the ones defined in the [Qiskit documentation](https://qiskit.org/documentation/tutorials/simulators/3_building_noise_models.html).

The sample space for these variables is:
* $I_1 = I_p = [0,1]$
* $I_2 = I_{t_1} = [34000,\infty]$ (empirically determined)
* $I_3 = I_{t_2} =[6070,\infty]$  (empirically determined)
* $I_4 = I_{pop} =[0,1]$  
* $I_5 = I_{p_{0\mid 1}} =[0,1]$  
* $I_6 = I_{p_{1 \mid 0}} =[0,1]$  

### Sampling Data

We generate a uniformly distributed dataset $D$ of $N$ samples from $S$ using the [Qiskit Qasm Simulator](https://qiskit.org/documentation/stubs/qiskit.providers.aer.QasmSimulator.html). (the only data not uniformly distributed are the frequency outcomes $F$ which are generated by Qiskit)
    
    
__Question__. Under what conditions is  $D$  learnable?

__Answer 1__. This depends on the statistical model chosen. It must be egineered in such a way as to satisy statistical [identifiability](https://en.wikipedia.org/wiki/Identifiability). This is an open question we must address.
    
__Answer 2__. Once a statistical model is defined, there are several factors which could affect the learnability of the sampled dataset.  For instance: the sample resolution (~ $1/N$), distribution, etc.

## Statistical Learning
    
The question we are left to discuss is the machine-learning architecture. This is the current challenge we are facing in our work.

This is an open research question.

Some points to think about:

1. For a given instance $x_i$, are we trying to [estimate](https://en.wikipedia.org/wiki/Estimator) all the plausible values of noise model parameters, or create a predictor function which returns some value $y_i$ which happens to work?

2. Do we need to use deep learning methods, or are traditional regression models sufficient? In the latter case, what are the feature engineering procedures we must apply? 
       
3. How can [Statistical Learning Theory](http://maxim.ece.illinois.edu/teaching/SLT/SLT.pdf) help us?

4. A long shot in the dark. Can we incorporate the methods described in [Quantum Detection and Estimation Theory / CW Helstrom Et. Al](https://ntrs.nasa.gov/api/citations/19690016211/downloads/19690016211.pdf) into our architecture? 
    
__Useful reading__
    
5. When do we stop talking about statistics and start talking about machine learning? What is [the difference?](https://towardsdatascience.com/the-actual-difference-between-statistics-and-machine-learning-64b49f07ea3)?

6. Where does [machine learning meet quantum computing](https://arxiv.org/ftp/arxiv/papers/1903/1903.03516.pdf)?
