# Inside Quantum Classifiers

This tutorial offers a deep dive in the internals of quantum classifiers and several exercises on solving a simple classification problem from scratch. You will learn how to build a simple classifier by hand and how to train a simple model using the [quantum machine learning library](https://docs.microsoft.com/azure/quantum/user-guide/libraries/machine-learning/) that is part of the Microsoft Quantum Development Kit. 

The companion Python notebook [Exploring Quantum Classification Library](./ExploringQuantumClassificationLibrary.ipynb) offers a high-level walk-through of solving the same classification problem.

### Prerequisites

We recommend you to get familiar with the basics of quantum computing and quantum programming before attempting this tutorial. It relies on the concepts of superposition and measurement, as well as the basic knowledge of quantum gates and circuits.

* You can use the [other tutorials](../../index.ipynb) to learn the basic concepts of quantum computing and their representation in Q#.
* Alternatively, [Microsoft Quantum Development Kit documentation](https://docs.microsoft.com/azure/quantum) covers a lot of introductory topics.

The necessary functionality is provided in the [Microsoft.Quantum.MachineLearning package](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning) package.

## 1. Circuit-centric Quantum Classifiers: an Overview

Circuit-centric quantum classifiers have the following structure:

1. Data encoding.  
The features of a data sample are encoded as the amplitudes of a quantum state. This state further serves as the input to the classification circuit. The encoding is parameter-free and uses only the raw data of the sample.

2. Classification model circuit.  
The classification circuit is a small-depth circuit of single-qubit rotations and two-qubit controlled rotations. 
The circuit geometry (the sequence of the rotations and the qubits to which they are applied) is fixed for the model, and the parameters of the model (the rotation angles) are learned during the training phase.

3. Measurement.  
Applying the classification circuit to the data encoded as a quantum state yields the final quantum state. 
Measuring the "output" qubit of that state allows us to get a classical result - 0 or 1. 
However, that is not the final classification result yet...

4. Result interpretation.  
The steps 1-3 are repeated multiple times to estimate the probability of getting 0 or 1 in the final measurement.
This probability is compared with a threshold (classifier bias) to produce the final classification result.

You can read more about the theory behind circuit-centric quantum classifiers in the [Microsoft QDK documentation](https://docs.microsoft.com/azure/quantum/user-guide/libraries/machine-learning/intro) or in the [original paper](https://arxiv.org/abs/1804.00633). 
In this tutorial we will focus on walking through each step of the classification and training process and implementing them.

## 2. The Raw Data

Same as in the companion tutorial, we start by preparing the training and validation datasets. 

> Q# is a domain-specific programming language, designed to express quantum algorithms, but it supports a subset of classical language features sufficient for performing simple computations, such as data generation. When solving a real problem, you'll want to load the data from an external source before calling Q# code to process it.

This Q# code follows the same data generation logic as the [Python code](./ExploringQuantumClassificationLibrary.ipynb#The-Data) in the companion notebook. 
There are two features, each of them is a real number from $[0, 1)$ range.

In [1]:
open Microsoft.Quantum.Math;
open Microsoft.Quantum.Random;

operation SampleData (samplesNumber : Int, separationAngles : Double[]) : (Double[][], Int[]) {
    mutable features = new Double[][samplesNumber];
    mutable labels = new Int[samplesNumber];
    for i in 0 .. samplesNumber - 1 {
        let sample = [DrawRandomDouble(0.0, 1.0), DrawRandomDouble(0.0, 1.0)];
        let angle = ArcTan2(sample[1], sample[0]);
        set features w/= i <- sample;
        set labels w/= i <- (angle < separationAngles[0] or angle > separationAngles[1]) ? 0 | 1;
    }
    return (features, labels);
}

operation SampleDataDemo () : Unit {
    let trainingData = SampleData(5, [PI() / 6.0, PI()/ 3.0]);
    Message($"{trainingData}");
}

`%simulate` cells run the Q# code on a quantum simulator. Executing the following cell shows a small sample of data generated by the code above.

In [2]:
%simulate SampleDataDemo

([[0,41558553018401634,0,8455760422374942],[0,4469224612447072,0,16762420449760937],[0,4400985573605162,0,05275775680912554],[0,1748506683739138,0,9069414613335121],[0,8807629122774875,0,7115352604126256]], [0,0,0,0,1])


()

## 3. Data Encoding

The first step of the quantum classification process is encoding the raw feature data into the amplitudes of a quantum state. 

> If you need a refresher on quantum state representation, see [Multi-qubit Systems tutorial](https://github.com/microsoft/QuantumKatas/tree/main/tutorials/MultiQubitSystems).

An $n$-qubit quantum state can be described by $2^n$ amplitudes. 
If the data has $M$ features, it can be encoded in the amplitudes of a state with $n = \lceil \log_2 M \rceil$ qubits. 
In general case the amplitudes can be complex, but for the purposes of data encoding it's sufficient to use only real amplitudes.

In our case $M = 2$, so we can encode the features $(x_0, x_1)$ in the state of one qubit as the amplitudes of the basis states $|0\rangle$ and $|1\rangle$, respectively. 
The sum of squares of the amplitudes of the basis states has to be 1, so we'll have to normalize our data:

$$(x_0, x_1) \rightarrow |\psi(x_0, x_1) \rangle = \tilde{x}_0 |0\rangle + \tilde{x}_1 |1\rangle \text{, where }
\tilde{x}_0 = \frac{x_0}{\sqrt{x_0^2 + x_1^2}}\text{ and }\tilde{x}_1 = \frac{x_1}{\sqrt{x_0^2 + x_1^2}}$$

> Note that this encoding will lose part of the information: if we plot the data, multiple points $(x_0, x_1)$ will be encoded in the same $|\psi(x_0, x_1) \rangle$. 
> Effectively only the angular data $\alpha$ is preserved, and the data about the distance to the origin is lost. 
>
> If we plot our data on a plane with X and Y axis corresponding to the amplitudes of the $|0\rangle$ and $|1\rangle$ states, our normalized data will belong to the unit circle:
> <img src="./img/1-data-encoding.PNG" width=300 alt="Two distinct data points are encoded as the same state" />
>
> If we need to preserve information about the distance to the origin (which is typically the case), we need to pre-process the data, adding an extra feature.
> In this tutorial we'll omit this step for simplicity; the synthetic data we're using is chosen so that only the angular data defines the class of the sample.

To implement data encoding in Q#, we can use library routines 
[InputEncoder](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.inputencoder) or 
[ApproximateInputEncoder](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.approximateinputencoder).
Here is what the results will look like for one sample:

In [3]:
open Microsoft.Quantum.Arithmetic;
open Microsoft.Quantum.Diagnostics;
open Microsoft.Quantum.Math;
open Microsoft.Quantum.MachineLearning;
open Microsoft.Quantum.Random;

operation EncodeDataDemo () : Unit {
    let sample = [DrawRandomDouble(0.0, 1.0), DrawRandomDouble(0.0, 1.0)];
    Message($"Raw data: {sample}");
    
    let norm = Sqrt(sample[0] ^ 2.0 + sample[1] ^ 2.0);
    Message($"Normalized data: [{sample[0] / norm}, {sample[1] / norm}]");
    
    use q = Qubit();
    let (_, encoder) = (InputEncoder(sample))!;
    encoder(LittleEndian([q]));
    Message("Encoded as a quantum state:");
    DumpMachine();
    Reset(q);
}

In [4]:
%simulate EncodeDataDemo

Raw data: [0,2991085887416772,0,2821816500659015]
Normalized data: [0,7273891172943013, 0,6862252341919649]
Encoded as a quantum state:


Qubit IDs,0,Unnamed: 2_level_0,Unnamed: 3_level_0
Basis state (little endian),Amplitude,Meas. Pr.,Phase
$\left|0\right\rangle$,$0.7274 + 0.0000 i$,"var num = 52.9094927958183;  num = num.toFixed(4);  var num_string = num + ""%"";  document.getElementById(""round-0f3d530d-b4cc-4920-afe3-638111695e5a"").innerHTML = num_string;",↑
$\left|1\right\rangle$,$0.6862 + 0.0000 i$,"var num = 47.090507204181726;  num = num.toFixed(4);  var num_string = num + ""%"";  document.getElementById(""round-8ee190ed-a26f-4390-86fe-07ef2e4c2d5c"").innerHTML = num_string;",↑


()

## 4. Classification Model Circuit

The classification circuit is a small-depth circuit of single-qubit rotations and two-qubit controlled rotations.
It can be described using two types of data:

1. The circuit geometry is the sequence of the gates that comprise the circuit, the types of rotations each of them uses and the qubits to which each of them is applied.  
Similarly to the model architecture in traditional machine learning, the circuit geometry is fixed for a model. 
In our case we have only one qubit to build the circuit on, so our choices are limited to only three single-qubit rotation gates: 
[$R_x$](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.intrinsic.rx), 
[$R_y$](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.intrinsic.ry) and
[$R_z$](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.intrinsic.rz). 
(You can read more about these gates in the [single-qubit gates tutorial](https://github.com/microsoft/QuantumKatas/tree/main/tutorials/SingleQubitGates).)
We will make an educated guess and decide to use a single [$R_y$](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.intrinsic.ry) gate - we will see why it works for our data later. Here is our model circuit:

<img src="./img/2-classification-circuit.PNG" width=250 alt="Circuit consisting of rotation gate" />

2. The parameters of each gate (the rotation angles).  
Thee parameters are learned during the training phase.
In our case the model has exactly one parameter: the rotation angle of our $R_y$ gate. 

> Following the same data visualization scheme as before, applying $R_y$ gate with parameter $\theta$ $R_y(\theta)$ will rotate the vector describing the state of the qubit counter-clockwise by $\frac{\theta}{2}$ radians.
> <img src="./img/3-rotation-gate.PNG" width=300 alt="Two distinct data points are encoded as the same state" />

## 5. Measurement and Result Interpretation

The last step of the classification run is measuring the output qubit. 
This step is probabilistic: unless the qubit ends up in one of the basis states, the measurement can yield both $0$ and $1$ in different runs. 
To get a useful interpretation of the measurement result, we will repeat all steps (data encoding, applying classifier circuit and measurement) multiple times to estimate the *probabilities* of measuring $0$ and $1$.

Finally, we will compare this probability with a threshold $0.5-b$, where $b$ is the *classifier bias* - another parameter of the classification process that is learned during the training phase. 
We assign the label to the sample we're classifying depending on whether the probability is below or above this threshold.

## 6. Using the QML Library

We could implement the whole training and classification process by hand, but this would become quite tedious very fast. 
Indeed, traditional machine learning uses well-developed libraries to abstract away the low-level implementation details and to allow the user to focus on high-level properties of the model.
The `Microsoft.Quantum.MachineLearning` library shipped with the Quantum Development Kit does just that for the quantum classifier we discuss in this tutorial. Let's see how this model looks when using the library.

### 6.1. Representing the circuit geometry

To describe the model in the format required for using the library, we'll need to represent the circuit geometry as an array of [ControlledRotation](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.controlledrotation)s, one per gate. 
You can read more about designing classifier circuits in [the documentation](https://docs.microsoft.com/azure/quantum/user-guide/libraries/machine-learning/design).

In our case we need to define an array of one rotation around PauliY axis that will be applied to qubit with index 0, with an empty array of control qubits, and parameterized with parameter with index 0. The following function does exactly that:

In [5]:
open Microsoft.Quantum.MachineLearning;

function ClassifierStructure () : ControlledRotation[] {
    return [
        ControlledRotation((0, new Int[0]), PauliY, 0)
    ];
}

### 6.2. Training the model

Library operation [TrainSequentialClassifier](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.trainsequentialclassifier) 
encapsulates all the model training logic.

In [6]:
open Microsoft.Quantum.Arrays;
open Microsoft.Quantum.MachineLearning;
open Microsoft.Quantum.Math;

operation TrainModel (
    trainingVectors : Double[][],
    trainingLabels : Int[],
    initialParameters : Double[][]
) : (Double[], Double) {
    // Combine training data and labels into a single data structure
    let samples = Mapped(
        LabeledSample,
        Zipped(trainingVectors, trainingLabels)
    );
    
    // Define a set of models we're going to try training;
    // in this case all models have the same structure but differ in the value of initial parameters
    let models = Mapped(
        SequentialModel(ClassifierStructure(), _, 0.0),
        initialParameters
    );
    
    // use all samples both for training and for validation
    let defaultSchedule = SamplingSchedule([0..Length(samples) - 1]);
    let (optimizedModel, nMisses) = TrainSequentialClassifier(
        models,
        samples,
        DefaultTrainingOptions()
            w/ LearningRate <- 2.0
            w/ Tolerance <- 0.0005,
        defaultSchedule,
        defaultSchedule
    );
    Message($"Training complete, found optimal parameters: {optimizedModel::Parameters}, {optimizedModel::Bias} with {nMisses} misses");
    return (optimizedModel::Parameters, optimizedModel::Bias);
}

operation TrainModelDemo () : Unit {
    // generate the training data
    let (features, labels) = SampleData(150, [PI() / 6.0, PI()/ 3.0]);
    let (parameters, bias) = TrainModel(features, labels, [[1.0], [2.0]]);
}

In [7]:
%simulate TrainModelDemo

Training complete, found optimal parameters: [1,5743999999999962], -0,43235 with 0 misses


()

### 6.3. Using the trained model for classification/validation

Now that we have trained the model, we can use it either for classifying data using [EstimateClassificationProbabilities](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.estimateclassificationprobabilities) and 
[InferredLabels](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.inferredlabels) library operations, 
or for validating the model using [ValidateSequentialClassifier](https://docs.microsoft.com/qsharp/api/qsharp/microsoft.quantum.machinelearning.validatesequentialclassifier).

In the [Python notebook](./ExploringQuantumClassificationLibrary.ipynb) we used the trained model to classify validation data and to plot it afterwards. 
In this case we don't want to build any plots, so let's just validate the model using another set of data, generated using the same procedure.

In [8]:
open Microsoft.Quantum.Arrays;
open Microsoft.Quantum.Convert;
open Microsoft.Quantum.MachineLearning;
open Microsoft.Quantum.Math;

operation ValidateModel (
    validationVectors : Double[][],
    validationLabels : Int[],
    parameters : Double[],
    bias : Double
) : Double {
    // Combine training data and labels into a single data structure
    let samples = Mapped(
        LabeledSample,
        Zipped(validationVectors, validationLabels)
    );

    let tolerance = 0.005;
    let nMeasurements = 10000;
    // use all data for validation
    let defaultSchedule = SamplingSchedule([0..Length(samples) - 1]);    
    let results = ValidateSequentialClassifier(
        SequentialModel(ClassifierStructure(), parameters, bias),
        samples,
        tolerance,
        nMeasurements,
        defaultSchedule
    );
    return IntAsDouble(results::NMisclassifications) / IntAsDouble(Length(samples));
}

operation TrainAndValidateModelDemo () : Unit {
    // generate the training data
    let (trainingVectors, trainingLabels) = SampleData(150, [PI() / 6.0, PI()/ 3.0]);
    let (parameters, bias) = TrainModel(trainingVectors, trainingLabels, [[1.0], [2.0]]);
    
    // generate the validation data
    let (validationVectors, validationLabels) = SampleData(50, [PI() / 6.0, PI()/ 3.0]);
    let missRate = ValidateModel(validationVectors, validationLabels, parameters, bias);
    Message($"Miss rate: {missRate * 100.0}%");
}

In [9]:
%simulate TrainAndValidateModelDemo

Training complete, found optimal parameters: [1,5632000000000161], -0,43415 with 0 misses
Miss rate: 0%


()