# Quantum Machine Learning: an implementation of PCA using Amazon Braket

Recent areas of computing such as Machine Learning had to face different obstacles to achieve their successes. An example of these problems is the dimensionality of the data that must be processed in today’s digital age. With a highly connected world, processing data with high dimensionality is a difficult task due to the resource demand, even for existing supercomputers. For this reason, different techniques have been created to subtract the most relevant information from data sets with high dimensionality, such as the **Principal Component Analysis** (PCA) technique. In a nutshell, PCA seeks to reduce the dimensionality of a data set, while preserving as much variability (i.e., statistical information) as possible [1].

On the other hand, all these techniques to improve data processing and extract the most relevant information have been conceived under the paradigm of classical computing (classical computing receives this name to differentiate it from the paradigm of quantum computing which leverages the laws of quantum mechanics to take advantage of new properties that are not present in classical computing [2]). However, these and other machine learning techniques are starting to be think using new approaches such as quantum processing, this new branch is known as Quantum Machine Learning [3].

The purpose of this notebook is to implement PCA on a quantum processor by using Amazon Braket. We will use the well-known scenario for PCA of housing prices in the United States. The idea behind this scenario is to show how a single variable (i.e. price of the house) can be function of many other features such as the number of bedrooms, number of bathrooms, if the kitchen has been recently remodeled, if it has a balcony, square footage, if it has sea view, lot size, amount of years of the construction, location of the house (neighborhood), if it has attic or basement, number of parking spots, etc. Due to the amount of features, it is important to reduce the data to the features that capture the largest variance in the data. Here comes the PCA technique which aims to determine which features capture the largest variance on the data [1].

For our case, let's consider the number of bedrooms (first feature) and the square footage (second feature) of several houses for sale in Los Alamos (following the same example as [4]). Here is the raw data, taken from __[Zillow](http://www.zillow.com)__, for 15 houses:

Number of bedrooms:
\begin{align}
X_1 = [4,3,4,4,3,3,3,3,4,4,4,5,4,3,4]
\end{align}

Square footage:

\begin{align}
X_2 = [3028, 1365, 2726, 2538, 1318, 1693, 1412, 1632, 2875, 3564, 4412, 4444, 4278, 3064, 3857]
\end{align}

## Principal Component Analysis

PCA has three main steps. The first one is called standarizations which means we must adjust the data to the right scale. We're going to divide $X_2$ by $1000$ so we can have both features on the same scale, and then we're going to substract off the mean of both features: 

In [1]:
import numpy as np

X_1 = [4,3,4,4,3,3,3,3,4,4,4,5,4,3,4]
X_2 = [3028,1365,2726,2538,1318,1693,1412,1632,2875,3564,4412,4444,4278,3064,3857]
X_1 = X_1 - np.average(X_1)
X_2 = (X_2 - np.average(X_2)) / 1000

In [2]:
print('The rescaled feature vectors are')
print('X_1 = ', X_1)
print('X_2 = ', X_2)

The rescaled feature vectors are
X_1 =  [ 0.33333333 -0.66666667  0.33333333  0.33333333 -0.66666667 -0.66666667
 -0.66666667 -0.66666667  0.33333333  0.33333333  0.33333333  1.33333333
  0.33333333 -0.66666667  0.33333333]
X_2 =  [ 0.21426667 -1.44873333 -0.08773333 -0.27573333 -1.49573333 -1.12073333
 -1.40173333 -1.18173333  0.06126667  0.75026667  1.59826667  1.63026667
  1.46426667  0.25026667  1.04326667]


The next step for PCA is calculating the covariance matrix which is defined by (where $\sigma[X]$ is the standard deviation of $X$, and note the $cov[X,X]=\sigma^2[X]=var(X)$ which means the covariance of variable with itself is the variance of the variable): 

\begin{align}
\Sigma = 
\begin{pmatrix}
var[X_1] & cov[X_1 * X_2] \\
cov[X_2 * X_1] & var[X_2]
\end{pmatrix}
\end{align}

We're going to use pandas library on Python to calculate the covariance matrix we need:

In [3]:
import pandas as pd

df = pd.DataFrame(
    {'X_1': X_1,
    'X_2': X_2}
    )

In [4]:
sigma = df.cov()
print(sigma)

          X_1       X_2
X_1  0.380952  0.573476
X_2  0.573476  1.296934


Then, our covariance matrix is:

\begin{align}
\Sigma = 
\begin{pmatrix}
0.380952 & 0.573476 \\
0.573476 & 1.296934
\end{pmatrix}
\end{align}

Now, the final step is to find the eigenvalues and the eigenvectors of the covariance matrix so we can find the Principal Components of the data. According to the number of the eigenvalues we know how much variance is being captured by each principal component. In our 2-feature case, we would keep the component with the highest eigenvalue. We can calculate using classical computation these eigenvalues with the linalg library on Python of Numpy:

In [5]:
sigma_eigenvalues, sigma_eigenvectors = np.linalg.eig(sigma)
print('sigma_eigenvalues: ', sigma_eigenvalues)
print('sigma_eigenvectors: ', sigma_eigenvectors)

sigma_eigenvalues:  [0.1050286  1.57285742]
sigma_eigenvectors:  [[-0.90112103 -0.43356764]
 [ 0.43356764 -0.90112103]]


Then, our eigenvalues (calculated with classical computation) are:

\begin{align}
e_1 = 1.57285742, e_2 = 0.1050286
\end{align}

## Quantum algorithm to compute PCA

We're going to implement the same quantum algorithm proposed by [4] which has the following steps:

    1. Classical pre-processing.
    2. State preparation.
    3. Purity calculation.
    4. Classical post-processing.

### 1. Classical pre-processing

Before talking about the classical pre-processing step, it's important to understand what a density matrix is. Density matrices are used to represent quantum states when they're not pure states but mixed states. If we don't know the exact state where our quantum system is but rather we know that it can be in one of the $M$ states, $\ket{\psi_i}$, each with a probability of $p_i$. Then, we can define the density matrix for the quantum system to be: 

\begin{align}
\rho = \sum_{i=1}^{M} p_i \ket{\psi_i} \bra{\psi_i}
\end{align}

Note that the previous density matrix definition gives the same result when we know the quantum system is in a specific state with $p=1$. Now, it's important to note also that from the desnity matrix definition it can be seen that: 

    i) it's positive semi-definite
    ii) it has unit trace
    
In fact, **any matrix that satisfies these two properties can be interpreted as a density matrix** [4] (more details about density matrix interpretation on the reference). This important fact is how we are going to encode the classical information of the covariance matrix into a quantum state. 

Then, our density matrix from our covariance matrix in this case of two features would be (after normalization): 

\begin{align}
\rho = \frac{\Sigma}{Tr(\Sigma)}
\end{align}

In [6]:
rho = sigma / np.trace(sigma)
print(np.array(rho))

[[0.22704306 0.34178495]
 [0.34178495 0.77295694]]


Our density matrix $\rho$ would be:

\begin{align}
\rho = 
\begin{pmatrix}
0.227043 & 0.341784 \\
0.341784 & 0.772956
\end{pmatrix}
\end{align}

## 2. State preparation

Before the state preparation step, we must talk about the purification process. Density matrices are more useful way to represent quantum systems since we can have both pure and mixed states, whereas state vector can only represent pure states. However, even a mixed state can be seen as a part of a larger system that is in a pure state. This process of converting a mixed state into a pure state of an enlarged system is called **purification** [4].

Recalling the definition of the density matrix, and by using the Schrödinger–HJW theorem it can be shown that any density matrix $\rho$ can be purified, which means it can be seen as the partial trace of a pure state defined in a larger Hilbert space. In other words, it's always possible to find a larger Hilbert space $\mathcal{H_a}$ with a pure state $\ket{\psi_{sa}} \in \mathcal{H_s} \otimes \mathcal{H_a}$ such that $\rho = Tr_a(\ket{\psi_{sa}}\bra{\psi_{sa}})$, and those states satisfy (for some orthonormal basis {$a_i$}): 

\begin{align}
\ket{\psi_{sa}} = \sum_i \sqrt{p_i} \ket{\phi_i} \otimes \ket{a_i}
\end{align}

For the state preparation process, it's important to note that in our case of only two features, $\Sigma$ and $\rho$ are $2x2$ matrices (one qubit states), then $\rho$ can be purified to a pure state $\ket{\psi}$ on two qubits. Note that, rigorously, we should design a state preparation circuit for this purification which is made on [4] with details. But for this notebook, we directly calculate it to focus on the quantum PCA process.

First, we find the eigenvalues and eigenvector of our density matrix $\rho$. 

In [7]:
rho_eig_val, rho_eig_vec = np.linalg.eig(rho)
print('rho_eig_val: ', rho_eig_val)
print('rho_eig_vec: ', rho_eig_vec)

rho_eig_val:  [0.06259579 0.93740421]
rho_eig_vec:  [[-0.90112103 -0.43356764]
 [ 0.43356764 -0.90112103]]


Then, our eigenvectors (after normalization process) would be:

\begin{align}
\vec{v_1} = 
\begin{pmatrix}
-0.90112103 \\
0.43356764 
\end{pmatrix},
\vec{v_2} = 
\begin{pmatrix}
-0.43356764 \\
-0.90112103
\end{pmatrix}
\end{align}


Following the definition of the pure state, we are going to use the computational basis which is an orthonormal basis (i.e. {$a=\ket0,\ket1$}):

In [8]:
sqrt_eig_val = np.sqrt(rho_eig_val)
print(sqrt_eig_val)

[0.25019151 0.96819637]


Our eigenvalues would be:

\begin{align}
\sqrt{\lambda_1} = 0.25019151, \sqrt{\lambda_2} = 0.96819637 
\end{align}

Using the previous definition, our pure state would be: 

\begin{align}
\ket{\psi} = \sum_i \sqrt{\lambda_i} \ket{v_i} \otimes \ket{a_i}
\end{align}

\begin{align}
\ket{\psi} = \sqrt{\lambda_1} \ket{v_1} \otimes \ket{0} + \sqrt{\lambda_2} \ket{v_2} \otimes \ket{1}
\end{align}

\begin{align}
\ket{\psi} = 0.25019151 
\begin{pmatrix}
-0.90112103 \\
0.43356764 
\end{pmatrix} \otimes 
\begin{pmatrix}
1 \\
0
\end{pmatrix} + 0.96819637 
\begin{pmatrix}
-0.43356764 \\
-0.90112103
\end{pmatrix} \otimes 
\begin{pmatrix}
0 \\
1
\end{pmatrix}
\end{align}

\begin{align}
\ket{\psi} = 0.25019151 
\begin{pmatrix}
−0.90112103 \\
0 \\
0.43356764  \\
0
\end{pmatrix} + 0.96819637 
\begin{pmatrix}
0 \\
-0.43356764 \\
0  \\
−0.90112103
\end{pmatrix}
\end{align}

We must use the *flip* function of Numpy because the eigenvectors provided by the linalg library are actually the columns and not the rows:

In [9]:
tensor_product1 = np.vstack((np.flip(rho_eig_vec[1]),np.zeros(2))).ravel('F')
tensor_product2 = np.vstack((np.zeros(2),np.flip(rho_eig_vec[0]))).ravel('F')
print(tensor_product1)
print(tensor_product2)

[-0.90112103  0.          0.43356764  0.        ]
[ 0.         -0.43356764  0.         -0.90112103]


In [10]:
psi = sqrt_eig_val[0]*tensor_product1 + sqrt_eig_val[1]*tensor_product2
print(psi)

[-0.22545283 -0.41977861  0.10847494 -0.8724621 ]


Our purified state would be: 

\begin{align}
\ket{\psi} = 
\begin{pmatrix}
-0.22545283 & -0.41977861 \\
0.10847494 & -0.8724621 \\
\end{pmatrix}
\end{align}

The way to confirm the state above is purified is checking if we return to the original density matrix $\rho$. 

\begin{align}
\rho = Tr_a(\ket{\psi}\bra{\psi}) 
\end{align}

In [11]:
rho_partial_trace = np.dot(psi.reshape((4,1)),psi.reshape((4,1)).transpose())
print(rho_partial_trace)

[[ 0.05082898  0.09464028 -0.02445598  0.19669905]
 [ 0.09464028  0.17621408 -0.04553546  0.36624093]
 [-0.02445598 -0.04553546  0.01176681 -0.09464028]
 [ 0.19669905  0.36624093 -0.09464028  0.76119012]]


As the reader can verify, taking the partial trace of the previous matrix lead us to the original mixed state with density matrix $\rho$ which proves that $\ket{\psi}$ is a purified state. 

\begin{align}
Tr_a(\ket{\psi}\bra{\psi}) = \begin{pmatrix}
0.227043 & 0.341784 \\
0.341784 & 0.772956
\end{pmatrix}
= \rho
\end{align}

## 3. Purity calculation

Before getting into the details of calculating the purity, we must discuss what is the purity and how to find it in our scenario. The purity is a measure which states how much a quantum state is mixed and is defined by: 

\begin{align}
P = Tr(\rho^2) = \sum_{i} p_i^2
\end{align}

Using the useful definitions and theory provided by [5], we can see that the purity can be expressed in terms of the lenght $r = || \vec{r} ||$ of the **Bloch vector**: 

\begin{align}
P = Tr(\rho^2) = \frac{1 + r^2}{2}
\end{align}

\begin{align}
2P - 1 = r^2
\end{align}

\begin{align}
\sqrt{2P - 1} = || \vec{r} ||
\end{align}

Using the definition of the eigenvalues of the density matrix $\rho$ expressed in terms of the lenght $r$ of the Bloch vector:

\begin{align}
\lambda_{1,2} =  \frac{(1 \pm \sqrt{(x^2 + y^2 + z^2)})}{2} = \frac{(1 \pm ||\vec{r} ||)}{2} = \frac{(1 \pm \sqrt{2P-1})}{2}
\end{align}

The previous equations states that we can calculate the eigenvalues of our covariance matrix (i.e. the eigenvalues for the Principal Components) by measuring the purity of our quantum state. 

Now, following the same procedure presented by [6], the purity is equal to the expected value of the SWAP gate between two purification copies (using also the previous definitions):

\begin{align}
\rho = \sum_{i} p_i \ket{v_i} \bra{v_i}
\end{align}

\begin{align}
\ket{\psi} = \sum_{i} \sqrt{p_i} \ket{v_i} \otimes \ket{w_i} = \sum_{i} \sqrt{p_i} \ket{v_i} \ket{w_i}
\end{align}

\begin{align}
P = Tr(\rho^2) = \sum_{i} p_i^2
\end{align}



Now, let's see what the SWAP operation makes to the purified state:

\begin{align}
\bra{\psi} \bra{\psi} SWAP_{a} \ket{\psi} \ket{\psi} = 
\sum_{ij} \bra{w_{j}} \bra{v_{j}} \bra{w_{i}}  \bra{v_{i}} \sqrt{p_i} \sqrt{p_j} \ket{v_{j}} \ket{w_{i}} \ket{v_{i}} \ket{w_{j}} = 
\sum_{i} p_i^2
\end{align}

Then, 

\begin{align}
P = Tr(\rho^2) = \bra{\psi} \bra{\psi} SWAP_{a} \ket{\psi} \ket{\psi}
\end{align}

The previous equation states that purity can be measured using the Hadamard test on a controlled-SWAP gate. We're going to use the Amazon Braket service to find the purity of our quantum system to finally convert that data into the eigenvalues we're looking for our PCA scenario.

To do this, we're going to follow the algorithm proposed by [4] which involves two processes: the state preparation and the purity calculation. We're going to explore two ways of implementing this quantum algorithm on Amazon Braket. First, through the Qiskit provider for Amazon Braket (to know more about this feature see: https://aws.amazon.com/blogs/quantum-computing/introducing-the-qiskit-provider-for-amazon-braket/) but in few words is a library that allows AWS customers run quantum algorithms written on Qiskit language (i.e. a widely used open-source quantum programming SDK). 

In [12]:
# AWS imports: Import Braket SDK modules
from braket.circuits import Circuit
from braket.devices import LocalSimulator
from braket.circuits import Gate
from braket.circuits import Observable
import matplotlib.pyplot as plt
from braket.aws import AwsDevice

## First implementation: Qiskit

The qiskit implementation is a fast approach to simulate our quantum circuit to see if we're going on the right way because we don't need to worry too much about the state preparation process (i.e. no need to manually convert gates to unitary matrices). However, this initial approach has certain limits that we'll solve with the second implementation. 

We just need to create our quantum circuit with 5 qubits (two copies for the quantum state which means we need 4 qubits and a fifth qubit to interact with the answer which is known as an ancilla qubit). Each pair of qubits must be initialized to start with the quantum state that we already found on the previous part of this notebook (i.e. $\ket{\psi}$ and the last qubit just need to be initialized as the $\ket{0}$ state. Then, we need to apply a Hadamard gate to the ancilla qubit so we can use the superposition property before applying the controlled-SWAP gate between the ancilla qubit and the pair of quantum states (here the basis state is the target qubit and the control qubits are the two copies of the quantum state). Finally, another Hadamard gate to recover the answer we're looking for and the measure over the ancilla qubit. 

In [13]:
from qiskit import QuantumCircuit, execute, Aer, assemble

circ_qiskit = QuantumCircuit(5, 1)

#target qubit
circ_qiskit.initialize([1,0], (0,))

#controlled qubits
circ_qiskit.initialize(psi, (1,2))
circ_qiskit.initialize(psi, (3,4))

circ_qiskit.h(0)
circ_qiskit.cswap(0,1,3)
circ_qiskit.h(0)

#measurement
circ_qiskit.measure(0,0)
circ_qiskit.draw()

Then, we just need to run the quantum circuit on the Aer simulator.

In [14]:
aer_sim = Aer.get_backend('aer_simulator')
counts = aer_sim.run(circ_qiskit, shots=1000000).result().get_counts()
print(counts)

{'1': 58542, '0': 941458}


After running our simulation, we can use the count results of the measurement to calculate some numbers that we need (we're also going to use the equations above on this notebook to find the eigenvalues using the purity). 

In [15]:
total_counts = counts['0'] + counts['1']
print('probability for 0 outcome: ', counts['0']/total_counts)
print('probability for 1 outcome: ', counts['1']/total_counts)

probability for 0 outcome:  0.941458
probability for 1 outcome:  0.058542


In [16]:
purity = (counts['0'] - counts['1']) / (counts['0'] + counts['1'])
e_1 = (1 + np.sqrt(2 * purity - 1)) / 2 * np.trace(sigma)
e_2 = (1 - np.sqrt(2 * purity - 1)) / 2 * np.trace(sigma)
print('The first eigenvalue obtained by the quantum PCA using Qiskit provider is: \n', e_1)
print('The second eigenvalue obtained by the quantum PCA using Qiskit provider is: \n', e_2)

The first eigenvalue obtained by the quantum PCA using Qiskit provider is: 
 1.5731173711237019
The second eigenvalue obtained by the quantum PCA using Qiskit provider is: 
 0.10476864792391717


Recall the eigenvalues we found using linear algebra at the beginning of this notebook were:

\begin{align}
e_1 = 1.57285742, e_2 = 0.1050286
\end{align}

In [41]:
perc_e1 = abs((e_1-1.57285742)/1.57285742)*100
print('percent error first eigenvalue: ', perc_e1)

percent error first eigenvalue:  0.016527316487581375


## Second implementation: Amazon Braket SDK

Now, we'd like to run our quantum circuit not just on the Aer quantum simulator but on Amazon Braket devices. Then, we need to implement each step using the Amazon Braket SDK because not all quantum providers offer the same gates, that's why we need to decompose complex gates (e.g. Controlled-SWAP gate) into simple gates that any quantum hardware provider can run. 

Let's start by defining the quantum gate that we'll use many times during our quantum circuit: the unitary gate. This unitary gate is a generic single-qubit rotation gate with 3 Euler angles defined by:

\begin{align}
U(\theta, \phi, \lambda) = 
\begin{pmatrix}
cos(\frac{\theta}{2}) & -e^{i\lambda}sin(\frac{\theta}{2}) \\
e^{i\phi}sin(\frac{\theta}{2}) & e^{i(\phi+\lambda)}cos(\frac{\theta}{2}) \\
\end{pmatrix}
\end{align}

In [18]:
def ret_matrix(value):
    return np.array([[np.cos(value/2), -np.exp(1j*value)*np.sin(value/2)],[np.exp(1j*value)*np.sin(value/2), np.exp(2*1j*value)*np.cos(value/2)]])

Considering our case is a special case due to the only 2 features of the PCA process we're doing it, the following values can be calculated using the above equations and data. First, we need to take the basis states to the quantum state $\ket{\psi}$ we found earlier (this is the state preparation process using unitary matrices, hadamard and cnot gates). Then, for the purity calculation process we need to decompose the controlled-swap operation into simple gates (the way of doing this is beyond the scope of this notebook, but if the reader wants further details about how to do decomposition using the Toffoli gate please see the references provided by [4] on this topic). Finally, we'd use the same strategy of the ancilla qubit to measure the quantum circuit to get the eigenvalues we're looking for.

In [19]:
circuit = Circuit()

circuit.unitary(matrix=ret_matrix(0.465), targets=[1])
circuit.unitary(matrix=ret_matrix(0.465), targets=[2])
circuit.h(3)

circuit.cnot(1,0)
circuit.cnot(2,4)

circuit.unitary(matrix=ret_matrix(1.570), targets=[0])
circuit.unitary(matrix=ret_matrix(1.950), targets=[1])
circuit.unitary(matrix=ret_matrix(1.950), targets=[2])
circuit.unitary(matrix=ret_matrix(1.570), targets=[4])
print(circuit)

T  : |0|1|2|
            
q0 : ---X-U-
        |   
q1 : -U-C-U-
            
q2 : -U-C-U-
        |   
q3 : -H-|---
        |   
q4 : ---X-U-

T  : |0|1|2|


In [20]:
circuit.h(1)
circuit.h(2)

circuit.cnot(2,1)

circuit.h(1)
circuit.h(2)

circuit.h(1)

circuit.cnot(2,1)
print(circuit)

T  : |0|1|2|3|4|5|6|7|
                      
q0 : ---X-U-----------
        |             
q1 : -U-C-U-H-X-H-H-X-
              |     | 
q2 : -U-C-U-H-C-H---C-
        |             
q3 : -H-|-------------
        |             
q4 : ---X-U-----------

T  : |0|1|2|3|4|5|6|7|


In [21]:
circuit.ti(1)

circuit.cnot(2,1)

circuit.cnot(3,2)

circuit.cnot(2,1)

circuit.t(1)
circuit.cnot(3,2)

circuit.cnot(2,1)

circuit.ti(1)

circuit.cnot(2,1)

circuit.cnot(3,2)

circuit.cnot(2,1)

circuit.t(1)
circuit.cnot(3,2)

circuit.h(1)
circuit.t(2)

circuit.h(1)
circuit.cnot(3,2)

circuit.ti(2)
circuit.t(3)

circuit.cnot(3,2)

circuit.h(2)
circuit.h(3)

circuit.cnot(2,1)

#measurement part of the quantum circuit
circuit.expectation(Observable.Z(), target=[3])
circuit.sample(observable=Observable.Z(), target=3)
circuit.probability(target=3)

circuit.h(2)
circuit.h(1)

print(circuit)

T  : |0|1|2|3|4|5|6|7|8 |9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|24|25|            Result Types            |
                                                                                                                
q0 : ---X-U-----------------------------------------------------------------------------------------------------
        |                                                                                                       
q1 : -U-C-U-H-X-H-H-X-Ti-X----X--T--X--Ti-X-----X--T--H--H-----------X--H---------------------------------------
              |     |    |    |     |     |     |                    |                                          
q2 : -U-C-U-H-C-H---C----C-X--C--X--C-----C--X--C--X--T--X--Ti-X--H--C--H---------------------------------------
        |                  |     |           |     |     |     |                                                
q3 : -H-|------------------C-----C-----------C-----C-----C--T--C--H--------Expectation(Z)-Sample

## Amazon Braket local simulator

In [22]:
local_device = LocalSimulator()

local_result = local_device.run(circuit, shots=10000).result()
local_counts = local_result.measurement_counts
print(local_counts)

Counter({'01100': 1554, '11100': 1081, '01101': 1044, '11101': 691, '11001': 641, '10101': 614, '10001': 573, '00101': 537, '01001': 480, '10100': 478, '11000': 477, '01000': 422, '00100': 374, '10000': 246, '00001': 223, '10110': 130, '11010': 119, '00111': 113, '01011': 110, '00000': 93})


In [23]:
print("Probability:", local_result.values[2])

Probability: [0.9528 0.0472]


In [24]:
purity_sdk = local_result.values[2][0] - local_result.values[2][1]
print(purity_sdk)

0.9056


In [25]:
v_1 = (1 + np.sqrt(2 * purity_sdk - 1)) / 2 * np.trace(sigma)
v_2 = (1 - np.sqrt(2 * purity_sdk - 1)) / 2 * np.trace(sigma)
print('The first eigenvalue obtained by the quantum PCA using Amazon Braket SDK is: \n', v_1)
print('The second eigenvalue obtained by the quantum PCA using Amazon Braket SDK is: \n', v_2)

The first eigenvalue obtained by the quantum PCA using Amazon Braket SDK is: 
 1.5945508064417984
The second eigenvalue obtained by the quantum PCA using Amazon Braket SDK is: 
 0.08333521260582065


In [40]:
perc_v1 = abs((v_1-1.57285742)/1.57285742)*100
print('percent error first eigenvalue: ', perc_v1)

percent error first eigenvalue:  1.3792341356534594


## Amazon Braket SV1 device

In [27]:
sv1_device = AwsDevice("arn:aws:braket:::device/quantum-simulator/amazon/sv1")

sv1_result = sv1_device.run(circuit, shots=10000).result()
sv1_counts = sv1_result.measurement_counts
print(sv1_counts)

Counter({'01100': 1599, '01101': 1128, '11100': 1049, '11101': 701, '10101': 607, '11001': 601, '10001': 565, '00101': 499, '11000': 471, '01001': 443, '10100': 442, '00100': 393, '01000': 379, '00001': 256, '10000': 243, '11010': 142, '00111': 136, '01011': 133, '10110': 116, '00000': 97})


In [28]:
print("Expectation value for:", sv1_result.values[2])

Expectation value for: [0.9473 0.0527]


In [29]:
purity_sv1 = sv1_result.values[2][0] - sv1_result.values[2][1]
print(purity_sv1)

0.8946000000000001


In [30]:
a_1 = (1 + np.sqrt(2 * purity_sv1 - 1)) / 2 * np.trace(sigma)
a_2 = (1 - np.sqrt(2 * purity_sv1 - 1)) / 2 * np.trace(sigma)
print('The first eigenvalue obtained by the quantum PCA using Amazon Braket SV1 is: \n', a_1)
print('The second eigenvalue obtained by the quantum PCA using Amazon Braket SV1 is: \n', a_2)

The first eigenvalue obtained by the quantum PCA using Amazon Braket SV1 is: 
 1.5842342174098423
The second eigenvalue obtained by the quantum PCA using Amazon Braket SV1 is: 
 0.09365180163777674


In [42]:
perc_sv1 = abs((a_1-1.57285742)/1.57285742)*100
print('percent error first eigenvalue: ', perc_sv1)

percent error first eigenvalue:  0.723320325490292


## Amazon Braket DM1 device

In [32]:
dm1_device = AwsDevice("arn:aws:braket:::device/quantum-simulator/amazon/dm1")

dm1_result = dm1_device.run(circuit, shots=10000).result()
dm1_counts = dm1_result.measurement_counts
print(dm1_counts)

Counter({'01100': 1583, '11100': 1110, '01101': 1022, '11101': 686, '11001': 629, '10101': 616, '10001': 551, '00101': 507, '11000': 491, '01001': 484, '10100': 467, '01000': 388, '00100': 361, '00001': 232, '10000': 207, '11010': 153, '00111': 144, '01011': 139, '10110': 135, '00000': 95})


In [33]:
print("Expectation value for:", dm1_result.values[2])

Expectation value for: [0.9429 0.0571]


In [34]:
purity_dm1 = dm1_result.values[2][0] - dm1_result.values[2][1]
print(purity_dm1)

0.8857999999999999


In [35]:
b_1 = (1 + np.sqrt(2 * purity_dm1 - 1)) / 2 * np.trace(sigma)
b_2 = (1 - np.sqrt(2 * purity_dm1 - 1)) / 2 * np.trace(sigma)
print('The first eigenvalue obtained by the quantum PCA using Amazon Braket DM1 is: \n', b_1)
print('The second eigenvalue obtained by the quantum PCA using Amazon Braket DM1 is: \n', b_2)

The first eigenvalue obtained by the quantum PCA using Amazon Braket DM1 is: 
 1.5758769672048174
The second eigenvalue obtained by the quantum PCA using Amazon Braket DM1 is: 
 0.10200905184280168


In [43]:
perc_dm1 = abs((b_1-1.57285742)/1.57285742)*100
print('percent error first eigenvalue: ', perc_dm1)

percent error first eigenvalue:  0.19197844422651741


# Conclusions

Summarizing what we've done, along the entire notebook we implemented the Principal Component Analysis technique under the quantum computation paradigm. We implemented the PCA technique for the famous problem of house pricing in the US where our final goal is to find the eigenvalues of our correlation matrix of our data (i.e. the principal components). The way how we implemented the PCA technique was divided in four stages: i) Classical pre-processing, ii) State preparation, iii) Purity calculation, iv) Classical post-processing. After explaining each step with the mathematics and physics details needed, we implemented the quantum algorithm in two ways: first, via the Qiskit provider for Amazon Braket with a simplified quantum algorithm but with some limitations, and then with the local simulator and the Amazon Braket SV1 and Amazon Braket DM1 devices. Our results for the eigenvalue we we're looking for is the same we predicted using classical computation with a percent error less than 1%. Finally, we saw that for this particular scenario, PCA technique was correctly implemented using quantum algorithms, circuits and devices. However, we know the technology is still in an early stage, but proving that we can make these small steps brings hope and optimism for the future of quantum computing and its applications. 

# References

1. Ian T. Jolliffe and Jorge Cadima. Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374, April 2016.

2. C. He, J. Li, W. Liu, J. Peng, and Z. J. Wang. A low-complexity quantum principal component analysis algorithm,. IEEE Transactions on Quantum Engineering, 3(3):1–13, 2022.

3. Lin Jie et al. An improved quantum principal component analysis algorithm based on the quantum singular threshold method. Physics Letters A, 383:2862–68, aug 2019.

4. J. , Abhijith, et al. «Quantum Algorithm Implementations for Beginners». ACM Transactions on Quantum Computing, vol. 3, n.o 4, diciembre de 2022, pp. 1-92. arXiv.org, https://doi.org/10.1145/3517340.

5. Schmied, Roman. «Quantum State Tomography of a Single Qubit: Comparison of Methods». Journal of Modern Optics, vol. 63, n.o 18, octubre de 2016, pp. 1744-58. DOI.org (Crossref), https://doi.org/10.1080/09500340.2016.1142018.

6. An special thanks to the Github repository of Haokai Zhang available at: https://github.com/Haokai-Zhang/ExampleQPCA/blob/master/5qubit-qPCA.ipynb
