# Basic usage

Following example shows how Falcondale SDK could be used to benchmark some known QML approaches just by selecting a Pandas Dataframe. We will start by simply installing the SDK and get a sample dataset.

In [18]:
from sklearn import datasets

# import some data to play with
X, y = datasets.load_breast_cancer(return_X_y=True, as_frame=True)
X

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.80,1001.0,0.11840,0.27760,0.30010,0.14710,0.2419,0.07871,...,25.380,17.33,184.60,2019.0,0.16220,0.66560,0.7119,0.2654,0.4601,0.11890
1,20.57,17.77,132.90,1326.0,0.08474,0.07864,0.08690,0.07017,0.1812,0.05667,...,24.990,23.41,158.80,1956.0,0.12380,0.18660,0.2416,0.1860,0.2750,0.08902
2,19.69,21.25,130.00,1203.0,0.10960,0.15990,0.19740,0.12790,0.2069,0.05999,...,23.570,25.53,152.50,1709.0,0.14440,0.42450,0.4504,0.2430,0.3613,0.08758
3,11.42,20.38,77.58,386.1,0.14250,0.28390,0.24140,0.10520,0.2597,0.09744,...,14.910,26.50,98.87,567.7,0.20980,0.86630,0.6869,0.2575,0.6638,0.17300
4,20.29,14.34,135.10,1297.0,0.10030,0.13280,0.19800,0.10430,0.1809,0.05883,...,22.540,16.67,152.20,1575.0,0.13740,0.20500,0.4000,0.1625,0.2364,0.07678
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
564,21.56,22.39,142.00,1479.0,0.11100,0.11590,0.24390,0.13890,0.1726,0.05623,...,25.450,26.40,166.10,2027.0,0.14100,0.21130,0.4107,0.2216,0.2060,0.07115
565,20.13,28.25,131.20,1261.0,0.09780,0.10340,0.14400,0.09791,0.1752,0.05533,...,23.690,38.25,155.00,1731.0,0.11660,0.19220,0.3215,0.1628,0.2572,0.06637
566,16.60,28.08,108.30,858.1,0.08455,0.10230,0.09251,0.05302,0.1590,0.05648,...,18.980,34.12,126.70,1124.0,0.11390,0.30940,0.3403,0.1418,0.2218,0.07820
567,20.60,29.33,140.10,1265.0,0.11780,0.27700,0.35140,0.15200,0.2397,0.07016,...,25.740,39.42,184.60,1821.0,0.16500,0.86810,0.9387,0.2650,0.4087,0.12400


We can check it is a binary classification task.

In [2]:
y.unique()

array([0, 1])

Being all numerical datatypes it is a simple case to be explored with Falcondale SDK. Falcondale SDK should be able to ease that task going into different functionalities as we will explore this binary classification use case.

In [7]:
from falcondale import Project

dataset = X
dataset["target"] = y

myproject = Project(dataset, target="target")

In [4]:
myproject.list_options()

Welcome to Falcondale SDK! you will be able to:
 - Perform Quantum Feature Selection using Quantum Annealing or QAOA
 - Train quantum-enhanced models such as QSVC or QNN
 - Perform Quantum Clustering by pure quantum and quantum-inspired techniques


We might want to preprocess the information so that we can remove duplicates or make the dataset scale invariant, for example. One important feature is also the ability to reduce dimensionality. Let's start just by simply going for the default options.

In [9]:
myproject.preprocess()
myproject.show_features()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,0.521037,0.022658,0.545989,0.363733,0.593753,0.792037,0.703140,0.731113,0.686364,0.605518,...,0.620776,0.141525,0.668310,0.450698,0.601136,0.619292,0.568610,0.912027,0.598462,0.418864
1,0.643144,0.272574,0.615783,0.501591,0.289880,0.181768,0.203608,0.348757,0.379798,0.141323,...,0.606901,0.303571,0.539818,0.435214,0.347553,0.154563,0.192971,0.639175,0.233590,0.222878
2,0.601496,0.390260,0.595743,0.449417,0.514309,0.431017,0.462512,0.635686,0.509596,0.211247,...,0.556386,0.360075,0.508442,0.374508,0.483590,0.385375,0.359744,0.835052,0.403706,0.213433
3,0.210090,0.360839,0.233501,0.102906,0.811321,0.811361,0.565604,0.522863,0.776263,1.000000,...,0.248310,0.385928,0.241347,0.094008,0.915472,0.814012,0.548642,0.884880,1.000000,0.773711
4,0.629893,0.156578,0.630986,0.489290,0.430351,0.347893,0.463918,0.518390,0.378283,0.186816,...,0.519744,0.123934,0.506948,0.341575,0.437364,0.172415,0.319489,0.558419,0.157500,0.142595
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
564,0.690000,0.428813,0.678668,0.566490,0.526948,0.296055,0.571462,0.690358,0.336364,0.132056,...,0.623266,0.383262,0.576174,0.452664,0.461137,0.178527,0.328035,0.761512,0.097575,0.105667
565,0.622320,0.626987,0.604036,0.474019,0.407782,0.257714,0.337395,0.486630,0.349495,0.113100,...,0.560655,0.699094,0.520892,0.379915,0.300007,0.159997,0.256789,0.559450,0.198502,0.074315
566,0.455251,0.621238,0.445788,0.303118,0.288165,0.254340,0.216753,0.263519,0.267677,0.137321,...,0.393099,0.589019,0.379949,0.230731,0.282177,0.273705,0.271805,0.487285,0.128721,0.151909
567,0.644564,0.663510,0.665538,0.475716,0.588336,0.790197,0.823336,0.755467,0.675253,0.425442,...,0.633582,0.730277,0.668310,0.402035,0.619626,0.815758,0.749760,0.910653,0.497142,0.452315


One of the first things we could do is try to reduce the size of our dataset. We will try to perform a Quantum Feature Selection and see what it says.

In [10]:
features = myproject.feature_selection(max_cols = 10)
features

Using Simulated Bifurcation for large datasets


['mean radius',
 'mean perimeter',
 'mean area',
 'mean compactness',
 'mean concavity',
 'mean concave points',
 'compactness error',
 'worst radius',
 'worst perimeter',
 'worst area',
 'worst concavity',
 'worst concave points']

This also modifies the internal dataset so that those features are going to be considered. Some simple classification tasks can be invoked thanks to the already identified target column.

In [11]:
model = myproject.evaluate("qsvc")
model.print_report()

              precision    recall  f1-score   support

           0       0.78      0.95      0.86        60
           1       0.97      0.86      0.91       111

    accuracy                           0.89       171
   macro avg       0.88      0.90      0.88       171
weighted avg       0.90      0.89      0.89       171



In [12]:
model.list_metrics()

['sensitivity',
 'recall',
 'precision',
 'f1',
 'accuracy',
 'balanced accuracy',
 'auc']

In [13]:
model.metric("auc")

0.9521021021021021

Models can be saved and retrieved afterwards. Make sure you use the same dataset structure when using the prediction function.

In [14]:
myproject.save_model(model,"my_qsvc.model")

In [15]:
model = myproject.load_model("my_qsvc.model")

In [20]:
from sklearn.metrics import classification_report

y_pred = model.predict(X)

print(classification_report(y, y_pred))

              precision    recall  f1-score   support

           0       0.93      0.99      0.96       212
           1       0.99      0.96      0.97       357

    accuracy                           0.97       569
   macro avg       0.96      0.97      0.96       569
weighted avg       0.97      0.97      0.97       569

