## QKMTuner: A Hyperparameter Optimization Pipeline

In the previous section we observed that many hyperparameters are involved when building a quantum kernel method. Let's differentiate between quantum and classical hyperparameters.

"Quantum hyperparameters":
- encoding circuit in general
    - num_qubits
    - num_layers
- feature_range for data preprocessing (or bandwidth)
- $k$-RDM and or choice of measurement operator in PQK approach

"Classical hyperparameters":
- QKRR: Regularization parameter $\alpha$
- QSVR: Regularization parameters $C$ and $\epsilon$
- outer_kernel hyperparameters, e.g., $\gamma$ in gaussian kernel and $\nu$ in Mat√©rn kernel

As in classical ML, finding proper hyperparameters for a ML model for a given dataset is a challenging and in general computationally expensive task. The same transfers to QML. To adress this problem we developed a hyperparameter optimization pipeline for quantum kernel methods (QSVC, QSVR and QKRR) - <b>QKMTuner</b> - which is based on sQUlearn and on Optuna -- a hyperparameter optimization framework. It features an imperative, *define-by-run* style user API and thus it allows for dynamically constructing search spaces for the hyperparameters. In QKMOtuna we particularly use 
    
- sQUlearn's and Optuna's compatibility with scikit-learn
- Optuna's efficient state-of-the-art optimization algorithms for sampling hyperparameters
- Optunas quick visualization tools

An schematic overview of the QKMTuner implemantation and it's functionalities is given in Fig. 6.

<center>

<img src="./schematic.png" alt="QKMTuner" width=1000>


*Figure 6: Schematic illustration of our QKMTuner hyperparameter optimization pipeline*
</center>

### Find a best quantum kernel model

To find the best hyperparameters for a quantum kernel model for a given encoding circuit, QKMTuner provides the evaluate_grid() method, which searches for

- optimal num_qubits and num_layers (within given boundaries for the corresponding search space)
- for given preprocessing method, optimize configuration settings (i.e., e.g., feature_range) for rescaling features
- optimal kernel algorithm hyperparameters (e.g., $C$ and $\epsilon$ for QSVR)
- optimal kernel hyperparameters (e.g., $\gamma$ for gaussian kernel)

#### Imports and data loading

In [2]:
import os
import sys
import joblib
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from optuna.samplers import TPESampler

# necessary sQUlearn imports
from squlearn import Executor
from squlearn.encoding_circuit import YZ_CX_EncodingCircuit

sys.path.append("./../src/")
from qkm_tuner import QKMTuner

In [3]:
file_friedman = os.path.join("./", "make_friedman1_dataset_num_features5.csv")

df = pd.read_csv(file_friedman)
x = df.iloc[:,:-1].to_numpy()
y = df.iloc[:,-1].to_numpy()

# split into training and test data
xtrain, xtest, ytrain, ytest = train_test_split(
    x,
    y,
    test_size=0.2,
    random_state=42
)

#### How to set up the method

Within this simulation we use sQUlearn's default PQK within a QSVR. For the sake of demonstration purposes we restrict ourselves to only evaluating the best model for the YZ_CX_EncodingCircuit. The following hyperparameters are optimized:

- min_range and max_range defining the feature_range in MinMaxScaler(feature_range=(min_range, max_range)) used for feature rescaling as well as therein we should use clip=True/False
- num_qubits of YZ_CX_EncodingCircuit
- num_layers of YZ_CX_EncodingCircuit
- gamma of outer_kernel="gaussian"
- epsilon and C, i.e. the QSVR regularization parameters

Futher one can specify:

- the boundaries of Optuna's hyperparameter search space (num_qubits_max and num_layers_max), 
- use different specifications of the PQKs (measurement and outer_kernel)
- optuna_sampler and optuna_pruner

In [4]:
# Define QKMTuner instance
qkm_tuner_inst = QKMTuner(
    xtrain=xtrain,
    xtest=xtest,
    ytrain=ytrain,
    ytest=ytest,
    scaler_method=MinMaxScaler(),
    optimize_scaler=True,
    label_scaler=MinMaxScaler(),
    quantum_kernel="PQK",
    quantum_kernel_method="QKRR",
    executor=Executor("pennylane"),
    parameter_seed=0
)


# Define encoding circuits for which one wants to compute the best models
encoding_circuits = [YZ_CX_EncodingCircuit]

# Call QKMTuner's evalate_best_model() method and define parameters as desired
qkm_tuner_inst.evaluate_best_model(
    encoding_circuits=encoding_circuits,
    measurement = "XYZ",
    outer_kernel = "gaussian",
    num_qubits_max = 10,
    num_layers_max = 8,
    optuna_sampler = TPESampler(seed=0),
    optuna_pruner = None, 
    n_trials = 1, # one has to define more, of course
    outdir = './results_demo_evaluate_best_model/',
    file_identifier = 'friedman_num_features5'
)

            Thus, make sure to pip install the same versions or consider using RDBs to 
            save/reload a study accross different Optuna versions, cf.
            https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/001_rdb.html#rdb
[I 2024-09-06 14:12:26,071] A new study created in memory with name: optuna_study_evaluate_best_model_QKRR_PQK_YZ_CX_EncodingCircuit_friedman_num_features5
[I 2024-09-06 14:12:44,834] Trial 0 finished with value: -99.40174397464125 and parameters: {'num_qubits': 10, 'num_layers': 4, 'min_range': -0.6239778297350675, 'max_range': 0.8559005023838371, 'alpha': 1.2129910377408882e-05, 'gamma': 7.505241622349544}. Best is trial 0 with value: -99.40174397464125.


The evaluate_best_model() method of QKMTuner automatically saves the following intermediate and final simulation results:

- For each EncodingCircuit in the list of encoding_circuits, we save the optuna study object to a *.pkl-file
- For each EncodingCircuit in the list of encoding_circuits, the optimal circuits out of the optuna study is saved to a *.pkl-file
- The final summary is saved as *.csv file, which contains, for each EncodingCircuit:
    
    - best_params (i.e. num_qubits, num_layers, etc.) determined within optuna optimization
    - best_trial (optuna object to resume study)
    - best_obj_val
    - best_feature_range
    - ktrain
    - ktesttrain
    - ypred_train
    - ypred_test
    - mse_train
    - rmse_train
    - mae_train
    - r2_train
    - mse_test
    - rmse_test
    - mae_test 
    - r2_test

For classifications tasks we use accuracy_score, roc_auc_score and f1_score instead.

This is managed within two directories, which were also created by the previous simulation. Take a closer look into their content

In [5]:
print("'/cache_optuna_studies_evaluate_best_model/' contains: ", os.listdir("./results_demo_evaluate_best_model/cache_optuna_studies_evaluate_best_model") )
print("'/results_evaluate_best_model/' contains: ", os.listdir("results_demo_evaluate_best_model/results_evaluate_best_model/"))

'/cache_optuna_studies_evaluate_best_model/' contains:  ['optimal_circuit_from_optuna_study_evaluate_best_model_QKRR_PQK_YZ_CX_EncodingCircuit_friedman_num_features5.pkl', 'optuna_study_evaluate_best_model_QKRR_PQK_YZ_CX_EncodingCircuit_friedman_num_features5.pkl']
'/results_evaluate_best_model/' contains:  ['results_evaluate_best_model_study_friedman_num_features5_QKRR_PQK_summary.pkl', 'results_evaluate_best_model_study_friedman_num_features5_QKRR_PQK_YZ_CX_EncodingCircuit.pkl']


This means .pkl-files are created in a cache directory containing the optimal encoding circuit as well as the complete optuna study itself from which the best model configurations can be extracted. Moreover the optuna study pickle can be used to resume the study and start a new study from there. 

#### Anaylzing the results

We can investigate the importance of hyperparameters by loading the *.pkl-file containing the corresponding Optuna study object and subsequently use Optuna's visualization tools.

Recall that within this simulation we optimized the following hyperparameters:

- min_range and max_range defining the feature_range in MinMaxScaler(feature_range=(min_range, max_range)) used for feature rescaling as well as therein we should use clip=True/False
- num_qubits of YZ_CX_EncodingCircuit
- num_layers of YZ_CX_EncodingCircuit
- gamma of outer_kernel="gaussian"
- epsilon and C, i.e. the QSVR regularization parameters

In [6]:
file_study = "./results_demo_evaluate_best_model/cache_optuna_studies_evaluate_best_model/optuna_study_evaluate_best_model_QKRR_PQK_YZ_CX_EncodingCircuit_friedman_num_features5.pkl"
study = joblib.load(file_study)

In [8]:
"""
Visualize hyperparameter importances. Apparently, this only works for n_trials > 1
"""
#from optuna.visualization import plot_param_importances
#plot_param_importances(study)

'\nVisualize hyperparameter importances. Apparently, this only works for n_trials > 1\n'

For the sake of generating a large database of quantum kernel experiments, all final results are saved as *.pkl file.

In [9]:
df = pd.read_pickle("./results_demo_evaluate_best_model/results_evaluate_best_model/results_evaluate_best_model_study_friedman_num_features5_QKRR_PQK_YZ_CX_EncodingCircuit.pkl")
df

Unnamed: 0,best_params,best_trial,best_obj_val,best_feature_range,ktrain,ktesttrain,ypred_train,ypred_test,mse_train,rmse_train,mae_train,r2_train,mse_test,rmse_test,mae_test,r2_test
YZ_CX_EncodingCircuit,"{'num_qubits': 10, 'num_layers': 4, 'min_range...","FrozenTrial(number=0, state=TrialState.COMPLET...",-99.401744,"(-0.6239778297350675, 0.8559005023838371)","[[1.0, 1.197097162078793e-07, 0.00030380654500...","[[1.050376185762618e-08, 2.6310618894618183e-0...","[0.4180767742051808, 0.618264002151646, 0.3605...","[0.22387087462243443, 0.1233089725828915, 0.02...",2.66462e-11,5e-06,5e-06,1.0,0.184601,0.429652,0.394143,-2.901853


### Hyperparameter optimization within grid search

In order to systematically analyze and compare quantum kernel methods as well as providing general design advices, ideally an extensive quantum kernel study is required, which considers many different datasets and different encoding circuits. To this end, QKMTuner provides the evaluate_grid() method, which 

- for each circuit within a list of encoding_circuits sets up a predefined grid to consider different num_qubits and num_layer configurations and for each configuration performs a hyperparameter search to determine

    - hyperparameters of the quantum kernel method 
    - hyperparameters of PQKs (e.g. $\gamma$ for gaussian kernel)
    - for given feature preprocessing routine (e.g. MinMaxScaler) optimize it's configuration (i.e., e.g. feature_range)

The evaluate_grid() method automatically saves the followig intermediate and final results:

- Optuna study object as *.pkl file for each encoding circuit and each (num_qubits, num_layers)-configuration
- Best trial optuna object as *.pkl file for each encoding circuit within given (num_qubits, num_layers)-grid corresponding to setting with highest R2-score on test data
- Accordingly, corresponding EncodingCircuit sQUlearn object as *.pkl-file
- Kernel matrices for each encoding circuit and each (num_qubits, num_layers)-configuration after optuna optimization in a *csv-file
- A final summary is saved as *.csv file, which contains, for each EncodingCircuit:

    - best_param_mat
    - best_trial_mat
    - best_objective_value_mat
    - feature_range_mat
    - mse_train_mat 
    - rmse_train_mat
    - mae_train_mat
    - r2_train_mat
    - mse_test_mat
    - rmse_test_mat
    - mae_test_mat
    - r2_test_mat

For classifications tasks we use accuracy_score, roc_auc_score and f1_score instead

To use the evaluate_grid() method of the QKMTuner pipeline, we just provide the corresponding code but, for the sake of runtime, do not execute it. We would rather like to show how to use and analyze the generated results.

##### Imports and data loading

In [11]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

from squlearn import Executor
from squlearn.encoding_circuit import YZ_CX_EncodingCircuit

For the sake of demonstration purposes, in the following we restrict ourselves to the YZ_CX_EncodingCircuit only. So does the following code snippet, which is given for FQK.

First we load the data

In [12]:
# again, load the friedman dataset
file_friedman = os.path.join("./", "make_friedman1_dataset_num_features5.csv")

df = pd.read_csv(file_friedman)
x = df.iloc[:,:-1].to_numpy()
y = df.iloc[:,-1].to_numpy()

# split into training and test data
xtrain, xtest, ytrain, ytest = train_test_split(
    x,
    y,
    test_size=0.2,
    random_state=42
)

##### How to set up the method

QKMTuner is initilized such that 

- a given feature preprocessing is used and no optimization of the scaler is performed
- we further rescale the labels using MinMaxScaler()

Here one can also use different scalers for both features and labels and can additionally optimize the respective feature preproccesing method. Moreover, one can change the quantum_kernel and quantum_kernel_method parameters.

To finally call the evaluate_grid() method one has to specify the encoding_circuits upon which one builds a grid. This grid is defined by the layer_list and qubit list arguments. Beyond that, QKMTuner allows for different PQK settings (i.e. differemt measurement and outer_kernel attributes).

For FQKs one merely has to change the quantum_kernel parameter in the code example below.

In [13]:
# Define QKMTuner instance
qkm_tuner_inst = QKMTuner(
    xtrain=xtrain,
    xtest=xtest,
    ytrain=ytrain,
    ytest=ytest,
    scaler_method=MinMaxScaler(),
    optimize_scaler=True,
    label_scaler=MinMaxScaler(),
    quantum_kernel="PQK",
    quantum_kernel_method="QKRR",
    executor=Executor("pennylane"),
    parameter_seed=0
)

# Define the grid search ranges
encoding_circuits = [YZ_CX_EncodingCircuit]
layer_list = [1,2]
qubit_list = [5]
# set up the evaluate_grid() method
qkm_tuner_inst.evaluate_grid(
    encoding_circuits=encoding_circuits,
    measurement="XYZ",
    outer_kernel="gaussian",
    qubits_list=qubit_list,
    layers_list=layer_list,
    optuna_sampler=TPESampler(seed=0),
    n_trials=2, # need to specify more -> 100 = default
    outdir="./results_grid/",
    file_identifier="friedman_num_features5_grid"
)

            Thus, make sure to pip install the same versions or consider using RDBs to 
            save/reload a study accross different Optuna versions, cf.
            https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/001_rdb.html#rdb
[I 2024-09-06 14:26:14,326] A new study created in memory with name: optuna_study_evaluate_grid_QKRR_PQK_YZ_CX_EncodingCircuit_num_qubits5_num_layers1_friedman_num_features5_grid
[I 2024-09-06 14:26:15,183] Trial 0 finished with value: -8.073707242448101 and parameters: {'min_range': -0.7087220907304183, 'max_range': 1.123416829660566, 'alpha': 0.0017106474441850336, 'gamma': 1.8590843630169627}. Best is trial 0 with value: -8.073707242448101.
[I 2024-09-06 14:26:16,181] Trial 1 finished with value: -140.84854803747305 and parameters: {'min_range': -0.9053209241643161, 'max_range': 1.014568100303551, 'alpha': 1.7825697616116485e-05, 'gamma': 224.2012371372442}. Best is trial 0 with value: -8.073707242448101.
[I 2024-09-06 14:26:16,415] A new 

 The evalute_grid() method of QKMTuner automatically saves many intermediate as well as final simulation results. This is useful for various post-processing analyses as well as for subsequent studies. Moreover, saving Optuna study objects allows to resume a study. The following shows the format of final simulation results and how to further process and analyze them.

In [14]:

df_qkrr_pqk = pd.read_pickle("results_grid/results_evaluate_grid/results_evaluate_grid_study_friedman_num_features5_grid_QKRR_PQK_YZ_CX_EncodingCircuit.pkl")
df_qkrr_pqk

Unnamed: 0,best_param_mat,best_trial_mat,best_objective_value_mat,feature_range_mat,mse_train_mat,rmse_train_mat,mae_train_mat,r2_train_mat,mse_test_mat,rmse_test_mat,mae_test_mat,r2_test_mat
YZ_CX_EncodingCircuit,"[[{'min_range': -0.7087220907304183, 'max_rang...","[[FrozenTrial(number=0, state=TrialState.COMPL...","[[-8.073707242448101, -10.916303245171914]]","[[(-0.7087220907304183, 1.123416829660566), (-...","[[3.150145156551879e-05, 0.015110023147217985]]","[[0.005612615394405606, 0.12292283411644064]]","[[0.003787573159020415, 0.0982739360758632]]","[[0.9993538211514481, 0.6900530968055647]]","[[0.01156683946464205, 0.017197051324471826]]","[[0.10754924204587427, 0.13113752828413316]]","[[0.07963710725459643, 0.10164948422741427]]","[[0.7555154474994237, 0.6365114766012664]]"


### Comments

At this, let us remark that QKMTuner also alows for performing KTA within the pipeline using any or only the optimal cached result.

Moreover we note, that in this demonstrator we did not make full use of the QKMTuner pipeline's full functionality. As such, the following is missing:

- Investigation and optimization of preprocessing routines (MinMaxScaler(), MaxAbsScaler(), StandardScaler(), RobustScaler(), PowerTransformer() and Bandwidth-Scaling implemented)
- Systematic comparison of QSVR vs. QKRR
- Application to classification problems (QSVC implemented)
- Use of more encoding circuits to enlarge data foundation and improve statistics
- shot-based simulations