### Ahmed Osama Abdellatif Alkadi
### Alexandria ITI

.






## **Hyperopt: Distributed Asynchronous Hyper-parameter Optimization**

1. **Intoduction:**
- Hyperopt is a powerful Python library for hyperparameter optimization, particularly popular in the machine learning and data science communities. It offers an efficient and flexible framework for automatically tuning the hyperparameters of machine learning models, aiming to find the best configuration for a given optimization objective.

2. **Key Objectives**
- To use "hyperopt" optimizer to optimize an SVC Quantum machine learning problem.
- To examine how the chosen parameters of optimization are effecting the results of the test and the accuracy.
- To have insights about the ranges of the hyperparameters in the case of SVC QML

2. **Algorithms:**
- Random Search: This algorithm randomly samples hyperparameter configurations from the search space and evaluates them independently. While simple and easy to implement, random search may not be the most efficient method for finding optimal hyperparameters, especially in high-dimensional spaces.

- Tree-structured Parzen Estimator (TPE): TPE is a Bayesian optimization algorithm that models the objective function and the distribution of hyperparameters using probability distributions. It iteratively refines these distributions based on observed performance, focusing the search on promising regions of the hyperparameter space.

- Adaptive TPE (ATPE): This variant of TPE adapts the search space dynamically during optimization based on the performance of previously evaluated configurations. It aims to allocate more samples to promising regions of the search space, potentially improving the convergence speed.

3. **Other Features:**
- One of the key advantages of Hyperopt is its ability to handle both discrete and continuous hyperparameters, as well as conditional hyperparameter spaces. This flexibility allows it to efficiently explore a wide range of hyperparameter configurations, making it suitable for various machine learning tasks.


- you can find more information here
 https://hyperopt.github.io/hyperopt/

- how to write an objective function in "hyperopt" 
https://github.com/hyperopt/hyperopt/wiki/FMin



## Pramameters to optimize 

In this study an optimization for the following parameters was carried out.

- rep: This parameter represents the number of repetitions for the feature map used in the quantum kernel. It determines the complexity or richness of the feature map, which can impact the expressiveness of the quantum model. Optimizing this parameter allows finding the optimal trade-off between model complexity and generalization performance.

- c: This parameter is the regularization parameter for the Support Vector Classifier (SVC) used in the classical part of the hybrid quantum-classical model. Regularization helps prevent overfitting by penalizing large coefficients in the decision function. Optimizing this parameter ensures the model's ability to generalize well to unseen data.

- entangle: This parameter specifies the type of entanglement used in the feature map of the quantum kernel. Entanglement is a fundamental property of quantum systems and can affect the quantum circuit's representation power. By optimizing this parameter, the most suitable type of entanglement for the given task can be determined, potentially improving model performance.

- shots: This parameter denotes the number of shots (measurement repetitions) for executing quantum circuits on a quantum device or simulator. More shots can lead to more accurate results but also require more computational resources. Optimizing this parameter involves finding the balance between the accuracy of quantum measurements and the computational cost, ensuring efficient utilization of resources.

### Import libraries

In [131]:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import model_selection
from sklearn.svm import SVC
from qiskit import *
from qiskit.circuit.library import *
from qiskit.primitives import Sampler
from qiskit_machine_learning.kernels import FidelityQuantumKernel
from qiskit_machine_learning.algorithms import *
from qiskit_machine_learning.kernels import *
from qiskit_algorithms.state_fidelities import ComputeUncompute
from qiskit_algorithms.utils import algorithm_globals
from hyperopt import hp, tpe, Trials, fmin ,space_eval

seed = 12345
np.random.seed(seed)
algorithm_globals.random_seed = seed


### Generates synthetic ad-hoc datasets with specified characteristics.


- Purpose: This function is particularly useful for generating synthetic datasets with specific characteristics, such as dimensionality, class separability, and sample sizes, which can be used for experimentation and testing in machine learning tasks.

- Returns:
1. train_features: Features of the training dataset.
2. train_labels: Labels of the training dataset.
3. test_features: Features of the test dataset.
4. test_labels: Labels of the test dataset.


In [132]:
from qiskit_machine_learning.datasets import ad_hoc_data

adhoc_dimension = 3
train_features, train_labels, test_features, test_labels, adhoc_total = ad_hoc_data(
    training_size=50,                    #The number of samples to generate for the training dataset.
    test_size=20,                        #The number of samples to generate for the test dataset.
    n=adhoc_dimension,                   #The dimensionality of the features in the dataset.
    gap=0.25,                            #The gap between classes in the dataset, which helps control the separability of the classes.
    plot_data=False,                     #A boolean flag to indicate whether to plot the generated dataset.
    one_hot=False,                       #A boolean flag to indicate whether to use one-hot encoding for the labels.
    include_sample_total=True,           #A boolean flag to indicate whether to include the total number of samples in the dataset.
)

### Implement the objective function for the optimization

In [133]:
trials_info=[]  # Initialize an empty list to store information about each trial.

def objective(params: dict) -> float: 
    """
    Objective function for hyperparameter optimization.
    
    Args:
        params (dict): Dictionary containing the hyperparameters to be optimized.
        
    Returns:
        float: Negative of the accuracy obtained by the model.
    """
    
    # Extract hyperparameters from the parameter dictionary.
    rep = int(params['rep'])  # Number of repetitions for the feature map.
    c = params['C']           # Regularization parameter for the SVC.
    entangle = params['entanglement']  # Type of entanglement in the feature map.
    shots = params['shots']   # Number of shots for executing quantum circuits.
    
    # Calculate the number of qubits based on the dimensionality of the training features.
    num_qubits = train_features.shape[1]
    
    # Create a feature map using the specified hyperparameters.
    feature_map = ZZFeatureMap(feature_dimension=num_qubits, reps=rep, entanglement=entangle)
    
    # Create a sampler with the specified number of shots.
    sampler = Sampler(options={"shots": shots})
    
    # Initialize a fidelity object for computing quantum state fidelities.
    fidelity = ComputeUncompute(sampler=sampler)
    
    # Initialize a quantum kernel based on the fidelity and feature map.
    adhoc_kernel = FidelityQuantumKernel(fidelity=fidelity, feature_map=feature_map)
    
    # Compute the Gram matrix for the training data using the quantum kernel.
    gram_train = adhoc_kernel.evaluate(x_vec=train_features, y_vec=train_features)
    
    # Initialize a support vector classifier with a precomputed kernel.
    classifier_obj = SVC(kernel="precomputed", shrinking=True, C=c, random_state=seed)
    
    # Perform cross-validation to estimate the accuracy of the classifier.
    score = model_selection.cross_val_score(classifier_obj, gram_train, train_labels, n_jobs=10, cv=5)
    
    # Compute the mean accuracy obtained from cross-validation.
    accuracy = score.mean()
    
    # Store information about the current trial in the trials_info list.
    trial_info = {'rep': rep, 'C': c, 'entanglement': entangle, 'shots': shots, 'accuracy': accuracy}
    trials_info.append(trial_info)
    
    # Return the negative of the accuracy (as hyperopt minimizes the objective function).
    return -accuracy


### Space for hyperparameter
- This cell defines a search space for hyperparameter optimization using the hyperopt library. Each hyperparameter is defined with a specific distribution or set of choices

In [134]:
space = {
    'rep': hp.quniform('rep', 1, 5, 1),                              # Number of repetitions for the feature map (uniformly distributed).
    'C': hp.quniform('C', .1, 1000, 10),                             # Regularization parameter for SVC (uniformly distributed).
    'entanglement': hp.choice('entanglement', ["full", "linear"]),   # Type of entanglement in the feature map (categorical choice).
    'shots': hp.quniform('shots', 100, 1000, 10)                     # Number of shots for executing quantum circuits (uniformly distributed).
}

### Performs the optimization 
- This cell performs hyperparameter optimization using the hyperopt library. It uses the Trials object to keep track of the trials during optimization.

In [135]:
trials = Trials()              # Create an object to store information about each trial.
best = fmin(fn=objective,      # fn=objective: Specifies the objective function to be optimized.
            space=space,       # space=space: Specifies the search space defined earlier.
            algo=tpe.suggest,  # algo=tpe.suggest: Specifies the optimization algorithm, which in this case is Tree Parzen Estimator (TPE).
            max_evals=20,      # max_evals=20: Specifies the maximum number of evaluations or trials for the optimization process.
            trials=trials,     # trials=trials: Passes the Trials object to store information about each trial during optimization.
            )

100%|██████████| 20/20 [11:40<00:00, 35.02s/trial, best loss: -0.9199999999999999]


### Show the best results

In [152]:
 # the best hyperparameters found by the optimization process
print("The best hyperparameters are:",space_eval(space, best))

The best hyperparameters are: {'C': 550.0, 'entanglement': 'full', 'rep': 2.0, 'shots': 600.0}


### Run the optimized case

In [149]:
# Convert 'rep' to integer from the best parameters obtained from hyperparameter optimization.
rep = int(best['rep'])

# Determine the number of qubits from the shape of the training features.
num_qubits = train_features.shape[1]

# Retrieve the best value for the regularization parameter 'C'.
c = best['C']

# Translate the encoded entanglement choice to its corresponding string value.
entangle = ["full", "linear"][best['entanglement']]

# Construct the feature map using the specified parameters.
feature_map = ZZFeatureMap(feature_dimension=num_qubits, reps=rep, entanglement=entangle)

# Set up the sampler with the optimal number of shots.
sampler = Sampler(options={"shots": shots})

# Initialize the fidelity computation method with the chosen sampler.
fidelity = ComputeUncompute(sampler=sampler)

# Create the fidelity-based quantum kernel using the feature map and fidelity method.
adhoc_kernel = FidelityQuantumKernel(fidelity=fidelity, feature_map=feature_map)

# Evaluate the quantum kernel on the training dataset.
gram_train = adhoc_kernel.evaluate(x_vec=train_features, y_vec=train_features)

# Evaluate the quantum kernel on the test dataset.
gram_test = adhoc_kernel.evaluate(x_vec=test_features, y_vec=train_features)

# Initialize the Support Vector Classifier (SVC) with the precomputed kernel.
classifier_obj = SVC(kernel="precomputed", shrinking=True, C=c, random_state=seed)

# Fit the SVC model to the training data.
classifier_obj.fit(gram_train, train_labels)

# Calculate and print the training accuracy of the SVC model.
print("Training accuracy:", classifier_obj.score(gram_train, train_labels))

# Calculate and print the test accuracy of the SVC model.
print("Test accuracy:", classifier_obj.score(gram_test, test_labels))


training accurcy 1.0
test accurcy 1.0


### Define the parallel_coordinates_plot funtion 
- Hyperparameter Optimization VS Accuracy

In [160]:
def draw_parallel_coordinates(trials_info: list):
    """
    Draw a parallel coordinates plot based on the provided trials information.

    Args:
        trials_info (list): List of dictionaries containing information about each trial.

    Returns:
        None
    """

    import plotly.express as px

    # Convert the trials information into a DataFrame
    df = pd.DataFrame(trials_info)

    # Map 'entanglement' to numerical values (0 for 'linear', 1 for 'full')
    df['Entanglement'] = df['entanglement'].map({'linear': 0, 'full': 1})

    # Reorder the columns for better visualization
    df = df[["rep", "C", "shots", "Entanglement", "accuracy"]]

    # Plotting parallel coordinates
    fig = px.parallel_coordinates(df, color="accuracy", labels={"Entanglement": "Full Entanglement"})

    # Add annotation for 'linear Entanglement'
    fig.add_annotation(
        text="linear Entanglement",  # Annotation text
        xref="paper", yref="paper",  # Reference point for the text position
        x=0.83, y=-0.15,  # Position of the annotation
        showarrow=False,  # Hide the arrow
        font=dict(
            family="Arial",
            size=12,
            color="black"
        ))

    # Update layout with a title
    fig.update_layout(title="Hyperparameter Optimization VS Accuracy")

    # Show the plot
    fig.show()


### Visualize and tracing of the accuracy while changing the values of the parameters

In [167]:
# Draw the parallel coordinates plot based on the trials information.
draw_parallel_coordinates(trials_info)

### Graph outcomes:
- The parameter Entanglement is highly effective in the accuracy of the model that can be shown obviously form the "Full entanglement" contribution in the high accuracy trails.

- The high accuracy came from values of "shots" relatively high from 350 to 850 so the more you repeat the measurement the higher the accuracy.
- The surprising note about this graph that "C" has no significant effect in the accuracy of the results, the optimized values of the c are found in a large spectrum from 50 to 750 and it share the high and low accuracy values.
- The representation parameter is inversely affecting the accuracy most of the high accuracy trails happened when rep was from 2 to 3 indicates that the lighter models were more accurate.

.






.

### Visualize and tracing the history of the accuracy vs the trails and mark the best values on the run

In [171]:
def Trails_history(trials_info: list, n: int):
    """
    Plot the object value and best value against the number of trials.

    Args:
    trials_info (list): List of dictionaries containing information about each trial.
    n (int): Number of trials.

    Returns:
    None
    """
    import plotly.graph_objects as go
    import numpy as np

    accuracies = []

    # Extract accuracies from trials_info
    for trial in trials_info:
        accuracies.append(trial['accuracy'])

    # Find the best values during the trials
    best_values = []
    initial = 0
    for i in accuracies:
        if i > initial:
            best_values.append(i)
            initial = i
        else:
            best_values.append(best_values[-1])

    trial_numbers = np.arange(1, (n + 1))

    # Create traces for both lines
    trace1 = go.Scatter(x=trial_numbers, y=best_values, mode='lines', name='best value')
    trace2 = go.Scatter(x=trial_numbers, y=accuracies, mode='markers', name='object value')

    # Create the layout
    layout = go.Layout(title='Object value  vs. Trials',
                       xaxis=dict(title='Trials'),
                       yaxis=dict(title='Object value (Accuracy)'))

    # Create the figure and add both traces
    fig = go.Figure(data=[trace1, trace2], layout=layout)

    # Show the plot
    fig.show()


In [172]:
# Draw the history plot based on the trials information and the number of trials.
Trails_history(trials_info,20)

### Graph outcomes:
- The red dots indicate the trails carried out to optimize this model.

- The blue line indicates the highest accuracy found so far in the trail.
- It is noticeable that the optimizer found the optimum values of the model just after 8 trails which represents the strength of the "hyperopt" optimizer and the TPE algorithm.
- Despite the fast finding of the optimum values the rest of the trails are in lack of accuracy.