# Automated Machine Learning from Scratch

Group 18 Members:

- Clara Pichler, 11917694
- Hannah Knapp, 11901857 
- Sibel Toprakkiran, 09426341

### Overview

1. Our Implementation

2. Data Sets

3. Evaluation
- Iris Dataset
- Congressional Voting Dataset
- gym session tracking Dataset
- Abalone Data set


The comparison with TPOT and auto-sklearn will be done in the files `tpot.ipynb` and `auto_sklearn.ipynb`.

In [1]:
from sklearn import datasets
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error, classification_report, mean_absolute_error, r2_score
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.neural_network import MLPClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, RandomForestRegressor, GradientBoostingRegressor
from sklearn.svm import SVC
import time
from sklearn.dummy import DummyClassifier, DummyRegressor
from sklearn.experimental import enable_iterative_imputer 
from sklearn.impute import IterativeImputer
from sklearn.linear_model import LinearRegression, Lasso
from sklearn.preprocessing import LabelEncoder

In [2]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

## Our Implementation

Our automated machine learning class is quite simple only focussing on the model selection and hyperparameter tuning by using simulated annealing.

Simulated annealing is a probabilistic optimization algorithm. It is used to find an __approximate global optimum__ for a given function in a large search space. The key features are:

1. __Temperature__: Starts at a high value and decreases over time (cooling schedule). At higher temperatures, the algorithm is more likely to accept worse solutions, allowing it to explore more of the solution space.
2. __Acceptance of Worse Solutions__: To avoid getting stuck in local optima, worse solutions are accepted with a probability proportional to how much worse they are and the current temperature.
3. __Cooling Schedule__: The cooling rate determines how quickly the temperature decreases.



In [3]:
class AutoML_18:
    def __init__(self, initial_temp=100, cooling_rate=0.99, max_iterations=100, min_training_time=3600, classifier = True):
        self.initial_temp = initial_temp
        self.cooling_rate = cooling_rate
        self.max_iterations = max_iterations
        self.min_training_time = min_training_time
        self.classifier = classifier

        self.algorithms_classifier = {
            "MLPClassifier": {
                "class": MLPClassifier,
                "parameters": ["max_iter", "activation", "solver", "alpha"],
                "values": [[1000, 2000, 3000], ['relu', 'tanh', 'logistic'], ['adam', 'sgd'], [0.0001, 0.001, 0.01]]
            },
            "RandomForestClassifier": {
                "class": RandomForestClassifier,
                "parameters": ["n_estimators", "max_depth", "min_samples_split", "max_features", "criterion"],
                "values": [[10, 25, 50, 100, 150], [5, 10, 15], [2, 3, 3, 4], ['sqrt', 'log2', None], ['gini', 'log_loss', 'entropy']]
            },
            "KNClassifier": {
                "class": KNeighborsClassifier,
                "parameters": ["n_neighbors", "weights", "leaf_size"],
                "values": [[3, 5, 7, 9, 11], ['uniform', 'distance'], [10, 20, 30, 40, 50]]
            },
            "SVM": {
                "class": SVC,
                "parameters": ["C", "kernel", "gamma"],
                "values": [[1, 10, 100, 1000], ['linear', 'poly', 'rbf', 'sigmoid'], ['scale', 'auto']]
            },
            "AdaBoostClassifier": {
                "class": AdaBoostClassifier,
                "parameters": ["n_estimators", "learning_rate"],
                "values": [[10, 25, 50, 100, 150], [0.1, 0.5, 1, 1.5, 2]]
            },
        }

        self.algorithms_regressor = {
            'RandomForestRegressor': {
                'class': RandomForestRegressor,
                'parameters': ["n_estimators", "max_depth", "min_samples_split", "max_features", "criterion"],
                'values': [[10, 25, 50, 100, 150], [5, 10, 15], [2, 3, 3, 4], ['sqrt', 'log2', None], ['squared_error', 'absolute_error']]
            },
            'GradientBoostingRegressor': {
                'class': GradientBoostingRegressor,
                'parameters': ["n_estimators", "learning_rate", "loss"],
                'values': [[10, 25, 50, 100, 150], [0.1, 0.5, 1, 1.5, 2], ['squared_error', 'absolute_error', 'huber']] 
            },
            'LinearRegression': {
                'class': LinearRegression,
                'parameters': ['n_jobs'],
                'values': [[3, 5, 7, 9]]
            },
            'LassoRegression': {
                'class': Lasso,
                'parameters': ["alpha", "max_iter"],
                'values': [[0.1, 0.5, 1, 1.5, 2], [1000, 2000, 3000, 4000, 5000]]
            },
            'KNRegressor': {
                'class': KNeighborsRegressor,
                'parameters': ["n_neighbors", "weights", "algorithm", "leaf_size"],
                'values': [[3, 5, 7, 9, 11], ['uniform', 'distance'], ['auto', 'ball_tree', 'kd_tree', 'brute'], [10, 20, 30, 40, 50]]
            },
        }
        
        self.best_solution = None
        self.best_score = 0
        self.model = None
        
    def eval(self, model, X_train, y_train, X_val, y_val):
        model.fit(X_train, y_train) 
        predictions = model.predict(X_val)  
        if self.classifier:
            score = accuracy_score(y_val, predictions) 
        else:
            score = -mean_squared_error(y_val, predictions)
      
        return score
    
    def generate_neighborhood(self, current_solution):
        algorithm_dict = self.algorithms_classifier if self.classifier else self.algorithms_regressor
        algorithm_name = current_solution[0]
        algorithm_info = algorithm_dict[algorithm_name]
        
        new_solution = current_solution[:]
        
        if not algorithm_info['parameters']:
            new_solution[0] = np.random.choice(list(algorithm_dict.keys()))
            return new_solution
        
        while len(new_solution) < len(algorithm_info['parameters']) + 1:
            param_index = len(new_solution) - 1
            new_solution.append(np.random.choice(algorithm_info['values'][param_index]))

        param_idx = np.random.randint(1, len(new_solution)) 
        param_values = algorithm_info['values'][param_idx - 1]  

        new_solution[param_idx] = np.random.choice(param_values)

        # 10% probability that a new algorithm is chosen
        if np.random.rand() < 0.1:
            new_solution[0] = np.random.choice(list(algorithm_dict.keys()))
            new_algorithm_info = algorithm_dict[new_solution[0]]
            new_solution = [new_solution[0]] + [
                np.random.choice(values) for values in new_algorithm_info["values"]
            ]

        return new_solution




    def create_model(self, solution):
        algorithm_name = solution[0]
        hyperparameters = solution[1:]
        algorithm_dict = self.algorithms_classifier if self.classifier else self.algorithms_regressor
        algorithm_class = algorithm_dict[algorithm_name]['class']

        if algorithm_name not in algorithm_dict:
            print(f"Algorithm {algorithm_name} not found in dictionary!")
            return None 
        
        elif algorithm_name == 'MLPClassifier':
            return algorithm_class(
                max_iter=hyperparameters[0],
                activation=hyperparameters[1],
                solver=hyperparameters[2],
                alpha=hyperparameters[3]
            )
        elif algorithm_name == 'RandomForestClassifier':
            return algorithm_class(
                n_estimators=hyperparameters[0],
                max_depth=hyperparameters[1],
                min_samples_split=hyperparameters[2],
                max_features=hyperparameters[3],
                criterion=hyperparameters[4]
            )
        elif algorithm_name == 'KNClassifier':
            return algorithm_class(
                n_neighbors=hyperparameters[0],
                weights=hyperparameters[1],
                leaf_size=hyperparameters[2]
            )
        elif algorithm_name == 'SVM':
            return algorithm_class(
                C=hyperparameters[0],
                kernel=hyperparameters[1],
                gamma=hyperparameters[2]
            )
        elif algorithm_name == 'AdaBoostClassifier':
            return algorithm_class(
                n_estimators=hyperparameters[0],
                learning_rate=hyperparameters[1],
            )
        elif algorithm_name == 'RandomForestRegressor':
            return algorithm_class(
                n_estimators=hyperparameters[0],
                max_depth=hyperparameters[1],
                min_samples_split=hyperparameters[2],
                max_features=hyperparameters[3],
                criterion=hyperparameters[4]
            )
        elif algorithm_name == 'GradientBoostingRegressor':
            return algorithm_class(
                n_estimators=hyperparameters[0],
                learning_rate=hyperparameters[1],
                loss=hyperparameters[2]
            )
        elif algorithm_name == 'Polynomial Regression':
            return algorithm_class(
                degree=hyperparameters[0],
                order=hyperparameters[1]
            )
        elif algorithm_name == 'LassoRegression':
            return algorithm_class(
                alpha=hyperparameters[0],
                max_iter=hyperparameters[1]
            )
        elif algorithm_name == 'KNRegressor':
            return algorithm_class(
                n_neighbors=hyperparameters[0],
                weights=hyperparameters[1],
                algorithm=hyperparameters[2],
                leaf_size=hyperparameters[3]
            )
        elif algorithm_name == 'DummyClassifier':
            return DummyClassifier(strategy='most_frequent')
        elif algorithm_name == 'DummyRegressor':
            return DummyRegressor(strategy='mean')
        elif algorithm_name == 'LinearRegression':
            return algorithm_class()
        

    def fit(self, X_train, y_train, X_val, y_val):
        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val
        self.simulated_annealing()

    def predict(self, X):
        if self.model is None:
            raise ValueError("The model has not been fit yet. Please call the fit method first.")
        return self.model.predict(X)
    
    def simulated_annealing(self):
        start_time = time.time()  
        
        zero_r_model = DummyClassifier(strategy='most_frequent') if self.classifier else DummyRegressor(strategy='mean')
        current_solution = ['DummyClassifier'] if self.classifier else ['DummyRegressor']
        algorithms_dict = self.algorithms_classifier if self.classifier else self.algorithms_regressor

        current_score = self.eval(zero_r_model, self.X_train, self.y_train, self.X_val, self.y_val)
        best_solution = current_solution
        best_score = current_score
    
        temperature = self.initial_temp
    
        while time.time() - start_time < self.min_training_time:
            for _ in range(1, self.max_iterations):
                    
                if current_solution[0] in ['DummyClassifier', 'DummyRegressor']:
                    new_solution = self.generate_neighborhood(['KNClassifier' if self.classifier else 'KNRegressor'])
                else:
                    new_solution = self.generate_neighborhood(current_solution)

                new_model = self.create_model(new_solution)
                new_score = self.eval(new_model, self.X_train, self.y_train, self.X_val, self.y_val)

                if new_score > current_score or np.random.rand() < np.exp((new_score - current_score) / max(temperature, 1e-3)):
                    current_solution = new_solution
                    current_score = new_score
                    if new_score > best_score:
                        best_solution = new_solution
                        best_score = new_score
    
            temperature *= self.cooling_rate
    
        self.best_solution = best_solution
        self.best_score = best_score
        self.model = self.create_model(best_solution)
        self.model.fit(self.X_train, self.y_train)
        
        algorithm_name = best_solution[0]
        hyperparameters = best_solution[1:]
        if hyperparameters:
            param_str = ', '.join(
                f"{param}={round(value, 4) if isinstance(value, float) else value}"
                for param, value in zip(algorithms_dict[algorithm_name]['parameters'], hyperparameters)
            )
            formatted_solution = f"{algorithm_name}({param_str})"
        else:
            formatted_solution = algorithm_name  

        print(f"The best model is {formatted_solution} with a score of {round(best_score, 4)}")
        

The most important function is `simulated_annealing` (which is the main function and calls the others in a loop). The function `generate_neighborhood` creates variations of the `current_solution` by modifying one or more hyperparameters. With a small probability (10%), a new algorithm is chosen entirely. This ensures exploration within the local neighborhood of solutions (small hyperparameter changes) and, occasionally, exploration of entirely new models, which makes sure that we are not stuck at a local optimum but might explore even a global optimum.

If the new solution has a better score, it is always accepted. If the new solution has a worse score, it is accepted with a probability based on the difference in scores (is negative if new solution is smaller (max problem)) and the current temperature.

## Data sets

In [4]:
iris = datasets.load_iris()
iris_data = pd.DataFrame(data= np.c_[iris['data'], iris['target']], columns= iris['feature_names'] + ['target'])
iris_data['target'] = iris_data['target'].map({0: 'setosa', 1: 'versicolor', 2: 'virginica'})

df_voting = pd.read_csv('data/CongressionalVotingID.shuf.lrn.csv')

df_airfoil = pd.read_csv("data/airfoil_noise_data.csv")

url='./data/abalone.csv'
column_names = ["Sex", "Length", "Diameter", "Height", "Whole_weight", "Shucked_weight", "Viscera_weight", "Shell_weight", "Rings"]
df_abalone = pd.read_csv(url, header=0, names=column_names)
df_abalone = df_abalone[df_abalone.Height != 0]
df_abalone = pd.get_dummies(df_abalone, columns=['Sex'], drop_first=False)


### Pre-processing

In [5]:
pd.set_option('future.no_silent_downcasting', True)
df_voting = df_voting.replace({"democrat": 0,"republican": 1,"n": 0,"y": 1,"unknown": np.nan})
df_voting = df_voting.drop(columns=['ID'])

imp = IterativeImputer(max_iter=10, random_state=0)
df_voting = pd.DataFrame(imp.fit_transform(df_voting), columns=df_voting.columns)



### test-validation-train split

In [7]:
X_iris = iris_data.drop(['target'], axis=1)
y_iris = iris_data['target']

X_train_iris, X_temp, y_train_iris, y_temp = train_test_split(X_iris, y_iris, test_size=0.6, random_state=42)
X_val_iris, X_test_iris, y_val_iris, y_test_iris = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

In [8]:
X_voting = df_voting.drop(['class'], axis=1)
y_voting = df_voting['class']

X_train_voting, X_temp, y_train_voting, y_temp = train_test_split(X_voting, y_voting, test_size=0.6, random_state=42)
X_val_voting, X_test_voting, y_val_voting, y_test_voting = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

In [9]:
X_airfoil = df_airfoil.drop(['y'], axis=1)
y_airfoil = df_airfoil['y']

X_train_airfoil, X_temp, y_train_airfoil, y_temp = train_test_split(X_airfoil, y_airfoil, test_size=0.6, random_state=42)
X_val_airfoil, X_test_airfoil, y_val_airfoil, y_test_airfoil = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

In [12]:
X_abalone_reg = df_abalone.drop(['Rings'], axis=1)
y_abalone_reg = df_abalone['Rings']

X_train_abalone_reg, X_temp_reg, y_train_abalone_reg, y_temp_reg = train_test_split(X_abalone_reg, y_abalone_reg, test_size=0.6, random_state=42)
X_val_abalone_reg, X_test_abalone_reg, y_val_abalone_reg, y_test_abalone_reg = train_test_split(X_temp_reg, y_temp_reg, test_size=0.5, random_state=42)

## Evaluation

### Iris

In [13]:
automl = AutoML_18(min_training_time=3600, max_iterations=50)

print("Fitting the AutoML algorithm")
automl.fit(X_train_iris, y_train_iris, X_val_iris, y_val_iris)

print("\nEvaluating on the test data")
predictions = automl.predict(X_test_iris)

test_accuracy = accuracy_score(y_test_iris, predictions)
print(f"Test Accuracy: {test_accuracy:.4f}")
print("\nClassification Report:")
print(classification_report(y_test_iris, predictions))

Fitting the AutoML algorithm




The best model is KNClassifier(n_neighbors=9, weights=distance, leaf_size=20) with a score of 0.9778

Evaluating on the test data
Test Accuracy: 0.9778

Classification Report:
              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        18
  versicolor       1.00      0.91      0.95        11
   virginica       0.94      1.00      0.97        16

    accuracy                           0.98        45
   macro avg       0.98      0.97      0.97        45
weighted avg       0.98      0.98      0.98        45





### Congressional Voting

In [None]:
automl = AutoML_18(min_training_time=3600)

print("Fitting the AutoML algorithm")
automl.fit(X_train_voting, y_train_voting, X_val_voting, y_val_voting)

print("\nEvaluating on the test data")
predictions = automl.predict(X_test_voting)

test_accuracy = accuracy_score(y_test_voting, predictions)
print(f"Test Accuracy: {test_accuracy:.4f}")
print("\nClassification Report:")
print(classification_report(y_test_voting, predictions))

Fitting the AutoML algorithm
Iteration 50, Temperature 100.000, Best Evaluation 0.95385
Iteration 50, Temperature 99.000, Best Evaluation 1.00000




Iteration 50, Temperature 98.010, Best Evaluation 1.00000




Iteration 50, Temperature 97.030, Best Evaluation 1.00000




Iteration 50, Temperature 96.060, Best Evaluation 1.00000
Iteration 50, Temperature 95.099, Best Evaluation 1.00000
Iteration 50, Temperature 94.148, Best Evaluation 1.00000




Iteration 50, Temperature 93.207, Best Evaluation 1.00000




Iteration 50, Temperature 92.274, Best Evaluation 1.00000
Iteration 50, Temperature 91.352, Best Evaluation 1.00000




Iteration 50, Temperature 90.438, Best Evaluation 1.00000




Iteration 50, Temperature 89.534, Best Evaluation 1.00000
Iteration 50, Temperature 88.638, Best Evaluation 1.00000
Iteration 50, Temperature 87.752, Best Evaluation 1.00000




Iteration 50, Temperature 86.875, Best Evaluation 1.00000




Iteration 50, Temperature 86.006, Best Evaluation 1.00000




Iteration 50, Temperature 85.146, Best Evaluation 1.00000
Iteration 50, Temperature 84.294, Best Evaluation 1.00000




Iteration 50, Temperature 83.451, Best Evaluation 1.00000
Iteration 50, Temperature 82.617, Best Evaluation 1.00000




Iteration 50, Temperature 81.791, Best Evaluation 1.00000




Iteration 50, Temperature 80.973, Best Evaluation 1.00000




Iteration 50, Temperature 80.163, Best Evaluation 1.00000
Iteration 50, Temperature 79.361, Best Evaluation 1.00000




Iteration 50, Temperature 78.568, Best Evaluation 1.00000
Iteration 50, Temperature 77.782, Best Evaluation 1.00000
Iteration 50, Temperature 77.004, Best Evaluation 1.00000
Iteration 50, Temperature 76.234, Best Evaluation 1.00000




Iteration 50, Temperature 75.472, Best Evaluation 1.00000




Iteration 50, Temperature 74.717, Best Evaluation 1.00000




Iteration 50, Temperature 73.970, Best Evaluation 1.00000




Iteration 50, Temperature 73.230, Best Evaluation 1.00000




Iteration 50, Temperature 72.498, Best Evaluation 1.00000




Iteration 50, Temperature 71.773, Best Evaluation 1.00000




Iteration 50, Temperature 71.055, Best Evaluation 1.00000




Iteration 50, Temperature 70.345, Best Evaluation 1.00000




Iteration 50, Temperature 69.641, Best Evaluation 1.00000




Iteration 50, Temperature 68.945, Best Evaluation 1.00000




Iteration 50, Temperature 68.255, Best Evaluation 1.00000
Iteration 50, Temperature 67.573, Best Evaluation 1.00000
Iteration 50, Temperature 66.897, Best Evaluation 1.00000




Iteration 50, Temperature 66.228, Best Evaluation 1.00000
Iteration 50, Temperature 65.566, Best Evaluation 1.00000
Iteration 50, Temperature 64.910, Best Evaluation 1.00000
Iteration 50, Temperature 64.261, Best Evaluation 1.00000




Iteration 50, Temperature 63.619, Best Evaluation 1.00000
Iteration 50, Temperature 62.982, Best Evaluation 1.00000
Iteration 50, Temperature 62.353, Best Evaluation 1.00000




Iteration 50, Temperature 61.729, Best Evaluation 1.00000




Iteration 50, Temperature 61.112, Best Evaluation 1.00000
Iteration 50, Temperature 60.501, Best Evaluation 1.00000




Iteration 50, Temperature 59.896, Best Evaluation 1.00000
Iteration 50, Temperature 59.297, Best Evaluation 1.00000




Iteration 50, Temperature 58.704, Best Evaluation 1.00000




Iteration 50, Temperature 58.117, Best Evaluation 1.00000
Iteration 50, Temperature 57.535, Best Evaluation 1.00000
Iteration 50, Temperature 56.960, Best Evaluation 1.00000
Iteration 50, Temperature 56.391, Best Evaluation 1.00000




Iteration 50, Temperature 55.827, Best Evaluation 1.00000
Iteration 50, Temperature 55.268, Best Evaluation 1.00000




Iteration 50, Temperature 54.716, Best Evaluation 1.00000




Iteration 50, Temperature 54.169, Best Evaluation 1.00000
Iteration 50, Temperature 53.627, Best Evaluation 1.00000




Iteration 50, Temperature 53.091, Best Evaluation 1.00000
Iteration 50, Temperature 52.560, Best Evaluation 1.00000
Iteration 50, Temperature 52.034, Best Evaluation 1.00000




Iteration 50, Temperature 51.514, Best Evaluation 1.00000




Iteration 50, Temperature 50.999, Best Evaluation 1.00000




Iteration 50, Temperature 50.489, Best Evaluation 1.00000
Iteration 50, Temperature 49.984, Best Evaluation 1.00000




Iteration 50, Temperature 49.484, Best Evaluation 1.00000




Iteration 50, Temperature 48.989, Best Evaluation 1.00000
Iteration 50, Temperature 48.499, Best Evaluation 1.00000
Iteration 50, Temperature 48.014, Best Evaluation 1.00000
Iteration 50, Temperature 47.534, Best Evaluation 1.00000




Iteration 50, Temperature 47.059, Best Evaluation 1.00000




Iteration 50, Temperature 46.588, Best Evaluation 1.00000




Iteration 50, Temperature 46.122, Best Evaluation 1.00000
Iteration 50, Temperature 45.661, Best Evaluation 1.00000
Iteration 50, Temperature 45.204, Best Evaluation 1.00000




Iteration 50, Temperature 44.752, Best Evaluation 1.00000




Iteration 50, Temperature 44.305, Best Evaluation 1.00000




Iteration 50, Temperature 43.862, Best Evaluation 1.00000




Iteration 50, Temperature 43.423, Best Evaluation 1.00000
Iteration 50, Temperature 42.989, Best Evaluation 1.00000




Iteration 50, Temperature 42.559, Best Evaluation 1.00000




Iteration 50, Temperature 42.133, Best Evaluation 1.00000




Iteration 50, Temperature 41.712, Best Evaluation 1.00000
Iteration 50, Temperature 41.295, Best Evaluation 1.00000




Iteration 50, Temperature 40.882, Best Evaluation 1.00000




Iteration 50, Temperature 40.473, Best Evaluation 1.00000
Iteration 50, Temperature 40.068, Best Evaluation 1.00000
Iteration 50, Temperature 39.668, Best Evaluation 1.00000




Iteration 50, Temperature 39.271, Best Evaluation 1.00000




Iteration 50, Temperature 38.878, Best Evaluation 1.00000
Iteration 50, Temperature 38.490, Best Evaluation 1.00000
Iteration 50, Temperature 38.105, Best Evaluation 1.00000
Iteration 50, Temperature 37.724, Best Evaluation 1.00000




Iteration 50, Temperature 37.346, Best Evaluation 1.00000




Iteration 50, Temperature 36.973, Best Evaluation 1.00000




Iteration 50, Temperature 36.603, Best Evaluation 1.00000




Iteration 50, Temperature 36.237, Best Evaluation 1.00000




Iteration 50, Temperature 35.875, Best Evaluation 1.00000
Iteration 50, Temperature 35.516, Best Evaluation 1.00000
Iteration 50, Temperature 35.161, Best Evaluation 1.00000




Iteration 50, Temperature 34.809, Best Evaluation 1.00000




Iteration 50, Temperature 34.461, Best Evaluation 1.00000
Iteration 50, Temperature 34.117, Best Evaluation 1.00000




Iteration 50, Temperature 33.775, Best Evaluation 1.00000




Iteration 50, Temperature 33.438, Best Evaluation 1.00000




Iteration 50, Temperature 33.103, Best Evaluation 1.00000
Iteration 50, Temperature 32.772, Best Evaluation 1.00000
Iteration 50, Temperature 32.445, Best Evaluation 1.00000
Iteration 50, Temperature 32.120, Best Evaluation 1.00000
Iteration 50, Temperature 31.799, Best Evaluation 1.00000
Iteration 50, Temperature 31.481, Best Evaluation 1.00000




Iteration 50, Temperature 31.166, Best Evaluation 1.00000




Iteration 50, Temperature 30.854, Best Evaluation 1.00000




Iteration 50, Temperature 30.546, Best Evaluation 1.00000




Iteration 50, Temperature 30.240, Best Evaluation 1.00000
Iteration 50, Temperature 29.938, Best Evaluation 1.00000




Iteration 50, Temperature 29.639, Best Evaluation 1.00000




Iteration 50, Temperature 29.342, Best Evaluation 1.00000
Iteration 50, Temperature 29.049, Best Evaluation 1.00000




Iteration 50, Temperature 28.758, Best Evaluation 1.00000
Iteration 50, Temperature 28.471, Best Evaluation 1.00000
Iteration 50, Temperature 28.186, Best Evaluation 1.00000




Iteration 50, Temperature 27.904, Best Evaluation 1.00000
Iteration 50, Temperature 27.625, Best Evaluation 1.00000




Iteration 50, Temperature 27.349, Best Evaluation 1.00000




Iteration 50, Temperature 27.075, Best Evaluation 1.00000




Iteration 50, Temperature 26.805, Best Evaluation 1.00000
Iteration 50, Temperature 26.537, Best Evaluation 1.00000
Iteration 50, Temperature 26.271, Best Evaluation 1.00000




Iteration 50, Temperature 26.009, Best Evaluation 1.00000




Iteration 50, Temperature 25.748, Best Evaluation 1.00000
Iteration 50, Temperature 25.491, Best Evaluation 1.00000




Iteration 50, Temperature 25.236, Best Evaluation 1.00000
Iteration 50, Temperature 24.984, Best Evaluation 1.00000
Iteration 50, Temperature 24.734, Best Evaluation 1.00000
Iteration 50, Temperature 24.487, Best Evaluation 1.00000
Iteration 50, Temperature 24.242, Best Evaluation 1.00000




Iteration 50, Temperature 23.999, Best Evaluation 1.00000




Iteration 50, Temperature 23.759, Best Evaluation 1.00000




Iteration 50, Temperature 23.522, Best Evaluation 1.00000
Iteration 50, Temperature 23.286, Best Evaluation 1.00000




Iteration 50, Temperature 23.054, Best Evaluation 1.00000
Iteration 50, Temperature 22.823, Best Evaluation 1.00000




Iteration 50, Temperature 22.595, Best Evaluation 1.00000




Iteration 50, Temperature 22.369, Best Evaluation 1.00000
Iteration 50, Temperature 22.145, Best Evaluation 1.00000




Iteration 50, Temperature 21.924, Best Evaluation 1.00000
Iteration 50, Temperature 21.704, Best Evaluation 1.00000




Iteration 50, Temperature 21.487, Best Evaluation 1.00000
Iteration 50, Temperature 21.273, Best Evaluation 1.00000
Iteration 50, Temperature 21.060, Best Evaluation 1.00000




Iteration 50, Temperature 20.849, Best Evaluation 1.00000
Iteration 50, Temperature 20.641, Best Evaluation 1.00000




Iteration 50, Temperature 20.434, Best Evaluation 1.00000
Iteration 50, Temperature 20.230, Best Evaluation 1.00000




Iteration 50, Temperature 20.028, Best Evaluation 1.00000




Iteration 50, Temperature 19.827, Best Evaluation 1.00000




Iteration 50, Temperature 19.629, Best Evaluation 1.00000




Iteration 50, Temperature 19.433, Best Evaluation 1.00000




Iteration 50, Temperature 19.239, Best Evaluation 1.00000
Iteration 50, Temperature 19.046, Best Evaluation 1.00000
Iteration 50, Temperature 18.856, Best Evaluation 1.00000




Iteration 50, Temperature 18.667, Best Evaluation 1.00000




Iteration 50, Temperature 18.480, Best Evaluation 1.00000
Iteration 50, Temperature 18.296, Best Evaluation 1.00000




Iteration 50, Temperature 18.113, Best Evaluation 1.00000
Iteration 50, Temperature 17.932, Best Evaluation 1.00000
Iteration 50, Temperature 17.752, Best Evaluation 1.00000
Iteration 50, Temperature 17.575, Best Evaluation 1.00000
Iteration 50, Temperature 17.399, Best Evaluation 1.00000
Iteration 50, Temperature 17.225, Best Evaluation 1.00000




Iteration 50, Temperature 17.053, Best Evaluation 1.00000




Iteration 50, Temperature 16.882, Best Evaluation 1.00000
Iteration 50, Temperature 16.713, Best Evaluation 1.00000




Iteration 50, Temperature 16.546, Best Evaluation 1.00000




Iteration 50, Temperature 16.381, Best Evaluation 1.00000




Iteration 50, Temperature 16.217, Best Evaluation 1.00000
Iteration 50, Temperature 16.055, Best Evaluation 1.00000
Iteration 50, Temperature 15.894, Best Evaluation 1.00000




Iteration 50, Temperature 15.735, Best Evaluation 1.00000




Iteration 50, Temperature 15.578, Best Evaluation 1.00000




Iteration 50, Temperature 15.422, Best Evaluation 1.00000
Iteration 50, Temperature 15.268, Best Evaluation 1.00000
Iteration 50, Temperature 15.115, Best Evaluation 1.00000




Iteration 50, Temperature 14.964, Best Evaluation 1.00000
Iteration 50, Temperature 14.814, Best Evaluation 1.00000




Iteration 50, Temperature 14.666, Best Evaluation 1.00000
Iteration 50, Temperature 14.520, Best Evaluation 1.00000




Iteration 50, Temperature 14.374, Best Evaluation 1.00000




Iteration 50, Temperature 14.231, Best Evaluation 1.00000
Iteration 50, Temperature 14.088, Best Evaluation 1.00000
Iteration 50, Temperature 13.948, Best Evaluation 1.00000
Iteration 50, Temperature 13.808, Best Evaluation 1.00000




Iteration 50, Temperature 13.670, Best Evaluation 1.00000
Iteration 50, Temperature 13.533, Best Evaluation 1.00000




Iteration 50, Temperature 13.398, Best Evaluation 1.00000




Iteration 50, Temperature 13.264, Best Evaluation 1.00000




Iteration 50, Temperature 13.131, Best Evaluation 1.00000




Iteration 50, Temperature 13.000, Best Evaluation 1.00000
Iteration 50, Temperature 12.870, Best Evaluation 1.00000




Iteration 50, Temperature 12.741, Best Evaluation 1.00000




Iteration 50, Temperature 12.614, Best Evaluation 1.00000
Iteration 50, Temperature 12.488, Best Evaluation 1.00000
Iteration 50, Temperature 12.363, Best Evaluation 1.00000
Iteration 50, Temperature 12.239, Best Evaluation 1.00000
Iteration 50, Temperature 12.117, Best Evaluation 1.00000
Iteration 50, Temperature 11.996, Best Evaluation 1.00000
Iteration 50, Temperature 11.876, Best Evaluation 1.00000
Iteration 50, Temperature 11.757, Best Evaluation 1.00000
Iteration 50, Temperature 11.639, Best Evaluation 1.00000
Iteration 50, Temperature 11.523, Best Evaluation 1.00000




Iteration 50, Temperature 11.408, Best Evaluation 1.00000




Iteration 50, Temperature 11.294, Best Evaluation 1.00000




Iteration 50, Temperature 11.181, Best Evaluation 1.00000




Iteration 50, Temperature 11.069, Best Evaluation 1.00000




Iteration 50, Temperature 10.958, Best Evaluation 1.00000
Iteration 50, Temperature 10.849, Best Evaluation 1.00000




Iteration 50, Temperature 10.740, Best Evaluation 1.00000




Iteration 50, Temperature 10.633, Best Evaluation 1.00000
Iteration 50, Temperature 10.526, Best Evaluation 1.00000




Iteration 50, Temperature 10.421, Best Evaluation 1.00000
Iteration 50, Temperature 10.317, Best Evaluation 1.00000
Iteration 50, Temperature 10.214, Best Evaluation 1.00000




Iteration 50, Temperature 10.112, Best Evaluation 1.00000
Iteration 50, Temperature 10.011, Best Evaluation 1.00000
Iteration 50, Temperature 9.910, Best Evaluation 1.00000
Iteration 50, Temperature 9.811, Best Evaluation 1.00000




Iteration 50, Temperature 9.713, Best Evaluation 1.00000
Iteration 50, Temperature 9.616, Best Evaluation 1.00000
Iteration 50, Temperature 9.520, Best Evaluation 1.00000




Iteration 50, Temperature 9.425, Best Evaluation 1.00000




Iteration 50, Temperature 9.331, Best Evaluation 1.00000




Iteration 50, Temperature 9.237, Best Evaluation 1.00000




Iteration 50, Temperature 9.145, Best Evaluation 1.00000
Iteration 50, Temperature 9.053, Best Evaluation 1.00000
Iteration 50, Temperature 8.963, Best Evaluation 1.00000
Iteration 50, Temperature 8.873, Best Evaluation 1.00000
Iteration 50, Temperature 8.785, Best Evaluation 1.00000




Iteration 50, Temperature 8.697, Best Evaluation 1.00000




Iteration 50, Temperature 8.610, Best Evaluation 1.00000




Iteration 50, Temperature 8.524, Best Evaluation 1.00000




Iteration 50, Temperature 8.438, Best Evaluation 1.00000




Iteration 50, Temperature 8.354, Best Evaluation 1.00000
Iteration 50, Temperature 8.270, Best Evaluation 1.00000




Iteration 50, Temperature 8.188, Best Evaluation 1.00000




Iteration 50, Temperature 8.106, Best Evaluation 1.00000




Iteration 50, Temperature 8.025, Best Evaluation 1.00000




Iteration 50, Temperature 7.945, Best Evaluation 1.00000




Iteration 50, Temperature 7.865, Best Evaluation 1.00000
Iteration 50, Temperature 7.786, Best Evaluation 1.00000
Iteration 50, Temperature 7.709, Best Evaluation 1.00000
Iteration 50, Temperature 7.631, Best Evaluation 1.00000




Iteration 50, Temperature 7.555, Best Evaluation 1.00000
Iteration 50, Temperature 7.480, Best Evaluation 1.00000




Iteration 50, Temperature 7.405, Best Evaluation 1.00000




Iteration 50, Temperature 7.331, Best Evaluation 1.00000




Iteration 50, Temperature 7.257, Best Evaluation 1.00000
Iteration 50, Temperature 7.185, Best Evaluation 1.00000




Iteration 50, Temperature 7.113, Best Evaluation 1.00000




Iteration 50, Temperature 7.042, Best Evaluation 1.00000




Iteration 50, Temperature 6.972, Best Evaluation 1.00000




Iteration 50, Temperature 6.902, Best Evaluation 1.00000




Iteration 50, Temperature 6.833, Best Evaluation 1.00000




Iteration 50, Temperature 6.764, Best Evaluation 1.00000




Iteration 50, Temperature 6.697, Best Evaluation 1.00000
Iteration 50, Temperature 6.630, Best Evaluation 1.00000




Iteration 50, Temperature 6.564, Best Evaluation 1.00000




Iteration 50, Temperature 6.498, Best Evaluation 1.00000
Iteration 50, Temperature 6.433, Best Evaluation 1.00000
Iteration 50, Temperature 6.369, Best Evaluation 1.00000




Iteration 50, Temperature 6.305, Best Evaluation 1.00000
Iteration 50, Temperature 6.242, Best Evaluation 1.00000




Iteration 50, Temperature 6.179, Best Evaluation 1.00000
Iteration 50, Temperature 6.118, Best Evaluation 1.00000
Iteration 50, Temperature 6.056, Best Evaluation 1.00000




Iteration 50, Temperature 5.996, Best Evaluation 1.00000




Iteration 50, Temperature 5.936, Best Evaluation 1.00000




Iteration 50, Temperature 5.877, Best Evaluation 1.00000




Iteration 50, Temperature 5.818, Best Evaluation 1.00000




Iteration 50, Temperature 5.760, Best Evaluation 1.00000
Iteration 50, Temperature 5.702, Best Evaluation 1.00000
Iteration 50, Temperature 5.645, Best Evaluation 1.00000




Iteration 50, Temperature 5.589, Best Evaluation 1.00000




Iteration 50, Temperature 5.533, Best Evaluation 1.00000




Iteration 50, Temperature 5.477, Best Evaluation 1.00000
Iteration 50, Temperature 5.423, Best Evaluation 1.00000




Iteration 50, Temperature 5.368, Best Evaluation 1.00000




Iteration 50, Temperature 5.315, Best Evaluation 1.00000




Iteration 50, Temperature 5.262, Best Evaluation 1.00000




Iteration 50, Temperature 5.209, Best Evaluation 1.00000




Iteration 50, Temperature 5.157, Best Evaluation 1.00000




Iteration 50, Temperature 5.105, Best Evaluation 1.00000




Iteration 50, Temperature 5.054, Best Evaluation 1.00000
Iteration 50, Temperature 5.004, Best Evaluation 1.00000
Iteration 50, Temperature 4.954, Best Evaluation 1.00000
Iteration 50, Temperature 4.904, Best Evaluation 1.00000
Iteration 50, Temperature 4.855, Best Evaluation 1.00000
Iteration 50, Temperature 4.806, Best Evaluation 1.00000




Iteration 50, Temperature 4.758, Best Evaluation 1.00000




Iteration 50, Temperature 4.711, Best Evaluation 1.00000




Iteration 50, Temperature 4.664, Best Evaluation 1.00000




Iteration 50, Temperature 4.617, Best Evaluation 1.00000




Iteration 50, Temperature 4.571, Best Evaluation 1.00000




Iteration 50, Temperature 4.525, Best Evaluation 1.00000




Iteration 50, Temperature 4.480, Best Evaluation 1.00000
Iteration 50, Temperature 4.435, Best Evaluation 1.00000




Iteration 50, Temperature 4.391, Best Evaluation 1.00000




Iteration 50, Temperature 4.347, Best Evaluation 1.00000




Iteration 50, Temperature 4.303, Best Evaluation 1.00000
Iteration 50, Temperature 4.260, Best Evaluation 1.00000




Iteration 50, Temperature 4.218, Best Evaluation 1.00000




Iteration 50, Temperature 4.176, Best Evaluation 1.00000




Iteration 50, Temperature 4.134, Best Evaluation 1.00000




Iteration 50, Temperature 4.093, Best Evaluation 1.00000




Iteration 50, Temperature 4.052, Best Evaluation 1.00000
Iteration 50, Temperature 4.011, Best Evaluation 1.00000




Iteration 50, Temperature 3.971, Best Evaluation 1.00000




Iteration 50, Temperature 3.931, Best Evaluation 1.00000
Iteration 50, Temperature 3.892, Best Evaluation 1.00000
Iteration 50, Temperature 3.853, Best Evaluation 1.00000




Iteration 50, Temperature 3.815, Best Evaluation 1.00000




Iteration 50, Temperature 3.776, Best Evaluation 1.00000
Iteration 50, Temperature 3.739, Best Evaluation 1.00000




Iteration 50, Temperature 3.701, Best Evaluation 1.00000
Iteration 50, Temperature 3.664, Best Evaluation 1.00000
Iteration 50, Temperature 3.628, Best Evaluation 1.00000
Iteration 50, Temperature 3.591, Best Evaluation 1.00000
Iteration 50, Temperature 3.555, Best Evaluation 1.00000




Iteration 50, Temperature 3.520, Best Evaluation 1.00000




Iteration 50, Temperature 3.485, Best Evaluation 1.00000




Iteration 50, Temperature 3.450, Best Evaluation 1.00000




Iteration 50, Temperature 3.415, Best Evaluation 1.00000




Iteration 50, Temperature 3.381, Best Evaluation 1.00000
Iteration 50, Temperature 3.347, Best Evaluation 1.00000




Iteration 50, Temperature 3.314, Best Evaluation 1.00000




Iteration 50, Temperature 3.281, Best Evaluation 1.00000




Iteration 50, Temperature 3.248, Best Evaluation 1.00000
Iteration 50, Temperature 3.215, Best Evaluation 1.00000
Iteration 50, Temperature 3.183, Best Evaluation 1.00000




Iteration 50, Temperature 3.151, Best Evaluation 1.00000
Iteration 50, Temperature 3.120, Best Evaluation 1.00000




Iteration 50, Temperature 3.089, Best Evaluation 1.00000




Iteration 50, Temperature 3.058, Best Evaluation 1.00000




Iteration 50, Temperature 3.027, Best Evaluation 1.00000
Iteration 50, Temperature 2.997, Best Evaluation 1.00000




Iteration 50, Temperature 2.967, Best Evaluation 1.00000




Iteration 50, Temperature 2.937, Best Evaluation 1.00000




Iteration 50, Temperature 2.908, Best Evaluation 1.00000




Iteration 50, Temperature 2.879, Best Evaluation 1.00000
Iteration 50, Temperature 2.850, Best Evaluation 1.00000
Iteration 50, Temperature 2.822, Best Evaluation 1.00000
Iteration 50, Temperature 2.793, Best Evaluation 1.00000




Iteration 50, Temperature 2.765, Best Evaluation 1.00000




Iteration 50, Temperature 2.738, Best Evaluation 1.00000




Iteration 50, Temperature 2.710, Best Evaluation 1.00000




Iteration 50, Temperature 2.683, Best Evaluation 1.00000




Iteration 50, Temperature 2.656, Best Evaluation 1.00000




Iteration 50, Temperature 2.630, Best Evaluation 1.00000
Iteration 50, Temperature 2.604, Best Evaluation 1.00000




Iteration 50, Temperature 2.578, Best Evaluation 1.00000




Iteration 50, Temperature 2.552, Best Evaluation 1.00000
Iteration 50, Temperature 2.526, Best Evaluation 1.00000




Iteration 50, Temperature 2.501, Best Evaluation 1.00000
Iteration 50, Temperature 2.476, Best Evaluation 1.00000
Iteration 50, Temperature 2.451, Best Evaluation 1.00000
Iteration 50, Temperature 2.427, Best Evaluation 1.00000




Iteration 50, Temperature 2.402, Best Evaluation 1.00000




Iteration 50, Temperature 2.378, Best Evaluation 1.00000




Iteration 50, Temperature 2.355, Best Evaluation 1.00000




Iteration 50, Temperature 2.331, Best Evaluation 1.00000
Iteration 50, Temperature 2.308, Best Evaluation 1.00000




Iteration 50, Temperature 2.285, Best Evaluation 1.00000




Iteration 50, Temperature 2.262, Best Evaluation 1.00000




Iteration 50, Temperature 2.239, Best Evaluation 1.00000




Iteration 50, Temperature 2.217, Best Evaluation 1.00000




Iteration 50, Temperature 2.195, Best Evaluation 1.00000




Iteration 50, Temperature 2.173, Best Evaluation 1.00000
Iteration 50, Temperature 2.151, Best Evaluation 1.00000
Iteration 50, Temperature 2.130, Best Evaluation 1.00000
Iteration 50, Temperature 2.108, Best Evaluation 1.00000




Iteration 50, Temperature 2.087, Best Evaluation 1.00000




Iteration 50, Temperature 2.066, Best Evaluation 1.00000
Iteration 50, Temperature 2.046, Best Evaluation 1.00000




Iteration 50, Temperature 2.025, Best Evaluation 1.00000
Iteration 50, Temperature 2.005, Best Evaluation 1.00000




Iteration 50, Temperature 1.985, Best Evaluation 1.00000




Iteration 50, Temperature 1.965, Best Evaluation 1.00000




Iteration 50, Temperature 1.945, Best Evaluation 1.00000
The best model is MLPClassifier(max_iter=1000, activation=logistic, solver=adam, alpha=0.0001) with a score of 1.0

Evaluating on the test data
Test Accuracy: 0.9545

Classification Report:
              precision    recall  f1-score   support

         0.0       0.97      0.95      0.96        40
         1.0       0.93      0.96      0.94        26

    accuracy                           0.95        66
   macro avg       0.95      0.96      0.95        66
weighted avg       0.96      0.95      0.95        66



### Airfoil

In [13]:
automl = AutoML_18(min_training_time=3600, max_iterations=10, classifier=False)

print("Fitting the AutoML algorithm")
automl.fit(X_train_airfoil, y_train_airfoil, X_val_airfoil, y_val_airfoil)

print("\nEvaluating on the test data")
predictions = automl.predict(X_test_airfoil)

test_mse = mean_squared_error(y_test_airfoil, predictions)
test_rmse = np.sqrt(test_mse)  
test_mae = mean_absolute_error(y_test_airfoil, predictions)
test_r2 = r2_score(y_test_airfoil, predictions)

print(f"Test MSE: {test_mse:.4f}")
print(f"Test RMSE: {test_rmse:.4f}")
print(f"Test MAE: {test_mae:.4f}")
print(f"Test R^2: {test_r2:.4f}")

Fitting the AutoML algorithm
The best model is RandomForestRegressor(n_estimators=25, max_depth=15, min_samples_split=2, max_features=None, criterion=absolute_error) with a score of -4.8266

Evaluating on the test data
Test MSE: 5.1866
Test RMSE: 2.2774
Test MAE: 1.6701
Test R^2: 0.8964


### Abalone (Regression)

In [None]:
automl = AutoML_18(min_training_time=3600, classifier=False)


print("Fitting the AutoML algorithm")
automl.fit(X_train_abalone_reg, y_train_abalone_reg, X_val_abalone_reg, y_val_abalone_reg)

print("\nEvaluating on the test data")
predictions = automl.predict(X_test_abalone_reg)

test_mse = mean_squared_error(y_test_abalone_reg, predictions)
test_rmse = np.sqrt(test_mse)  
test_mae = mean_absolute_error(y_test_abalone_reg, predictions)
test_r2 = r2_score(y_test_abalone_reg, predictions)

print(f"Test MSE: {test_mse:.4f}")
print(f"Test RMSE: {test_rmse:.4f}")
print(f"Test MAE: {test_mae:.4f}")
print(f"Test R^2: {test_r2:.4f}")

Fitting the AutoML algorithm
Temperature 100.000
Temperature 99.000
Temperature 98.010
Temperature 97.030
Temperature 96.060
Temperature 95.099
Temperature 94.148
Temperature 93.207
Temperature 92.274
Temperature 91.352
Temperature 90.438
Temperature 89.534
Temperature 88.638
Temperature 87.752
Temperature 86.875
Temperature 86.006
Temperature 85.146
Temperature 84.294
Temperature 83.451
Temperature 82.617
Temperature 81.791
Temperature 80.973
Temperature 80.163
Temperature 79.361
Temperature 78.568
Temperature 77.782
Temperature 77.004
Temperature 76.234
Temperature 75.472
Temperature 74.717
Temperature 73.970
Temperature 73.230
Temperature 72.498


### Abalone (Classification)

In [14]:
automl = AutoML_18(min_training_time=60, max_iterations=50)

print("Fitting the AutoML algorithm")
automl.fit(X_train_abalone_class, y_train_abalone_class, X_val_abalone_class, y_val_abalone_class)

print("\nEvaluating on the test data")
predictions = automl.predict(X_test_abalone_class)

test_accuracy = accuracy_score(y_test_abalone_class, predictions)
print(f"Test Accuracy: {test_accuracy:.4f}")
print("\nClassification Report:")
print(classification_report(y_test_abalone_class, predictions))

Fitting the AutoML algorithm
The best model is MLPClassifier(max_iter=3000, activation=tanh, solver=adam, alpha=0.001) with a score of 0.588

Evaluating on the test data
Test Accuracy: 0.5502

Classification Report:
              precision    recall  f1-score   support

           F       0.52      0.35      0.42       278
           I       0.64      0.85      0.73       267
           M       0.46      0.47      0.47       291

    accuracy                           0.55       836
   macro avg       0.54      0.56      0.54       836
weighted avg       0.54      0.55      0.53       836

