Marcin Wardyński  
wtorek, 9:45

## Laboratorium 7
### 7.2 DBN

Oto kolejne funkcje pomocnicze:

Wczytywanie zbiorów danych:

In [3]:
import warnings
warnings.filterwarnings("ignore")

import importlib
import lab7_utils as utils
importlib.reload(utils)

seed = 42

Wrapper wokół RBM, który wyłącza możliwość uczenia się. Przydatny, gdy chcemy wytrenować jedynie dołożone warstwy RBM, a zachować wagi dla tych już wytrenowanych.

In [6]:
from sklearn.base import TransformerMixin, BaseEstimator

class FrozenRBM(TransformerMixin, BaseEstimator):
    def __init__(self, rbm):
        self.rbm = rbm

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        return self.rbm.transform(X)

Poniższe funkcje odpowiadają za szkolenie poszczególnych warst klasyfikatora DBN, każdej z osobna. Na wyjściu otrzymujemy trzy warstwy RBM, oraz cztery regresje liniowe, do użycia odpowiednio z 0, 1, 2 i 3 warstwami RBM.

In [7]:
from sklearn.base import clone
from sklearn.pipeline import Pipeline

def wrap_rbm_snapshots(rbms_snapshots):
    rbm_tuples = []
    for i in range(len(rbms_snapshots)):
        rbm_tuples.append((f"rbm_L{i+1}", rbms_snapshots[i]))

    return rbm_tuples


def create_dbn(hidden_dims, rbm_base, log_reg_base, X_train, y_train):
    rbm_snapshots = []
    log_regs = []

    log_reg = clone(log_reg_base)
    pipeline = Pipeline([("log_reg", log_reg)])
    pipeline.fit(X_train, y_train)
    log_regs.append(log_reg)


    for hidden_dim in hidden_dims:
        log_reg = clone(log_reg_base)
        rbm = clone(rbm_base)
        rbm.n_components = hidden_dim


        pipeline_def = []
        pipeline_def.extend(wrap_rbm_snapshots(rbm_snapshots))
        pipeline_def.append((f"rbm_L{len(rbm_snapshots)+1}", rbm))
        pipeline_def.append(("log_reg", log_reg))
       
        pipeline = Pipeline(pipeline_def)
        pipeline.fit(X_train, y_train)

        rbm_snapshots.append(FrozenRBM(rbm))
        log_regs.append(log_reg)
    
    return rbm_snapshots, log_regs


Poniższy kod testuje jakość klasyfikacji już wytrenowanych DBN-ów o różnej wysokości, dobierając do każdego z nich odpowiednią głowę regresji liniowej.

In [8]:
from sklearn.metrics import accuracy_score, classification_report

def test_dbn(rbms, log_regs, X_test, y_test, print_report):
    accuracies = []
    for i in range(len(rbms)+1):
        pipeline_def = []
        pipeline_def.extend(wrap_rbm_snapshots(rbms[:i]))
        pipeline_def.append(("log_reg", log_regs[i]))

        pipeline = Pipeline(pipeline_def)
        y_pred = pipeline.predict(X_test)

        accuracy = accuracy_score(y_test, y_pred)
        accuracies.append(accuracy)

        if print_report:
            print(f"Warstwy modelu: {pipeline.named_steps.keys()}")
            print(classification_report(y_test, y_pred))
            
    return accuracies

Następne dwa bloki kodu przedstawiają funkcje, które spinają cały proces ewaluacji DBN-ów w jedną całóść, oraz wykorzystują optymalizator `Optuna`, do sprawdzenia kilku relewantnych konfiguracji hiperparametrów.

Jako hiperparametry rozumiem wielkości poszczególnych warstw DBN, a pozostałe hiperparametry zostały przejęte z najlepszych modeli dla poszczególnych zbiorów danych z poprzedniego zadania.

In [9]:
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import BernoulliRBM

def check_dbn(dataset_name, model_params, l1_h_dim, l2_h_dim, l3_h_dim, print_report):
    X_train, X_test, y_train, y_test = utils.get_dataset(dataset_name)

    rbm_base = BernoulliRBM(learning_rate=model_params['learning_rate'], batch_size=model_params['batch_size'], random_state=seed)
    log_reg_base = LogisticRegression(max_iter=model_params['max_iter'], solver=model_params['solver'], C=model_params['C'])

    hidden_dims = [l1_h_dim, l2_h_dim, l3_h_dim]

    rbms, log_regs = create_dbn(hidden_dims, rbm_base, log_reg_base, X_train, y_train)

    accuracies = test_dbn(rbms, log_regs, X_test, y_test, print_report)

    return max(accuracies)

In [10]:
import optuna

def objective(trial, dataset_name, model_params):

    l1_h_dim = trial.suggest_categorical('l1_h_dim', [128, 256, 512])
    l2_h_dim = trial.suggest_categorical('l2_h_dim', [64, 128, 192])
    l3_h_dim = trial.suggest_categorical('l3_h_dim', [32, 64, 96])
    
    return check_dbn(dataset_name, model_params, l1_h_dim, l2_h_dim, l3_h_dim, print_report=False)

def optimize_dbn(dataset_name, model_params, n_trials):

    study = optuna.create_study(direction='maximize')
    study.optimize(lambda trial: objective(trial, dataset_name, model_params), n_trials=n_trials, show_progress_bar=True)

    print(f"Best parameters: {study.best_params}")
    print_best_dbn_summary(dataset_name, model_params, study.best_params)

def print_best_dbn_summary(dataset_name, model_params, l_h_dims):
    check_dbn(dataset_name, model_params, l_h_dims['l1_h_dim'], l_h_dims['l2_h_dim'], l_h_dims['l3_h_dim'], print_report=True)

Każdy zbiór danych zostanie przebadany przy użyciu 15 konfiguracji hiperparametrów.

Dodatkowo zmieniłem algorytm solvera w regresji logistycznej w porównaniu do najlepszego (`sag`), który został wyłoniony w poprzednim zadaniu, gdyż działał on bardzo wolno, a czas jego pracy jedynie w małym stopniu przekładał się na wzrost jakości klasyfikacji.

*Dodatkowa uwaga:* Rozmiar największej z warstw RBM został ograniczony do 512 neuronów, stąd jeśli tylko DBN osiąga najlepszy wynik dla pojedyńczej warstwy RBM, to ten wynik może być odrobinę gorszy, niż ten z poprzedniego zadania.

In [11]:
n_trials = 15
solver = 'lbfgs'

### MNIST

In [12]:
model_params = {
    'C': 0.5,
    'solver': solver,
    'max_iter': 1000,
    'batch_size': 20,
    'learning_rate': 0.05,
}

optimize_dbn(utils.Dataset_Select.MNIST.value, model_params, n_trials)

[I 2025-01-05 13:39:47,819] A new study created in memory with name: no-name-e94cd863-4a09-45f7-aa21-4a22387e7d59
Best trial: 0. Best value: 0.9684:   7%|▋         | 1/15 [03:37<50:44, 217.48s/it]

[I 2025-01-05 13:43:25,342] Trial 0 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  13%|█▎        | 2/15 [05:35<34:27, 159.03s/it]

[I 2025-01-05 13:45:23,459] Trial 1 finished with value: 0.9658 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  20%|██        | 3/15 [06:42<23:23, 116.95s/it]

[I 2025-01-05 13:46:30,340] Trial 2 finished with value: 0.9538 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  27%|██▋       | 4/15 [07:50<17:53, 97.59s/it] 

[I 2025-01-05 13:47:38,235] Trial 3 finished with value: 0.9538 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  33%|███▎      | 5/15 [11:15<22:44, 136.49s/it]

[I 2025-01-05 13:51:03,695] Trial 4 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  40%|████      | 6/15 [13:08<19:14, 128.31s/it]

[I 2025-01-05 13:52:56,128] Trial 5 finished with value: 0.96 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  47%|████▋     | 7/15 [15:11<16:53, 126.68s/it]

[I 2025-01-05 13:54:59,448] Trial 6 finished with value: 0.9658 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 192, 'l3_h_dim': 32}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  53%|█████▎    | 8/15 [16:22<12:42, 108.90s/it]

[I 2025-01-05 13:56:10,298] Trial 7 finished with value: 0.9613 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  60%|██████    | 9/15 [17:30<09:36, 96.17s/it] 

[I 2025-01-05 13:57:18,453] Trial 8 finished with value: 0.9538 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 128, 'l3_h_dim': 32}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  67%|██████▋   | 10/15 [21:02<10:59, 131.90s/it]

[I 2025-01-05 14:00:50,372] Trial 9 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 32}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  73%|███████▎  | 11/15 [24:27<10:16, 154.25s/it]

[I 2025-01-05 14:04:15,293] Trial 10 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  80%|████████  | 12/15 [27:40<08:18, 166.20s/it]

[I 2025-01-05 14:07:28,839] Trial 11 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  87%|████████▋ | 13/15 [31:27<06:08, 184.37s/it]

[I 2025-01-05 14:11:15,025] Trial 12 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684:  93%|█████████▎| 14/15 [34:55<03:11, 191.65s/it]

[I 2025-01-05 14:14:43,496] Trial 13 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 0 with value: 0.9684.


Best trial: 0. Best value: 0.9684: 100%|██████████| 15/15 [38:38<00:00, 154.56s/it]


[I 2025-01-05 14:18:26,296] Trial 14 finished with value: 0.9684 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 0 with value: 0.9684.
Best parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}
Warstwy modelu: dict_keys(['log_reg'])
              precision    recall  f1-score   support

           0       0.94      0.97      0.95       980
           1       0.96      0.98      0.97      1135
           2       0.92      0.89      0.90      1032
           3       0.90      0.91      0.90      1010
           4       0.92      0.92      0.92       982
           5       0.88      0.85      0.87       892
           6       0.94      0.94      0.94       958
           7       0.92      0.92      0.92      1028
           8       0.87      0.87      0.87       974
           9       0.90      0.90      0.90      1009

    accuracy                           0.92     10000
   macro avg       0.92      0.92      0.92     10000
weighted avg       0

Najlepiej wypada DBN o tylko jednej, acz bardzo dużej, warstwie RBM i osiąga on `accuracy=0.97`, dodawanie kolejnych warstw obniża `accuracy` do odpowiednio 0.96 i 0.95 przy dwóch i trzech warstwach. Zdecydowanie zbiór MNIST jest dość prostym zbiorem, w którym dodatkowa pojemność modelu z użyciem dodatkowych warstw niekoniecznie przynosi porządany skutek.

### Fashion_MNIST

In [13]:
model_params = {
    'C': 0.5,
    'solver': solver,
    'max_iter': 5000,
    'batch_size': 10,
    'learning_rate': 0.01,
}

optimize_dbn(utils.Dataset_Select.F_MNIST.value, model_params, n_trials)

[I 2025-01-05 14:22:49,458] A new study created in memory with name: no-name-20a7044c-707a-4b3f-b5da-dab23c64e9ad
Best trial: 0. Best value: 0.7932:   7%|▋         | 1/15 [01:48<25:21, 108.66s/it]

[I 2025-01-05 14:24:38,111] Trial 0 finished with value: 0.7932 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 192, 'l3_h_dim': 32}. Best is trial 0 with value: 0.7932.


Best trial: 1. Best value: 0.8135:  13%|█▎        | 2/15 [05:28<37:40, 173.87s/it]

[I 2025-01-05 14:28:17,639] Trial 1 finished with value: 0.8135 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 64}. Best is trial 1 with value: 0.8135.


Best trial: 1. Best value: 0.8135:  20%|██        | 3/15 [08:54<37:42, 188.57s/it]

[I 2025-01-05 14:31:43,691] Trial 2 finished with value: 0.8135 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 1 with value: 0.8135.


Best trial: 1. Best value: 0.8135:  27%|██▋       | 4/15 [10:57<29:51, 162.88s/it]

[I 2025-01-05 14:33:47,192] Trial 3 finished with value: 0.7932 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 192, 'l3_h_dim': 32}. Best is trial 1 with value: 0.8135.


Best trial: 4. Best value: 0.8299:  33%|███▎      | 5/15 [18:03<42:55, 257.53s/it]

[I 2025-01-05 14:40:52,547] Trial 4 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 64}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  40%|████      | 6/15 [20:48<33:56, 226.23s/it]

[I 2025-01-05 14:43:38,009] Trial 5 finished with value: 0.8135 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  47%|████▋     | 7/15 [27:49<38:40, 290.01s/it]

[I 2025-01-05 14:50:39,316] Trial 6 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  53%|█████▎    | 8/15 [35:10<39:26, 338.05s/it]

[I 2025-01-05 14:58:00,236] Trial 7 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  60%|██████    | 9/15 [41:59<36:00, 360.09s/it]

[I 2025-01-05 15:04:48,810] Trial 8 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  67%|██████▋   | 10/15 [44:38<24:50, 298.03s/it]

[I 2025-01-05 15:07:27,874] Trial 9 finished with value: 0.8135 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  73%|███████▎  | 11/15 [49:34<19:49, 297.29s/it]

[I 2025-01-05 15:12:23,477] Trial 10 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 64}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  80%|████████  | 12/15 [54:26<14:47, 295.73s/it]

[I 2025-01-05 15:17:15,626] Trial 11 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  87%|████████▋ | 13/15 [59:31<09:57, 298.62s/it]

[I 2025-01-05 15:22:20,912] Trial 12 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299:  93%|█████████▎| 14/15 [1:04:47<05:03, 303.91s/it]

[I 2025-01-05 15:27:37,041] Trial 13 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 4 with value: 0.8299.


Best trial: 4. Best value: 0.8299: 100%|██████████| 15/15 [1:10:01<00:00, 280.10s/it]


[I 2025-01-05 15:32:50,987] Trial 14 finished with value: 0.8299 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 4 with value: 0.8299.
Best parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 64}
Warstwy modelu: dict_keys(['log_reg'])
              precision    recall  f1-score   support

           0       0.74      0.76      0.75      1000
           1       0.93      0.95      0.94      1000
           2       0.69      0.67      0.68      1000
           3       0.79      0.78      0.78      1000
           4       0.68      0.70      0.69      1000
           5       0.86      0.88      0.87      1000
           6       0.54      0.51      0.52      1000
           7       0.86      0.88      0.87      1000
           8       0.91      0.90      0.90      1000
           9       0.91      0.92      0.91      1000

    accuracy                           0.79     10000
   macro avg       0.79      0.79      0.79     10000
weighted avg       0

W tym wypadku również, bardzo duża warstwa pierwsza RBM dostarcza najlepszy wynik `accuracy=0.83`, natomiast kolejne warstwy obniżają go do 0.78 i 0.75 przy dwóch i trzech warstwach.

In [14]:
model_params = {
    'C': 1.0,
    'solver': solver,
    'max_iter': 1000,
    'batch_size': 10,
    'learning_rate': 0.1,
}

optimize_dbn(utils.Dataset_Select.K_MNIST.value, model_params, n_trials)

[I 2025-01-05 17:47:19,653] A new study created in memory with name: no-name-c8ba785f-e959-43ab-a36d-878bd1d20937
  0%|          | 0/15 [00:00<?, ?it/s]2025-01-05 17:47:36.622527: I tensorflow/core/kernels/data/tf_record_dataset_op.cc:376] The default buffer size is 262144, which is overridden by the user specified `buffer_size` of 8388608
2025-01-05 17:47:42.740981: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
2025-01-05 17:47:43.860298: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Best trial: 0. Best value: 0.8191:   7%|▋         | 1/15 [03:10<44:28, 190.60s/it]

[I 2025-01-05 17:50:30,259] Trial 0 finished with value: 0.8191 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 32}. Best is trial 0 with value: 0.8191.


2025-01-05 17:50:37.351978: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Best trial: 0. Best value: 0.8191:  13%|█▎        | 2/15 [04:37<28:03, 129.47s/it]

[I 2025-01-05 17:51:56,930] Trial 1 finished with value: 0.777 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 128, 'l3_h_dim': 64}. Best is trial 0 with value: 0.8191.


Best trial: 2. Best value: 0.8584:  20%|██        | 3/15 [10:53<48:26, 242.19s/it]

[I 2025-01-05 17:58:13,270] Trial 2 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


2025-01-05 17:58:24.701051: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Best trial: 2. Best value: 0.8584:  27%|██▋       | 4/15 [16:23<50:42, 276.63s/it]

[I 2025-01-05 18:03:42,690] Trial 3 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  33%|███▎      | 5/15 [19:29<40:42, 244.22s/it]

[I 2025-01-05 18:06:49,454] Trial 4 finished with value: 0.8294 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  40%|████      | 6/15 [22:30<33:25, 222.79s/it]

[I 2025-01-05 18:09:50,652] Trial 5 finished with value: 0.8191 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 32}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  47%|████▋     | 7/15 [28:15<35:00, 262.59s/it]

[I 2025-01-05 18:15:35,185] Trial 6 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 2 with value: 0.8584.


2025-01-05 18:15:41.909606: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Best trial: 2. Best value: 0.8584:  53%|█████▎    | 8/15 [30:55<26:49, 229.96s/it]

[I 2025-01-05 18:18:15,279] Trial 7 finished with value: 0.8294 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 192, 'l3_h_dim': 32}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  60%|██████    | 9/15 [36:52<26:57, 269.50s/it]

[I 2025-01-05 18:24:11,723] Trial 8 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  67%|██████▋   | 10/15 [40:03<20:27, 245.50s/it]

[I 2025-01-05 18:27:23,483] Trial 9 finished with value: 0.8294 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  73%|███████▎  | 11/15 [41:37<13:16, 199.09s/it]

[I 2025-01-05 18:28:57,322] Trial 10 finished with value: 0.7994 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  80%|████████  | 12/15 [47:30<12:17, 245.98s/it]

[I 2025-01-05 18:34:50,579] Trial 11 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  87%|████████▋ | 13/15 [52:52<08:57, 268.75s/it]

[I 2025-01-05 18:40:11,719] Trial 12 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584:  93%|█████████▎| 14/15 [58:44<04:53, 293.93s/it]

[I 2025-01-05 18:46:03,816] Trial 13 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 2 with value: 0.8584.


Best trial: 2. Best value: 0.8584: 100%|██████████| 15/15 [1:04:58<00:00, 259.93s/it]


[I 2025-01-05 18:52:18,651] Trial 14 finished with value: 0.8584 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 2 with value: 0.8584.
Best parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 96}


2025-01-05 18:52:26.465015: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


Warstwy modelu: dict_keys(['log_reg'])
              precision    recall  f1-score   support

           0       0.85      0.75      0.80      1000
           1       0.64      0.67      0.65      1000
           2       0.51      0.63      0.56      1000
           3       0.79      0.76      0.77      1000
           4       0.64      0.64      0.64      1000
           5       0.74      0.70      0.72      1000
           6       0.67      0.71      0.69      1000
           7       0.74      0.56      0.64      1000
           8       0.62      0.74      0.68      1000
           9       0.70      0.64      0.67      1000

    accuracy                           0.68     10000
   macro avg       0.69      0.68      0.68     10000
weighted avg       0.69      0.68      0.68     10000

Warstwy modelu: dict_keys(['rbm_L1', 'log_reg'])
              precision    recall  f1-score   support

           0       0.89      0.88      0.89      1000
           1       0.85      0.85      0.85 

Wciąż pojedyńcza warstwa o największym rozmiarze daje najlepszą klasyfikację z `accuracy=0.86`.

Na przykładzie tego i poprzedniego zbioru danych widać, że nawet przy bardziej skomplikowanym zbiorze danych, niż MNIST, dostatecznie duża, pojedyńcza warstwa RBM potrafi być najlepszym ekstraktorem cech.

In [None]:
model_params = {
    'C': 0.5,
    'solver': solver,
    'max_iter': 1000,
    'batch_size': 20,
    'learning_rate': 0.1,
}

optimize_dbn(utils.Dataset_Select.KUZ_49.value, model_params, n_trials)

[I 2025-01-05 19:04:22,772] A new study created in memory with name: no-name-bdf9d0c7-2a7a-469a-b851-7f97545c689a
Best trial: 0. Best value: 0.779753:   7%|▋         | 1/15 [12:13<2:51:11, 733.69s/it]

[I 2025-01-05 19:16:36,458] Trial 0 finished with value: 0.7797529843418944 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 0 with value: 0.7797529843418944.


Best trial: 0. Best value: 0.779753:  13%|█▎        | 2/15 [21:40<2:17:40, 635.45s/it]

[I 2025-01-05 19:26:03,138] Trial 1 finished with value: 0.7797529843418944 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 0 with value: 0.7797529843418944.


Best trial: 0. Best value: 0.779753:  20%|██        | 3/15 [28:06<1:44:19, 521.66s/it]

[I 2025-01-05 19:32:29,392] Trial 2 finished with value: 0.7513048421270219 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 192, 'l3_h_dim': 96}. Best is trial 0 with value: 0.7797529843418944.


Best trial: 0. Best value: 0.779753:  27%|██▋       | 4/15 [33:45<1:22:24, 449.53s/it]

[I 2025-01-05 19:38:08,356] Trial 3 finished with value: 0.7032711487778409 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 0 with value: 0.7797529843418944.


Best trial: 0. Best value: 0.779753:  33%|███▎      | 5/15 [44:39<1:27:12, 523.29s/it]

[I 2025-01-05 19:49:02,398] Trial 4 finished with value: 0.7797529843418944 and parameters: {'l1_h_dim': 256, 'l2_h_dim': 128, 'l3_h_dim': 32}. Best is trial 0 with value: 0.7797529843418944.


Best trial: 5. Best value: 0.84161:  40%|████      | 6/15 [1:03:07<1:48:19, 722.20s/it] 

[I 2025-01-05 20:07:30,743] Trial 5 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  47%|████▋     | 7/15 [1:20:36<1:50:30, 828.86s/it]

[I 2025-01-05 20:24:59,179] Trial 6 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  53%|█████▎    | 8/15 [1:40:33<1:50:21, 945.95s/it]

[I 2025-01-05 20:44:55,851] Trial 7 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 192, 'l3_h_dim': 64}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  60%|██████    | 9/15 [1:48:16<1:19:29, 794.96s/it]

[I 2025-01-05 20:52:38,802] Trial 8 finished with value: 0.7032711487778409 and parameters: {'l1_h_dim': 128, 'l2_h_dim': 64, 'l3_h_dim': 32}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  67%|██████▋   | 10/15 [2:07:30<1:15:29, 905.80s/it]

[I 2025-01-05 21:11:52,813] Trial 9 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 128, 'l3_h_dim': 96}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  73%|███████▎  | 11/15 [2:24:43<1:02:59, 944.78s/it]

[I 2025-01-05 21:29:05,968] Trial 10 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  80%|████████  | 12/15 [2:39:14<46:07, 922.53s/it]  

[I 2025-01-05 21:43:37,627] Trial 11 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  87%|████████▋ | 13/15 [2:55:14<31:07, 933.69s/it]

[I 2025-01-05 21:59:36,973] Trial 12 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 96}. Best is trial 5 with value: 0.8416102527001189.


Best trial: 5. Best value: 0.84161:  93%|█████████▎| 14/15 [3:08:22<14:49, 889.69s/it]

[I 2025-01-05 22:12:45,009] Trial 13 finished with value: 0.8416102527001189 and parameters: {'l1_h_dim': 512, 'l2_h_dim': 64, 'l3_h_dim': 64}. Best is trial 5 with value: 0.8416102527001189.
