# 🖥️ MNIST Experiment
#
In this notebook, we train:
1. The Bayes by Backprop model on MNIST for every combination of (π, σ₁, σ₂).  
2. A standard feedforward neural network (two hidden layers) optimized with SGD and no regularization.  
3. The same feedforward network, but with Dropout applied to both hidden layers, still optimized with SGD.  
#
During each training loop, we display progress bars for both epochs and batches, along with printed outputs of key metrics.

### 🔧 Environment Setup

This section initializes the environment for training a PyTorch model:

- Imports necessary libraries including PyTorch, NumPy, and torchvision for data handling.
- Uses `tqdm` for progress bars during training.
- Automatically selects the available device: GPU (`cuda`) if present, otherwise defaults to CPU.
- Prints the selected device to confirm which hardware will be used for computations.

In [7]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import random
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
from tqdm import trange, tqdm
import math

# Set the device (GPU if available, otherwise CPU)
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
print(f"🚀 Using device: {device}")

🚀 Using device: cuda


### 📚 Dataset Preparation: TensorFlow and PyTorch Versions of MNIST

This section performs the following:

#### TensorFlow
- Loads the MNIST dataset using `tf.keras.datasets`.
- Normalizes pixel values to [0, 1].
- Flattens 28×28 images into 784-dimensional vectors.

#### PyTorch
- Downloads and transforms the MNIST dataset using `torchvision`.
- Applies `ToTensor()` and reshapes the images to 1D vectors.
- Sets up `DataLoader` objects for training and testing with appropriate batch sizes and performance optimizations.

#### Shared Hyperparameters
- Defines standard values for learning rate, batch size, input/output dimensions, and number of epochs.
- Logs dataset sizes and input dimensions for confirmation.

In [8]:
# %% [code]
# 📦 TensorFlow + TF Probability
import tensorflow as tf

# Load MNIST dataset
(x_train_tf, y_train_tf), (x_test_tf, y_test_tf) = tf.keras.datasets.mnist.load_data()

# Normalize and reshape the data
x_train_tf = x_train_tf.astype('float32') / 255.0
x_test_tf  = x_test_tf.astype('float32') / 255.0
x_train_tf = x_train_tf.reshape(-1, 28 * 28)
x_test_tf  = x_test_tf.reshape(-1, 28 * 28)

print(f"✅ TF MNIST: {x_train_tf.shape[0]} train, {x_test_tf.shape[0]} test, input_dim = {x_train_tf.shape[1]}")

# %% [code]
# 🔥 PyTorch
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Transform: convert to Tensor and flatten
transform = transforms.Compose([
    transforms.ToTensor(),                              # Converts [0,255] to [0,1] and to FloatTensor C×H×W
    transforms.Lambda(lambda x: x.view(-1))             # Flattens 28×28 image to a vector of 784
])

# Download and prepare datasets
train_dataset_torch = datasets.MNIST(root='./data', train=True,  download=True, transform=transform)
test_dataset_torch  = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoaders
BATCH_SIZE_TORCH = 512
train_loader_torch = DataLoader(train_dataset_torch, batch_size=BATCH_SIZE_TORCH,
                                shuffle=True,  num_workers=4, pin_memory=True)
test_loader_torch  = DataLoader(test_dataset_torch,  batch_size=1000,
                                shuffle=False, num_workers=2, pin_memory=True)

print(f"✅ Torch MNIST: {len(train_dataset_torch)} train, {len(test_dataset_torch)} test, input_dim = {28*28}")

# Hyperparameters
NUM_EPOCHS = 600
LR = 1e-3
BATCH_SIZE = 1024
INPUT_DIM = 28 * 28
HIDDEN_DIM = 800
OUTPUT_DIM = 10
INPUT_DIM = 28 * 28
N_TRAIN = len(train_dataset_torch)
N_TEST  = len(test_dataset_torch)
print(f"✅ Loaded MNIST: {N_TRAIN} train, {N_TEST} test, input_dim = {INPUT_DIM}")

✅ TF MNIST: 60000 train, 10000 test, input_dim = 784
✅ Torch MNIST: 60000 train, 10000 test, input_dim = 784
✅ Loaded MNIST: 60000 train, 10000 test, input_dim = 784


### 🧠 Model Definitions: Bayesian and Feedforward Networks

This section implements three neural network architectures for MNIST classification:

#### 1. `BayesianLinear` Layer
- Custom fully-connected layer with Bayesian treatment of weights.
- Learns distributions (`μ`, `ρ`) instead of point estimates.
- Samples weights during training and computes log posterior (`log_qw`).

#### 2. `BayesianNet`
- A 3-layer fully connected Bayesian neural network using `BayesianLinear`.
- Uses a mixture of Gaussians prior (π, σ₁, σ₂).
- During forward pass, samples weights and accumulates KL divergence for ELBO.

#### 3. `FFNet`
- A standard feedforward network with two hidden ReLU layers and no regularization.

#### 4. `FFNetDropout`
- Identical to `FFNet`, but applies Dropout (p=0.5) after each hidden layer to prevent overfitting.

#### Other Elements
- `log_mixture_prior`: computes log-probabilities under a mixture Gaussian prior.
- `train_loader` and `test_loader`: PyTorch `DataLoader` objects for efficient batching.
- `criterion`: Cross-entropy loss used for classification.

In [9]:
# ------------------------------------------------------------
# (1) BayesianLinear Layer and BayesianNet Model
# ------------------------------------------------------------
class BayesianLinear(nn.Module):
    """
    Fully-connected Bayesian layer:
     - defines μ and ρ parameters for weights and biases
     - σ = log(1 + exp(ρ)) ensures positivity and stability
     - log_qw is computed during each forward pass
    """
    def __init__(self, in_features: int, out_features: int):
        super().__init__()
        self.in_features  = in_features
        self.out_features = out_features

        # Parameters μ and ρ
        self.mu_weight  = nn.Parameter(torch.zeros(out_features, in_features))
        self.rho_weight = nn.Parameter(torch.ones(out_features, in_features) * -3.0)
        self.mu_bias    = nn.Parameter(torch.zeros(out_features))
        self.rho_bias   = nn.Parameter(torch.ones(out_features) * -3.0)

        # log_qw will be stored after sampling
        self.log_qw = None

    def compute_log_qw(self, w: torch.Tensor, sigma_w: torch.Tensor) -> torch.Tensor:
        """
        Compute log q(w) = -0.5 * sum( log(2πσ²) + ((w - μ)² / σ²) )
        """
        return -0.5 * torch.sum(torch.log(2 * torch.pi * sigma_w**2) +
                                ((w - self.mu_weight) ** 2) / (sigma_w**2))

    def sample_weight_and_bias(self) -> (torch.Tensor, torch.Tensor, torch.Tensor):
        """
        Sample (weights, biases) and return log_qw for KL divergence calculation.
        """
        sigma_w = torch.log1p(torch.exp(self.rho_weight))
        sigma_b = torch.log1p(torch.exp(self.rho_bias))

        eps_w = torch.randn_like(self.mu_weight)
        eps_b = torch.randn_like(self.mu_bias)
        w = self.mu_weight + sigma_w * eps_w
        b = self.mu_bias   + sigma_b * eps_b

        log_qw = self.compute_log_qw(w, sigma_w)
        return w, b, log_qw

    def forward(self, x: torch.Tensor, sample: bool = True) -> torch.Tensor:
        """
        If sample=True and model is training, sample weights and biases;
        otherwise use mean values (μ). Computes and stores log_qw.
        """
        sigma_w = torch.log1p(torch.exp(self.rho_weight))
        sigma_b = torch.log1p(torch.exp(self.rho_bias))

        if self.training and sample:
            eps_w = torch.randn_like(self.mu_weight)
            eps_b = torch.randn_like(self.mu_bias)
            w = self.mu_weight + sigma_w * eps_w
            b = self.mu_bias   + sigma_b * eps_b
        else:
            w = self.mu_weight
            b = self.mu_bias

        self.log_qw = -0.5 * torch.sum(torch.log(2 * torch.pi * sigma_w**2) +
                                       ((w - self.mu_weight)**2) / (sigma_w**2))

        return F.linear(x, w, b)


def log_mixture_prior(w_flat: torch.Tensor, pi: float, sigma1: float, sigma2: float) -> torch.Tensor:
    """
    Compute log[ π N(0,σ1²) + (1−π) N(0,σ2²) ] for each element of w_flat,
    then return the total log-probability.
    """
    log_const1 = 0.5 * math.log(2 * math.pi * sigma1**2)
    log_const2 = 0.5 * math.log(2 * math.pi * sigma2**2)

    log_prob1 = -0.5 * (w_flat ** 2) / (sigma1 ** 2) - log_const1
    log_prob2 = -0.5 * (w_flat ** 2) / (sigma2 ** 2) - log_const2

    log_pi         = math.log(pi)
    log_1_minus_pi = math.log(1.0 - pi)

    t1 = log_prob1 + log_pi
    t2 = log_prob2 + log_1_minus_pi

    max_log = torch.max(t1, t2)
    log_mix = max_log + torch.log(torch.exp(t1 - max_log) + torch.exp(t2 - max_log))

    return log_mix.sum()


class BayesianNet(nn.Module):
    """
    Fully-connected Bayesian network with 3 layers (2 hidden + output).
    """
    def __init__(self, input_dim, hidden_dim, output_dim, pi, sigma1, sigma2):
        super().__init__()
        # Prior parameters
        self.pi     = pi
        self.sigma1 = sigma1
        self.sigma2 = sigma2

        # Bayesian layers
        self.bl1 = BayesianLinear(input_dim,  hidden_dim)
        self.bl2 = BayesianLinear(hidden_dim, hidden_dim)
        self.bl3 = BayesianLinear(hidden_dim, output_dim)

        # KL divergence accumulator
        self.kl = None

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        self.kl = 0.0

        ####### LAYER 1 #######
        w1, b1, log_qw1 = self.bl1.sample_weight_and_bias()
        w1_flat = torch.cat([w1.view(-1), b1.view(-1)])
        log_pw1 = log_mixture_prior(w1_flat, self.pi, self.sigma1, self.sigma2)
        self.kl += (log_qw1 - log_pw1)
        x1 = F.relu(F.linear(x, w1, b1))

        ####### LAYER 2 #######
        w2, b2, log_qw2 = self.bl2.sample_weight_and_bias()
        w2_flat = torch.cat([w2.view(-1), b2.view(-1)])
        log_pw2 = log_mixture_prior(w2_flat, self.pi, self.sigma1, self.sigma2)
        self.kl += (log_qw2 - log_pw2)
        x2 = F.relu(F.linear(x1, w2, b2))

        ####### OUTPUT LAYER #######
        w3, b3, log_qw3 = self.bl3.sample_weight_and_bias()
        w3_flat = torch.cat([w3.view(-1), b3.view(-1)])
        log_pw3 = log_mixture_prior(w3_flat, self.pi, self.sigma1, self.sigma2)
        self.kl += (log_qw3 - log_pw3)
        logits = F.linear(x2, w3, b3)

        return logits


# DataLoaders for training and testing
train_loader = DataLoader(
    train_dataset_torch,
    batch_size=BATCH_SIZE,
    shuffle=True,
    num_workers=4,
    pin_memory=True
)
test_loader = DataLoader(
    test_dataset_torch,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=2,
    pin_memory=True
)

class FFNet(nn.Module):
    """
    Standard feedforward neural network with 2 hidden ReLU layers.
    """
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.fc1 = nn.Linear(input_dim,  hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        self.fc3 = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

class FFNetDropout(nn.Module):
    """
    Same architecture as FFNet, but with Dropout (p=0.5) after each hidden layer.
    """
    def __init__(self, input_dim, hidden_dim, output_dim, p: float = 0.5):
        super().__init__()
        self.fc1 = nn.Linear(input_dim,  hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        self.fc3 = nn.Linear(hidden_dim, output_dim)
        self.dropout = nn.Dropout(p)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        return x

# Loss function
criterion = nn.CrossEntropyLoss()

### ⚙️ Hyperparameter Configuration

This section defines the training settings for two model types:

#### 🧪 Bayes by Backprop
- **Epochs**: 600
- **Learning rate**: 1e-3
- **Prior parameters**:
  - `π = 0.75`
  - `σ₁ = exp(−2)`
  - `σ₂ = exp(−6)`
- These are used in the mixture Gaussian prior for Bayesian layers.

#### 🧪 Baseline SGD (Standard & Dropout Networks)
- **Epochs**: 20
- **Learning rate**: 1e-3
- Same loss function (`CrossEntropyLoss`) is used for both Bayesian and baseline models.

The goal is to ensure a consistent and fair comparison between the Bayesian and non-Bayesian approaches.

In [10]:
# Hyperparameters for Bayes by Backprop training
NUM_EPOCHS_BBB = 600
LR_BBB = 1e-3
PI_LIST = [0.75]
SIGMA1_LIST = [np.exp(-2)]
SIGMA2_LIST = [np.exp(-6)]
# The list is HardCoded because we found the best parameters in a precedent run and it would have been too long to recompute them
# Hyperparameters for baseline SGD models (aligned to BBB for fair comparison)
NUM_EPOCHS_SGD = 20
LR_SGD = 1e-3

# Loss function for classification
criterion = nn.CrossEntropyLoss()

# Display configured hyperparameters
print("ℹ️ Hyperparameters:")
print(f"  Bayes by Backprop: NUM_EPOCHS = {NUM_EPOCHS_BBB}, LR = {LR_BBB}")
print(f"    PI_LIST     = {PI_LIST}")
print(f"    SIGMA1_LIST = {SIGMA1_LIST}")
print(f"    SIGMA2_LIST = {SIGMA2_LIST}")
print(f"\n  Baseline SGD: NUM_EPOCHS = {NUM_EPOCHS_SGD}, LR = {LR_SGD}\n")

ℹ️ Hyperparameters:
  Bayes by Backprop: NUM_EPOCHS = 600, LR = 0.001
    PI_LIST     = [0.75]
    SIGMA1_LIST = [0.1353352832366127]
    SIGMA2_LIST = [0.0024787521766663585]

  Baseline SGD: NUM_EPOCHS = 20, LR = 0.001



### 🤖 Training Feedforward Models in TensorFlow

This section implements and trains two neural network models using TensorFlow:

#### 🏗️ Model Definitions
- `build_sgd_model()`: A plain feedforward neural network with two hidden ReLU layers.
- `build_dropout_model()`: Same as above, but includes Dropout (p=0.5) after each hidden layer to act as a regularizer.

#### 🏋️ Training
- Both models are compiled with the Adam optimizer and trained using cross-entropy loss.
- A `train_model()` function wraps training logic and logs progress.
- Training is performed on the MNIST dataset using `validation_split=0.1`.

#### 📊 Evaluation
- After training, both models are evaluated on the test set.
- Final test accuracies are printed to compare:
  - **SGD without regularization**
  - **SGD with Dropout**

This setup provides a performance baseline for comparison against Bayesian models.

In [11]:
import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np

# 1️⃣ Model construction
def build_sgd_model():
    return tf.keras.Sequential([
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dense(10)  # Output layer (logits)
    ])

def build_dropout_model():
    return tf.keras.Sequential([
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dropout(0.5),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dropout(0.5),
        tf.keras.layers.Dense(10)  # Output layer (logits)
    ])

# 2️⃣ Training function
def train_model(model, name, x_train, y_train, epochs=10):
    model.compile(
        optimizer=tf.keras.optimizers.Adam(),
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=['accuracy']
    )
    print(f"\n🔧 Training {name}")
    model.fit(x_train, y_train,
              batch_size=64,
              epochs=epochs,
              validation_split=0.1,
              verbose=2)
    print(f"✅ Finished training {name}")
    return model

# Train plain SGD model (no regularization)
sgd_model = train_model(build_sgd_model(), "SGD", x_train_tf, y_train_tf, epochs=NUM_EPOCHS_SGD)

# Train SGD + Dropout model
dropout_model = train_model(build_dropout_model(), "Dropout", x_train_tf, y_train_tf, epochs=NUM_EPOCHS_SGD)

# Evaluate both models on the test set
loss_sgd, acc_sgd = sgd_model.evaluate(x_test_tf, y_test_tf, verbose=0)
loss_do,  acc_do  = dropout_model.evaluate(x_test_tf, y_test_tf, verbose=0)

# Print results
print(f"📈 SGD without regularization Accuracy: {acc_sgd:.4f}")
print(f"📈 SGD + Dropout (p=0.5) Accuracy: {acc_do:.4f}")

I0000 00:00:1750416078.141061      35 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 15513 MB memory:  -> device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0



🔧 Training SGD
Epoch 1/20


I0000 00:00:1750416080.161141      81 service.cc:148] XLA service 0x7a74a4003160 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1750416080.161792      81 service.cc:156]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
I0000 00:00:1750416080.490903      81 cuda_dnn.cc:529] Loaded cuDNN version 90300
I0000 00:00:1750416081.392622      81 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


844/844 - 6s - 7ms/step - accuracy: 0.9151 - loss: 0.2899 - val_accuracy: 0.9625 - val_loss: 0.1348
Epoch 2/20
844/844 - 2s - 2ms/step - accuracy: 0.9648 - loss: 0.1169 - val_accuracy: 0.9720 - val_loss: 0.1003
Epoch 3/20
844/844 - 2s - 2ms/step - accuracy: 0.9755 - loss: 0.0809 - val_accuracy: 0.9733 - val_loss: 0.0863
Epoch 4/20
844/844 - 2s - 2ms/step - accuracy: 0.9814 - loss: 0.0604 - val_accuracy: 0.9783 - val_loss: 0.0755
Epoch 5/20
844/844 - 2s - 2ms/step - accuracy: 0.9851 - loss: 0.0474 - val_accuracy: 0.9767 - val_loss: 0.0825
Epoch 6/20
844/844 - 2s - 2ms/step - accuracy: 0.9883 - loss: 0.0378 - val_accuracy: 0.9790 - val_loss: 0.0755
Epoch 7/20
844/844 - 2s - 2ms/step - accuracy: 0.9901 - loss: 0.0306 - val_accuracy: 0.9768 - val_loss: 0.0862
Epoch 8/20
844/844 - 2s - 2ms/step - accuracy: 0.9919 - loss: 0.0249 - val_accuracy: 0.9785 - val_loss: 0.0792
Epoch 9/20
844/844 - 2s - 2ms/step - accuracy: 0.9933 - loss: 0.0210 - val_accuracy: 0.9815 - val_loss: 0.0800
Epoch 10/20


### 🔁 Training Bayesian Neural Networks with All Prior Combinations

This block performs a full sweep over different combinations of prior parameters for the **Bayes by Backprop** model:

#### 🔧 What happens:
- For each combination of:
  - π (mixing coefficient)
  - σ₁, σ₂ (standard deviations for the mixture prior)
- A new `BayesianNet` is initialized and trained on MNIST.
- The model uses:
  - **Adam** optimizer
  - **600 epochs** of training
  - **KL-divergence regularization** added to the negative log-likelihood

#### 🧠 Evaluation:
- After each epoch:
  - The model is evaluated on the test set using the **mean weights (μ)** (i.e., deterministic mode).
  - If the current test accuracy exceeds the best so far, it is stored.

#### 📦 Output:
- Metrics such as training loss, accuracy, and test accuracy are printed for every epoch.
- The best test accuracy for each parameter combination is stored in the `results_bbb` list.

This process helps identify which prior configuration leads to the most accurate Bayesian model.

In [12]:
results_bbb = []  # Store (pi, sigma1, sigma2, best_test_accuracy)

# Iterate over combinations of prior parameters
for pi in PI_LIST:
    for sigma1 in SIGMA1_LIST:
        for sigma2 in SIGMA2_LIST:
            # Initialize the Bayesian network with current prior parameters
            model = BayesianNet(INPUT_DIM, hidden_dim=800, output_dim=10,
                                pi=pi, sigma1=sigma1, sigma2=sigma2).to(device)
            optimizer = optim.Adam(model.parameters(), lr=LR_BBB)
            best_test_acc = 0.0

            # Epoch progress bar
            epoch_desc = f"BBB π={pi:.2f} σ₁={sigma1:.1e} σ₂={sigma2:.1e}"
            for epoch in trange(NUM_EPOCHS_BBB, desc=epoch_desc, leave=True):
                model.train()
                running_loss = 0.0
                total = 0
                correct = 0
                num_batches = len(train_loader)

                # Training loop (no tqdm on batches)
                for data, target in train_loader:
                    data, target = data.to(device), target.to(device)
                    optimizer.zero_grad()
                    output = model(data)                              # Bayesian forward pass
                    nll = criterion(output, target)                   # Negative log-likelihood
                    kl_loss = model.kl / N_TRAIN                      # Normalized KL divergence
                    loss = nll + kl_loss
                    loss.backward()
                    optimizer.step()

                    running_loss += loss.item()
                    preds = output.argmax(dim=1)
                    correct += (preds == target).sum().item()
                    total += target.size(0)

                train_acc = correct / total * 100

                # Evaluation on test set (using deterministic weights: μ)
                model.eval()
                correct_test = 0
                total_test = 0
                with torch.no_grad():
                    for data, target in test_loader:
                        data, target = data.to(device), target.to(device)
                        x = F.relu(model.bl1(data, sample=False))
                        x = F.relu(model.bl2(x, sample=False))
                        logits = model.bl3(x, sample=False)
                        preds = logits.argmax(dim=1)
                        correct_test += (preds == target).sum().item()
                        total_test += target.size(0)
                test_acc = correct_test / total_test * 100
                if test_acc > best_test_acc:
                    best_test_acc = test_acc

                # Print metrics at the end of the epoch
                print(
                    f"  Epoch {epoch+1:02d}/{NUM_EPOCHS_BBB} | "
                    f"Train Loss={running_loss/num_batches:.4f} | "
                    f"Train Acc={train_acc:.2f}% | "
                    f"Test Acc={test_acc:.2f}%"
                )

            print(f"✅ BBB π={pi:.2f}, σ₁={sigma1:.1e}, σ₂={sigma2:.1e} → "
                  f"Best Test Acc = {best_test_acc:.2f}%")
            results_bbb.append((pi, sigma1, sigma2, best_test_acc))

BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   0%|          | 1/600 [00:05<57:45,  5.79s/it]

  Epoch 01/600 | Train Loss=16.2266 | Train Acc=10.01% | Test Acc=25.23%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   0%|          | 2/600 [00:11<55:20,  5.55s/it]

  Epoch 02/600 | Train Loss=15.9366 | Train Acc=13.91% | Test Acc=33.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   0%|          | 3/600 [00:16<56:09,  5.64s/it]

  Epoch 03/600 | Train Loss=15.2078 | Train Acc=35.45% | Test Acc=52.14%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   1%|          | 4/600 [00:22<54:46,  5.51s/it]

  Epoch 04/600 | Train Loss=14.3975 | Train Acc=58.67% | Test Acc=74.15%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   1%|          | 5/600 [00:27<54:09,  5.46s/it]

  Epoch 05/600 | Train Loss=13.9397 | Train Acc=70.75% | Test Acc=81.84%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   1%|          | 6/600 [00:32<53:28,  5.40s/it]

  Epoch 06/600 | Train Loss=13.5645 | Train Acc=78.87% | Test Acc=86.92%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   1%|          | 7/600 [00:38<53:19,  5.39s/it]

  Epoch 07/600 | Train Loss=13.2743 | Train Acc=82.75% | Test Acc=88.67%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   1%|▏         | 8/600 [00:43<53:08,  5.39s/it]

  Epoch 08/600 | Train Loss=13.0121 | Train Acc=85.15% | Test Acc=90.09%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 9/600 [00:49<54:32,  5.54s/it]

  Epoch 09/600 | Train Loss=12.7785 | Train Acc=86.68% | Test Acc=89.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 10/600 [00:54<53:39,  5.46s/it]

  Epoch 10/600 | Train Loss=12.5422 | Train Acc=87.86% | Test Acc=91.12%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 11/600 [01:00<53:24,  5.44s/it]

  Epoch 11/600 | Train Loss=12.3247 | Train Acc=88.68% | Test Acc=91.84%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 12/600 [01:05<53:09,  5.43s/it]

  Epoch 12/600 | Train Loss=12.1058 | Train Acc=89.57% | Test Acc=92.71%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 13/600 [01:11<53:11,  5.44s/it]

  Epoch 13/600 | Train Loss=11.9027 | Train Acc=90.02% | Test Acc=92.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▏         | 14/600 [01:16<52:42,  5.40s/it]

  Epoch 14/600 | Train Loss=11.6944 | Train Acc=90.63% | Test Acc=93.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   2%|▎         | 15/600 [01:22<54:09,  5.55s/it]

  Epoch 15/600 | Train Loss=11.5042 | Train Acc=90.72% | Test Acc=92.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   3%|▎         | 16/600 [01:27<53:52,  5.54s/it]

  Epoch 16/600 | Train Loss=11.3110 | Train Acc=91.16% | Test Acc=94.09%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   3%|▎         | 17/600 [01:33<53:13,  5.48s/it]

  Epoch 17/600 | Train Loss=11.1055 | Train Acc=91.98% | Test Acc=94.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   3%|▎         | 18/600 [01:38<53:32,  5.52s/it]

  Epoch 18/600 | Train Loss=10.9198 | Train Acc=92.13% | Test Acc=94.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   3%|▎         | 19/600 [01:44<53:18,  5.50s/it]

  Epoch 19/600 | Train Loss=10.7463 | Train Acc=92.33% | Test Acc=94.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   3%|▎         | 20/600 [01:49<53:41,  5.55s/it]

  Epoch 20/600 | Train Loss=10.5785 | Train Acc=92.15% | Test Acc=94.86%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▎         | 21/600 [01:55<54:12,  5.62s/it]

  Epoch 21/600 | Train Loss=10.3992 | Train Acc=92.67% | Test Acc=94.81%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▎         | 22/600 [02:01<53:53,  5.59s/it]

  Epoch 22/600 | Train Loss=10.2329 | Train Acc=92.72% | Test Acc=95.01%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▍         | 23/600 [02:06<53:24,  5.55s/it]

  Epoch 23/600 | Train Loss=10.0713 | Train Acc=92.93% | Test Acc=95.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▍         | 24/600 [02:12<54:10,  5.64s/it]

  Epoch 24/600 | Train Loss=9.9126 | Train Acc=93.05% | Test Acc=95.30%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▍         | 25/600 [02:18<54:50,  5.72s/it]

  Epoch 25/600 | Train Loss=9.7592 | Train Acc=93.31% | Test Acc=95.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▍         | 26/600 [02:24<56:04,  5.86s/it]

  Epoch 26/600 | Train Loss=9.6074 | Train Acc=93.41% | Test Acc=95.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   4%|▍         | 27/600 [02:30<55:08,  5.77s/it]

  Epoch 27/600 | Train Loss=9.4653 | Train Acc=93.51% | Test Acc=95.80%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   5%|▍         | 28/600 [02:35<54:05,  5.67s/it]

  Epoch 28/600 | Train Loss=9.3209 | Train Acc=93.62% | Test Acc=95.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   5%|▍         | 29/600 [02:41<53:38,  5.64s/it]

  Epoch 29/600 | Train Loss=9.1796 | Train Acc=93.61% | Test Acc=95.97%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   5%|▌         | 30/600 [02:46<53:14,  5.60s/it]

  Epoch 30/600 | Train Loss=9.0462 | Train Acc=93.96% | Test Acc=95.86%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   5%|▌         | 31/600 [02:52<52:51,  5.57s/it]

  Epoch 31/600 | Train Loss=8.9129 | Train Acc=94.04% | Test Acc=96.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   5%|▌         | 32/600 [02:58<53:36,  5.66s/it]

  Epoch 32/600 | Train Loss=8.7903 | Train Acc=94.01% | Test Acc=95.95%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▌         | 33/600 [03:03<52:53,  5.60s/it]

  Epoch 33/600 | Train Loss=8.6670 | Train Acc=94.09% | Test Acc=96.20%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▌         | 34/600 [03:08<52:31,  5.57s/it]

  Epoch 34/600 | Train Loss=8.5539 | Train Acc=94.14% | Test Acc=96.14%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▌         | 35/600 [03:14<52:02,  5.53s/it]

  Epoch 35/600 | Train Loss=8.4431 | Train Acc=94.06% | Test Acc=96.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▌         | 36/600 [03:20<52:09,  5.55s/it]

  Epoch 36/600 | Train Loss=8.3250 | Train Acc=94.31% | Test Acc=96.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▌         | 37/600 [03:25<52:41,  5.62s/it]

  Epoch 37/600 | Train Loss=8.2156 | Train Acc=94.31% | Test Acc=96.25%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▋         | 38/600 [03:31<54:13,  5.79s/it]

  Epoch 38/600 | Train Loss=8.1132 | Train Acc=94.25% | Test Acc=96.28%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   6%|▋         | 39/600 [03:37<54:21,  5.81s/it]

  Epoch 39/600 | Train Loss=8.0151 | Train Acc=94.37% | Test Acc=96.23%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   7%|▋         | 40/600 [03:43<53:20,  5.71s/it]

  Epoch 40/600 | Train Loss=7.9188 | Train Acc=94.32% | Test Acc=96.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   7%|▋         | 41/600 [03:49<53:13,  5.71s/it]

  Epoch 41/600 | Train Loss=7.8204 | Train Acc=94.51% | Test Acc=96.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   7%|▋         | 42/600 [03:54<52:26,  5.64s/it]

  Epoch 42/600 | Train Loss=7.7272 | Train Acc=94.70% | Test Acc=96.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   7%|▋         | 43/600 [04:00<53:06,  5.72s/it]

  Epoch 43/600 | Train Loss=7.6374 | Train Acc=94.54% | Test Acc=96.69%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   7%|▋         | 44/600 [04:05<52:29,  5.66s/it]

  Epoch 44/600 | Train Loss=7.5544 | Train Acc=94.60% | Test Acc=96.68%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 45/600 [04:11<52:16,  5.65s/it]

  Epoch 45/600 | Train Loss=7.4664 | Train Acc=94.66% | Test Acc=96.76%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 46/600 [04:17<51:49,  5.61s/it]

  Epoch 46/600 | Train Loss=7.3910 | Train Acc=94.72% | Test Acc=96.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 47/600 [04:22<51:12,  5.56s/it]

  Epoch 47/600 | Train Loss=7.3112 | Train Acc=94.75% | Test Acc=96.85%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 48/600 [04:27<50:41,  5.51s/it]

  Epoch 48/600 | Train Loss=7.2394 | Train Acc=94.66% | Test Acc=96.68%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 49/600 [04:34<52:30,  5.72s/it]

  Epoch 49/600 | Train Loss=7.1668 | Train Acc=94.75% | Test Acc=97.01%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 50/600 [04:39<51:02,  5.57s/it]

  Epoch 50/600 | Train Loss=7.0961 | Train Acc=94.77% | Test Acc=96.79%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   8%|▊         | 51/600 [04:44<48:41,  5.32s/it]

  Epoch 51/600 | Train Loss=7.0291 | Train Acc=94.93% | Test Acc=96.83%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   9%|▊         | 52/600 [04:48<47:26,  5.19s/it]

  Epoch 52/600 | Train Loss=6.9700 | Train Acc=94.89% | Test Acc=97.05%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   9%|▉         | 53/600 [04:53<46:08,  5.06s/it]

  Epoch 53/600 | Train Loss=6.9055 | Train Acc=94.65% | Test Acc=96.93%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   9%|▉         | 54/600 [04:58<45:11,  4.97s/it]

  Epoch 54/600 | Train Loss=6.8415 | Train Acc=94.93% | Test Acc=96.91%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   9%|▉         | 55/600 [05:03<44:28,  4.90s/it]

  Epoch 55/600 | Train Loss=6.7790 | Train Acc=95.11% | Test Acc=96.86%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:   9%|▉         | 56/600 [05:08<45:26,  5.01s/it]

  Epoch 56/600 | Train Loss=6.7257 | Train Acc=95.01% | Test Acc=96.78%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|▉         | 57/600 [05:13<44:43,  4.94s/it]

  Epoch 57/600 | Train Loss=6.6710 | Train Acc=95.05% | Test Acc=97.02%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|▉         | 58/600 [05:18<44:14,  4.90s/it]

  Epoch 58/600 | Train Loss=6.6247 | Train Acc=94.85% | Test Acc=97.11%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|▉         | 59/600 [05:22<43:47,  4.86s/it]

  Epoch 59/600 | Train Loss=6.5626 | Train Acc=95.18% | Test Acc=96.92%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|█         | 60/600 [05:27<43:24,  4.82s/it]

  Epoch 60/600 | Train Loss=6.5194 | Train Acc=95.07% | Test Acc=97.24%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|█         | 61/600 [05:32<42:58,  4.78s/it]

  Epoch 61/600 | Train Loss=6.4786 | Train Acc=94.90% | Test Acc=97.10%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|█         | 62/600 [05:37<43:30,  4.85s/it]

  Epoch 62/600 | Train Loss=6.4303 | Train Acc=95.01% | Test Acc=97.18%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  10%|█         | 63/600 [05:42<44:26,  4.97s/it]

  Epoch 63/600 | Train Loss=6.3882 | Train Acc=95.02% | Test Acc=97.14%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  11%|█         | 64/600 [05:47<44:38,  5.00s/it]

  Epoch 64/600 | Train Loss=6.3436 | Train Acc=95.17% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  11%|█         | 65/600 [05:52<44:41,  5.01s/it]

  Epoch 65/600 | Train Loss=6.3075 | Train Acc=94.96% | Test Acc=97.17%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  11%|█         | 66/600 [05:57<43:53,  4.93s/it]

  Epoch 66/600 | Train Loss=6.2656 | Train Acc=95.16% | Test Acc=97.18%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  11%|█         | 67/600 [06:02<43:17,  4.87s/it]

  Epoch 67/600 | Train Loss=6.2237 | Train Acc=95.25% | Test Acc=97.11%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  11%|█▏        | 68/600 [06:06<42:58,  4.85s/it]

  Epoch 68/600 | Train Loss=6.1919 | Train Acc=95.19% | Test Acc=97.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 69/600 [06:12<43:56,  4.96s/it]

  Epoch 69/600 | Train Loss=6.1601 | Train Acc=95.11% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 70/600 [06:16<43:24,  4.91s/it]

  Epoch 70/600 | Train Loss=6.1208 | Train Acc=95.24% | Test Acc=97.15%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 71/600 [06:21<42:54,  4.87s/it]

  Epoch 71/600 | Train Loss=6.0944 | Train Acc=95.21% | Test Acc=97.26%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 72/600 [06:26<42:37,  4.84s/it]

  Epoch 72/600 | Train Loss=6.0607 | Train Acc=95.19% | Test Acc=97.29%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 73/600 [06:31<42:28,  4.84s/it]

  Epoch 73/600 | Train Loss=6.0294 | Train Acc=95.23% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▏        | 74/600 [06:36<42:08,  4.81s/it]

  Epoch 74/600 | Train Loss=6.0099 | Train Acc=95.05% | Test Acc=97.26%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  12%|█▎        | 75/600 [06:40<41:59,  4.80s/it]

  Epoch 75/600 | Train Loss=5.9804 | Train Acc=95.17% | Test Acc=97.22%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  13%|█▎        | 76/600 [06:46<43:58,  5.03s/it]

  Epoch 76/600 | Train Loss=5.9481 | Train Acc=95.22% | Test Acc=97.33%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  13%|█▎        | 77/600 [06:51<44:15,  5.08s/it]

  Epoch 77/600 | Train Loss=5.9203 | Train Acc=95.40% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  13%|█▎        | 78/600 [06:56<43:11,  4.96s/it]

  Epoch 78/600 | Train Loss=5.9046 | Train Acc=95.14% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  13%|█▎        | 79/600 [07:01<42:40,  4.91s/it]

  Epoch 79/600 | Train Loss=5.8789 | Train Acc=95.23% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  13%|█▎        | 80/600 [07:05<42:13,  4.87s/it]

  Epoch 80/600 | Train Loss=5.8504 | Train Acc=95.35% | Test Acc=97.30%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▎        | 81/600 [07:10<41:53,  4.84s/it]

  Epoch 81/600 | Train Loss=5.8248 | Train Acc=95.36% | Test Acc=97.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▎        | 82/600 [07:15<42:25,  4.91s/it]

  Epoch 82/600 | Train Loss=5.8080 | Train Acc=95.25% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▍        | 83/600 [07:20<41:52,  4.86s/it]

  Epoch 83/600 | Train Loss=5.7928 | Train Acc=95.15% | Test Acc=97.26%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▍        | 84/600 [07:25<41:38,  4.84s/it]

  Epoch 84/600 | Train Loss=5.7699 | Train Acc=95.30% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▍        | 85/600 [07:30<41:32,  4.84s/it]

  Epoch 85/600 | Train Loss=5.7532 | Train Acc=95.20% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▍        | 86/600 [07:34<41:11,  4.81s/it]

  Epoch 86/600 | Train Loss=5.7353 | Train Acc=95.18% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  14%|█▍        | 87/600 [07:39<41:07,  4.81s/it]

  Epoch 87/600 | Train Loss=5.7157 | Train Acc=95.32% | Test Acc=97.21%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  15%|█▍        | 88/600 [07:44<41:49,  4.90s/it]

  Epoch 88/600 | Train Loss=5.7037 | Train Acc=95.18% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  15%|█▍        | 89/600 [07:50<43:13,  5.07s/it]

  Epoch 89/600 | Train Loss=5.6853 | Train Acc=95.28% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  15%|█▌        | 90/600 [07:55<42:59,  5.06s/it]

  Epoch 90/600 | Train Loss=5.6691 | Train Acc=95.33% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  15%|█▌        | 91/600 [07:59<42:07,  4.97s/it]

  Epoch 91/600 | Train Loss=5.6489 | Train Acc=95.44% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  15%|█▌        | 92/600 [08:04<41:27,  4.90s/it]

  Epoch 92/600 | Train Loss=5.6467 | Train Acc=95.08% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▌        | 93/600 [08:09<41:03,  4.86s/it]

  Epoch 93/600 | Train Loss=5.6259 | Train Acc=95.36% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▌        | 94/600 [08:14<40:47,  4.84s/it]

  Epoch 94/600 | Train Loss=5.6098 | Train Acc=95.40% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▌        | 95/600 [08:19<41:11,  4.89s/it]

  Epoch 95/600 | Train Loss=5.6001 | Train Acc=95.39% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▌        | 96/600 [08:24<41:04,  4.89s/it]

  Epoch 96/600 | Train Loss=5.5905 | Train Acc=95.15% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▌        | 97/600 [08:28<40:46,  4.86s/it]

  Epoch 97/600 | Train Loss=5.5684 | Train Acc=95.53% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▋        | 98/600 [08:33<40:26,  4.83s/it]

  Epoch 98/600 | Train Loss=5.5662 | Train Acc=95.23% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  16%|█▋        | 99/600 [08:38<40:13,  4.82s/it]

  Epoch 99/600 | Train Loss=5.5494 | Train Acc=95.39% | Test Acc=97.28%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  17%|█▋        | 100/600 [08:43<40:00,  4.80s/it]

  Epoch 100/600 | Train Loss=5.5490 | Train Acc=95.25% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  17%|█▋        | 101/600 [08:48<40:48,  4.91s/it]

  Epoch 101/600 | Train Loss=5.5377 | Train Acc=95.21% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  17%|█▋        | 102/600 [08:53<41:48,  5.04s/it]

  Epoch 102/600 | Train Loss=5.5297 | Train Acc=95.22% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  17%|█▋        | 103/600 [08:58<41:51,  5.05s/it]

  Epoch 103/600 | Train Loss=5.5178 | Train Acc=95.24% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  17%|█▋        | 104/600 [09:03<41:06,  4.97s/it]

  Epoch 104/600 | Train Loss=5.5115 | Train Acc=95.26% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 105/600 [09:08<40:41,  4.93s/it]

  Epoch 105/600 | Train Loss=5.4976 | Train Acc=95.47% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 106/600 [09:13<40:03,  4.87s/it]

  Epoch 106/600 | Train Loss=5.4848 | Train Acc=95.47% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 107/600 [09:17<39:41,  4.83s/it]

  Epoch 107/600 | Train Loss=5.4840 | Train Acc=95.25% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 108/600 [09:22<39:15,  4.79s/it]

  Epoch 108/600 | Train Loss=5.4721 | Train Acc=95.52% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 109/600 [09:27<40:18,  4.93s/it]

  Epoch 109/600 | Train Loss=5.4694 | Train Acc=95.22% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 110/600 [09:32<39:41,  4.86s/it]

  Epoch 110/600 | Train Loss=5.4590 | Train Acc=95.39% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  18%|█▊        | 111/600 [09:37<39:23,  4.83s/it]

  Epoch 111/600 | Train Loss=5.4524 | Train Acc=95.34% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  19%|█▊        | 112/600 [09:42<39:05,  4.81s/it]

  Epoch 112/600 | Train Loss=5.4427 | Train Acc=95.41% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  19%|█▉        | 113/600 [09:47<39:49,  4.91s/it]

  Epoch 113/600 | Train Loss=5.4398 | Train Acc=95.30% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  19%|█▉        | 114/600 [09:52<40:25,  4.99s/it]

  Epoch 114/600 | Train Loss=5.4299 | Train Acc=95.52% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  19%|█▉        | 115/600 [09:57<41:21,  5.12s/it]

  Epoch 115/600 | Train Loss=5.4267 | Train Acc=95.48% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  19%|█▉        | 116/600 [10:02<40:58,  5.08s/it]

  Epoch 116/600 | Train Loss=5.4168 | Train Acc=95.42% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|█▉        | 117/600 [10:07<40:15,  5.00s/it]

  Epoch 117/600 | Train Loss=5.4216 | Train Acc=95.31% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|█▉        | 118/600 [10:12<39:35,  4.93s/it]

  Epoch 118/600 | Train Loss=5.4101 | Train Acc=95.40% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|█▉        | 119/600 [10:17<39:16,  4.90s/it]

  Epoch 119/600 | Train Loss=5.4078 | Train Acc=95.32% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|██        | 120/600 [10:21<38:48,  4.85s/it]

  Epoch 120/600 | Train Loss=5.4101 | Train Acc=95.14% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|██        | 121/600 [10:26<38:48,  4.86s/it]

  Epoch 121/600 | Train Loss=5.3984 | Train Acc=95.38% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|██        | 122/600 [10:32<39:49,  5.00s/it]

  Epoch 122/600 | Train Loss=5.3908 | Train Acc=95.44% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  20%|██        | 123/600 [10:36<39:05,  4.92s/it]

  Epoch 123/600 | Train Loss=5.3914 | Train Acc=95.34% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  21%|██        | 124/600 [10:41<38:23,  4.84s/it]

  Epoch 124/600 | Train Loss=5.3849 | Train Acc=95.34% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  21%|██        | 125/600 [10:46<37:55,  4.79s/it]

  Epoch 125/600 | Train Loss=5.3770 | Train Acc=95.36% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  21%|██        | 126/600 [10:49<35:16,  4.47s/it]

  Epoch 126/600 | Train Loss=5.3725 | Train Acc=95.51% | Test Acc=97.76%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  21%|██        | 127/600 [10:53<33:20,  4.23s/it]

  Epoch 127/600 | Train Loss=5.3752 | Train Acc=95.38% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  21%|██▏       | 128/600 [10:57<32:02,  4.07s/it]

  Epoch 128/600 | Train Loss=5.3675 | Train Acc=95.41% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 129/600 [11:01<31:36,  4.03s/it]

  Epoch 129/600 | Train Loss=5.3635 | Train Acc=95.58% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 130/600 [11:05<30:58,  3.95s/it]

  Epoch 130/600 | Train Loss=5.3581 | Train Acc=95.48% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 131/600 [11:08<30:25,  3.89s/it]

  Epoch 131/600 | Train Loss=5.3598 | Train Acc=95.41% | Test Acc=97.68%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 132/600 [11:12<29:53,  3.83s/it]

  Epoch 132/600 | Train Loss=5.3548 | Train Acc=95.33% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 133/600 [11:16<29:28,  3.79s/it]

  Epoch 133/600 | Train Loss=5.3506 | Train Acc=95.47% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▏       | 134/600 [11:19<29:12,  3.76s/it]

  Epoch 134/600 | Train Loss=5.3528 | Train Acc=95.37% | Test Acc=97.64%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  22%|██▎       | 135/600 [11:23<28:56,  3.73s/it]

  Epoch 135/600 | Train Loss=5.3512 | Train Acc=95.27% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  23%|██▎       | 136/600 [11:27<28:50,  3.73s/it]

  Epoch 136/600 | Train Loss=5.3424 | Train Acc=95.46% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  23%|██▎       | 137/600 [11:30<28:39,  3.71s/it]

  Epoch 137/600 | Train Loss=5.3452 | Train Acc=95.36% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  23%|██▎       | 138/600 [11:35<29:24,  3.82s/it]

  Epoch 138/600 | Train Loss=5.3421 | Train Acc=95.33% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  23%|██▎       | 139/600 [11:38<29:08,  3.79s/it]

  Epoch 139/600 | Train Loss=5.3361 | Train Acc=95.43% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  23%|██▎       | 140/600 [11:42<29:51,  3.89s/it]

  Epoch 140/600 | Train Loss=5.3390 | Train Acc=95.32% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▎       | 141/600 [11:47<30:33,  4.00s/it]

  Epoch 141/600 | Train Loss=5.3345 | Train Acc=95.35% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▎       | 142/600 [11:50<29:40,  3.89s/it]

  Epoch 142/600 | Train Loss=5.3360 | Train Acc=95.15% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▍       | 143/600 [11:54<29:04,  3.82s/it]

  Epoch 143/600 | Train Loss=5.3317 | Train Acc=95.30% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▍       | 144/600 [11:58<28:46,  3.79s/it]

  Epoch 144/600 | Train Loss=5.3283 | Train Acc=95.40% | Test Acc=97.71%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▍       | 145/600 [12:01<28:29,  3.76s/it]

  Epoch 145/600 | Train Loss=5.3221 | Train Acc=95.52% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▍       | 146/600 [12:05<28:53,  3.82s/it]

  Epoch 146/600 | Train Loss=5.3271 | Train Acc=95.41% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  24%|██▍       | 147/600 [12:09<28:56,  3.83s/it]

  Epoch 147/600 | Train Loss=5.3210 | Train Acc=95.38% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  25%|██▍       | 148/600 [12:13<28:30,  3.78s/it]

  Epoch 148/600 | Train Loss=5.3117 | Train Acc=95.50% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  25%|██▍       | 149/600 [12:17<28:21,  3.77s/it]

  Epoch 149/600 | Train Loss=5.3175 | Train Acc=95.41% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  25%|██▌       | 150/600 [12:20<28:02,  3.74s/it]

  Epoch 150/600 | Train Loss=5.3154 | Train Acc=95.46% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  25%|██▌       | 151/600 [12:24<27:48,  3.72s/it]

  Epoch 151/600 | Train Loss=5.3181 | Train Acc=95.23% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  25%|██▌       | 152/600 [12:28<28:53,  3.87s/it]

  Epoch 152/600 | Train Loss=5.3142 | Train Acc=95.33% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▌       | 153/600 [12:32<29:25,  3.95s/it]

  Epoch 153/600 | Train Loss=5.3125 | Train Acc=95.34% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▌       | 154/600 [12:37<30:25,  4.09s/it]

  Epoch 154/600 | Train Loss=5.3063 | Train Acc=95.47% | Test Acc=97.42%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▌       | 155/600 [12:41<30:03,  4.05s/it]

  Epoch 155/600 | Train Loss=5.3052 | Train Acc=95.48% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▌       | 156/600 [12:44<29:09,  3.94s/it]

  Epoch 156/600 | Train Loss=5.3086 | Train Acc=95.38% | Test Acc=97.71%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▌       | 157/600 [12:48<28:38,  3.88s/it]

  Epoch 157/600 | Train Loss=5.3011 | Train Acc=95.54% | Test Acc=97.67%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▋       | 158/600 [12:52<28:03,  3.81s/it]

  Epoch 158/600 | Train Loss=5.2983 | Train Acc=95.54% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  26%|██▋       | 159/600 [12:55<27:37,  3.76s/it]

  Epoch 159/600 | Train Loss=5.2986 | Train Acc=95.49% | Test Acc=97.74%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  27%|██▋       | 160/600 [12:59<27:28,  3.75s/it]

  Epoch 160/600 | Train Loss=5.2964 | Train Acc=95.47% | Test Acc=97.69%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  27%|██▋       | 161/600 [13:03<27:16,  3.73s/it]

  Epoch 161/600 | Train Loss=5.2949 | Train Acc=95.61% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  27%|██▋       | 162/600 [13:06<27:06,  3.71s/it]

  Epoch 162/600 | Train Loss=5.2994 | Train Acc=95.42% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  27%|██▋       | 163/600 [13:10<27:44,  3.81s/it]

  Epoch 163/600 | Train Loss=5.2909 | Train Acc=95.57% | Test Acc=97.74%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  27%|██▋       | 164/600 [13:14<27:20,  3.76s/it]

  Epoch 164/600 | Train Loss=5.2949 | Train Acc=95.49% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 165/600 [13:18<28:08,  3.88s/it]

  Epoch 165/600 | Train Loss=5.2949 | Train Acc=95.39% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 166/600 [13:22<28:51,  3.99s/it]

  Epoch 166/600 | Train Loss=5.2984 | Train Acc=95.39% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 167/600 [13:27<29:29,  4.09s/it]

  Epoch 167/600 | Train Loss=5.2898 | Train Acc=95.41% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 168/600 [13:30<28:28,  3.96s/it]

  Epoch 168/600 | Train Loss=5.2952 | Train Acc=95.33% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 169/600 [13:34<27:44,  3.86s/it]

  Epoch 169/600 | Train Loss=5.2901 | Train Acc=95.44% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 170/600 [13:38<27:26,  3.83s/it]

  Epoch 170/600 | Train Loss=5.2954 | Train Acc=95.33% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  28%|██▊       | 171/600 [13:42<27:41,  3.87s/it]

  Epoch 171/600 | Train Loss=5.2866 | Train Acc=95.57% | Test Acc=97.67%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  29%|██▊       | 172/600 [13:46<27:30,  3.86s/it]

  Epoch 172/600 | Train Loss=5.2852 | Train Acc=95.51% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  29%|██▉       | 173/600 [13:49<27:06,  3.81s/it]

  Epoch 173/600 | Train Loss=5.2855 | Train Acc=95.47% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  29%|██▉       | 174/600 [13:53<26:41,  3.76s/it]

  Epoch 174/600 | Train Loss=5.2815 | Train Acc=95.56% | Test Acc=97.73%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  29%|██▉       | 175/600 [13:57<26:28,  3.74s/it]

  Epoch 175/600 | Train Loss=5.2818 | Train Acc=95.47% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  29%|██▉       | 176/600 [14:00<26:13,  3.71s/it]

  Epoch 176/600 | Train Loss=5.2831 | Train Acc=95.43% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|██▉       | 177/600 [14:04<26:00,  3.69s/it]

  Epoch 177/600 | Train Loss=5.2846 | Train Acc=95.45% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|██▉       | 178/600 [14:08<26:58,  3.83s/it]

  Epoch 178/600 | Train Loss=5.2788 | Train Acc=95.51% | Test Acc=97.77%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|██▉       | 179/600 [14:12<27:48,  3.96s/it]

  Epoch 179/600 | Train Loss=5.2787 | Train Acc=95.55% | Test Acc=97.75%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|███       | 180/600 [14:17<29:35,  4.23s/it]

  Epoch 180/600 | Train Loss=5.2809 | Train Acc=95.35% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|███       | 181/600 [14:21<28:19,  4.06s/it]

  Epoch 181/600 | Train Loss=5.2722 | Train Acc=95.56% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|███       | 182/600 [14:25<27:24,  3.93s/it]

  Epoch 182/600 | Train Loss=5.2756 | Train Acc=95.46% | Test Acc=97.69%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  30%|███       | 183/600 [14:28<26:51,  3.87s/it]

  Epoch 183/600 | Train Loss=5.2756 | Train Acc=95.52% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  31%|███       | 184/600 [14:32<26:18,  3.80s/it]

  Epoch 184/600 | Train Loss=5.2755 | Train Acc=95.41% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  31%|███       | 185/600 [14:36<26:00,  3.76s/it]

  Epoch 185/600 | Train Loss=5.2679 | Train Acc=95.65% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  31%|███       | 186/600 [14:39<25:51,  3.75s/it]

  Epoch 186/600 | Train Loss=5.2726 | Train Acc=95.45% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  31%|███       | 187/600 [14:43<25:38,  3.72s/it]

  Epoch 187/600 | Train Loss=5.2715 | Train Acc=95.56% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  31%|███▏      | 188/600 [14:47<26:13,  3.82s/it]

  Epoch 188/600 | Train Loss=5.2727 | Train Acc=95.45% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 189/600 [14:51<25:52,  3.78s/it]

  Epoch 189/600 | Train Loss=5.2708 | Train Acc=95.47% | Test Acc=97.67%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 190/600 [14:55<26:30,  3.88s/it]

  Epoch 190/600 | Train Loss=5.2736 | Train Acc=95.42% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 191/600 [14:59<27:02,  3.97s/it]

  Epoch 191/600 | Train Loss=5.2695 | Train Acc=95.43% | Test Acc=97.73%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 192/600 [15:03<27:34,  4.06s/it]

  Epoch 192/600 | Train Loss=5.2629 | Train Acc=95.57% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 193/600 [15:08<28:01,  4.13s/it]

  Epoch 193/600 | Train Loss=5.2679 | Train Acc=95.51% | Test Acc=97.73%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▏      | 194/600 [15:11<27:00,  3.99s/it]

  Epoch 194/600 | Train Loss=5.2647 | Train Acc=95.56% | Test Acc=97.78%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  32%|███▎      | 195/600 [15:15<26:15,  3.89s/it]

  Epoch 195/600 | Train Loss=5.2696 | Train Acc=95.42% | Test Acc=97.68%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  33%|███▎      | 196/600 [15:19<26:29,  3.93s/it]

  Epoch 196/600 | Train Loss=5.2576 | Train Acc=95.66% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  33%|███▎      | 197/600 [15:23<26:03,  3.88s/it]

  Epoch 197/600 | Train Loss=5.2559 | Train Acc=95.66% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  33%|███▎      | 198/600 [15:26<25:41,  3.83s/it]

  Epoch 198/600 | Train Loss=5.2581 | Train Acc=95.62% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  33%|███▎      | 199/600 [15:30<25:19,  3.79s/it]

  Epoch 199/600 | Train Loss=5.2608 | Train Acc=95.53% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  33%|███▎      | 200/600 [15:34<24:56,  3.74s/it]

  Epoch 200/600 | Train Loss=5.2627 | Train Acc=95.44% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▎      | 201/600 [15:37<24:54,  3.75s/it]

  Epoch 201/600 | Train Loss=5.2636 | Train Acc=95.47% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▎      | 202/600 [15:41<24:39,  3.72s/it]

  Epoch 202/600 | Train Loss=5.2591 | Train Acc=95.37% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▍      | 203/600 [15:45<25:32,  3.86s/it]

  Epoch 203/600 | Train Loss=5.2559 | Train Acc=95.60% | Test Acc=97.72%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▍      | 204/600 [15:49<26:05,  3.95s/it]

  Epoch 204/600 | Train Loss=5.2593 | Train Acc=95.46% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▍      | 205/600 [15:54<27:36,  4.19s/it]

  Epoch 205/600 | Train Loss=5.2567 | Train Acc=95.54% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▍      | 206/600 [15:59<28:35,  4.36s/it]

  Epoch 206/600 | Train Loss=5.2527 | Train Acc=95.61% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  34%|███▍      | 207/600 [16:04<29:06,  4.45s/it]

  Epoch 207/600 | Train Loss=5.2466 | Train Acc=95.71% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  35%|███▍      | 208/600 [16:08<29:29,  4.51s/it]

  Epoch 208/600 | Train Loss=5.2543 | Train Acc=95.55% | Test Acc=97.71%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  35%|███▍      | 209/600 [16:13<29:34,  4.54s/it]

  Epoch 209/600 | Train Loss=5.2578 | Train Acc=95.40% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  35%|███▌      | 210/600 [16:18<29:50,  4.59s/it]

  Epoch 210/600 | Train Loss=5.2578 | Train Acc=95.48% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  35%|███▌      | 211/600 [16:22<29:47,  4.59s/it]

  Epoch 211/600 | Train Loss=5.2541 | Train Acc=95.46% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  35%|███▌      | 212/600 [16:27<30:40,  4.74s/it]

  Epoch 212/600 | Train Loss=5.2529 | Train Acc=95.44% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▌      | 213/600 [16:32<30:27,  4.72s/it]

  Epoch 213/600 | Train Loss=5.2577 | Train Acc=95.44% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▌      | 214/600 [16:37<30:15,  4.70s/it]

  Epoch 214/600 | Train Loss=5.2475 | Train Acc=95.56% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▌      | 215/600 [16:41<30:03,  4.68s/it]

  Epoch 215/600 | Train Loss=5.2486 | Train Acc=95.51% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▌      | 216/600 [16:46<29:58,  4.68s/it]

  Epoch 216/600 | Train Loss=5.2475 | Train Acc=95.63% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▌      | 217/600 [16:51<29:55,  4.69s/it]

  Epoch 217/600 | Train Loss=5.2509 | Train Acc=95.47% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▋      | 218/600 [16:55<30:07,  4.73s/it]

  Epoch 218/600 | Train Loss=5.2429 | Train Acc=95.62% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  36%|███▋      | 219/600 [17:00<30:13,  4.76s/it]

  Epoch 219/600 | Train Loss=5.2445 | Train Acc=95.48% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  37%|███▋      | 220/600 [17:05<29:54,  4.72s/it]

  Epoch 220/600 | Train Loss=5.2431 | Train Acc=95.47% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  37%|███▋      | 221/600 [17:10<29:48,  4.72s/it]

  Epoch 221/600 | Train Loss=5.2437 | Train Acc=95.54% | Test Acc=97.69%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  37%|███▋      | 222/600 [17:14<29:36,  4.70s/it]

  Epoch 222/600 | Train Loss=5.2447 | Train Acc=95.41% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  37%|███▋      | 223/600 [17:19<29:32,  4.70s/it]

  Epoch 223/600 | Train Loss=5.2413 | Train Acc=95.58% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  37%|███▋      | 224/600 [17:24<29:20,  4.68s/it]

  Epoch 224/600 | Train Loss=5.2407 | Train Acc=95.60% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 225/600 [17:29<29:59,  4.80s/it]

  Epoch 225/600 | Train Loss=5.2376 | Train Acc=95.66% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 226/600 [17:33<29:44,  4.77s/it]

  Epoch 226/600 | Train Loss=5.2358 | Train Acc=95.67% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 227/600 [17:38<29:29,  4.74s/it]

  Epoch 227/600 | Train Loss=5.2429 | Train Acc=95.49% | Test Acc=97.31%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 228/600 [17:43<29:19,  4.73s/it]

  Epoch 228/600 | Train Loss=5.2343 | Train Acc=95.59% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 229/600 [17:47<29:08,  4.71s/it]

  Epoch 229/600 | Train Loss=5.2334 | Train Acc=95.66% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 230/600 [17:52<28:52,  4.68s/it]

  Epoch 230/600 | Train Loss=5.2306 | Train Acc=95.77% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  38%|███▊      | 231/600 [17:57<28:50,  4.69s/it]

  Epoch 231/600 | Train Loss=5.2357 | Train Acc=95.44% | Test Acc=97.31%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  39%|███▊      | 232/600 [18:02<29:22,  4.79s/it]

  Epoch 232/600 | Train Loss=5.2318 | Train Acc=95.56% | Test Acc=97.71%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  39%|███▉      | 233/600 [18:06<29:08,  4.76s/it]

  Epoch 233/600 | Train Loss=5.2386 | Train Acc=95.40% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  39%|███▉      | 234/600 [18:11<28:48,  4.72s/it]

  Epoch 234/600 | Train Loss=5.2320 | Train Acc=95.60% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  39%|███▉      | 235/600 [18:16<28:31,  4.69s/it]

  Epoch 235/600 | Train Loss=5.2365 | Train Acc=95.47% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  39%|███▉      | 236/600 [18:20<28:28,  4.69s/it]

  Epoch 236/600 | Train Loss=5.2304 | Train Acc=95.65% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|███▉      | 237/600 [18:25<28:20,  4.68s/it]

  Epoch 237/600 | Train Loss=5.2244 | Train Acc=95.69% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|███▉      | 238/600 [18:30<28:17,  4.69s/it]

  Epoch 238/600 | Train Loss=5.2289 | Train Acc=95.55% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|███▉      | 239/600 [18:35<28:51,  4.80s/it]

  Epoch 239/600 | Train Loss=5.2316 | Train Acc=95.50% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|████      | 240/600 [18:40<28:34,  4.76s/it]

  Epoch 240/600 | Train Loss=5.2329 | Train Acc=95.49% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|████      | 241/600 [18:44<28:14,  4.72s/it]

  Epoch 241/600 | Train Loss=5.2221 | Train Acc=95.72% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|████      | 242/600 [18:49<28:04,  4.71s/it]

  Epoch 242/600 | Train Loss=5.2260 | Train Acc=95.58% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  40%|████      | 243/600 [18:53<27:52,  4.69s/it]

  Epoch 243/600 | Train Loss=5.2290 | Train Acc=95.44% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  41%|████      | 244/600 [18:58<27:49,  4.69s/it]

  Epoch 244/600 | Train Loss=5.2210 | Train Acc=95.74% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  41%|████      | 245/600 [19:03<27:37,  4.67s/it]

  Epoch 245/600 | Train Loss=5.2257 | Train Acc=95.55% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  41%|████      | 246/600 [19:08<28:17,  4.80s/it]

  Epoch 246/600 | Train Loss=5.2229 | Train Acc=95.59% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  41%|████      | 247/600 [19:12<27:52,  4.74s/it]

  Epoch 247/600 | Train Loss=5.2165 | Train Acc=95.78% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  41%|████▏     | 248/600 [19:17<27:37,  4.71s/it]

  Epoch 248/600 | Train Loss=5.2157 | Train Acc=95.77% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 249/600 [19:22<27:25,  4.69s/it]

  Epoch 249/600 | Train Loss=5.2124 | Train Acc=95.68% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 250/600 [19:26<27:21,  4.69s/it]

  Epoch 250/600 | Train Loss=5.2161 | Train Acc=95.65% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 251/600 [19:31<27:12,  4.68s/it]

  Epoch 251/600 | Train Loss=5.2183 | Train Acc=95.62% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 252/600 [19:36<27:01,  4.66s/it]

  Epoch 252/600 | Train Loss=5.2079 | Train Acc=95.73% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 253/600 [19:41<27:40,  4.78s/it]

  Epoch 253/600 | Train Loss=5.2128 | Train Acc=95.60% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▏     | 254/600 [19:45<27:22,  4.75s/it]

  Epoch 254/600 | Train Loss=5.2196 | Train Acc=95.41% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  42%|████▎     | 255/600 [19:50<27:10,  4.73s/it]

  Epoch 255/600 | Train Loss=5.2101 | Train Acc=95.79% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  43%|████▎     | 256/600 [19:55<26:57,  4.70s/it]

  Epoch 256/600 | Train Loss=5.2133 | Train Acc=95.63% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  43%|████▎     | 257/600 [19:59<26:53,  4.70s/it]

  Epoch 257/600 | Train Loss=5.2129 | Train Acc=95.49% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  43%|████▎     | 258/600 [20:04<26:40,  4.68s/it]

  Epoch 258/600 | Train Loss=5.2050 | Train Acc=95.75% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  43%|████▎     | 259/600 [20:09<26:32,  4.67s/it]

  Epoch 259/600 | Train Loss=5.2069 | Train Acc=95.56% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  43%|████▎     | 260/600 [20:14<27:17,  4.81s/it]

  Epoch 260/600 | Train Loss=5.2064 | Train Acc=95.69% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▎     | 261/600 [20:19<27:00,  4.78s/it]

  Epoch 261/600 | Train Loss=5.2070 | Train Acc=95.71% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▎     | 262/600 [20:23<26:37,  4.73s/it]

  Epoch 262/600 | Train Loss=5.2052 | Train Acc=95.67% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▍     | 263/600 [20:28<26:28,  4.71s/it]

  Epoch 263/600 | Train Loss=5.2049 | Train Acc=95.56% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▍     | 264/600 [20:33<26:15,  4.69s/it]

  Epoch 264/600 | Train Loss=5.2011 | Train Acc=95.75% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▍     | 265/600 [20:37<26:10,  4.69s/it]

  Epoch 265/600 | Train Loss=5.2056 | Train Acc=95.63% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▍     | 266/600 [20:42<26:11,  4.71s/it]

  Epoch 266/600 | Train Loss=5.2020 | Train Acc=95.58% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  44%|████▍     | 267/600 [20:47<26:30,  4.78s/it]

  Epoch 267/600 | Train Loss=5.1945 | Train Acc=95.73% | Test Acc=97.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  45%|████▍     | 268/600 [20:52<26:46,  4.84s/it]

  Epoch 268/600 | Train Loss=5.2005 | Train Acc=95.54% | Test Acc=97.24%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  45%|████▍     | 269/600 [20:57<26:55,  4.88s/it]

  Epoch 269/600 | Train Loss=5.1972 | Train Acc=95.66% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  45%|████▌     | 270/600 [21:01<26:25,  4.81s/it]

  Epoch 270/600 | Train Loss=5.1997 | Train Acc=95.61% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  45%|████▌     | 271/600 [21:06<26:09,  4.77s/it]

  Epoch 271/600 | Train Loss=5.1962 | Train Acc=95.67% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  45%|████▌     | 272/600 [21:11<25:51,  4.73s/it]

  Epoch 272/600 | Train Loss=5.1944 | Train Acc=95.72% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▌     | 273/600 [21:16<26:16,  4.82s/it]

  Epoch 273/600 | Train Loss=5.1943 | Train Acc=95.81% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▌     | 274/600 [21:21<26:01,  4.79s/it]

  Epoch 274/600 | Train Loss=5.1981 | Train Acc=95.68% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▌     | 275/600 [21:25<25:44,  4.75s/it]

  Epoch 275/600 | Train Loss=5.1880 | Train Acc=95.86% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▌     | 276/600 [21:30<25:30,  4.72s/it]

  Epoch 276/600 | Train Loss=5.1914 | Train Acc=95.71% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▌     | 277/600 [21:35<25:16,  4.70s/it]

  Epoch 277/600 | Train Loss=5.1940 | Train Acc=95.70% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▋     | 278/600 [21:39<25:11,  4.69s/it]

  Epoch 278/600 | Train Loss=5.1918 | Train Acc=95.60% | Test Acc=97.28%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  46%|████▋     | 279/600 [21:44<25:04,  4.69s/it]

  Epoch 279/600 | Train Loss=5.1940 | Train Acc=95.55% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  47%|████▋     | 280/600 [21:49<26:06,  4.90s/it]

  Epoch 280/600 | Train Loss=5.1892 | Train Acc=95.66% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  47%|████▋     | 281/600 [21:54<26:11,  4.93s/it]

  Epoch 281/600 | Train Loss=5.1860 | Train Acc=95.78% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  47%|████▋     | 282/600 [21:59<26:16,  4.96s/it]

  Epoch 282/600 | Train Loss=5.1917 | Train Acc=95.53% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  47%|████▋     | 283/600 [22:04<25:41,  4.86s/it]

  Epoch 283/600 | Train Loss=5.1862 | Train Acc=95.73% | Test Acc=97.69%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  47%|████▋     | 284/600 [22:09<25:22,  4.82s/it]

  Epoch 284/600 | Train Loss=5.1844 | Train Acc=95.80% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 285/600 [22:13<24:56,  4.75s/it]

  Epoch 285/600 | Train Loss=5.1850 | Train Acc=95.58% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 286/600 [22:18<24:44,  4.73s/it]

  Epoch 286/600 | Train Loss=5.1810 | Train Acc=95.80% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 287/600 [22:23<25:12,  4.83s/it]

  Epoch 287/600 | Train Loss=5.1850 | Train Acc=95.60% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 288/600 [22:28<24:54,  4.79s/it]

  Epoch 288/600 | Train Loss=5.1836 | Train Acc=95.70% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 289/600 [22:32<24:36,  4.75s/it]

  Epoch 289/600 | Train Loss=5.1797 | Train Acc=95.77% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 290/600 [22:37<24:24,  4.72s/it]

  Epoch 290/600 | Train Loss=5.1837 | Train Acc=95.62% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  48%|████▊     | 291/600 [22:42<24:13,  4.71s/it]

  Epoch 291/600 | Train Loss=5.1809 | Train Acc=95.67% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  49%|████▊     | 292/600 [22:46<24:09,  4.71s/it]

  Epoch 292/600 | Train Loss=5.1739 | Train Acc=95.88% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  49%|████▉     | 293/600 [22:52<24:46,  4.84s/it]

  Epoch 293/600 | Train Loss=5.1766 | Train Acc=95.79% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  49%|████▉     | 294/600 [22:57<25:08,  4.93s/it]

  Epoch 294/600 | Train Loss=5.1729 | Train Acc=95.69% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  49%|████▉     | 295/600 [23:02<25:06,  4.94s/it]

  Epoch 295/600 | Train Loss=5.1776 | Train Acc=95.62% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  49%|████▉     | 296/600 [23:06<24:40,  4.87s/it]

  Epoch 296/600 | Train Loss=5.1745 | Train Acc=95.74% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|████▉     | 297/600 [23:11<24:11,  4.79s/it]

  Epoch 297/600 | Train Loss=5.1808 | Train Acc=95.48% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|████▉     | 298/600 [23:16<23:55,  4.75s/it]

  Epoch 298/600 | Train Loss=5.1726 | Train Acc=95.74% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|████▉     | 299/600 [23:20<23:45,  4.74s/it]

  Epoch 299/600 | Train Loss=5.1673 | Train Acc=95.88% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|█████     | 300/600 [23:25<24:09,  4.83s/it]

  Epoch 300/600 | Train Loss=5.1623 | Train Acc=95.94% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|█████     | 301/600 [23:30<23:48,  4.78s/it]

  Epoch 301/600 | Train Loss=5.1682 | Train Acc=95.76% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|█████     | 302/600 [23:35<23:31,  4.74s/it]

  Epoch 302/600 | Train Loss=5.1737 | Train Acc=95.65% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  50%|█████     | 303/600 [23:39<23:23,  4.72s/it]

  Epoch 303/600 | Train Loss=5.1672 | Train Acc=95.90% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  51%|█████     | 304/600 [23:44<23:07,  4.69s/it]

  Epoch 304/600 | Train Loss=5.1740 | Train Acc=95.61% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  51%|█████     | 305/600 [23:49<23:35,  4.80s/it]

  Epoch 305/600 | Train Loss=5.1681 | Train Acc=95.62% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  51%|█████     | 306/600 [23:54<23:50,  4.87s/it]

  Epoch 306/600 | Train Loss=5.1682 | Train Acc=95.60% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  51%|█████     | 307/600 [23:59<24:29,  5.02s/it]

  Epoch 307/600 | Train Loss=5.1635 | Train Acc=95.88% | Test Acc=97.42%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  51%|█████▏    | 308/600 [24:04<24:17,  4.99s/it]

  Epoch 308/600 | Train Loss=5.1677 | Train Acc=95.64% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 309/600 [24:09<23:48,  4.91s/it]

  Epoch 309/600 | Train Loss=5.1614 | Train Acc=95.80% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 310/600 [24:14<23:20,  4.83s/it]

  Epoch 310/600 | Train Loss=5.1609 | Train Acc=95.88% | Test Acc=97.64%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 311/600 [24:18<23:01,  4.78s/it]

  Epoch 311/600 | Train Loss=5.1655 | Train Acc=95.78% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 312/600 [24:23<22:43,  4.74s/it]

  Epoch 312/600 | Train Loss=5.1644 | Train Acc=95.71% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 313/600 [24:28<22:37,  4.73s/it]

  Epoch 313/600 | Train Loss=5.1624 | Train Acc=95.77% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▏    | 314/600 [24:33<22:50,  4.79s/it]

  Epoch 314/600 | Train Loss=5.1650 | Train Acc=95.66% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  52%|█████▎    | 315/600 [24:37<22:37,  4.76s/it]

  Epoch 315/600 | Train Loss=5.1616 | Train Acc=95.86% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  53%|█████▎    | 316/600 [24:42<22:23,  4.73s/it]

  Epoch 316/600 | Train Loss=5.1578 | Train Acc=95.80% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  53%|█████▎    | 317/600 [24:47<22:13,  4.71s/it]

  Epoch 317/600 | Train Loss=5.1593 | Train Acc=95.81% | Test Acc=97.42%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  53%|█████▎    | 318/600 [24:52<22:28,  4.78s/it]

  Epoch 318/600 | Train Loss=5.1598 | Train Acc=95.81% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  53%|█████▎    | 319/600 [24:57<22:40,  4.84s/it]

  Epoch 319/600 | Train Loss=5.1554 | Train Acc=95.91% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  53%|█████▎    | 320/600 [25:02<23:15,  4.98s/it]

  Epoch 320/600 | Train Loss=5.1565 | Train Acc=95.80% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▎    | 321/600 [25:07<23:05,  4.97s/it]

  Epoch 321/600 | Train Loss=5.1617 | Train Acc=95.66% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▎    | 322/600 [25:11<22:34,  4.87s/it]

  Epoch 322/600 | Train Loss=5.1524 | Train Acc=95.86% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▍    | 323/600 [25:16<22:14,  4.82s/it]

  Epoch 323/600 | Train Loss=5.1528 | Train Acc=95.89% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▍    | 324/600 [25:21<21:58,  4.78s/it]

  Epoch 324/600 | Train Loss=5.1556 | Train Acc=95.72% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▍    | 325/600 [25:25<21:40,  4.73s/it]

  Epoch 325/600 | Train Loss=5.1495 | Train Acc=95.95% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  54%|█████▍    | 326/600 [25:30<21:33,  4.72s/it]

  Epoch 326/600 | Train Loss=5.1560 | Train Acc=95.73% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▍    | 327/600 [25:35<21:54,  4.81s/it]

  Epoch 327/600 | Train Loss=5.1479 | Train Acc=95.91% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▍    | 328/600 [25:40<21:40,  4.78s/it]

  Epoch 328/600 | Train Loss=5.1561 | Train Acc=95.60% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▍    | 329/600 [25:45<21:22,  4.73s/it]

  Epoch 329/600 | Train Loss=5.1565 | Train Acc=95.76% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▌    | 330/600 [25:49<21:15,  4.72s/it]

  Epoch 330/600 | Train Loss=5.1514 | Train Acc=95.87% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▌    | 331/600 [25:54<21:27,  4.79s/it]

  Epoch 331/600 | Train Loss=5.1550 | Train Acc=95.62% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  55%|█████▌    | 332/600 [25:59<21:41,  4.85s/it]

  Epoch 332/600 | Train Loss=5.1542 | Train Acc=95.66% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▌    | 333/600 [26:04<21:50,  4.91s/it]

  Epoch 333/600 | Train Loss=5.1496 | Train Acc=95.78% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▌    | 334/600 [26:09<21:59,  4.96s/it]

  Epoch 334/600 | Train Loss=5.1510 | Train Acc=95.84% | Test Acc=97.27%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▌    | 335/600 [26:14<21:26,  4.85s/it]

  Epoch 335/600 | Train Loss=5.1589 | Train Acc=95.53% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▌    | 336/600 [26:19<21:11,  4.82s/it]

  Epoch 336/600 | Train Loss=5.1547 | Train Acc=95.74% | Test Acc=97.33%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▌    | 337/600 [26:23<20:51,  4.76s/it]

  Epoch 337/600 | Train Loss=5.1528 | Train Acc=95.73% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▋    | 338/600 [26:28<20:39,  4.73s/it]

  Epoch 338/600 | Train Loss=5.1465 | Train Acc=95.85% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  56%|█████▋    | 339/600 [26:33<20:27,  4.70s/it]

  Epoch 339/600 | Train Loss=5.1491 | Train Acc=95.74% | Test Acc=97.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▋    | 340/600 [26:37<20:26,  4.72s/it]

  Epoch 340/600 | Train Loss=5.1505 | Train Acc=95.71% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▋    | 341/600 [26:42<20:39,  4.79s/it]

  Epoch 341/600 | Train Loss=5.1448 | Train Acc=95.79% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▋    | 342/600 [26:47<20:24,  4.74s/it]

  Epoch 342/600 | Train Loss=5.1423 | Train Acc=95.84% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▋    | 343/600 [26:52<20:11,  4.71s/it]

  Epoch 343/600 | Train Loss=5.1465 | Train Acc=95.80% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▋    | 344/600 [26:57<20:31,  4.81s/it]

  Epoch 344/600 | Train Loss=5.1436 | Train Acc=95.79% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  57%|█████▊    | 345/600 [27:02<20:43,  4.88s/it]

  Epoch 345/600 | Train Loss=5.1443 | Train Acc=95.77% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 346/600 [27:07<20:44,  4.90s/it]

  Epoch 346/600 | Train Loss=5.1456 | Train Acc=95.79% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 347/600 [27:12<20:47,  4.93s/it]

  Epoch 347/600 | Train Loss=5.1417 | Train Acc=95.82% | Test Acc=97.28%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 348/600 [27:16<20:21,  4.85s/it]

  Epoch 348/600 | Train Loss=5.1501 | Train Acc=95.69% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 349/600 [27:21<20:01,  4.79s/it]

  Epoch 349/600 | Train Loss=5.1428 | Train Acc=95.85% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 350/600 [27:26<19:47,  4.75s/it]

  Epoch 350/600 | Train Loss=5.1437 | Train Acc=95.81% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  58%|█████▊    | 351/600 [27:30<19:36,  4.72s/it]

  Epoch 351/600 | Train Loss=5.1423 | Train Acc=95.90% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  59%|█████▊    | 352/600 [27:35<19:25,  4.70s/it]

  Epoch 352/600 | Train Loss=5.1399 | Train Acc=95.88% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  59%|█████▉    | 353/600 [27:40<19:22,  4.71s/it]

  Epoch 353/600 | Train Loss=5.1433 | Train Acc=95.81% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  59%|█████▉    | 354/600 [27:45<19:42,  4.81s/it]

  Epoch 354/600 | Train Loss=5.1422 | Train Acc=95.79% | Test Acc=97.32%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  59%|█████▉    | 355/600 [27:49<19:26,  4.76s/it]

  Epoch 355/600 | Train Loss=5.1373 | Train Acc=95.83% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  59%|█████▉    | 356/600 [27:54<19:08,  4.71s/it]

  Epoch 356/600 | Train Loss=5.1427 | Train Acc=95.75% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|█████▉    | 357/600 [27:59<19:27,  4.80s/it]

  Epoch 357/600 | Train Loss=5.1406 | Train Acc=95.82% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|█████▉    | 358/600 [28:04<19:28,  4.83s/it]

  Epoch 358/600 | Train Loss=5.1416 | Train Acc=95.65% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|█████▉    | 359/600 [28:09<19:33,  4.87s/it]

  Epoch 359/600 | Train Loss=5.1384 | Train Acc=95.77% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|██████    | 360/600 [28:13<19:08,  4.78s/it]

  Epoch 360/600 | Train Loss=5.1400 | Train Acc=95.83% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|██████    | 361/600 [28:18<19:28,  4.89s/it]

  Epoch 361/600 | Train Loss=5.1408 | Train Acc=95.80% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|██████    | 362/600 [28:23<19:04,  4.81s/it]

  Epoch 362/600 | Train Loss=5.1415 | Train Acc=95.75% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  60%|██████    | 363/600 [28:28<18:49,  4.76s/it]

  Epoch 363/600 | Train Loss=5.1398 | Train Acc=95.78% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  61%|██████    | 364/600 [28:32<18:35,  4.73s/it]

  Epoch 364/600 | Train Loss=5.1402 | Train Acc=95.82% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  61%|██████    | 365/600 [28:37<18:27,  4.71s/it]

  Epoch 365/600 | Train Loss=5.1410 | Train Acc=95.72% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  61%|██████    | 366/600 [28:42<18:16,  4.68s/it]

  Epoch 366/600 | Train Loss=5.1399 | Train Acc=95.89% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  61%|██████    | 367/600 [28:46<18:10,  4.68s/it]

  Epoch 367/600 | Train Loss=5.1409 | Train Acc=95.83% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  61%|██████▏   | 368/600 [28:51<18:33,  4.80s/it]

  Epoch 368/600 | Train Loss=5.1384 | Train Acc=95.77% | Test Acc=97.30%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 369/600 [28:56<18:44,  4.87s/it]

  Epoch 369/600 | Train Loss=5.1301 | Train Acc=96.01% | Test Acc=97.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 370/600 [29:01<18:45,  4.89s/it]

  Epoch 370/600 | Train Loss=5.1413 | Train Acc=95.80% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 371/600 [29:06<18:46,  4.92s/it]

  Epoch 371/600 | Train Loss=5.1320 | Train Acc=95.89% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 372/600 [29:11<18:41,  4.92s/it]

  Epoch 372/600 | Train Loss=5.1346 | Train Acc=96.00% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 373/600 [29:16<18:16,  4.83s/it]

  Epoch 373/600 | Train Loss=5.1369 | Train Acc=95.78% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▏   | 374/600 [29:21<18:30,  4.91s/it]

  Epoch 374/600 | Train Loss=5.1350 | Train Acc=95.78% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  62%|██████▎   | 375/600 [29:26<18:05,  4.83s/it]

  Epoch 375/600 | Train Loss=5.1373 | Train Acc=95.78% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  63%|██████▎   | 376/600 [29:30<17:52,  4.79s/it]

  Epoch 376/600 | Train Loss=5.1390 | Train Acc=95.73% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  63%|██████▎   | 377/600 [29:35<17:35,  4.73s/it]

  Epoch 377/600 | Train Loss=5.1372 | Train Acc=95.83% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  63%|██████▎   | 378/600 [29:40<17:27,  4.72s/it]

  Epoch 378/600 | Train Loss=5.1374 | Train Acc=95.86% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  63%|██████▎   | 379/600 [29:44<17:15,  4.69s/it]

  Epoch 379/600 | Train Loss=5.1361 | Train Acc=95.77% | Test Acc=97.28%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  63%|██████▎   | 380/600 [29:49<17:14,  4.70s/it]

  Epoch 380/600 | Train Loss=5.1353 | Train Acc=95.81% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▎   | 381/600 [29:54<17:32,  4.80s/it]

  Epoch 381/600 | Train Loss=5.1329 | Train Acc=95.91% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▎   | 382/600 [29:59<17:41,  4.87s/it]

  Epoch 382/600 | Train Loss=5.1361 | Train Acc=95.91% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▍   | 383/600 [30:04<17:43,  4.90s/it]

  Epoch 383/600 | Train Loss=5.1408 | Train Acc=95.60% | Test Acc=97.32%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▍   | 384/600 [30:09<17:45,  4.93s/it]

  Epoch 384/600 | Train Loss=5.1302 | Train Acc=95.85% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▍   | 385/600 [30:14<17:47,  4.96s/it]

  Epoch 385/600 | Train Loss=5.1346 | Train Acc=95.79% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▍   | 386/600 [30:19<17:23,  4.88s/it]

  Epoch 386/600 | Train Loss=5.1344 | Train Acc=95.74% | Test Acc=97.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  64%|██████▍   | 387/600 [30:23<17:01,  4.80s/it]

  Epoch 387/600 | Train Loss=5.1281 | Train Acc=95.93% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  65%|██████▍   | 388/600 [30:28<17:17,  4.89s/it]

  Epoch 388/600 | Train Loss=5.1334 | Train Acc=95.71% | Test Acc=97.67%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  65%|██████▍   | 389/600 [30:33<16:55,  4.82s/it]

  Epoch 389/600 | Train Loss=5.1400 | Train Acc=95.71% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  65%|██████▌   | 390/600 [30:38<16:42,  4.77s/it]

  Epoch 390/600 | Train Loss=5.1313 | Train Acc=95.91% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  65%|██████▌   | 391/600 [30:42<16:30,  4.74s/it]

  Epoch 391/600 | Train Loss=5.1316 | Train Acc=95.78% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  65%|██████▌   | 392/600 [30:47<16:22,  4.72s/it]

  Epoch 392/600 | Train Loss=5.1369 | Train Acc=95.69% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▌   | 393/600 [30:52<16:12,  4.70s/it]

  Epoch 393/600 | Train Loss=5.1322 | Train Acc=95.84% | Test Acc=97.32%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▌   | 394/600 [30:56<16:07,  4.70s/it]

  Epoch 394/600 | Train Loss=5.1363 | Train Acc=95.75% | Test Acc=97.32%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▌   | 395/600 [31:02<16:38,  4.87s/it]

  Epoch 395/600 | Train Loss=5.1352 | Train Acc=95.76% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▌   | 396/600 [31:07<16:41,  4.91s/it]

  Epoch 396/600 | Train Loss=5.1266 | Train Acc=96.11% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▌   | 397/600 [31:12<16:37,  4.92s/it]

  Epoch 397/600 | Train Loss=5.1338 | Train Acc=95.84% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▋   | 398/600 [31:16<16:18,  4.85s/it]

  Epoch 398/600 | Train Loss=5.1319 | Train Acc=95.80% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  66%|██████▋   | 399/600 [31:21<16:02,  4.79s/it]

  Epoch 399/600 | Train Loss=5.1316 | Train Acc=95.89% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  67%|██████▋   | 400/600 [31:26<15:47,  4.74s/it]

  Epoch 400/600 | Train Loss=5.1266 | Train Acc=95.98% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  67%|██████▋   | 401/600 [31:31<16:04,  4.85s/it]

  Epoch 401/600 | Train Loss=5.1302 | Train Acc=95.79% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  67%|██████▋   | 402/600 [31:35<15:46,  4.78s/it]

  Epoch 402/600 | Train Loss=5.1283 | Train Acc=95.87% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  67%|██████▋   | 403/600 [31:40<15:34,  4.74s/it]

  Epoch 403/600 | Train Loss=5.1243 | Train Acc=95.95% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  67%|██████▋   | 404/600 [31:45<15:22,  4.71s/it]

  Epoch 404/600 | Train Loss=5.1309 | Train Acc=95.78% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 405/600 [31:49<15:17,  4.71s/it]

  Epoch 405/600 | Train Loss=5.1277 | Train Acc=95.80% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 406/600 [31:54<15:08,  4.68s/it]

  Epoch 406/600 | Train Loss=5.1300 | Train Acc=95.86% | Test Acc=97.17%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 407/600 [31:59<15:01,  4.67s/it]

  Epoch 407/600 | Train Loss=5.1316 | Train Acc=95.83% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 408/600 [32:04<15:29,  4.84s/it]

  Epoch 408/600 | Train Loss=5.1303 | Train Acc=95.82% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 409/600 [32:09<15:34,  4.89s/it]

  Epoch 409/600 | Train Loss=5.1298 | Train Acc=95.79% | Test Acc=97.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 410/600 [32:14<15:35,  4.93s/it]

  Epoch 410/600 | Train Loss=5.1346 | Train Acc=95.68% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  68%|██████▊   | 411/600 [32:18<15:17,  4.85s/it]

  Epoch 411/600 | Train Loss=5.1295 | Train Acc=95.85% | Test Acc=97.33%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  69%|██████▊   | 412/600 [32:23<14:59,  4.79s/it]

  Epoch 412/600 | Train Loss=5.1341 | Train Acc=95.76% | Test Acc=97.33%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  69%|██████▉   | 413/600 [32:28<14:46,  4.74s/it]

  Epoch 413/600 | Train Loss=5.1323 | Train Acc=95.80% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  69%|██████▉   | 414/600 [32:32<14:35,  4.71s/it]

  Epoch 414/600 | Train Loss=5.1297 | Train Acc=95.84% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  69%|██████▉   | 415/600 [32:38<14:53,  4.83s/it]

  Epoch 415/600 | Train Loss=5.1346 | Train Acc=95.75% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  69%|██████▉   | 416/600 [32:42<14:37,  4.77s/it]

  Epoch 416/600 | Train Loss=5.1274 | Train Acc=95.91% | Test Acc=97.38%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|██████▉   | 417/600 [32:47<14:27,  4.74s/it]

  Epoch 417/600 | Train Loss=5.1242 | Train Acc=95.96% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|██████▉   | 418/600 [32:51<14:17,  4.71s/it]

  Epoch 418/600 | Train Loss=5.1283 | Train Acc=95.83% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|██████▉   | 419/600 [32:56<14:09,  4.69s/it]

  Epoch 419/600 | Train Loss=5.1317 | Train Acc=95.77% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|███████   | 420/600 [33:01<14:02,  4.68s/it]

  Epoch 420/600 | Train Loss=5.1259 | Train Acc=95.93% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|███████   | 421/600 [33:06<14:16,  4.78s/it]

  Epoch 421/600 | Train Loss=5.1235 | Train Acc=95.94% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|███████   | 422/600 [33:11<14:42,  4.96s/it]

  Epoch 422/600 | Train Loss=5.1244 | Train Acc=95.95% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  70%|███████   | 423/600 [33:16<14:33,  4.93s/it]

  Epoch 423/600 | Train Loss=5.1272 | Train Acc=95.83% | Test Acc=97.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  71%|███████   | 424/600 [33:21<14:15,  4.86s/it]

  Epoch 424/600 | Train Loss=5.1223 | Train Acc=96.02% | Test Acc=97.70%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  71%|███████   | 425/600 [33:25<13:57,  4.79s/it]

  Epoch 425/600 | Train Loss=5.1264 | Train Acc=95.82% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  71%|███████   | 426/600 [33:30<13:46,  4.75s/it]

  Epoch 426/600 | Train Loss=5.1233 | Train Acc=95.93% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  71%|███████   | 427/600 [33:35<13:34,  4.71s/it]

  Epoch 427/600 | Train Loss=5.1281 | Train Acc=95.80% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  71%|███████▏  | 428/600 [33:40<13:45,  4.80s/it]

  Epoch 428/600 | Train Loss=5.1270 | Train Acc=95.84% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 429/600 [33:44<13:32,  4.75s/it]

  Epoch 429/600 | Train Loss=5.1223 | Train Acc=95.92% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 430/600 [33:49<13:25,  4.74s/it]

  Epoch 430/600 | Train Loss=5.1303 | Train Acc=95.90% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 431/600 [33:54<13:16,  4.71s/it]

  Epoch 431/600 | Train Loss=5.1274 | Train Acc=95.89% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 432/600 [33:58<13:08,  4.69s/it]

  Epoch 432/600 | Train Loss=5.1243 | Train Acc=95.84% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 433/600 [34:03<13:20,  4.79s/it]

  Epoch 433/600 | Train Loss=5.1279 | Train Acc=95.78% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▏  | 434/600 [34:08<13:25,  4.85s/it]

  Epoch 434/600 | Train Loss=5.1253 | Train Acc=95.90% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  72%|███████▎  | 435/600 [34:14<13:42,  4.99s/it]

  Epoch 435/600 | Train Loss=5.1234 | Train Acc=95.88% | Test Acc=97.64%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  73%|███████▎  | 436/600 [34:18<13:33,  4.96s/it]

  Epoch 436/600 | Train Loss=5.1284 | Train Acc=95.74% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  73%|███████▎  | 437/600 [34:23<13:12,  4.86s/it]

  Epoch 437/600 | Train Loss=5.1256 | Train Acc=95.83% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  73%|███████▎  | 438/600 [34:28<12:59,  4.81s/it]

  Epoch 438/600 | Train Loss=5.1290 | Train Acc=95.78% | Test Acc=97.75%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  73%|███████▎  | 439/600 [34:32<12:46,  4.76s/it]

  Epoch 439/600 | Train Loss=5.1256 | Train Acc=95.77% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  73%|███████▎  | 440/600 [34:37<12:37,  4.73s/it]

  Epoch 440/600 | Train Loss=5.1234 | Train Acc=95.95% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▎  | 441/600 [34:42<12:28,  4.71s/it]

  Epoch 441/600 | Train Loss=5.1282 | Train Acc=95.75% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▎  | 442/600 [34:47<12:45,  4.85s/it]

  Epoch 442/600 | Train Loss=5.1249 | Train Acc=95.90% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▍  | 443/600 [34:52<12:29,  4.77s/it]

  Epoch 443/600 | Train Loss=5.1235 | Train Acc=95.97% | Test Acc=97.42%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▍  | 444/600 [34:56<12:19,  4.74s/it]

  Epoch 444/600 | Train Loss=5.1228 | Train Acc=95.82% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▍  | 445/600 [35:01<12:08,  4.70s/it]

  Epoch 445/600 | Train Loss=5.1239 | Train Acc=95.92% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▍  | 446/600 [35:06<12:19,  4.80s/it]

  Epoch 446/600 | Train Loss=5.1279 | Train Acc=95.86% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  74%|███████▍  | 447/600 [35:11<12:21,  4.84s/it]

  Epoch 447/600 | Train Loss=5.1238 | Train Acc=95.91% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  75%|███████▍  | 448/600 [35:16<12:26,  4.91s/it]

  Epoch 448/600 | Train Loss=5.1271 | Train Acc=95.80% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  75%|███████▍  | 449/600 [35:21<12:33,  4.99s/it]

  Epoch 449/600 | Train Loss=5.1178 | Train Acc=95.93% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  75%|███████▌  | 450/600 [35:26<12:10,  4.87s/it]

  Epoch 450/600 | Train Loss=5.1246 | Train Acc=95.89% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  75%|███████▌  | 451/600 [35:30<11:57,  4.81s/it]

  Epoch 451/600 | Train Loss=5.1222 | Train Acc=95.78% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  75%|███████▌  | 452/600 [35:35<11:42,  4.75s/it]

  Epoch 452/600 | Train Loss=5.1267 | Train Acc=95.89% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▌  | 453/600 [35:40<11:35,  4.73s/it]

  Epoch 453/600 | Train Loss=5.1268 | Train Acc=95.80% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▌  | 454/600 [35:44<11:25,  4.70s/it]

  Epoch 454/600 | Train Loss=5.1250 | Train Acc=95.93% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▌  | 455/600 [35:49<11:40,  4.83s/it]

  Epoch 455/600 | Train Loss=5.1244 | Train Acc=95.88% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▌  | 456/600 [35:54<11:26,  4.77s/it]

  Epoch 456/600 | Train Loss=5.1240 | Train Acc=95.89% | Test Acc=97.33%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▌  | 457/600 [35:59<11:16,  4.73s/it]

  Epoch 457/600 | Train Loss=5.1257 | Train Acc=95.78% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▋  | 458/600 [36:03<11:05,  4.68s/it]

  Epoch 458/600 | Train Loss=5.1247 | Train Acc=95.89% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  76%|███████▋  | 459/600 [36:08<11:11,  4.76s/it]

  Epoch 459/600 | Train Loss=5.1248 | Train Acc=95.85% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  77%|███████▋  | 460/600 [36:13<11:14,  4.82s/it]

  Epoch 460/600 | Train Loss=5.1204 | Train Acc=95.99% | Test Acc=97.37%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  77%|███████▋  | 461/600 [36:18<11:13,  4.84s/it]

  Epoch 461/600 | Train Loss=5.1236 | Train Acc=95.67% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  77%|███████▋  | 462/600 [36:23<11:19,  4.92s/it]

  Epoch 462/600 | Train Loss=5.1235 | Train Acc=95.77% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  77%|███████▋  | 463/600 [36:28<11:03,  4.84s/it]

  Epoch 463/600 | Train Loss=5.1227 | Train Acc=95.85% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  77%|███████▋  | 464/600 [36:32<10:48,  4.77s/it]

  Epoch 464/600 | Train Loss=5.1185 | Train Acc=95.94% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 465/600 [36:37<10:40,  4.75s/it]

  Epoch 465/600 | Train Loss=5.1211 | Train Acc=95.86% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 466/600 [36:42<10:32,  4.72s/it]

  Epoch 466/600 | Train Loss=5.1206 | Train Acc=95.84% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 467/600 [36:46<10:26,  4.71s/it]

  Epoch 467/600 | Train Loss=5.1181 | Train Acc=95.94% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 468/600 [36:51<10:16,  4.67s/it]

  Epoch 468/600 | Train Loss=5.1205 | Train Acc=95.94% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 469/600 [36:56<10:26,  4.78s/it]

  Epoch 469/600 | Train Loss=5.1231 | Train Acc=95.77% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 470/600 [37:01<10:17,  4.75s/it]

  Epoch 470/600 | Train Loss=5.1170 | Train Acc=95.92% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  78%|███████▊  | 471/600 [37:05<10:06,  4.70s/it]

  Epoch 471/600 | Train Loss=5.1201 | Train Acc=95.86% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  79%|███████▊  | 472/600 [37:10<10:13,  4.80s/it]

  Epoch 472/600 | Train Loss=5.1204 | Train Acc=95.99% | Test Acc=97.42%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  79%|███████▉  | 473/600 [37:15<10:16,  4.85s/it]

  Epoch 473/600 | Train Loss=5.1195 | Train Acc=95.89% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  79%|███████▉  | 474/600 [37:20<10:18,  4.91s/it]

  Epoch 474/600 | Train Loss=5.1280 | Train Acc=95.71% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  79%|███████▉  | 475/600 [37:25<10:04,  4.83s/it]

  Epoch 475/600 | Train Loss=5.1192 | Train Acc=95.81% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  79%|███████▉  | 476/600 [37:30<10:06,  4.89s/it]

  Epoch 476/600 | Train Loss=5.1201 | Train Acc=95.89% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|███████▉  | 477/600 [37:35<09:52,  4.81s/it]

  Epoch 477/600 | Train Loss=5.1171 | Train Acc=95.97% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|███████▉  | 478/600 [37:39<09:42,  4.78s/it]

  Epoch 478/600 | Train Loss=5.1218 | Train Acc=95.77% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|███████▉  | 479/600 [37:44<09:32,  4.73s/it]

  Epoch 479/600 | Train Loss=5.1226 | Train Acc=95.83% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|████████  | 480/600 [37:49<09:25,  4.71s/it]

  Epoch 480/600 | Train Loss=5.1187 | Train Acc=95.90% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|████████  | 481/600 [37:53<09:18,  4.69s/it]

  Epoch 481/600 | Train Loss=5.1150 | Train Acc=95.92% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|████████  | 482/600 [37:58<09:25,  4.79s/it]

  Epoch 482/600 | Train Loss=5.1209 | Train Acc=95.80% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  80%|████████  | 483/600 [38:03<09:17,  4.76s/it]

  Epoch 483/600 | Train Loss=5.1167 | Train Acc=95.92% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  81%|████████  | 484/600 [38:08<09:08,  4.73s/it]

  Epoch 484/600 | Train Loss=5.1212 | Train Acc=95.87% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  81%|████████  | 485/600 [38:13<09:12,  4.81s/it]

  Epoch 485/600 | Train Loss=5.1175 | Train Acc=95.90% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  81%|████████  | 486/600 [38:18<09:12,  4.85s/it]

  Epoch 486/600 | Train Loss=5.1140 | Train Acc=95.98% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  81%|████████  | 487/600 [38:22<09:10,  4.87s/it]

  Epoch 487/600 | Train Loss=5.1186 | Train Acc=96.01% | Test Acc=97.35%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  81%|████████▏ | 488/600 [38:27<08:57,  4.80s/it]

  Epoch 488/600 | Train Loss=5.1191 | Train Acc=95.85% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 489/600 [38:32<09:01,  4.88s/it]

  Epoch 489/600 | Train Loss=5.1211 | Train Acc=95.87% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 490/600 [38:37<08:50,  4.82s/it]

  Epoch 490/600 | Train Loss=5.1153 | Train Acc=96.03% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 491/600 [38:41<08:39,  4.76s/it]

  Epoch 491/600 | Train Loss=5.1158 | Train Acc=96.00% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 492/600 [38:46<08:30,  4.73s/it]

  Epoch 492/600 | Train Loss=5.1187 | Train Acc=95.86% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 493/600 [38:51<08:23,  4.70s/it]

  Epoch 493/600 | Train Loss=5.1194 | Train Acc=95.90% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▏ | 494/600 [38:55<08:15,  4.68s/it]

  Epoch 494/600 | Train Loss=5.1166 | Train Acc=95.89% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  82%|████████▎ | 495/600 [39:00<08:10,  4.67s/it]

  Epoch 495/600 | Train Loss=5.1218 | Train Acc=95.87% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  83%|████████▎ | 496/600 [39:05<08:17,  4.79s/it]

  Epoch 496/600 | Train Loss=5.1167 | Train Acc=95.92% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  83%|████████▎ | 497/600 [39:10<08:21,  4.87s/it]

  Epoch 497/600 | Train Loss=5.1199 | Train Acc=95.81% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  83%|████████▎ | 498/600 [39:15<08:08,  4.79s/it]

  Epoch 498/600 | Train Loss=5.1196 | Train Acc=95.92% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  83%|████████▎ | 499/600 [39:19<08:00,  4.76s/it]

  Epoch 499/600 | Train Loss=5.1154 | Train Acc=95.92% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  83%|████████▎ | 500/600 [39:24<07:52,  4.73s/it]

  Epoch 500/600 | Train Loss=5.1207 | Train Acc=95.88% | Test Acc=97.16%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▎ | 501/600 [39:29<07:46,  4.71s/it]

  Epoch 501/600 | Train Loss=5.1159 | Train Acc=95.83% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▎ | 502/600 [39:33<07:37,  4.66s/it]

  Epoch 502/600 | Train Loss=5.1190 | Train Acc=95.88% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▍ | 503/600 [39:38<07:45,  4.80s/it]

  Epoch 503/600 | Train Loss=5.1209 | Train Acc=95.87% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▍ | 504/600 [39:43<07:36,  4.75s/it]

  Epoch 504/600 | Train Loss=5.1168 | Train Acc=95.88% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▍ | 505/600 [39:48<07:32,  4.76s/it]

  Epoch 505/600 | Train Loss=5.1178 | Train Acc=95.97% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▍ | 506/600 [39:53<07:24,  4.73s/it]

  Epoch 506/600 | Train Loss=5.1156 | Train Acc=95.98% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  84%|████████▍ | 507/600 [39:57<07:17,  4.71s/it]

  Epoch 507/600 | Train Loss=5.1228 | Train Acc=95.73% | Test Acc=97.36%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  85%|████████▍ | 508/600 [40:02<07:10,  4.68s/it]

  Epoch 508/600 | Train Loss=5.1172 | Train Acc=95.91% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  85%|████████▍ | 509/600 [40:06<07:06,  4.69s/it]

  Epoch 509/600 | Train Loss=5.1178 | Train Acc=95.99% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  85%|████████▌ | 510/600 [40:11<07:08,  4.76s/it]

  Epoch 510/600 | Train Loss=5.1158 | Train Acc=96.02% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  85%|████████▌ | 511/600 [40:16<07:11,  4.85s/it]

  Epoch 511/600 | Train Loss=5.1174 | Train Acc=95.94% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  85%|████████▌ | 512/600 [40:21<07:10,  4.89s/it]

  Epoch 512/600 | Train Loss=5.1160 | Train Acc=95.95% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▌ | 513/600 [40:26<07:07,  4.91s/it]

  Epoch 513/600 | Train Loss=5.1188 | Train Acc=95.93% | Test Acc=97.50%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▌ | 514/600 [40:31<06:55,  4.83s/it]

  Epoch 514/600 | Train Loss=5.1178 | Train Acc=95.94% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▌ | 515/600 [40:36<06:46,  4.78s/it]

  Epoch 515/600 | Train Loss=5.1149 | Train Acc=96.02% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▌ | 516/600 [40:41<06:50,  4.89s/it]

  Epoch 516/600 | Train Loss=5.1142 | Train Acc=96.08% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▌ | 517/600 [40:45<06:38,  4.80s/it]

  Epoch 517/600 | Train Loss=5.1167 | Train Acc=95.90% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▋ | 518/600 [40:50<06:30,  4.76s/it]

  Epoch 518/600 | Train Loss=5.1124 | Train Acc=96.02% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  86%|████████▋ | 519/600 [40:55<06:22,  4.73s/it]

  Epoch 519/600 | Train Loss=5.1187 | Train Acc=95.86% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  87%|████████▋ | 520/600 [40:59<06:16,  4.71s/it]

  Epoch 520/600 | Train Loss=5.1178 | Train Acc=95.94% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  87%|████████▋ | 521/600 [41:04<06:09,  4.67s/it]

  Epoch 521/600 | Train Loss=5.1174 | Train Acc=95.91% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  87%|████████▋ | 522/600 [41:09<06:04,  4.67s/it]

  Epoch 522/600 | Train Loss=5.1179 | Train Acc=95.86% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  87%|████████▋ | 523/600 [41:14<06:08,  4.78s/it]

  Epoch 523/600 | Train Loss=5.1155 | Train Acc=95.95% | Test Acc=97.39%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  87%|████████▋ | 524/600 [41:19<06:09,  4.86s/it]

  Epoch 524/600 | Train Loss=5.1192 | Train Acc=95.79% | Test Acc=97.54%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 525/600 [41:24<06:05,  4.88s/it]

  Epoch 525/600 | Train Loss=5.1188 | Train Acc=95.85% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 526/600 [41:28<05:56,  4.81s/it]

  Epoch 526/600 | Train Loss=5.1143 | Train Acc=95.97% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 527/600 [41:33<05:47,  4.77s/it]

  Epoch 527/600 | Train Loss=5.1173 | Train Acc=96.05% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 528/600 [41:38<05:41,  4.74s/it]

  Epoch 528/600 | Train Loss=5.1150 | Train Acc=96.01% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 529/600 [41:42<05:33,  4.70s/it]

  Epoch 529/600 | Train Loss=5.1203 | Train Acc=95.87% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 530/600 [41:47<05:38,  4.83s/it]

  Epoch 530/600 | Train Loss=5.1135 | Train Acc=95.95% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  88%|████████▊ | 531/600 [41:52<05:29,  4.77s/it]

  Epoch 531/600 | Train Loss=5.1195 | Train Acc=95.89% | Test Acc=97.32%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  89%|████████▊ | 532/600 [41:57<05:22,  4.74s/it]

  Epoch 532/600 | Train Loss=5.1163 | Train Acc=95.87% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  89%|████████▉ | 533/600 [42:01<05:14,  4.70s/it]

  Epoch 533/600 | Train Loss=5.1140 | Train Acc=96.01% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  89%|████████▉ | 534/600 [42:06<05:08,  4.67s/it]

  Epoch 534/600 | Train Loss=5.1203 | Train Acc=95.78% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  89%|████████▉ | 535/600 [42:11<05:02,  4.66s/it]

  Epoch 535/600 | Train Loss=5.1170 | Train Acc=96.00% | Test Acc=97.30%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  89%|████████▉ | 536/600 [42:16<05:04,  4.76s/it]

  Epoch 536/600 | Train Loss=5.1178 | Train Acc=95.89% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|████████▉ | 537/600 [42:21<05:13,  4.98s/it]

  Epoch 537/600 | Train Loss=5.1156 | Train Acc=95.95% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|████████▉ | 538/600 [42:26<05:08,  4.98s/it]

  Epoch 538/600 | Train Loss=5.1160 | Train Acc=95.97% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|████████▉ | 539/600 [42:31<04:57,  4.88s/it]

  Epoch 539/600 | Train Loss=5.1155 | Train Acc=96.03% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|█████████ | 540/600 [42:35<04:47,  4.79s/it]

  Epoch 540/600 | Train Loss=5.1191 | Train Acc=95.92% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|█████████ | 541/600 [42:40<04:41,  4.77s/it]

  Epoch 541/600 | Train Loss=5.1168 | Train Acc=95.90% | Test Acc=97.49%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|█████████ | 542/600 [42:45<04:34,  4.73s/it]

  Epoch 542/600 | Train Loss=5.1148 | Train Acc=96.01% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  90%|█████████ | 543/600 [42:50<04:34,  4.82s/it]

  Epoch 543/600 | Train Loss=5.1167 | Train Acc=95.95% | Test Acc=97.60%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  91%|█████████ | 544/600 [42:54<04:27,  4.77s/it]

  Epoch 544/600 | Train Loss=5.1193 | Train Acc=95.78% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  91%|█████████ | 545/600 [42:59<04:20,  4.74s/it]

  Epoch 545/600 | Train Loss=5.1143 | Train Acc=95.95% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  91%|█████████ | 546/600 [43:04<04:13,  4.70s/it]

  Epoch 546/600 | Train Loss=5.1162 | Train Acc=95.93% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  91%|█████████ | 547/600 [43:08<04:08,  4.69s/it]

  Epoch 547/600 | Train Loss=5.1191 | Train Acc=95.88% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  91%|█████████▏| 548/600 [43:13<04:02,  4.67s/it]

  Epoch 548/600 | Train Loss=5.1202 | Train Acc=95.81% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 549/600 [43:18<04:03,  4.78s/it]

  Epoch 549/600 | Train Loss=5.1163 | Train Acc=96.01% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 550/600 [43:23<04:07,  4.95s/it]

  Epoch 550/600 | Train Loss=5.1161 | Train Acc=95.99% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 551/600 [43:28<04:02,  4.95s/it]

  Epoch 551/600 | Train Loss=5.1124 | Train Acc=95.97% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 552/600 [43:33<03:52,  4.85s/it]

  Epoch 552/600 | Train Loss=5.1191 | Train Acc=95.87% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 553/600 [43:37<03:45,  4.80s/it]

  Epoch 553/600 | Train Loss=5.1125 | Train Acc=96.02% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▏| 554/600 [43:42<03:38,  4.74s/it]

  Epoch 554/600 | Train Loss=5.1150 | Train Acc=96.03% | Test Acc=97.63%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  92%|█████████▎| 555/600 [43:47<03:32,  4.72s/it]

  Epoch 555/600 | Train Loss=5.1126 | Train Acc=96.07% | Test Acc=97.56%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  93%|█████████▎| 556/600 [43:51<03:26,  4.69s/it]

  Epoch 556/600 | Train Loss=5.1159 | Train Acc=95.86% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  93%|█████████▎| 557/600 [43:57<03:27,  4.83s/it]

  Epoch 557/600 | Train Loss=5.1170 | Train Acc=95.93% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  93%|█████████▎| 558/600 [44:01<03:19,  4.76s/it]

  Epoch 558/600 | Train Loss=5.1143 | Train Acc=95.93% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  93%|█████████▎| 559/600 [44:06<03:12,  4.70s/it]

  Epoch 559/600 | Train Loss=5.1146 | Train Acc=95.98% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  93%|█████████▎| 560/600 [44:10<03:07,  4.68s/it]

  Epoch 560/600 | Train Loss=5.1222 | Train Acc=95.78% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▎| 561/600 [44:15<03:00,  4.63s/it]

  Epoch 561/600 | Train Loss=5.1162 | Train Acc=95.97% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▎| 562/600 [44:20<03:00,  4.74s/it]

  Epoch 562/600 | Train Loss=5.1162 | Train Acc=95.97% | Test Acc=97.41%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▍| 563/600 [44:25<03:02,  4.93s/it]

  Epoch 563/600 | Train Loss=5.1122 | Train Acc=96.09% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▍| 564/600 [44:31<03:01,  5.05s/it]

  Epoch 564/600 | Train Loss=5.1167 | Train Acc=95.85% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▍| 565/600 [44:35<02:51,  4.91s/it]

  Epoch 565/600 | Train Loss=5.1148 | Train Acc=95.91% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▍| 566/600 [44:40<02:44,  4.85s/it]

  Epoch 566/600 | Train Loss=5.1102 | Train Acc=96.11% | Test Acc=97.53%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  94%|█████████▍| 567/600 [44:44<02:37,  4.77s/it]

  Epoch 567/600 | Train Loss=5.1143 | Train Acc=95.93% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  95%|█████████▍| 568/600 [44:49<02:31,  4.74s/it]

  Epoch 568/600 | Train Loss=5.1205 | Train Acc=95.80% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  95%|█████████▍| 569/600 [44:54<02:26,  4.71s/it]

  Epoch 569/600 | Train Loss=5.1165 | Train Acc=95.80% | Test Acc=97.66%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  95%|█████████▌| 570/600 [44:59<02:23,  4.79s/it]

  Epoch 570/600 | Train Loss=5.1197 | Train Acc=95.86% | Test Acc=97.47%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  95%|█████████▌| 571/600 [45:03<02:18,  4.78s/it]

  Epoch 571/600 | Train Loss=5.1102 | Train Acc=96.08% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  95%|█████████▌| 572/600 [45:08<02:13,  4.76s/it]

  Epoch 572/600 | Train Loss=5.1213 | Train Acc=95.83% | Test Acc=97.46%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▌| 573/600 [45:13<02:07,  4.71s/it]

  Epoch 573/600 | Train Loss=5.1133 | Train Acc=95.95% | Test Acc=97.45%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▌| 574/600 [45:18<02:05,  4.81s/it]

  Epoch 574/600 | Train Loss=5.1164 | Train Acc=95.90% | Test Acc=97.65%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▌| 575/600 [45:23<02:02,  4.88s/it]

  Epoch 575/600 | Train Loss=5.1172 | Train Acc=95.87% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▌| 576/600 [45:28<01:57,  4.90s/it]

  Epoch 576/600 | Train Loss=5.1132 | Train Acc=95.91% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▌| 577/600 [45:33<01:55,  5.04s/it]

  Epoch 577/600 | Train Loss=5.1123 | Train Acc=95.99% | Test Acc=97.27%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▋| 578/600 [45:38<01:48,  4.95s/it]

  Epoch 578/600 | Train Loss=5.1094 | Train Acc=96.08% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  96%|█████████▋| 579/600 [45:43<01:41,  4.85s/it]

  Epoch 579/600 | Train Loss=5.1115 | Train Acc=96.04% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  97%|█████████▋| 580/600 [45:47<01:35,  4.79s/it]

  Epoch 580/600 | Train Loss=5.1143 | Train Acc=95.87% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  97%|█████████▋| 581/600 [45:52<01:30,  4.74s/it]

  Epoch 581/600 | Train Loss=5.1208 | Train Acc=95.67% | Test Acc=97.43%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  97%|█████████▋| 582/600 [45:57<01:25,  4.73s/it]

  Epoch 582/600 | Train Loss=5.1112 | Train Acc=96.12% | Test Acc=97.34%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  97%|█████████▋| 583/600 [46:01<01:19,  4.69s/it]

  Epoch 583/600 | Train Loss=5.1170 | Train Acc=95.88% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  97%|█████████▋| 584/600 [46:06<01:17,  4.82s/it]

  Epoch 584/600 | Train Loss=5.1134 | Train Acc=95.97% | Test Acc=97.58%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 585/600 [46:11<01:11,  4.75s/it]

  Epoch 585/600 | Train Loss=5.1171 | Train Acc=95.83% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 586/600 [46:15<01:05,  4.70s/it]

  Epoch 586/600 | Train Loss=5.1118 | Train Acc=96.05% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 587/600 [46:20<01:02,  4.81s/it]

  Epoch 587/600 | Train Loss=5.1148 | Train Acc=95.90% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 588/600 [46:25<00:58,  4.84s/it]

  Epoch 588/600 | Train Loss=5.1112 | Train Acc=96.06% | Test Acc=97.48%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 589/600 [46:30<00:53,  4.86s/it]

  Epoch 589/600 | Train Loss=5.1151 | Train Acc=95.98% | Test Acc=97.40%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 590/600 [46:35<00:48,  4.86s/it]

  Epoch 590/600 | Train Loss=5.1106 | Train Acc=96.03% | Test Acc=97.61%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  98%|█████████▊| 591/600 [46:40<00:43,  4.85s/it]

  Epoch 591/600 | Train Loss=5.1151 | Train Acc=95.87% | Test Acc=97.55%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  99%|█████████▊| 592/600 [46:45<00:38,  4.80s/it]

  Epoch 592/600 | Train Loss=5.1128 | Train Acc=95.88% | Test Acc=97.62%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  99%|█████████▉| 593/600 [46:49<00:33,  4.75s/it]

  Epoch 593/600 | Train Loss=5.1125 | Train Acc=95.94% | Test Acc=97.44%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  99%|█████████▉| 594/600 [46:54<00:28,  4.71s/it]

  Epoch 594/600 | Train Loss=5.1118 | Train Acc=95.89% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  99%|█████████▉| 595/600 [46:59<00:23,  4.68s/it]

  Epoch 595/600 | Train Loss=5.1176 | Train Acc=95.87% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03:  99%|█████████▉| 596/600 [47:03<00:18,  4.66s/it]

  Epoch 596/600 | Train Loss=5.1139 | Train Acc=95.99% | Test Acc=97.52%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03: 100%|█████████▉| 597/600 [47:08<00:14,  4.80s/it]

  Epoch 597/600 | Train Loss=5.1172 | Train Acc=95.87% | Test Acc=97.51%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03: 100%|█████████▉| 598/600 [47:13<00:09,  4.76s/it]

  Epoch 598/600 | Train Loss=5.1125 | Train Acc=96.06% | Test Acc=97.59%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03: 100%|█████████▉| 599/600 [47:18<00:04,  4.74s/it]

  Epoch 599/600 | Train Loss=5.1135 | Train Acc=95.99% | Test Acc=97.57%


BBB π=0.75 σ₁=1.4e-01 σ₂=2.5e-03: 100%|██████████| 600/600 [47:23<00:00,  4.74s/it]

  Epoch 600/600 | Train Loss=5.1152 | Train Acc=95.81% | Test Acc=97.60%
✅ BBB π=0.75, σ₁=1.4e-01, σ₂=2.5e-03 → Best Test Acc = 97.78%





### 📈 Final Evaluation Summary

This section prints the final test results for all models:

#### ✅ Bayes by Backprop
- Displays best test accuracy for each combination of prior parameters:
  - π (mixture weight)
  - σ₁ and σ₂ (standard deviations of the Gaussian mixture prior)

#### ✅ TensorFlow Baseline Models
- Shows test accuracy for:
  - **Standard SGD** (no regularization)
  - **SGD + Dropout** (Dropout probability = 0.5)

The results allow direct comparison between Bayesian and non-Bayesian training strategies on the MNIST dataset.

In [13]:
# Print results for Bayes by Backprop
print("\n📊 Bayes by Backprop Results (π, σ₁, σ₂, Best Test Accuracy):")
for pi, s1, s2, acc in results_bbb:
    print(f"  π={pi:.2f}, σ₁={s1:.4e}, σ₂={s2:.4e}  →  Best Test Acc = {acc:.2f}%")

# Convert TensorFlow baseline accuracies from [0, 1] to percentages
acc_baseline_sgd = acc_sgd * 100
acc_baseline_do  = acc_do  * 100

# Print results for baseline TensorFlow models
print("\n📊 Baseline TF Results:")
print(f"  SGD without regularization       → Best Test Acc = {acc_baseline_sgd:.2f}%")
print(f"  SGD + Dropout (p=0.5)             → Best Test Acc = {acc_baseline_do:.2f}%")


📊 Bayes by Backprop Results (π, σ₁, σ₂, Best Test Accuracy):
  π=0.75, σ₁=1.3534e-01, σ₂=2.4788e-03  →  Best Test Acc = 97.78%

📊 Baseline TF Results:
  SGD without regularization       → Best Test Acc = 97.34%
  SGD + Dropout (p=0.5)             → Best Test Acc = 97.44%
