<a href="https://colab.research.google.com/github/JunehanLee/Bayesian-Model-Misspeicification/blob/main/Experiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  Experimental Design: Structural Misspecification in Choice Models



## Experiment Overview

This notebook evaluates the robustness and performance of discrete choice models under different types of model misspecification. We explore two key scenarios:

---

## **Scenario 1: Structural Misspecification**

We begin with a simple baseline logistic regression model and assess the impact of adding one structural component at a time. This helps us understand how each feature affects model performance, calibration, and posterior uncertainty.

### Components added individually:

- **Group-level effect**  
  Models unobserved heterogeneity by allowing group-specific intercepts or coefficients.

- **Interaction term**  
  Captures multiplicative relationships between features (e.g., $x_0 \cdot x_1$).

- **Higher-order nonlinear term**  
  Introduces nonlinearity (e.g., $x_0^2$) to model more complex decision surfaces.

We compare models based on predictive accuracy, uncertainty quantification, and calibration to evaluate how structural misspecification influences inference.

---

##  **Scenario 2: Misspecified Error Distributions**

In this scenario, we fix the model structure and vary the error distribution in the data-generating process to simulate noise misspecification. The model is always fit using a logistic (Gumbel-distributed error) likelihood, but the true data-generating errors differ.

###  Error distributions considered:

- **Normal (probit-like behavior)**  
  Well-behaved and symmetric.

- **Student's t (df=3)**  
  Heavy-tailed distribution, capturing extreme values.

- **Cauchy**  
  Extremely heavy-tailed with undefined moments — models extreme outliers.

- **Normal-contaminated**  
  Mixture of 90% standard normal and 10% noise with large variance (e.g., $\mathcal{N}(0, 25)$).

By fitting the same model to data with these different noise structures, we assess robustness of posterior inference, calibration, and classification accuracy under error term misspecification.

---

Each scenario is designed to isolate specific sources of model misspecification and test the effectiveness of Bayesian and Generalized Bayesian approaches under controlled deviations from the ideal model assumptions.





##  Experimental Setup



###  True Data Generating Process (DGP)

The data is generated using a utility model that includes **all three structural components**:

$u_i = X_i \beta_{g[i]} + \gamma_1 (x_{i0} \cdot x_{i1}) + \gamma_2 x_{i0}^2 + z_i, \quad z_i \sim \mathcal{N}(0, 1)
$, $
y_i = \mathbb{1}(u_i > 0)
$

- $X_i$: observed features
- $\beta_{g[i]}$: group-specific coefficients
- $z_i$: latent noise

---



###  Model Variants

Each model begins from the same **baseline model (M0)** and adds only one component at a time:

| Model Name | Included Components |
|------------|---------------------|
| **M0** (Baseline) | Linear terms only, no group structure |
| **M1** | M0 + Group-level effect |
| **M2** | M0 + Interaction term $(x_0 \cdot x_1)$ |
| **M3** | M0 + Nonlinear term $(x_0^2)$ |
| **Mf** | Model with every term |

This structure allows us to isolate the effect of each component.

---


##  Methodology



###  Dataset Description

The dataset simulates a realistic marketing scenario where customers are exposed to targeted campaigns. Each row corresponds to an individual customer and includes both behavioral features and treatment indicators.

| Variable               | Description                                                                 |
|------------------------|-----------------------------------------------------------------------------|
| `group_id` / `group_label` | Customer tier (Bronze, Silver, Gold, Platinum, VIP)                      |
| `logins_last_week`     | Number of logins in the past 7 days (indicates user activity)              |
| `previous_purchases`   | Number of purchases in the past 30 days (baseline interest)                |
| `viewed_target_category` | Whether the customer viewed items in the target category (binary: 0/1)  |
| `discount_received`    | Whether the customer received a discount (binary treatment: 0/1)           |
| `y`                    | Binary purchase response (1 = purchase, 0 = no purchase)                   |


###  Model Implementation

- Use **PyMC** with **Bayesian inference (NUTS)** for all models.
- M0: shared β coefficients.
- M1: group-level β.
- M2: add interaction term to features.
- M3: add squared term to features.
- Mf: add all term to features.



###  Inference Procedure

- Sample posterior using MCMC.
- Compute posterior predictive probabilities for test set.

---



##  Evaluation Metrics

| Metric | Description |
|--------|-------------|
| **Accuracy** | Test set classification performance |
| **Brier Score** | Measures quality of probabilistic prediction |
| **Log Loss** | Measures quality of probabilistic prediction |

---



##  Interpretation Goals

- Which component provides the largest improvement over the baseline?
- Do some components lead to overconfidence or poor calibration?
- How does Quasi-Bayes inference compare in robustness under each misspecification?

---



In [None]:
pip install pyro-ppl

Collecting pyro-ppl
  Downloading pyro_ppl-1.9.1-py3-none-any.whl.metadata (7.8 kB)
Collecting pyro-api>=0.1.1 (from pyro-ppl)
  Downloading pyro_api-0.1.2-py3-none-any.whl.metadata (2.5 kB)
Downloading pyro_ppl-1.9.1-py3-none-any.whl (755 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m756.0/756.0 kB[0m [31m21.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyro_api-0.1.2-py3-none-any.whl (11 kB)
Installing collected packages: pyro-api, pyro-ppl
Successfully installed pyro-api-0.1.2 pyro-ppl-1.9.1


In [None]:
# Import all packages

import numpy as np
import pandas as pd
from scipy.special import expit
import pymc as pm
import arviz as az
import matplotlib.pyplot as plt
from scipy.special import expit
import seaborn as sns
import matplotlib.pyplot as plt
import pytensor.tensor as pt
from sklearn.metrics import accuracy_score, roc_auc_score, brier_score_loss, log_loss
from sklearn.preprocessing import StandardScaler
from scipy.stats import norm
import arviz as az

import pyro
import pyro.infer.mcmc as pmcmc
from pyro.infer.mcmc import NUTS

# Data simulation


In [None]:
# Data simulation
# Set random seed for reproducibility
np.random.seed(50)

# Configuration
group_labels = ["Bronze", "Silver", "Gold", "Platinum", "VIP"]
n_groups = len(group_labels)
n_per_group = 10000  # observations per group

# Group-level multiplier: higher-tier customers are more responsive to discounts
group_effects = np.linspace(0.5, 2.0, n_groups)

# Empty list to store simulated rows
rows = []

for j in range(n_groups):
    group_name = group_labels[j]
    group_boost = group_effects[j]

    for _ in range(n_per_group):
        # Features
        logins_last_week = np.random.poisson(lam=5)                   # user activeness
        previous_purchases = np.random.poisson(lam=2)                 # prior purchase behavior
        viewed_target_category = np.random.binomial(1, p=0.5)         # relevance exposure
        discount_received = np.random.binomial(1, p=0.5)              # marketing treatment

        # True logit: combine all meaningful effects
        U = (
            -4.5
            + 0.1 * previous_purchases
            + 0.6 * viewed_target_category
            + 0.9 * discount_received
            + 3 * viewed_target_category * discount_received
            + 1.1 * group_boost * discount_received
            + 0.13 * previous_purchases**2
            + np.random.normal(0, 1)  # heavy-tailed noise
            )

        y = 1 if U >0 else 0

        # Store observation
        rows.append({
            "group_id": j,
            "group_label": group_name,
            "logins_last_week": logins_last_week,
            "previous_purchases": previous_purchases,
            "viewed_target_category": viewed_target_category,
            "discount_received": discount_received,
            "y": y
        })

# Create DataFrame
df_simulated_test = pd.DataFrame(rows)

df_simulated_test.head()

print(df_simulated_test['y'].value_counts())

y
0    35046
1    14954
Name: count, dtype: int64


In [None]:
# ✅ noise 생성 함수
def generate_noise(noise_type, size):
    if noise_type == "normal":
        return np.random.normal(0, 1, size=size)
    elif noise_type == "t":
        return np.random.standard_t(df=2, size=size)
    elif noise_type == "cauchy":
        return np.random.standard_cauchy(size=size)
    elif noise_type == "contaminated":
        base = np.random.normal(0, 1, size=size)
        outlier_idx = np.random.choice(size, size=int(0.1 * size), replace=False)
        base[outlier_idx] += np.random.normal(0, 5, size=len(outlier_idx))
        return base
    else:
        raise ValueError("Unsupported noise type")

# ✅ 데이터 생성 함수
def simulate_dataset(noise_type,n_per_group):
    rows = []
    noise_vector = generate_noise(noise_type, size=n_groups * n_per_group)
    noise_counter = 0  # noise 인덱스 추적

    for j in range(n_groups):
        group_name = group_labels[j]
        group_boost = group_effects[j]

        for _ in range(n_per_group):
            # Features
            logins_last_week = np.random.poisson(lam=5)
            previous_purchases = np.random.poisson(lam=2)
            viewed_target_category = np.random.binomial(1, p=0.5)
            discount_received = np.random.binomial(1, p=0.5)

            # True latent variable (U)
            U = (
                -4.5
                + 0.1 * previous_purchases
                + 0.6 * viewed_target_category
                + 0.9 * discount_received
                + 3 * viewed_target_category * discount_received
                + 1.1 * group_boost * discount_received
                + 0.13 * previous_purchases**2
                + np.random.normal(0, 1)  # heavy-tailed noise
                )
            noise_counter += 1

            # Probit outcome

            y = 1 if U >0 else 0

            rows.append({
                "group_id": j,
                "group_label": group_name,
                "logins_last_week": logins_last_week,
                "previous_purchases": previous_purchases,
                "viewed_target_category": viewed_target_category,
                "discount_received": discount_received,
                "y": y
            })
    df_simulated = pd.DataFrame(rows)
    scaler = StandardScaler()
    df_simulated[["logins_last_week", "previous_purchases"]] = scaler.fit_transform(
    df_simulated[["logins_last_week", "previous_purchases"]])
    return df_simulated


feature_cols = [
    "logins_last_week",
    "previous_purchases",
    "viewed_target_category",
    "discount_received"
]

##  Why Normalize Features in Bayesian Hierarchical Models?

Feature normalization helps ensure **stable inference and fair comparisons** in Bayesian and Quasi-Bayesian models.

---

###  Benefits of Normalization

- **Faster, more stable sampling**  
  NUTS/HMC samplers perform better when all features are on a similar scale.
  
- **Better prior behavior**  
  Priors like `β ~ Normal(0, 1)` assume standardized inputs.
  
- **More interpretable posteriors**  
  Coefficients are easier to compare when features are normalized.
  
- **Consistent across models**  
  Helps maintain numerical stability in both simple and complex structures.

---

###  When to Normalize

| Feature Type             | Normalize? |
|--------------------------|------------|
| Continuous (e.g. counts) |  Yes      |
| Binary or categorical    |  No       |

---

###  How to Normalize

Use **z-score standardization**:

$$
z = \frac{x - \mu}{\sigma}
$$

Apply to:
- `logins_last_week`
- `previous_purchases`

Leave binary features like `viewed_target_category`, `discount_received` unchanged.

---

###  Note

> While normalization may not change **relative model rankings**,  
> it improves **convergence, interpretability, and sampling efficiency** —  
> especially in hierarchical and Quasi-Bayesian models.

In [None]:
scaler = StandardScaler()
df_simulated_test[["logins_last_week", "previous_purchases"]] = scaler.fit_transform(
    df_simulated_test[["logins_last_week", "previous_purchases"]]
)

feature_cols = [
    "logins_last_week",
    "previous_purchases",
    "viewed_target_category",
    "discount_received"
]

# Scenario 1 : Utility misspecification

In [None]:
def phi(x):
    return 0.5 * (1 + pt.erf(x / pt.sqrt(2)))


def define_model(df, feature_cols, group_idx=None,
                 group=False, interaction=False, nonlinear=False):
    """
    Build the model structure (features, beta, eta) without likelihood.
    Likelihood should be added later (Bayesian or Quasi-Bayesian).
    """
    X = df[feature_cols].copy()

    # Add interaction term if enabled
    if interaction:
        X["interaction"] = df["viewed_target_category"] * df["discount_received"]

    # Add nonlinear term if enabled
    if nonlinear:
        X["purchases_squared"] = df["previous_purchases"] ** 2  # assume preprocessed if needed

    X_np = X.values.astype("float64")
    y = df["y"].values
    N, K = X.shape

    with pm.Model() as model:
        X_shared = pm.Data("X", X_np)
        if group:
            mu = pm.Normal("mu", mu=0, sigma=1, shape=K)
            sigma = pm.Exponential("sigma", lam=1.0, shape=K)
            beta = pm.Normal("beta", mu=mu, sigma=sigma, shape=(df["group_id"].nunique(), K))
            eta = pt.sum(X_shared * beta[group_idx], axis=1)
        else:
            beta = pm.Normal("beta", mu=0, sigma=1, shape=K)
            eta = pt.dot(X_shared, beta)

    return model, eta, y

In [None]:
def run_bayesian_model(model, eta, y, link="probit", **sample_kwargs):
    with model:
      eps = 1e-6  # 안정 범위

      if link == "probit":
          p_raw = phi(eta)
      elif link == "logit":
          p_raw = pm.math.sigmoid(eta)

      # 안정화된 확률
      p = pm.Deterministic("p", pm.math.clip(p_raw, eps, 1 - eps))
      pm.Bernoulli("y_obs", p=p, observed=y)
      trace = pm.sample(**sample_kwargs)
      # trace = sample_numpyro_nuts(**sample_kwargs)
    return trace



In [None]:
def loss_fn(y_true, y_pred, kind,alpha = None):
    p = pt.clip(y_pred, 1e-6, 1 - 1e-6)
    if kind == "bce":
        return -pt.sum(y_true * pt.log(p) + (1 - y_true) *pt.log(1 - p))
    elif kind == "squared":
        return pt.sum(pt.sqr(y_true - p))
    elif kind == "huber":
        delta = 1.0
        residual = y_true - p
        return pt.sum(pt.switch(pt.abs(residual) <= delta,
                                0.5 * residual**2,
                                delta * (pt.abs(residual) - 0.5 * delta)))
    elif kind == "sph":
        # Scaled pseudo-Huber:
        # ℓ_SPH,α(t) = α * sqrt(1 + α^2) * ( sqrt(1 + (t/α)^2 ) - 1 )
        t = y_true - p
        scale = alpha * pt.sqrt(1.0 + alpha**2)
        loss_i = scale * (pt.sqrt(1.0 + (t / alpha)**2) - 1.0)
        return pt.sum(loss_i)
    else:
        raise ValueError("Unknown loss kind")

In [None]:
def run_quasi_model(model, eta, y, loss_kind="bce", **sample_kwargs):

    with model:
        p = pm.Deterministic("p", phi(eta))

        step_kwargs = {}
        alpha_sq_rv = None
        alpha_det = None

        if loss_kind == "sph":
            # alpha^2 ~ Gamma(a, b): 약정보 시작 (원하면 a,b 조정)
            alpha_sq_rv = pm.Gamma("alpha_sq", alpha=1.0, beta=1.0)
            alpha_det = pm.Deterministic("alpha", pt.sqrt(alpha_sq_rv))
            loss = loss_fn(y, p, "sph", alpha=alpha_det)
            # α는 slice로 안정적 샘플링
            step_kwargs["step"] = [pm.Slice(vars=[alpha_sq_rv])]
        else:
            loss = loss_fn(y, p, loss_kind)

        pm.Potential(f"{loss_kind}_loss", -loss)

        trace = pm.sample(**{**step_kwargs, **sample_kwargs})

    return trace

In [None]:
def run_experiment_loop(
    seeds,
    feature_cols,
    noise_type_for_train="normal",
    npergroup=200,
    group=False, interaction=False, nonlinear=False,
    use_quasi=False, loss_kind="bce",
    draws=1000, tune=1000, target_accept=0.9):
    results = []


    for seed in seeds:
        np.random.seed(seed)

        # 1) Train 데이터 생성
        df_train = simulate_dataset(noise_type_for_train,npergroup)
        group_idx_train = df_train["group_id"].values if group else None

        df_test = simulate_dataset(noise_type_for_train,10000)
        X_test = df_test[feature_cols].copy()
        y_test = df_test["y"].values
        group_idx_test = df_test["group_id"].values if group else None
        if interaction:
          X_test["interaction"] = X_test["viewed_target_category"] * X_test["discount_received"]
        if nonlinear:
          X_test["purchases_squared"] = X_test["previous_purchases"] ** 2
        X_test = X_test.values.astype("float64")


        # 2) 모델 정의
        model, eta, y = define_model(
            df_train, feature_cols,
            group_idx=group_idx_train,
            group=group, interaction=interaction, nonlinear=nonlinear
        )

        # 3) 모델 실행 (Bayes vs Quasi 선택)
        if use_quasi:
            trace = run_quasi_model(
                model, eta, y,
                loss_kind=loss_kind,
                draws=draws, tune=tune, target_accept=target_accept,
                return_inferencedata=True,     idata_kwargs={"log_likelihood": True}
            )
        else:
            trace = run_bayesian_model(
                model, eta, y,
                draws=draws, tune=tune, target_accept=target_accept,
                return_inferencedata=True,     idata_kwargs={"log_likelihood": True}
            )


        beta_da = trace.posterior["beta"].stack(sample=("chain", "draw"))

        if beta_da.ndim == 2:
          # --- 비계층 (beta shape: (K, S)로 변환) ---
          beta = beta_da.transpose("sample", ...).values  # (S, K)

          # 선형예측자 η: (N_test, S)
          eta = X_test @ beta.T

        else:
          # --- 계층 (beta shape: (G, K, S)로 변환) ---
          beta = beta_da.transpose("sample", ...).values  # (S, G, K)
          S, G, K = beta.shape

          # 각 관측치의 그룹 계수를 매칭 → (S, N_test, K)
          beta_g = beta[:, group_idx_test, :]  # group_idx_test shape: (N_test,)

          # 선형예측자 η: (N_test, S)
          eta = np.einsum("snk,nk->ns", beta_g, X_test)

        # 2) probit 변환 (표준정규 CDF)
        p_samples = norm.cdf(eta)  # (N_test, S)

        # 3) 모든 β 샘플 기반 평균 확률
        p_mean = p_samples.mean(axis=1)  # (N_test,)

        # 4) 이진 예측 (threshold=0.5)
        y_pred = (p_mean >= 0.5).astype(int)

        # 6) 메트릭
        acc = accuracy_score(y_test, y_pred)
        logloss_val = log_loss(y_test, p_mean, labels=[0,1])
        brier = brier_score_loss(y_test, p_mean)

        results.append({
            "seed": seed,
            "acc": acc,
            "logloss": logloss_val,
            "brier": brier,
            "model_type": "quasi" if use_quasi else "bayes",
            "loss_kind": loss_kind if use_quasi else None
        })

    return pd.DataFrame(results)

In [None]:
model123, eta123, y123 = define_model(
            df_simulated_test, feature_cols,
            group_idx=df_simulated_test["group_id"].values,
            group=False, interaction=False, nonlinear=False
        )

trace = run_quasi_model(
                model123, eta123, y123,
                loss_kind="bce",
                draws=1000, tune=1000, target_accept=0.9,
                return_inferencedata=True,     idata_kwargs={"log_likelihood": True}
            )

Output()

In [None]:
print(trace.posterior["beta"].stack(sample=("chain","draw")))

<xarray.DataArray 'beta' (beta_dim_0: 4, sample: 4000)> Size: 128kB
array([[ 0.00767734,  0.0103127 ,  0.01914304, ...,  0.00040624,
        -0.00441607, -0.0026601 ],
       [ 0.27923985,  0.27965409,  0.28360715, ...,  0.29371519,
         0.293261  ,  0.28819431],
       [-0.17587823, -0.17750586, -0.17076636, ..., -0.17441405,
        -0.19027545, -0.18226491],
       [ 0.25302131,  0.25404311,  0.25968831, ...,  0.25671671,
         0.24352239,  0.27265221]])
Coordinates:
  * beta_dim_0  (beta_dim_0) int64 32B 0 1 2 3
  * sample      (sample) object 32kB MultiIndex
  * chain       (sample) int64 32kB 0 0 0 0 0 0 0 0 0 0 ... 3 3 3 3 3 3 3 3 3 3
  * draw        (sample) int64 32kB 0 1 2 3 4 5 6 ... 994 995 996 997 998 999


## 1. Classical Bayesian model

In [None]:
print(df_results_bayes_m0)
print(df_results_bayes_mf)

    seed      acc   logloss     brier model_type loss_kind
0      0  0.58476  0.663497  0.235968      bayes      None
1      1  0.56088  0.664066  0.236271      bayes      None
2      2  0.57410  0.664663  0.236490      bayes      None
3      3  0.55728  0.665086  0.236824      bayes      None
4      4  0.56292  0.664656  0.236540      bayes      None
5      5  0.57984  0.664001  0.236165      bayes      None
6      6  0.55424  0.662976  0.235756      bayes      None
7      7  0.55650  0.665039  0.236583      bayes      None
8      8  0.54748  0.665284  0.236866      bayes      None
9      9  0.56376  0.664229  0.236428      bayes      None
10    10  0.54936  0.663703  0.236115      bayes      None
11    11  0.61052  0.666296  0.237302      bayes      None
12    12  0.54174  0.663922  0.236206      bayes      None
13    13  0.55360  0.665087  0.236759      bayes      None
14    14  0.55492  0.663919  0.236058      bayes      None
15    15  0.57428  0.664229  0.236290      bayes      No

In [None]:
df_results_bayes_m0 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 1000,
    group=False,
    interaction=False,
    nonlinear=False,
    use_quasi=False
)

df_results_bayes_mf = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 1000,
    group=True,
    interaction=True,
    nonlinear=True,
    use_quasi=False
)
print(df_result_bayes_m0)
print(df_result_bayes_mf)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

ERROR:pymc.stats.convergence:There were 478 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 557 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 623 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 266 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 311 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 283 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 499 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 619 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 420 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 382 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 405 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 513 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 316 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 691 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 702 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 340 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 405 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 326 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 266 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 275 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


NameError: name 'df_result_bayes_m0' is not defined

In [None]:
df_results_bayes_m0 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=False,
    use_quasi=False
)

df_results_bayes_m1 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=False,
    nonlinear=False,
    use_quasi=False
)

df_results_bayes_m2 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=True,
    nonlinear=False,
    use_quasi=False
)

df_results_bayes_m3 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=True,
    use_quasi=False
)
df_results_bayes_mf = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=True,
    nonlinear=True,
    use_quasi=False
)



NameError: name 'feature_cols' is not defined

In [None]:
print(df_results_bayes_m0)
print(df_results_bayes_m1)
print(df_results_bayes_m2)
print(df_results_bayes_m3)
print(df_results_bayes_mf)

    seed      acc   logloss     brier model_type loss_kind
0      0  0.60556  0.665323  0.236813      bayes      None
1      1  0.55854  0.664128  0.236312      bayes      None
2      2  0.58686  0.666607  0.237491      bayes      None
3      3  0.58666  0.668819  0.238587      bayes      None
4      4  0.59392  0.666346  0.237142      bayes      None
5      5  0.60800  0.666717  0.237426      bayes      None
6      6  0.54950  0.662976  0.235722      bayes      None
7      7  0.57604  0.664863  0.236562      bayes      None
8      8  0.56248  0.665550  0.236940      bayes      None
9      9  0.57224  0.664275  0.236455      bayes      None
10    10  0.55014  0.663335  0.235838      bayes      None
11    11  0.57264  0.666296  0.237110      bayes      None
12    12  0.55052  0.666142  0.237350      bayes      None
13    13  0.54908  0.670554  0.239178      bayes      None
14    14  0.57088  0.665498  0.236928      bayes      None
15    15  0.60070  0.666914  0.237470      bayes      No

## 2. Quasi Bayes model

In [None]:
df_results_quasi_m0 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=False,
    use_quasi=True
)

print(df_results_quasi_m0)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.60556  0.665353  0.236832      quasi       bce
1      1  0.55854  0.664136  0.236318      quasi       bce
2      2  0.58686  0.666553  0.237459      quasi       bce
3      3  0.58546  0.668620  0.238495      quasi       bce
4      4  0.59392  0.666345  0.237137      quasi       bce
5      5  0.60800  0.666702  0.237417      quasi       bce
6      6  0.54950  0.663003  0.235734      quasi       bce
7      7  0.57506  0.664898  0.236578      quasi       bce
8      8  0.56434  0.665540  0.236933      quasi       bce
9      9  0.57224  0.664300  0.236467      quasi       bce
10    10  0.55014  0.663341  0.235840      quasi       bce
11    11  0.57660  0.666377  0.237146      quasi       bce
12    12  0.55052  0.666139  0.237349      quasi       bce
13    13  0.54830  0.670463  0.239139      quasi       bce
14    14  0.57086  0.665453  0.236898      quasi       bce
15    15  0.60070  0.666975  0.237501      quasi       b

In [None]:
df_results_quasi_m1 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=False,
    nonlinear=False,
    use_quasi=True
)

print(df_results_quasi_m1)

Output()

ERROR:pymc.stats.convergence:There were 253 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 513 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 208 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 535 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 540 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 532 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 281 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 489 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 265 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 212 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 470 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 518 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 433 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 146 divergences after tuning. Increase `target_accept` or reparameterize.


Output()

ERROR:pymc.stats.convergence:There were 245 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 210 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 182 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 554 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 174 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 173 divergences after tuning. Increase `target_accept` or reparameterize.


    seed      acc   logloss     brier model_type loss_kind
0      0  0.59848  0.663320  0.235745      quasi       bce
1      1  0.61292  0.663208  0.235927      quasi       bce
2      2  0.59398  0.663554  0.235781      quasi       bce
3      3  0.58764  0.668030  0.238203      quasi       bce
4      4  0.59770  0.663134  0.235646      quasi       bce
5      5  0.56850  0.665490  0.236916      quasi       bce
6      6  0.58706  0.666918  0.237512      quasi       bce
7      7  0.60974  0.662096  0.234857      quasi       bce
8      8  0.58930  0.663748  0.236097      quasi       bce
9      9  0.60004  0.662684  0.235646      quasi       bce
10    10  0.59692  0.660988  0.234743      quasi       bce
11    11  0.57928  0.663855  0.235953      quasi       bce
12    12  0.59282  0.663071  0.235947      quasi       bce
13    13  0.58780  0.670618  0.238799      quasi       bce
14    14  0.58910  0.663679  0.235779      quasi       bce
15    15  0.59966  0.664354  0.236315      quasi       b

In [None]:
df_results_quasi_m2 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=True,
    nonlinear=False,
    use_quasi=True
)

print(df_results_quasi_m2)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.83732  0.336468  0.108880      quasi       bce
1      1  0.83380  0.346044  0.111900      quasi       bce
2      2  0.83100  0.338159  0.109693      quasi       bce
3      3  0.83812  0.339186  0.109698      quasi       bce
4      4  0.82468  0.344745  0.111494      quasi       bce
5      5  0.82042  0.339211  0.109795      quasi       bce
6      6  0.82676  0.339491  0.109398      quasi       bce
7      7  0.83592  0.341914  0.110279      quasi       bce
8      8  0.82998  0.338337  0.109280      quasi       bce
9      9  0.83438  0.341719  0.110792      quasi       bce
10    10  0.82644  0.341133  0.109911      quasi       bce
11    11  0.80694  0.343722  0.110164      quasi       bce
12    12  0.82976  0.339889  0.109450      quasi       bce
13    13  0.83294  0.348723  0.111958      quasi       bce
14    14  0.82570  0.338807  0.109830      quasi       bce
15    15  0.83656  0.341824  0.110167      quasi       b

In [None]:
df_results_quasi_m3 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=True,
    use_quasi=True
)
print(df_results_quasi_m3)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.57046  0.648400  0.228857      quasi       bce
1      1  0.57554  0.647703  0.228883      quasi       bce
2      2  0.55244  0.649170  0.229243      quasi       bce
3      3  0.57284  0.652262  0.230543      quasi       bce
4      4  0.57128  0.648375  0.228858      quasi       bce
5      5  0.53692  0.648429  0.229039      quasi       bce
6      6  0.59980  0.646222  0.228182      quasi       bce
7      7  0.57126  0.646958  0.228478      quasi       bce
8      8  0.57582  0.648852  0.229397      quasi       bce
9      9  0.58984  0.648341  0.229475      quasi       bce
10    10  0.54152  0.645935  0.227885      quasi       bce
11    11  0.57042  0.650097  0.229480      quasi       bce
12    12  0.56684  0.648954  0.229797      quasi       bce
13    13  0.54258  0.653045  0.231448      quasi       bce
14    14  0.57822  0.648135  0.228641      quasi       bce
15    15  0.60924  0.649898  0.229580      quasi       b

In [None]:
df_results_quasi_mf = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=True,
    nonlinear=True,
    use_quasi=True
)

print(df_results_quasi_mf)


Output()

ERROR:pymc.stats.convergence:There were 250 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 921 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 529 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 355 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 407 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 265 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 448 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 572 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 455 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 224 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 535 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 327 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 749 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 517 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 188 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 450 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 348 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 193 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 456 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 584 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.84356  0.326069  0.106413      quasi       bce
1      1  0.83176  0.331958  0.108929      quasi       bce
2      2  0.82110  0.323177  0.106212      quasi       bce
3      3  0.83550  0.328467  0.106351      quasi       bce
4      4  0.82210  0.327868  0.107233      quasi       bce
5      5  0.82278  0.333417  0.105302      quasi       bce
6      6  0.81728  0.329998  0.107157      quasi       bce
7      7  0.83572  0.327438  0.107664      quasi       bce
8      8  0.82812  0.326536  0.105304      quasi       bce
9      9  0.84230  0.327526  0.107548      quasi       bce
10    10  0.82662  0.325756  0.106409      quasi       bce
11    11  0.82358  0.329884  0.106473      quasi       bce
12    12  0.82806  0.328527  0.106482      quasi       bce
13    13  0.83082  0.333474  0.109090      quasi       bce
14    14  0.81990  0.324047  0.105965      quasi       bce
15    15  0.84056  0.334008  0.106090      quasi       b

## 3. SPH Bayes model

In [None]:
df_results_sph_m0 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=False,
    use_quasi=True,loss_kind = "sph"
)

print(df_results_sph_m0)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.70482  0.706312  0.242674      quasi       sph
1      1  0.65196  0.664637  0.234056      quasi       sph
2      2  0.66792  0.670888  0.234987      quasi       sph
3      3  0.67702  0.690014  0.239490      quasi       sph
4      4  0.66274  0.696664  0.241829      quasi       sph
5      5  0.66698  0.658184  0.231456      quasi       sph
6      6  0.63558  0.656369  0.231248      quasi       sph
7      7  0.67174  0.671748  0.235766      quasi       sph
8      8  0.60756  0.657976  0.232958      quasi       sph
9      9  0.63586  0.659283  0.232808      quasi       sph
10    10  0.61742  0.650879  0.229470      quasi       sph
11    11  0.65450  0.661335  0.233289      quasi       sph
12    12  0.55014  0.673636  0.240279      quasi       sph
13    13  0.66604  0.713571  0.245903      quasi       sph
14    14  0.62172  0.651371  0.229718      quasi       sph
15    15  0.67260  0.702450  0.243250      quasi       s

In [None]:
df_results_sph_m1 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=False,
    nonlinear=False,
    use_quasi=True,loss_kind = "sph"
)

print(df_results_sph_m1)

Output()

ERROR:pymc.stats.convergence:There were 538 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 303 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 249 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 396 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 174 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 181 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 407 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 265 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 288 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 398 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 132 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 876 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 116 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 366 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 617 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 143 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 290 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 317 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 307 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 137 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.66172  0.711361  0.238120      quasi       sph
1      1  0.66836  0.691426  0.236977      quasi       sph
2      2  0.66606  0.714871  0.234633      quasi       sph
3      3  0.66288  0.704478  0.238531      quasi       sph
4      4  0.65920  0.740814  0.246865      quasi       sph
5      5  0.66522  0.670899  0.233126      quasi       sph
6      6  0.66574  0.678136  0.232492      quasi       sph
7      7  0.67106  0.852266  0.258813      quasi       sph
8      8  0.65090  0.653482  0.227734      quasi       sph
9      9  0.66766  0.744337  0.243912      quasi       sph
10    10  0.64674  0.635452  0.221561      quasi       sph
11    11  0.66866  0.662819  0.228109      quasi       sph
12    12  0.63446  0.661920  0.230165      quasi       sph
13    13  0.67180  0.934869  0.266124      quasi       sph
14    14  0.63830  0.657126  0.227157      quasi       sph
15    15  0.67322  0.744774  0.247171      quasi       s

In [None]:
df_results_sph_m2 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=True,
    nonlinear=False,
    use_quasi=True,loss_kind = "sph"
)

print(df_results_sph_m2)

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.83964  0.347226  0.109605      quasi       sph
1      1  0.83636  0.358112  0.113177      quasi       sph
2      2  0.83170  0.347474  0.110006      quasi       sph
3      3  0.83776  0.351271  0.110645      quasi       sph
4      4  0.82354  0.356411  0.112160      quasi       sph
5      5  0.82484  0.349750  0.110066      quasi       sph
6      6  0.82764  0.351597  0.110484      quasi       sph
7      7  0.83846  0.352556  0.110922      quasi       sph
8      8  0.83096  0.350140  0.109957      quasi       sph
9      9  0.83568  0.353066  0.111405      quasi       sph
10    10  0.82654  0.351685  0.110858      quasi       sph
11    11  0.82492  0.354432  0.110928      quasi       sph
12    12  0.83166  0.352463  0.110839      quasi       sph
13    13  0.81984  0.361460  0.114255      quasi       sph
14    14  0.82638  0.349431  0.110186      quasi       sph
15    15  0.83746  0.351913  0.110790      quasi       s

In [None]:
df_results_sph_m3 = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=False,
    interaction=False,
    nonlinear=True,
    use_quasi=True,loss_kind = "sph"
)
print(df_results_sph_m3)

Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

Output()

Output()

ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.73426  0.635808  0.215068      quasi       sph
1      1  0.67758  0.719622  0.234500      quasi       sph
2      2  0.63824  0.788090  0.240804      quasi       sph
3      3  0.67252  0.793209  0.244525      quasi       sph
4      4  0.70644  0.689804  0.228879      quasi       sph
5      5  0.67876  0.653059  0.222685      quasi       sph
6      6  0.68702  0.628495  0.217706      quasi       sph
7      7  0.69550  0.678530  0.226000      quasi       sph
8      8  0.62650  0.738651  0.237582      quasi       sph
9      9  0.65420  0.655177  0.222859      quasi       sph
10    10  0.69520  0.622026  0.214012      quasi       sph
11    11  0.66358  0.766737  0.237299      quasi       sph
12    12  0.70004  0.684105  0.228260      quasi       sph
13    13  0.68204  0.683963  0.230567      quasi       sph
14    14  0.70116  0.627752  0.213894      quasi       sph
15    15  0.65772  0.678957  0.229687      quasi       s

In [None]:
df_results_sph_mf = run_experiment_loop(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
    group=True,
    interaction=True,
    nonlinear=True,
    use_quasi=True,loss_kind = "sph"
)

print(df_results_sph_mf)

Output()

ERROR:pymc.stats.convergence:There were 879 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 346 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 397 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 404 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 271 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 393 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 197 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 212 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 646 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 279 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 292 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 386 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 199 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 324 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 280 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 401 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 187 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 320 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 235 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 245 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.89262  0.404814  0.095384      quasi       sph
1      1  0.88904  0.522611  0.100641      quasi       sph
2      2  0.87856  0.469729  0.096797      quasi       sph
3      3  0.89074  0.474212  0.098692      quasi       sph
4      4  0.88006  0.488261  0.098303      quasi       sph
5      5  0.88268  0.512018  0.097534      quasi       sph
6      6  0.87898  0.400368  0.094527      quasi       sph
7      7  0.89252  0.401938  0.095180      quasi       sph
8      8  0.88828  0.443634  0.096873      quasi       sph
9      9  0.88650  0.505914  0.100018      quasi       sph
10    10  0.87786  0.493298  0.099327      quasi       sph
11    11  0.87968  0.493032  0.097959      quasi       sph
12    12  0.87936  0.473336  0.096676      quasi       sph
13    13  0.88384  0.303943  0.090096      quasi       sph
14    14  0.87924  0.470309  0.095793      quasi       sph
15    15  0.89510  0.374241  0.095233      quasi       s

## 4. KSD Bayes model

In [None]:
import torch
import pandas as pd
import numpy as np
from typing import Tuple, Optional, Dict

# ---------------------------
# 1) Feature builder
# ---------------------------
def build_features(
    df: pd.DataFrame,
    feature_cols,
    interaction: bool,
    nonlinear: bool,
    group: bool
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, int, int]:
    X_df = df[list(feature_cols)].copy()
    X_df["bias"] = 1.0
    X = torch.tensor(X_df.values, dtype=torch.float32)

    y = torch.tensor(df["y"].values, dtype=torch.float32)
    if group:
        gids_raw = pd.Categorical(df["group_id"])
        group_ids = torch.tensor(gids_raw.codes, dtype=torch.long)
        G = len(gids_raw.categories)
    else:
        group_ids = torch.zeros(len(df), dtype=torch.long)
        G = 1

    inter = torch.tensor(
        (df["viewed_target_category"] * df["discount_received"]).values,
        dtype=torch.float32
    ).unsqueeze(1) if interaction else torch.zeros((len(df), 1), dtype=torch.float32)

    nonlin = torch.tensor(
        (df["previous_purchases"] ** 2).values,
        dtype=torch.float32
    ).unsqueeze(1) if nonlinear else torch.zeros((len(df), 1), dtype=torch.float32)

    extra = torch.cat([inter, nonlin], dim=1)
    D = X.shape[1]
    return X, y, group_ids, extra, D, G


# ---------------------------
# 2) Param unpack & eta(x)
# ---------------------------
def unpack_theta(theta: torch.Tensor, D: int, G: int, use_gamma1: bool, use_gamma2: bool):
    idx = 0
    if G > 1:
        betas = theta[idx: idx + G * D].reshape(G, D)
        idx += G * D
    else:
        betas = theta[idx: idx + D].unsqueeze(0)
        idx += D

    gamma1 = theta[idx] if use_gamma1 else torch.tensor(0.0, dtype=theta.dtype, device=theta.device)
    idx += 1 if use_gamma1 else 0

    gamma2 = theta[idx] if use_gamma2 else torch.tensor(0.0, dtype=theta.dtype, device=theta.device)
    idx += 1 if use_gamma2 else 0

    return {"betas": betas, "gamma1": gamma1, "gamma2": gamma2}


def eta_fn(theta: torch.Tensor, X: torch.Tensor, group_ids: torch.Tensor,
           extra: torch.Tensor, D: int, G: int,
           use_gamma1: bool, use_gamma2: bool) -> torch.Tensor:
    pars = unpack_theta(theta, D, G, use_gamma1, use_gamma2)
    betas = pars["betas"]
    xb = (X * betas[group_ids]).sum(dim=1)
    eta = xb
    if use_gamma1:
        eta = eta + pars["gamma1"] * extra[:, 0]
    if use_gamma2:
        eta = eta + pars["gamma2"] * extra[:, 1]
    return eta


# ---------------------------
# 3) Truncated Normal sampler
# ---------------------------
SQRT2 = 1.4142135623730951
def norm_cdf(z: torch.Tensor) -> torch.Tensor:
    return 0.5 * (1.0 + torch.erf(z / SQRT2))

def norm_ppf(u: torch.Tensor,eps: float = 1e-6) -> torch.Tensor:
  u = u.clamp(eps, 1.0 - eps)
  return SQRT2 * torch.erfinv(2.0 * u - 1.0)

@torch.no_grad()
def sample_truncnorm_from_probit(eta: torch.Tensor, y: torch.Tensor,eps: float = 1e-6) -> torch.Tensor:
    N = eta.shape[0]
    a = (0.0 - eta)
    Phi_a = norm_cdf(a).clamp(eps, 1.0 - eps)

    u1 = Phi_a + (1.0 - Phi_a) * torch.rand(N, device=eta.device)
    z1 = eta + norm_ppf(u1)

    u0 = torch.rand(N, device=eta.device) * Phi_a
    z0 = eta + norm_ppf(u0)

    return torch.where(y > 0.5, z1, z0)


# ---------------------------
# 4) KSD^2 with polynomial kernel (PSD version)
# ---------------------------
def _poly_stats(r: torch.Tensor, gamma: float):
    k   = (1.0 + r).pow(gamma)
    kp  = gamma * (1.0 + r).pow(gamma - 1.0)
    kpp = gamma * (gamma - 1.0) * (1.0 + r).pow(gamma - 2.0)
    return k, kp, kpp

def imq_cache_from_u(u: torch.Tensor, c: float = None, beta: float = 0.5):
    # u: (N,1) tensor on device
    u = u.view(-1, 1)
    device = u.device
    N = u.shape[0]
    diff = u - u.t()
    r2 = diff.pow(2)

    # median heuristic for c (once per outer step)
    if c is None:
        # upper-tri distances의 중앙값
        tri = r2[torch.triu(torch.ones_like(r2, dtype=torch.bool), diagonal=1)]
        med = torch.sqrt(tri.median()).clamp_min(1e-8)
        c = med

    A   = c*c + r2
    k   = A.pow(-beta)
    kp  = -beta * A.pow(-beta - 1.0)
    kpp = (-beta) * (-beta - 1.0) * A.pow(-beta - 2.0)

    dA_du  =  2.0 * diff
    dA_dup = -2.0 * diff

    # cross second derivative base term (eta와 무관)
    trace_base = kpp * dA_du * dA_dup + kp * (-2.0)

    ones_col = torch.ones(N, 1, device=device)
    ones_row = torch.ones(1, N, device=device)

    return {
        "k": k, "kp": kp,
        "dA_du": dA_du, "dA_dup": dA_dup,
        "trace_base": trace_base,
        "ones_col": ones_col, "ones_row": ones_row,
        "c": c, "beta": beta
    }

def ksd2_vstat_imq_cached(u: torch.Tensor, eta: torch.Tensor, cache: dict) -> torch.Tensor:
    # u: (N,1) 또는 (N,) / eta: (N,)
    u   = u.view(-1, 1)
    eta = eta.view(-1, 1)
    N   = u.shape[0]
    k         = cache["k"]
    dA_du     = cache["dA_du"]
    dA_dup    = cache["dA_dup"]
    trace_base= cache["trace_base"]
    ones_col  = cache["ones_col"]
    ones_row  = cache["ones_row"]

    # score for N(eta,1)
    s = -(u - eta)                     # (N,1)
    s_dot = s @ s.t()                  # (N,N)

    # gradients via chain rule: grad_u_k = (dk/dA)*dA/du = kp * dA_du
    kp = cache["kp"]
    grad_u_k  = kp * dA_du
    grad_up_k = kp * dA_dup

    term1 = k * s_dot
    term2 = (s @ ones_row) * grad_up_k
    term3 = (ones_col @ s.t()) * grad_u_k
    Umat = term1 + term2 + term3 + trace_base
    return Umat.mean()
# ---------------------------
# 5) Log posterior (KSD-Bayes, no RBF)
# ---------------------------
def logpost_ksd_bayes(
    theta: torch.Tensor,
    X: torch.Tensor, y: torch.Tensor, group_ids: torch.Tensor,
    extra: torch.Tensor,
    D: int, G: int,
    use_gamma1: bool, use_gamma2: bool,
    u_imputed: torch.Tensor,
    beta_lr: float = 1.0,
    prior_std: float = 1.0,
    cache = None
) -> torch.Tensor:
    eta = eta_fn(theta, X, group_ids, extra, D, G, use_gamma1, use_gamma2)
    ksd2 = ksd2_vstat_imq_cached(u_imputed, eta, cache)
    log_prior = -0.5 * theta.pow(2).sum() / (prior_std**2)
    N = X.shape[0]
    return log_prior - beta_lr * N * ksd2


# ---------------------------
# 6) MH update
# ---------------------------

def grad_logpost(theta, X, y, group_ids, extra, D, G,
                 use_gamma1, use_gamma2, u_imputed,
                 beta_lr=1.0, prior_std=1.0, cache=None):
    # theta requires grad
    theta = theta.detach().requires_grad_(True)
    lp = logpost_ksd_bayes(theta, X, y, group_ids, extra, D, G,
                           use_gamma1, use_gamma2, u_imputed,
                           beta_lr=beta_lr, prior_std=prior_std, cache=cache)
    (g,) = torch.autograd.grad(lp, theta, retain_graph=False, create_graph=False)
    return lp, g

@torch.no_grad()
def hmc_update_theta(
    theta: torch.Tensor,
    X: torch.Tensor, y: torch.Tensor, group_ids: torch.Tensor, extra: torch.Tensor,
    D: int, G: int, use_gamma1: bool, use_gamma2: bool,
    u_imputed: torch.Tensor,
    L: int = 5,                # leapfrog steps
    eps: float = 1e-3,         # step size
    mass: float = 1.0,         # scalar mass (diag M)
    beta_lr: float = 1.0,
    prior_std: float = 1.0,
    cache: Optional[Dict] = None
) -> Tuple[torch.Tensor, float]:
    device = theta.device
    # 초기 상태
    theta0 = theta.clone()
    p0 = torch.randn_like(theta) * mass**0.5  # N(0, M)
    # 현재 log posterior / grad (autograd 필요하므로 따로 호출)
    lp, g = grad_logpost(theta0, X, y, group_ids, extra, D, G,
                         use_gamma1, use_gamma2, u_imputed,
                         beta_lr=beta_lr, prior_std=prior_std, cache=cache)

    # Half-step for momentum
    p = p0 + 0.5 * eps * g
    theta_prop = theta0.clone()

    # Leapfrog
    for _ in range(L):
        theta_prop = (theta_prop + eps * p / mass).detach()  # position step
        # recompute grad at new position
        lp_prop, g_prop = grad_logpost(theta_prop, X, y, group_ids, extra, D, G,
                                       use_gamma1, use_gamma2, u_imputed,
                                       beta_lr=beta_lr, prior_std=prior_std, cache=cache)
        # full step for momentum except last iteration
        if _ != L - 1:
            p = p + eps * g_prop
        else:
            p = p + 0.5 * eps * g_prop

    # Hamiltonian
    H0 = -lp + 0.5 * (p0.pow(2).sum() / mass)
    H1 = -lp_prop + 0.5 * (p.pow(2).sum() / mass)

    log_alpha = -(H1 - H0)  # log accept prob
    u = torch.rand(1, device=device)
    if torch.log(u) < log_alpha:
        return theta_prop.detach(), 1.0
    else:
        return theta0, 0.0

def make_potential_fn(
    X, y, group_ids, extra, D, G, use_gamma1, use_gamma2,
    u_imputed, beta_lr, prior_std, cache
):
    # Pyro는 음의 로그타겟을 potential로 받습니다.
    def _potential(theta_dict):
        theta = theta_dict["theta"]
        return -logpost_ksd_bayes(
            theta, X, y, group_ids, extra, D, G,
            use_gamma1, use_gamma2, u_imputed,
            beta_lr=beta_lr, prior_std=prior_std, cache=cache
        )
    return _potential

def nuts_sample_theta(
    theta_init: torch.Tensor,
    X: torch.Tensor, y: torch.Tensor, group_ids: torch.Tensor, extra: torch.Tensor,
    D: int, G: int, use_gamma1: bool, use_gamma2: bool,
    u_imputed: torch.Tensor, cache: dict,
    beta_lr: float = 1.0, prior_std: float = 1.0,
    num_warmup: int = 300, num_samples: int = 100,
    target_accept_prob: float = 0.8, max_tree_depth: int = 10,
) -> torch.Tensor:
    device = theta_init.device
    potential_fn = make_potential_fn(
        X, y, group_ids, extra, D, G, use_gamma1, use_gamma2,
        u_imputed, beta_lr, prior_std, cache
    )

    kernel = NUTS(
        potential_fn=potential_fn,
        target_accept_prob=target_accept_prob,
        max_tree_depth=max_tree_depth,
        adapt_step_size=True,
        adapt_mass_matrix=True,
    )

    # Pyro 버전별 호환: (A) MCMC(warmup_steps=...) -> run()
    #                  (B) MCMC(...) -> run(num_warmup=...)
    mcmc = None
    try:
        mcmc = pmcmc.MCMC(
            kernel,
            num_samples=num_samples,
            warmup_steps=num_warmup,                 # 일부 버전에서만 지원
            initial_params={"theta": theta_init.detach()},
        )
        mcmc.run()                                   # 여긴 num_warmup 전달 X
    except TypeError:
        # warmup_steps 인자 미지원 버전
        mcmc = pmcmc.MCMC(
            kernel,
            num_samples=num_samples,
            initial_params={"theta": theta_init.detach()},
        )
        mcmc.run(num_warmup=num_warmup)              # 이 버전은 run()에 전달

    samples = mcmc.get_samples(group_by_chain=False)["theta"].to(device)
    return samples
# ---------------------------
# 7) Training loop
# ---------------------------
def init_theta(D: int, G: int, use_gamma1: bool, use_gamma2: bool, scale: float = 0.1) -> torch.Tensor:
    k = (G * D if G > 1 else D) + (1 if use_gamma1 else 0) + (1 if use_gamma2 else 0)
    return scale * torch.randn(k)

def fit_ksd_bayes(
    df_train: pd.DataFrame,
    feature_cols,
    interaction: bool = True,
    nonlinear: bool = True,
    group: bool = True,
    n_iter: int = 400,
    inner_updates: int = 5,
    step_scale: float = 0.05,
    beta_lr: float = 1.0,
    prior_std: float = 1.0,
    device: str = "cpu"
) -> Dict[str, torch.Tensor]:
    X, y, group_ids, extra, D, G = build_features(df_train, feature_cols, interaction, nonlinear, group)
    X, y, group_ids, extra = X.to(device), y.to(device), group_ids.to(device), extra.to(device)

    use_gamma1 = interaction
    use_gamma2 = nonlinear
    theta = init_theta(D, G, use_gamma1, use_gamma2).to(device)

    accepts = 0
    samples = []

    for t in range(n_iter):
        eta = eta_fn(theta, X, group_ids, extra, D, G, use_gamma1, use_gamma2)
        u = sample_truncnorm_from_probit(eta, y)

        cache = imq_cache_from_u(u, c=None, beta=0.5)

        for _ in range(inner_updates):
            theta, acc = hmc_update_theta(
                theta, X, y, group_ids, extra, D, G,
                use_gamma1, use_gamma2, u,
                L=5, eps=1e-3, mass=1.0, beta_lr=beta_lr,
                prior_std=prior_std, cache = cache
            )
            accepts += acc
        samples.append(theta.clone().cpu())

    return {
        "theta_samples": torch.stack(samples),
        "accept_rate": accepts / (n_iter * inner_updates),
        "D": D, "G": G, "use_gamma1": use_gamma1, "use_gamma2": use_gamma2
    }

def fit_ksd_bayes_nuts(
    df_train: pd.DataFrame,
    feature_cols,
    interaction: bool = True,
    nonlinear: bool = True,
    group: bool = True,
    n_outer: int = 200,               # 바깥 루프( u 갱신 횟수 )
    nuts_warmup: int = 300,
    nuts_samples: int = 100,
    beta_lr: float = 1.0,
    prior_std: float = 1.0,
    imq_beta: float = 0.5,            # IMQ exponent (보통 0.5 고정)
    target_accept_prob: float = 0.8,
    max_tree_depth: int = 10,
    device: str = "cuda",
) -> Dict[str, torch.Tensor]:
    X, y, group_ids, extra, D, G = build_features(df_train, feature_cols, interaction, nonlinear, group)
    X, y, group_ids, extra = X.to(device), y.to(device), group_ids.to(device), extra.to(device)

    use_gamma1 = interaction
    use_gamma2 = nonlinear
    theta = init_theta(D, G, use_gamma1, use_gamma2).to(device)

    samples = []

    for t in range(n_outer):
        # 1) 현재 theta에서 eta 계산 → u 샘플
        eta = eta_fn(theta, X, group_ids, extra, D, G, use_gamma1, use_gamma2)
        u = sample_truncnorm_from_probit(eta, y).to(device)

        # 2) IMQ cache (median heuristic로 c 선택)
        cache = imq_cache_from_u(u, c=None, beta=imq_beta)

        # 3) NUTS로 theta 샘플들 뽑기 (u, cache는 고정)
        theta_draws = nuts_sample_theta(
            theta_init=theta,
            X=X, y=y, group_ids=group_ids, extra=extra,
            D=D, G=G, use_gamma1=use_gamma1, use_gamma2=use_gamma2,
            u_imputed=u, cache=cache,
            beta_lr=beta_lr, prior_std=prior_std,
            num_warmup=nuts_warmup, num_samples=nuts_samples,
            target_accept_prob=target_accept_prob, max_tree_depth=max_tree_depth,
        )

        # 4) 새 상태 선택: (a) 마지막 샘플, 또는 (b) 평균
        theta = theta_draws[-1].detach()
        samples.append(theta.cpu())

    return {
        "theta_samples": torch.stack(samples),  # (n_outer, dim_theta)
        "D": D, "G": G, "use_gamma1": use_gamma1, "use_gamma2": use_gamma2
    }
# ---------------------------
# 8) Prediction
# ---------------------------
def predict_probit(
    theta_mean: torch.Tensor,
    df_test: pd.DataFrame,
    feature_cols,
    interaction: bool, nonlinear: bool, group: bool
) -> Tuple[torch.Tensor, Dict[str, float]]:
    X, y, group_ids, extra, D, G = build_features(df_test, feature_cols, interaction, nonlinear, group)

        # 2) 🔧 device 통일
    device = theta_mean.device
    X = X.to(device)
    y = y.to(device)
    group_ids = group_ids.to(device)      # long dtype 유지
    extra = extra.to(device)
    eta = eta_fn(theta_mean, X, group_ids, extra, D, G, interaction, nonlinear)
    p = 0.5 * (1.0 + torch.erf(eta / 1.4142135623730951))
    p = p.clamp(1e-8, 1-1e-8)
    yhat = (p > 0.5).float()

    acc = (yhat == y).float().mean().item()
    brier = torch.mean((p - y) ** 2).item()
    logloss = -torch.mean(y * torch.log(p) + (1 - y) * torch.log(1 - p)).item()

    return p, {"accuracy": acc, "brier": brier, "logloss": logloss}


In [None]:
def fit_ksd_bayes_nuts_early(
    df_train, df_test, feature_cols,
    interaction=True, nonlinear=True, group=True,
    n_outer=120, nuts_warmup=200, nuts_samples=30,
    beta_lr=0.5, prior_std=1.0, imq_beta=0.5,
    target_accept_prob=0.9, max_tree_depth=10,
    burn_outer=20, check_window=5,
    tol_logloss=0.01,   # 1% 미만 개선이면 수렴
    tol_brier=0.005,    # 0.5% 미만 개선이면 수렴
    tol_acc=0.005,      # 0.5%p 미만 개선이면 수렴
    device="cpu", verbose=True
):
    X, y, group_ids, extra, D, G = build_features(df_train, feature_cols, interaction, nonlinear, group)
    X, y, group_ids, extra = X.to(device), y.to(device), group_ids.to(device), extra.to(device)

    use_gamma1, use_gamma2 = interaction, nonlinear
    theta = init_theta(D, G, use_gamma1, use_gamma2).to(device)

    # 기록 (매 outer)
    history = {"logloss": [], "brier": [], "accuracy": []}
    thetas  = []  # θ 경로 (optional)
    stopped_at = None

    for t in range(n_outer):
        # 1) u 샘플
        eta = eta_fn(theta, X, group_ids, extra, D, G, use_gamma1, use_gamma2)
        u = sample_truncnorm_from_probit(eta, y).to(device)

        # 2) KSD 캐시
        cache = imq_cache_from_u(u, c=None, beta=imq_beta)

        # 3) NUTS로 θ 샘플 (u 고정)
        theta_draws = nuts_sample_theta(
            theta_init=theta,
            X=X, y=y, group_ids=group_ids, extra=extra,
            D=D, G=G, use_gamma1=use_gamma1, use_gamma2=use_gamma2,
            u_imputed=u, cache=cache,
            beta_lr=beta_lr, prior_std=prior_std,
            num_warmup=nuts_warmup, num_samples=nuts_samples,
            target_accept_prob=target_accept_prob, max_tree_depth=max_tree_depth
        )

        # 4) 새 θ 선택 (마지막 샘플)
        theta = theta_draws[-1].detach()
        thetas.append(theta.cpu())

        # ── 평가: posterior predictive mean(노이즈 ↓ 권장) ──
        # 최근 절반 샘플로 p 평균
        k = max(1, nuts_samples // 2)
        ps = []
        for th in theta_draws[-k:]:
            p_th, _ = predict_probit(th, df_train, feature_cols, interaction, nonlinear, group)
            ps.append(p_th.to(device))
        p_mean = torch.stack(ps).mean(0)

        y_true = torch.tensor(df_train["y"].values, dtype=torch.float32, device=p_mean.device)
        eps = 1e-8
        acc = ((p_mean > 0.5).float() == y_true).float().mean().item()
        brier = torch.mean((p_mean - y_true) ** 2).item()
        logloss = (-y_true*torch.log(p_mean.clamp(eps,1-eps))
                   -(1-y_true)*torch.log((1-p_mean).clamp(eps,1-eps))).mean().item()

        history["logloss"].append(logloss)
        history["brier"].append(brier)
        history["accuracy"].append(acc)

        if verbose:
            print(f"[outer {t:03d}] logloss={logloss:.4f}  brier={brier:.4f}  acc={acc:.4f}")

        # ── 조기 종료 체크: burn + 두 구간(2*window) 이후부터 ──
        if t >= burn_outer + 2*check_window:
            def mov_avg(vals): return float(np.mean(vals[-check_window:]))

            ll_recent, ll_prev = mov_avg(history["logloss"]), mov_avg(history["logloss"][:-check_window])
            br_recent, br_prev = mov_avg(history["brier"]),   mov_avg(history["brier"][:-check_window])
            ac_recent, ac_prev = mov_avg(history["accuracy"]),mov_avg(history["accuracy"][:-check_window])

            ll_impr = (ll_prev - ll_recent) / max(ll_prev, 1e-12)   # 상대 개선율
            br_impr = (br_prev - br_recent) / max(br_prev, 1e-12)   # 상대 개선율
            ac_impr = (ac_recent - ac_prev)                         # 절대 개선폭(↑가 개선)

            stop_ll  = (0 <= ll_impr < tol_logloss)   # ex) 0 ≤ 0.006 < 0.01
            stop_br  = (0 <= br_impr < tol_brier)
            stop_acc = (0 <= ac_impr < tol_acc)

            if stop_ll and stop_br and stop_acc:
                stopped_at = t
                if verbose:
                    print(f"[Early stop @ outer {t}] "
                          f"ll_impr={ll_impr:.3%}, br_impr={br_impr:.3%}, ac_impr={ac_impr:.3f}")
                break
    p_test, metrics_test = predict_probit(theta, df_test, feature_cols, interaction, nonlinear, group)

    return {
        "theta_path": torch.stack(thetas),
        "history_train": history,
        "stopped_at": stopped_at,
        "final_theta": thetas[-1],
        "metrics_test": metrics_test
    }


In [None]:
import numpy as np
import torch
from collections import deque

def fit_ksd_bayes_nuts_ema_ensemble(
    df_train, df_test, feature_cols,
    interaction=True, nonlinear=True, group=True,
    n_outer=60,
    nuts_warmup=300, nuts_samples=30,
    frac_pp_mean=0.5,            # posterior predictive mean에 사용할 NUTS 마지막 비율 q
    beta_lr=0.01, prior_std=1.0, imq_beta=0.5,
    target_accept_prob=0.90, max_tree_depth=10,
    # --- 안정화 하이퍼 ---
    alpha=0.25,                  # EMA 계수 (주기 T≈6~8이면 0.2~0.3 권장)
    K_ens=8,                     # 최근 K outer 앙상블 크기(평가에 쓰는 p 평균)
    K_ma=7,                      # 조기종료 이동평균 창(주기절반 정도)
    tol_logloss=0.01, tol_brier=0.005, tol_acc=0.005,
    device="cuda", verbose=True
):
    """
    Train에서만 모니터링/조기종료(EMA + 최근 K 앙상블), 마지막에 Test 1회 평가.
    반환: theta_path(EMA θ), history_train(앙상블 p 기준 지표), metrics_test(최종)
    """
    # --- build (TRAIN) ---
    X, y, group_ids, extra, D, G = build_features(df_train, feature_cols, interaction, nonlinear, group)
    X, y, group_ids, extra = X.to(device), y.to(device), group_ids.to(device), extra.to(device)
    use_gamma1, use_gamma2 = interaction, nonlinear

    # 초기 θ
    theta = init_theta(D, G, use_gamma1, use_gamma2).to(device)
    theta_path = []
    history = {"logloss": [], "brier": [], "accuracy": []}
    last_theta_draws = None

    # 앙상블 버퍼(최근 K outer의 p 저장)
    bag = deque(maxlen=K_ens)

    stopped_at = None
    eps = 1e-8

    for t in range(n_outer):
        # 1) u 샘플
        eta = eta_fn(theta, X, group_ids, extra, D, G, use_gamma1, use_gamma2)
        u = sample_truncnorm_from_probit(eta, y).to(device)

        # 2) KSD 캐시 (IMQ)
        cache = imq_cache_from_u(u, c=None, beta=imq_beta)

        # 3) NUTS로 θ 샘플
        theta_draws = nuts_sample_theta(
            theta_init=theta,
            X=X, y=y, group_ids=group_ids, extra=extra,
            D=D, G=G, use_gamma1=use_gamma1, use_gamma2=use_gamma2,
            u_imputed=u, cache=cache,
            beta_lr=beta_lr, prior_std=prior_std,
            num_warmup=nuts_warmup, num_samples=nuts_samples,
            target_accept_prob=target_accept_prob, max_tree_depth=max_tree_depth,
        )
        last_theta_draws = theta_draws.detach().cpu()

        # 4) inner 평균 + EMA 갱신
        theta_inner_mean = theta_draws.mean(0).detach()
        if t == 0:
            theta = theta_inner_mean
        else:
            theta = (1.0 - alpha) * theta + alpha * theta_inner_mean

        theta_path.append(theta.detach().cpu())

        # 5) posterior predictive mean (한 outer 내 마지막 q 비율)
        k_pp = max(1, int(nuts_samples * frac_pp_mean))
        ps = []
        for th in theta_draws[-k_pp:]:
            p_th, _ = predict_probit(th, df_train, feature_cols, interaction, nonlinear, group)
            ps.append(p_th.to(device))
        p_mean_this_outer = torch.stack(ps).mean(0)

        # 6) 최근 K outer 앙상블 확률
        bag.append(p_mean_this_outer.detach())
        p_ens = torch.stack(list(bag)).mean(0)

        # 7) TRAIN 지표 기록 (앙상블 p 기준)
        y_tr = torch.tensor(df_train["y"].values, dtype=torch.float32, device=device)
        acc = ((p_ens > 0.5).float() == y_tr).float().mean().item()
        brier = torch.mean((p_ens - y_tr) ** 2).item()
        logloss = (-y_tr * torch.log(p_ens.clamp(eps, 1-eps))
                   - (1 - y_tr) * torch.log((1 - p_ens).clamp(eps, 1-eps))).mean().item()

        history["logloss"].append(logloss)
        history["brier"].append(brier)
        history["accuracy"].append(acc)

        if verbose:
            print(f"[outer {t:03d}] TRAIN (EMA+K-ens) ll={logloss:.4f}  br={brier:.4f}  acc={acc:.4f}")

        # 8) 조기종료 (최근 K_ma vs 직전 K_ma 이동평균 비교; 개선이 '양수이면서' 작을 때만 stop)
        if len(history["logloss"]) >= 2*K_ma:
            def ma_tail(vals, K): return float(np.mean(vals[-K:]))
            ll_r = ma_tail(history["logloss"], K_ma)
            ll_p = ma_tail(history["logloss"][:-K_ma], K_ma)
            br_r = ma_tail(history["brier"],   K_ma)
            br_p = ma_tail(history["brier"][:-K_ma],   K_ma)
            ac_r = ma_tail(history["accuracy"], K_ma)
            ac_p = ma_tail(history["accuracy"][:-K_ma], K_ma)

            ll_impr = (ll_p - ll_r) / max(ll_p, 1e-12)  # +면 개선
            br_impr = (br_p - br_r) / max(br_p, 1e-12)
            ac_impr = (ac_r - ac_p)                      # +면 개선

            stop_ll  = (0 <= ll_impr < tol_logloss)
            stop_br  = (0 <= br_impr < tol_brier)
            stop_acc = (0 <= ac_impr < tol_acc)

            if stop_ll and stop_br and stop_acc:
                stopped_at = t
                if verbose:
                    print(f"[Early stop @ outer {t}] "
                          f"Δll={ll_impr:.3%}, Δbr={br_impr:.3%}, Δacc={ac_impr:.3f}")
                break
    ps = []
    with torch.no_grad():
      for th in last_theta_draws:  # 마지막 outer의 모든 θ 샘플 사용
        p_th, _ = predict_probit(th.to(device), df_test, feature_cols, interaction, nonlinear, group)
        ps.append(p_th.to(device))

    p_test = torch.stack(ps).mean(0)  # posterior predictive mean

    y_te = torch.tensor(df_test["y"].values, dtype=torch.float32, device=device)
    acc_te = ((p_test > 0.5).float() == y_te).float().mean().item()
    brier_te = torch.mean((p_test - y_te) ** 2).item()
    logloss_te = (-y_te * torch.log(p_test.clamp(eps, 1 - eps))
              - (1 - y_te) * torch.log((1 - p_test).clamp(eps, 1 - eps))).mean().item()


    return {
        "theta_path": torch.stack(theta_path),      # EMA θ 경로 (T, dim)
        "history_train": history,                   # 각 outer의 (EMA+K-ens) 지표
        "stopped_at": stopped_at,
        "final_theta": theta.detach().cpu(),
       "metrics_test": {
        "accuracy": acc_te,
        "brier": brier_te,
        "logloss": logloss_te
    }
    }


In [None]:
all_metrics = []
noise_type = "normal"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=False, nonlinear=False, group=False,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:12, 25.42it/s, step size=2.60e-01, acc. prob=0.947]


[outer 000] TRAIN (EMA+K-ens) ll=0.6725  br=0.2400  acc=0.5730


Sample: 100%|██████████| 330/330 [00:15, 21.65it/s, step size=2.93e-01, acc. prob=0.956]


[outer 001] TRAIN (EMA+K-ens) ll=0.6938  br=0.2496  acc=0.6970


Sample: 100%|██████████| 330/330 [00:15, 21.41it/s, step size=3.13e-01, acc. prob=0.909]


[outer 002] TRAIN (EMA+K-ens) ll=0.6914  br=0.2481  acc=0.6020


Sample: 100%|██████████| 330/330 [00:13, 24.30it/s, step size=2.99e-01, acc. prob=0.942]


[outer 003] TRAIN (EMA+K-ens) ll=0.6828  br=0.2438  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 22.02it/s, step size=2.62e-01, acc. prob=0.961]


[outer 004] TRAIN (EMA+K-ens) ll=0.6760  br=0.2406  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.05it/s, step size=2.31e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.6753  br=0.2403  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=2.83e-01, acc. prob=0.952]


[outer 006] TRAIN (EMA+K-ens) ll=0.6748  br=0.2401  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.22it/s, step size=2.26e-01, acc. prob=0.966]


[outer 007] TRAIN (EMA+K-ens) ll=0.6628  br=0.2344  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 23.00it/s, step size=2.52e-01, acc. prob=0.964]


[outer 008] TRAIN (EMA+K-ens) ll=0.6619  br=0.2339  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.81it/s, step size=3.65e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6546  br=0.2306  acc=0.6970


Sample: 100%|██████████| 330/330 [00:14, 22.07it/s, step size=2.69e-01, acc. prob=0.937]


[outer 010] TRAIN (EMA+K-ens) ll=0.6572  br=0.2318  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 21.65it/s, step size=2.78e-01, acc. prob=0.957]


[outer 011] TRAIN (EMA+K-ens) ll=0.6517  br=0.2293  acc=0.6750


Sample: 100%|██████████| 330/330 [00:15, 21.75it/s, step size=3.41e-01, acc. prob=0.913]


[outer 012] TRAIN (EMA+K-ens) ll=0.6486  br=0.2278  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 22.46it/s, step size=2.67e-01, acc. prob=0.939]


[outer 013] TRAIN (EMA+K-ens) ll=0.6437  br=0.2254  acc=0.6640


Sample: 100%|██████████| 330/330 [00:15, 20.94it/s, step size=2.82e-01, acc. prob=0.961]


[outer 014] TRAIN (EMA+K-ens) ll=0.6467  br=0.2266  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.93it/s, step size=2.50e-01, acc. prob=0.973]


[outer 015] TRAIN (EMA+K-ens) ll=0.6525  br=0.2294  acc=0.6590


Sample: 100%|██████████| 330/330 [00:14, 23.27it/s, step size=2.68e-01, acc. prob=0.972]


[outer 016] TRAIN (EMA+K-ens) ll=0.6546  br=0.2302  acc=0.6800


Sample: 100%|██████████| 330/330 [00:15, 21.25it/s, step size=3.04e-01, acc. prob=0.938]


[outer 017] TRAIN (EMA+K-ens) ll=0.6477  br=0.2268  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.70it/s, step size=2.74e-01, acc. prob=0.953]


[outer 018] TRAIN (EMA+K-ens) ll=0.6494  br=0.2276  acc=0.6970


Sample: 100%|██████████| 330/330 [00:13, 24.11it/s, step size=3.07e-01, acc. prob=0.924]


[outer 019] TRAIN (EMA+K-ens) ll=0.6464  br=0.2264  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 23.77it/s, step size=2.39e-01, acc. prob=0.955]


[outer 020] TRAIN (EMA+K-ens) ll=0.6523  br=0.2294  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 24.01it/s, step size=2.67e-01, acc. prob=0.931]


[outer 021] TRAIN (EMA+K-ens) ll=0.6510  br=0.2289  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 23.64it/s, step size=3.04e-01, acc. prob=0.942]


[outer 022] TRAIN (EMA+K-ens) ll=0.6470  br=0.2270  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.89e-01, acc. prob=0.955]


[outer 023] TRAIN (EMA+K-ens) ll=0.6508  br=0.2289  acc=0.6890


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=2.56e-01, acc. prob=0.956]


[outer 024] TRAIN (EMA+K-ens) ll=0.6569  br=0.2317  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=2.74e-01, acc. prob=0.966]


[outer 025] TRAIN (EMA+K-ens) ll=0.6641  br=0.2352  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.70it/s, step size=2.77e-01, acc. prob=0.934]


[outer 026] TRAIN (EMA+K-ens) ll=0.6559  br=0.2315  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 22.49it/s, step size=2.84e-01, acc. prob=0.952]


[outer 027] TRAIN (EMA+K-ens) ll=0.6585  br=0.2327  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=2.55e-01, acc. prob=0.967]


[outer 028] TRAIN (EMA+K-ens) ll=0.6400  br=0.2237  acc=0.6960


Sample: 100%|██████████| 330/330 [00:14, 22.48it/s, step size=2.81e-01, acc. prob=0.936]


[outer 029] TRAIN (EMA+K-ens) ll=0.6469  br=0.2270  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.44it/s, step size=2.68e-01, acc. prob=0.951]


[outer 030] TRAIN (EMA+K-ens) ll=0.6435  br=0.2254  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.01it/s, step size=2.68e-01, acc. prob=0.972]


[outer 031] TRAIN (EMA+K-ens) ll=0.6362  br=0.2219  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.88it/s, step size=2.50e-01, acc. prob=0.963]


[outer 032] TRAIN (EMA+K-ens) ll=0.6181  br=0.2132  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.76e-01, acc. prob=0.958]


[outer 033] TRAIN (EMA+K-ens) ll=0.6179  br=0.2131  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.17it/s, step size=2.79e-01, acc. prob=0.953]


[outer 034] TRAIN (EMA+K-ens) ll=0.6202  br=0.2142  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 23.82it/s, step size=3.13e-01, acc. prob=0.944]


[outer 035] TRAIN (EMA+K-ens) ll=0.6248  br=0.2164  acc=0.6980


Sample: 100%|██████████| 330/330 [00:12, 26.53it/s, step size=3.47e-01, acc. prob=0.950]


[outer 036] TRAIN (EMA+K-ens) ll=0.6336  br=0.2205  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 23.52it/s, step size=2.65e-01, acc. prob=0.955]


[outer 037] TRAIN (EMA+K-ens) ll=0.6228  br=0.2154  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.35it/s, step size=3.29e-01, acc. prob=0.949]


[outer 038] TRAIN (EMA+K-ens) ll=0.6312  br=0.2194  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=2.97e-01, acc. prob=0.955]


[outer 039] TRAIN (EMA+K-ens) ll=0.6368  br=0.2221  acc=0.6820
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}]


Sample: 100%|██████████| 330/330 [00:15, 21.58it/s, step size=2.80e-01, acc. prob=0.962]


[outer 000] TRAIN (EMA+K-ens) ll=0.7362  br=0.2702  acc=0.5880


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.89e-01, acc. prob=0.938]


[outer 001] TRAIN (EMA+K-ens) ll=0.7077  br=0.2564  acc=0.6170


Sample: 100%|██████████| 330/330 [00:14, 22.39it/s, step size=2.79e-01, acc. prob=0.951]


[outer 002] TRAIN (EMA+K-ens) ll=0.6997  br=0.2524  acc=0.6340


Sample: 100%|██████████| 330/330 [00:14, 23.12it/s, step size=2.50e-01, acc. prob=0.952]


[outer 003] TRAIN (EMA+K-ens) ll=0.7030  br=0.2537  acc=0.6300


Sample: 100%|██████████| 330/330 [00:15, 21.15it/s, step size=2.74e-01, acc. prob=0.942]


[outer 004] TRAIN (EMA+K-ens) ll=0.7028  br=0.2534  acc=0.6270


Sample: 100%|██████████| 330/330 [00:14, 22.27it/s, step size=3.02e-01, acc. prob=0.928]


[outer 005] TRAIN (EMA+K-ens) ll=0.6970  br=0.2506  acc=0.6470


Sample: 100%|██████████| 330/330 [00:14, 22.26it/s, step size=2.75e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6953  br=0.2499  acc=0.6400


Sample: 100%|██████████| 330/330 [00:15, 21.77it/s, step size=2.25e-01, acc. prob=0.968]


[outer 007] TRAIN (EMA+K-ens) ll=0.6871  br=0.2459  acc=0.6490


Sample: 100%|██████████| 330/330 [00:13, 24.77it/s, step size=2.77e-01, acc. prob=0.954]


[outer 008] TRAIN (EMA+K-ens) ll=0.6922  br=0.2480  acc=0.6450


Sample: 100%|██████████| 330/330 [00:14, 22.20it/s, step size=3.08e-01, acc. prob=0.956]


[outer 009] TRAIN (EMA+K-ens) ll=0.6907  br=0.2473  acc=0.6640


Sample: 100%|██████████| 330/330 [00:13, 23.58it/s, step size=2.58e-01, acc. prob=0.961]


[outer 010] TRAIN (EMA+K-ens) ll=0.6977  br=0.2505  acc=0.6600


Sample: 100%|██████████| 330/330 [00:15, 21.45it/s, step size=3.20e-01, acc. prob=0.945]


[outer 011] TRAIN (EMA+K-ens) ll=0.6996  br=0.2514  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 21.52it/s, step size=2.74e-01, acc. prob=0.959]


[outer 012] TRAIN (EMA+K-ens) ll=0.6961  br=0.2500  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.30it/s, step size=2.32e-01, acc. prob=0.959]


[outer 013] TRAIN (EMA+K-ens) ll=0.7038  br=0.2536  acc=0.6800


Sample: 100%|██████████| 330/330 [00:14, 22.44it/s, step size=3.72e-01, acc. prob=0.936]


[outer 014] TRAIN (EMA+K-ens) ll=0.6994  br=0.2514  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.77e-01, acc. prob=0.964]


[outer 015] TRAIN (EMA+K-ens) ll=0.7049  br=0.2537  acc=0.6760


Sample: 100%|██████████| 330/330 [00:13, 24.27it/s, step size=2.46e-01, acc. prob=0.951]


[outer 016] TRAIN (EMA+K-ens) ll=0.6950  br=0.2492  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.95e-01, acc. prob=0.955]


[outer 017] TRAIN (EMA+K-ens) ll=0.6987  br=0.2506  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=2.40e-01, acc. prob=0.952]


[outer 018] TRAIN (EMA+K-ens) ll=0.6983  br=0.2505  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 23.20it/s, step size=2.81e-01, acc. prob=0.960]


[outer 019] TRAIN (EMA+K-ens) ll=0.6929  br=0.2479  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 23.33it/s, step size=2.90e-01, acc. prob=0.946]


[outer 020] TRAIN (EMA+K-ens) ll=0.6962  br=0.2492  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 20.58it/s, step size=2.73e-01, acc. prob=0.971]


[outer 021] TRAIN (EMA+K-ens) ll=0.6925  br=0.2478  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.76it/s, step size=3.09e-01, acc. prob=0.923]


[outer 022] TRAIN (EMA+K-ens) ll=0.6896  br=0.2464  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 23.94it/s, step size=2.51e-01, acc. prob=0.949]


[outer 023] TRAIN (EMA+K-ens) ll=0.6881  br=0.2460  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.21it/s, step size=3.11e-01, acc. prob=0.918]


[outer 024] TRAIN (EMA+K-ens) ll=0.6796  br=0.2423  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.44e-01, acc. prob=0.928]


[outer 025] TRAIN (EMA+K-ens) ll=0.6653  br=0.2355  acc=0.6920


Sample: 100%|██████████| 330/330 [00:14, 22.43it/s, step size=2.89e-01, acc. prob=0.963]


[outer 026] TRAIN (EMA+K-ens) ll=0.6514  br=0.2290  acc=0.6920


Sample: 100%|██████████| 330/330 [00:14, 23.10it/s, step size=2.96e-01, acc. prob=0.964]


[outer 027] TRAIN (EMA+K-ens) ll=0.6436  br=0.2253  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 24.99it/s, step size=2.47e-01, acc. prob=0.965]


[outer 028] TRAIN (EMA+K-ens) ll=0.6450  br=0.2259  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.13it/s, step size=2.84e-01, acc. prob=0.932]


[outer 029] TRAIN (EMA+K-ens) ll=0.6462  br=0.2266  acc=0.6800


Sample: 100%|██████████| 330/330 [00:15, 21.63it/s, step size=2.60e-01, acc. prob=0.920]


[outer 030] TRAIN (EMA+K-ens) ll=0.6510  br=0.2289  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 23.40it/s, step size=3.04e-01, acc. prob=0.958]


[outer 031] TRAIN (EMA+K-ens) ll=0.6434  br=0.2252  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=2.26e-01, acc. prob=0.952]


[outer 032] TRAIN (EMA+K-ens) ll=0.6415  br=0.2243  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.85it/s, step size=2.49e-01, acc. prob=0.957]


[outer 033] TRAIN (EMA+K-ens) ll=0.6471  br=0.2269  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 23.08it/s, step size=2.84e-01, acc. prob=0.938]


[outer 034] TRAIN (EMA+K-ens) ll=0.6578  br=0.2319  acc=0.6890


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=2.46e-01, acc. prob=0.967]


[outer 035] TRAIN (EMA+K-ens) ll=0.6566  br=0.2313  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.28it/s, step size=2.14e-01, acc. prob=0.975]


[outer 036] TRAIN (EMA+K-ens) ll=0.6590  br=0.2324  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 24.87it/s, step size=3.17e-01, acc. prob=0.930]


[outer 037] TRAIN (EMA+K-ens) ll=0.6579  br=0.2319  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 23.94it/s, step size=2.52e-01, acc. prob=0.947]


[outer 038] TRAIN (EMA+K-ens) ll=0.6610  br=0.2334  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=2.57e-01, acc. prob=0.947]


[outer 039] TRAIN (EMA+K-ens) ll=0.6672  br=0.2363  acc=0.6910
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}]


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=2.25e-01, acc. prob=0.977]


[outer 000] TRAIN (EMA+K-ens) ll=0.7107  br=0.2574  acc=0.5380


Sample: 100%|██████████| 330/330 [00:14, 22.61it/s, step size=3.87e-01, acc. prob=0.936]


[outer 001] TRAIN (EMA+K-ens) ll=0.6653  br=0.2361  acc=0.6350


Sample: 100%|██████████| 330/330 [00:13, 23.81it/s, step size=3.71e-01, acc. prob=0.905]


[outer 002] TRAIN (EMA+K-ens) ll=0.6906  br=0.2479  acc=0.6510


Sample: 100%|██████████| 330/330 [00:14, 23.44it/s, step size=2.85e-01, acc. prob=0.957]


[outer 003] TRAIN (EMA+K-ens) ll=0.6799  br=0.2427  acc=0.6440


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=3.43e-01, acc. prob=0.916]


[outer 004] TRAIN (EMA+K-ens) ll=0.6925  br=0.2484  acc=0.6330


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.32e-01, acc. prob=0.976]


[outer 005] TRAIN (EMA+K-ens) ll=0.6855  br=0.2450  acc=0.6730


Sample: 100%|██████████| 330/330 [00:13, 25.36it/s, step size=3.27e-01, acc. prob=0.943]


[outer 006] TRAIN (EMA+K-ens) ll=0.6647  br=0.2352  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.08it/s, step size=2.67e-01, acc. prob=0.954]


[outer 007] TRAIN (EMA+K-ens) ll=0.6649  br=0.2352  acc=0.7040


Sample: 100%|██████████| 330/330 [00:14, 22.43it/s, step size=3.34e-01, acc. prob=0.872]


[outer 008] TRAIN (EMA+K-ens) ll=0.6617  br=0.2336  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 25.21it/s, step size=4.00e-01, acc. prob=0.917]


[outer 009] TRAIN (EMA+K-ens) ll=0.6785  br=0.2411  acc=0.7040


Sample: 100%|██████████| 330/330 [00:13, 23.90it/s, step size=2.74e-01, acc. prob=0.956]


[outer 010] TRAIN (EMA+K-ens) ll=0.6586  br=0.2318  acc=0.7120


Sample: 100%|██████████| 330/330 [00:14, 23.29it/s, step size=2.27e-01, acc. prob=0.952]


[outer 011] TRAIN (EMA+K-ens) ll=0.6621  br=0.2334  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 23.29it/s, step size=2.96e-01, acc. prob=0.931]


[outer 012] TRAIN (EMA+K-ens) ll=0.6595  br=0.2323  acc=0.7130


Sample: 100%|██████████| 330/330 [00:13, 24.63it/s, step size=3.15e-01, acc. prob=0.977]


[outer 013] TRAIN (EMA+K-ens) ll=0.6605  br=0.2329  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.24it/s, step size=3.47e-01, acc. prob=0.915]


[outer 014] TRAIN (EMA+K-ens) ll=0.6722  br=0.2381  acc=0.7110


Sample: 100%|██████████| 330/330 [00:13, 24.21it/s, step size=2.52e-01, acc. prob=0.965]


[outer 015] TRAIN (EMA+K-ens) ll=0.6674  br=0.2358  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.60e-01, acc. prob=0.971]


[outer 016] TRAIN (EMA+K-ens) ll=0.6675  br=0.2359  acc=0.7030


Sample: 100%|██████████| 330/330 [00:15, 21.97it/s, step size=2.29e-01, acc. prob=0.970]


[outer 017] TRAIN (EMA+K-ens) ll=0.6451  br=0.2257  acc=0.7120


Sample: 100%|██████████| 330/330 [00:13, 23.90it/s, step size=3.63e-01, acc. prob=0.938]


[outer 018] TRAIN (EMA+K-ens) ll=0.6600  br=0.2327  acc=0.7120


Sample: 100%|██████████| 330/330 [00:12, 26.37it/s, step size=2.51e-01, acc. prob=0.952]


[outer 019] TRAIN (EMA+K-ens) ll=0.6641  br=0.2347  acc=0.7120
[Early stop @ outer 19] Δll=0.284%, Δbr=0.408%, Δacc=0.004
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}]


Sample: 100%|██████████| 330/330 [00:13, 24.82it/s, step size=3.05e-01, acc. prob=0.964]


[outer 000] TRAIN (EMA+K-ens) ll=0.5991  br=0.2040  acc=0.7190


Sample: 100%|██████████| 330/330 [00:13, 23.67it/s, step size=2.79e-01, acc. prob=0.953]


[outer 001] TRAIN (EMA+K-ens) ll=0.6327  br=0.2202  acc=0.7180


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=2.12e-01, acc. prob=0.968]


[outer 002] TRAIN (EMA+K-ens) ll=0.6427  br=0.2250  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 21.30it/s, step size=2.29e-01, acc. prob=0.961]


[outer 003] TRAIN (EMA+K-ens) ll=0.6536  br=0.2301  acc=0.6970


Sample: 100%|██████████| 330/330 [00:14, 22.61it/s, step size=2.34e-01, acc. prob=0.952]


[outer 004] TRAIN (EMA+K-ens) ll=0.6547  br=0.2305  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 20.56it/s, step size=2.68e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6590  br=0.2327  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.73it/s, step size=3.00e-01, acc. prob=0.943]


[outer 006] TRAIN (EMA+K-ens) ll=0.6485  br=0.2277  acc=0.7040


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=2.55e-01, acc. prob=0.958]


[outer 007] TRAIN (EMA+K-ens) ll=0.6520  br=0.2293  acc=0.7040


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=2.60e-01, acc. prob=0.953]


[outer 008] TRAIN (EMA+K-ens) ll=0.6615  br=0.2337  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 20.82it/s, step size=2.71e-01, acc. prob=0.963]


[outer 009] TRAIN (EMA+K-ens) ll=0.6645  br=0.2351  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.03it/s, step size=3.03e-01, acc. prob=0.944]


[outer 010] TRAIN (EMA+K-ens) ll=0.6612  br=0.2337  acc=0.6970


Sample: 100%|██████████| 330/330 [00:13, 24.29it/s, step size=2.59e-01, acc. prob=0.962]


[outer 011] TRAIN (EMA+K-ens) ll=0.6587  br=0.2324  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.17it/s, step size=2.48e-01, acc. prob=0.971]


[outer 012] TRAIN (EMA+K-ens) ll=0.6508  br=0.2286  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 20.83it/s, step size=3.16e-01, acc. prob=0.940]


[outer 013] TRAIN (EMA+K-ens) ll=0.6490  br=0.2275  acc=0.7070


Sample: 100%|██████████| 330/330 [00:15, 21.82it/s, step size=2.79e-01, acc. prob=0.965]


[outer 014] TRAIN (EMA+K-ens) ll=0.6488  br=0.2273  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.78it/s, step size=2.63e-01, acc. prob=0.936]


[outer 015] TRAIN (EMA+K-ens) ll=0.6456  br=0.2259  acc=0.7110


Sample: 100%|██████████| 330/330 [00:13, 25.34it/s, step size=3.10e-01, acc. prob=0.949]


[outer 016] TRAIN (EMA+K-ens) ll=0.6454  br=0.2258  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=2.61e-01, acc. prob=0.959]


[outer 017] TRAIN (EMA+K-ens) ll=0.6413  br=0.2239  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.56it/s, step size=3.77e-01, acc. prob=0.965]


[outer 018] TRAIN (EMA+K-ens) ll=0.6500  br=0.2279  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.19it/s, step size=3.09e-01, acc. prob=0.938]


[outer 019] TRAIN (EMA+K-ens) ll=0.6549  br=0.2303  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 22.18it/s, step size=3.04e-01, acc. prob=0.932]


[outer 020] TRAIN (EMA+K-ens) ll=0.6606  br=0.2329  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.07it/s, step size=2.31e-01, acc. prob=0.955]


[outer 021] TRAIN (EMA+K-ens) ll=0.6589  br=0.2323  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.95it/s, step size=2.73e-01, acc. prob=0.952]


[outer 022] TRAIN (EMA+K-ens) ll=0.6802  br=0.2423  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 22.49it/s, step size=2.57e-01, acc. prob=0.954]


[outer 023] TRAIN (EMA+K-ens) ll=0.6833  br=0.2437  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.30e-01, acc. prob=0.951]


[outer 024] TRAIN (EMA+K-ens) ll=0.6779  br=0.2411  acc=0.7030


Sample: 100%|██████████| 330/330 [00:14, 22.42it/s, step size=3.41e-01, acc. prob=0.879]


[outer 025] TRAIN (EMA+K-ens) ll=0.6776  br=0.2412  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 23.67it/s, step size=2.14e-01, acc. prob=0.959]


[outer 026] TRAIN (EMA+K-ens) ll=0.6641  br=0.2348  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=2.19e-01, acc. prob=0.961]


[outer 027] TRAIN (EMA+K-ens) ll=0.6615  br=0.2337  acc=0.7030


Sample: 100%|██████████| 330/330 [00:14, 23.51it/s, step size=2.55e-01, acc. prob=0.965]


[outer 028] TRAIN (EMA+K-ens) ll=0.6587  br=0.2324  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.58it/s, step size=3.23e-01, acc. prob=0.951]


[outer 029] TRAIN (EMA+K-ens) ll=0.6586  br=0.2324  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=2.45e-01, acc. prob=0.937]


[outer 030] TRAIN (EMA+K-ens) ll=0.6555  br=0.2309  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.79it/s, step size=3.32e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6529  br=0.2297  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=2.76e-01, acc. prob=0.959]


[outer 032] TRAIN (EMA+K-ens) ll=0.6547  br=0.2305  acc=0.7030


Sample: 100%|██████████| 330/330 [00:14, 22.22it/s, step size=2.56e-01, acc. prob=0.968]


[outer 033] TRAIN (EMA+K-ens) ll=0.6664  br=0.2360  acc=0.7030


Sample: 100%|██████████| 330/330 [00:15, 20.85it/s, step size=3.29e-01, acc. prob=0.956]


[outer 034] TRAIN (EMA+K-ens) ll=0.6745  br=0.2397  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.02it/s, step size=2.57e-01, acc. prob=0.944]


[outer 035] TRAIN (EMA+K-ens) ll=0.6704  br=0.2376  acc=0.7030


Sample: 100%|██████████| 330/330 [00:14, 22.13it/s, step size=3.04e-01, acc. prob=0.917]


[outer 036] TRAIN (EMA+K-ens) ll=0.6743  br=0.2392  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=3.02e-01, acc. prob=0.951]


[outer 037] TRAIN (EMA+K-ens) ll=0.6751  br=0.2398  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.73it/s, step size=2.54e-01, acc. prob=0.971]


[outer 038] TRAIN (EMA+K-ens) ll=0.6735  br=0.2390  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 22.80it/s, step size=2.53e-01, acc. prob=0.972]


[outer 039] TRAIN (EMA+K-ens) ll=0.6849  br=0.2444  acc=0.7040
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}]


Sample: 100%|██████████| 330/330 [00:15, 20.87it/s, step size=3.35e-01, acc. prob=0.944]


[outer 000] TRAIN (EMA+K-ens) ll=0.7022  br=0.2540  acc=0.5730


Sample: 100%|██████████| 330/330 [00:14, 23.37it/s, step size=2.90e-01, acc. prob=0.958]


[outer 001] TRAIN (EMA+K-ens) ll=0.6805  br=0.2432  acc=0.6800


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=3.01e-01, acc. prob=0.931]


[outer 002] TRAIN (EMA+K-ens) ll=0.6592  br=0.2329  acc=0.6720


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=3.14e-01, acc. prob=0.928]


[outer 003] TRAIN (EMA+K-ens) ll=0.6514  br=0.2291  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.79e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6487  br=0.2278  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 21.99it/s, step size=2.65e-01, acc. prob=0.952]


[outer 005] TRAIN (EMA+K-ens) ll=0.6576  br=0.2321  acc=0.6930


Sample: 100%|██████████| 330/330 [00:14, 22.94it/s, step size=3.17e-01, acc. prob=0.902]


[outer 006] TRAIN (EMA+K-ens) ll=0.6484  br=0.2277  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.96it/s, step size=2.61e-01, acc. prob=0.972]


[outer 007] TRAIN (EMA+K-ens) ll=0.6506  br=0.2287  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.72it/s, step size=2.68e-01, acc. prob=0.957]


[outer 008] TRAIN (EMA+K-ens) ll=0.6502  br=0.2285  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.99e-01, acc. prob=0.949]


[outer 009] TRAIN (EMA+K-ens) ll=0.6484  br=0.2277  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.33it/s, step size=2.26e-01, acc. prob=0.972]


[outer 010] TRAIN (EMA+K-ens) ll=0.6451  br=0.2262  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.25it/s, step size=3.43e-01, acc. prob=0.928]


[outer 011] TRAIN (EMA+K-ens) ll=0.6454  br=0.2263  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=2.91e-01, acc. prob=0.946]


[outer 012] TRAIN (EMA+K-ens) ll=0.6545  br=0.2305  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.08it/s, step size=2.81e-01, acc. prob=0.953]


[outer 013] TRAIN (EMA+K-ens) ll=0.6519  br=0.2293  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.52it/s, step size=2.50e-01, acc. prob=0.945]


[outer 014] TRAIN (EMA+K-ens) ll=0.6628  br=0.2345  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.10it/s, step size=2.38e-01, acc. prob=0.970]


[outer 015] TRAIN (EMA+K-ens) ll=0.6656  br=0.2357  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.23it/s, step size=2.34e-01, acc. prob=0.963]


[outer 016] TRAIN (EMA+K-ens) ll=0.6584  br=0.2323  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=3.34e-01, acc. prob=0.939]


[outer 017] TRAIN (EMA+K-ens) ll=0.6538  br=0.2301  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.27it/s, step size=2.45e-01, acc. prob=0.951]


[outer 018] TRAIN (EMA+K-ens) ll=0.6626  br=0.2342  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.84it/s, step size=3.34e-01, acc. prob=0.922]


[outer 019] TRAIN (EMA+K-ens) ll=0.6651  br=0.2354  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.36e-01, acc. prob=0.945]


[outer 020] TRAIN (EMA+K-ens) ll=0.6606  br=0.2334  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.94it/s, step size=2.84e-01, acc. prob=0.931]


[outer 021] TRAIN (EMA+K-ens) ll=0.6563  br=0.2314  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.67it/s, step size=2.76e-01, acc. prob=0.938]


[outer 022] TRAIN (EMA+K-ens) ll=0.6560  br=0.2313  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.64it/s, step size=2.48e-01, acc. prob=0.974]


[outer 023] TRAIN (EMA+K-ens) ll=0.6567  br=0.2317  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 23.61it/s, step size=2.49e-01, acc. prob=0.954]


[outer 024] TRAIN (EMA+K-ens) ll=0.6649  br=0.2356  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.30it/s, step size=2.68e-01, acc. prob=0.952]


[outer 025] TRAIN (EMA+K-ens) ll=0.6640  br=0.2351  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.12it/s, step size=2.52e-01, acc. prob=0.949]


[outer 026] TRAIN (EMA+K-ens) ll=0.6680  br=0.2371  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.63it/s, step size=1.90e-01, acc. prob=0.967]


[outer 027] TRAIN (EMA+K-ens) ll=0.6741  br=0.2399  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.49it/s, step size=2.54e-01, acc. prob=0.962]


[outer 028] TRAIN (EMA+K-ens) ll=0.6718  br=0.2388  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.55it/s, step size=2.58e-01, acc. prob=0.955]


[outer 029] TRAIN (EMA+K-ens) ll=0.6778  br=0.2417  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 21.97it/s, step size=3.01e-01, acc. prob=0.948]


[outer 030] TRAIN (EMA+K-ens) ll=0.6722  br=0.2390  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.19it/s, step size=3.40e-01, acc. prob=0.920]


[outer 031] TRAIN (EMA+K-ens) ll=0.6652  br=0.2356  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.55it/s, step size=2.32e-01, acc. prob=0.952]


[outer 032] TRAIN (EMA+K-ens) ll=0.6642  br=0.2353  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.57e-01, acc. prob=0.939]


[outer 033] TRAIN (EMA+K-ens) ll=0.6761  br=0.2412  acc=0.6600


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=3.17e-01, acc. prob=0.928]


[outer 034] TRAIN (EMA+K-ens) ll=0.6803  br=0.2432  acc=0.6130


Sample: 100%|██████████| 330/330 [00:15, 21.36it/s, step size=2.62e-01, acc. prob=0.966]


[outer 035] TRAIN (EMA+K-ens) ll=0.6759  br=0.2410  acc=0.6240


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=3.28e-01, acc. prob=0.958]


[outer 036] TRAIN (EMA+K-ens) ll=0.6812  br=0.2436  acc=0.6180


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.62e-01, acc. prob=0.948]


[outer 037] TRAIN (EMA+K-ens) ll=0.6806  br=0.2432  acc=0.6280


Sample: 100%|██████████| 330/330 [00:14, 22.36it/s, step size=3.03e-01, acc. prob=0.931]


[outer 038] TRAIN (EMA+K-ens) ll=0.6804  br=0.2431  acc=0.6420


Sample: 100%|██████████| 330/330 [00:15, 20.88it/s, step size=3.13e-01, acc. prob=0.958]


[outer 039] TRAIN (EMA+K-ens) ll=0.6848  br=0.2452  acc=0.6500
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}]


Sample: 100%|██████████| 330/330 [00:14, 22.65it/s, step size=2.88e-01, acc. prob=0.951]


[outer 000] TRAIN (EMA+K-ens) ll=0.7430  br=0.2720  acc=0.5320


Sample: 100%|██████████| 330/330 [00:14, 23.16it/s, step size=2.84e-01, acc. prob=0.960]


[outer 001] TRAIN (EMA+K-ens) ll=0.7336  br=0.2669  acc=0.5780


Sample: 100%|██████████| 330/330 [00:15, 21.62it/s, step size=3.23e-01, acc. prob=0.957]


[outer 002] TRAIN (EMA+K-ens) ll=0.7002  br=0.2522  acc=0.6410


Sample: 100%|██████████| 330/330 [00:14, 22.41it/s, step size=2.99e-01, acc. prob=0.956]


[outer 003] TRAIN (EMA+K-ens) ll=0.7097  br=0.2568  acc=0.6400


Sample: 100%|██████████| 330/330 [00:14, 23.56it/s, step size=3.32e-01, acc. prob=0.938]


[outer 004] TRAIN (EMA+K-ens) ll=0.6803  br=0.2429  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=2.42e-01, acc. prob=0.965]


[outer 005] TRAIN (EMA+K-ens) ll=0.6753  br=0.2405  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 22.61it/s, step size=3.53e-01, acc. prob=0.915]


[outer 006] TRAIN (EMA+K-ens) ll=0.6721  br=0.2390  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 23.09it/s, step size=2.71e-01, acc. prob=0.942]


[outer 007] TRAIN (EMA+K-ens) ll=0.6771  br=0.2413  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.05it/s, step size=1.97e-01, acc. prob=0.968]


[outer 008] TRAIN (EMA+K-ens) ll=0.6784  br=0.2418  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 24.19it/s, step size=2.62e-01, acc. prob=0.955]


[outer 009] TRAIN (EMA+K-ens) ll=0.6793  br=0.2422  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.99it/s, step size=3.02e-01, acc. prob=0.944]


[outer 010] TRAIN (EMA+K-ens) ll=0.6900  br=0.2472  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.91it/s, step size=2.64e-01, acc. prob=0.937]


[outer 011] TRAIN (EMA+K-ens) ll=0.6789  br=0.2421  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.70e-01, acc. prob=0.924]


[outer 012] TRAIN (EMA+K-ens) ll=0.6945  br=0.2495  acc=0.6810


Sample: 100%|██████████| 330/330 [00:15, 20.65it/s, step size=2.62e-01, acc. prob=0.953]


[outer 013] TRAIN (EMA+K-ens) ll=0.6955  br=0.2500  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.38it/s, step size=2.73e-01, acc. prob=0.969]


[outer 014] TRAIN (EMA+K-ens) ll=0.6990  br=0.2517  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.20it/s, step size=2.90e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6933  br=0.2491  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=2.22e-01, acc. prob=0.965]


[outer 016] TRAIN (EMA+K-ens) ll=0.6849  br=0.2453  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.18it/s, step size=3.16e-01, acc. prob=0.948]


[outer 017] TRAIN (EMA+K-ens) ll=0.6692  br=0.2377  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 24.10it/s, step size=2.47e-01, acc. prob=0.956]


[outer 018] TRAIN (EMA+K-ens) ll=0.6500  br=0.2285  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.75it/s, step size=2.69e-01, acc. prob=0.965]


[outer 019] TRAIN (EMA+K-ens) ll=0.6492  br=0.2281  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 24.19it/s, step size=2.65e-01, acc. prob=0.952]


[outer 020] TRAIN (EMA+K-ens) ll=0.6481  br=0.2275  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=2.99e-01, acc. prob=0.963]


[outer 021] TRAIN (EMA+K-ens) ll=0.6524  br=0.2294  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.52it/s, step size=3.05e-01, acc. prob=0.950]


[outer 022] TRAIN (EMA+K-ens) ll=0.6534  br=0.2298  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.53it/s, step size=3.04e-01, acc. prob=0.935]


[outer 023] TRAIN (EMA+K-ens) ll=0.6607  br=0.2334  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 21.13it/s, step size=2.37e-01, acc. prob=0.957]


[outer 024] TRAIN (EMA+K-ens) ll=0.6594  br=0.2328  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 23.64it/s, step size=2.66e-01, acc. prob=0.958]


[outer 025] TRAIN (EMA+K-ens) ll=0.6675  br=0.2366  acc=0.6800


Sample: 100%|██████████| 330/330 [00:14, 22.16it/s, step size=2.96e-01, acc. prob=0.953]


[outer 026] TRAIN (EMA+K-ens) ll=0.6790  br=0.2420  acc=0.6800


Sample: 100%|██████████| 330/330 [00:15, 21.22it/s, step size=2.21e-01, acc. prob=0.974]


[outer 027] TRAIN (EMA+K-ens) ll=0.6794  br=0.2424  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.45it/s, step size=2.50e-01, acc. prob=0.959]


[outer 028] TRAIN (EMA+K-ens) ll=0.6804  br=0.2430  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=2.89e-01, acc. prob=0.963]


[outer 029] TRAIN (EMA+K-ens) ll=0.6768  br=0.2413  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 22.17it/s, step size=2.79e-01, acc. prob=0.929]


[outer 030] TRAIN (EMA+K-ens) ll=0.6694  br=0.2377  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.57it/s, step size=3.17e-01, acc. prob=0.945]


[outer 031] TRAIN (EMA+K-ens) ll=0.6559  br=0.2313  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.94it/s, step size=2.86e-01, acc. prob=0.958]


[outer 032] TRAIN (EMA+K-ens) ll=0.6579  br=0.2321  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.31it/s, step size=2.49e-01, acc. prob=0.967]


[outer 033] TRAIN (EMA+K-ens) ll=0.6538  br=0.2301  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.04it/s, step size=3.82e-01, acc. prob=0.938]


[outer 034] TRAIN (EMA+K-ens) ll=0.6590  br=0.2325  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.27it/s, step size=3.27e-01, acc. prob=0.915]


[outer 035] TRAIN (EMA+K-ens) ll=0.6556  br=0.2308  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.21it/s, step size=2.84e-01, acc. prob=0.962]


[outer 036] TRAIN (EMA+K-ens) ll=0.6551  br=0.2306  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 19.92it/s, step size=2.64e-01, acc. prob=0.966]


[outer 037] TRAIN (EMA+K-ens) ll=0.6481  br=0.2273  acc=0.6850


Sample: 100%|██████████| 330/330 [00:12, 25.51it/s, step size=3.14e-01, acc. prob=0.952]


[outer 038] TRAIN (EMA+K-ens) ll=0.6503  br=0.2284  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=3.02e-01, acc. prob=0.948]


[outer 039] TRAIN (EMA+K-ens) ll=0.6570  br=0.2315  acc=0.6850
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}, {'accuracy': 0.6592000126838684, 'brier': 0.2397531419992447, 'logloss': 0.675529956817627}]


Sample: 100%|██████████| 330/330 [00:16, 19.42it/s, step size=3.27e-01, acc. prob=0.925]


[outer 000] TRAIN (EMA+K-ens) ll=0.7130  br=0.2585  acc=0.5370


Sample: 100%|██████████| 330/330 [00:15, 21.65it/s, step size=2.41e-01, acc. prob=0.950]


[outer 001] TRAIN (EMA+K-ens) ll=0.6889  br=0.2472  acc=0.5750


Sample: 100%|██████████| 330/330 [00:14, 22.44it/s, step size=2.27e-01, acc. prob=0.968]


[outer 002] TRAIN (EMA+K-ens) ll=0.6965  br=0.2509  acc=0.5820


Sample: 100%|██████████| 330/330 [00:14, 22.11it/s, step size=2.16e-01, acc. prob=0.947]


[outer 003] TRAIN (EMA+K-ens) ll=0.6640  br=0.2355  acc=0.6280


Sample: 100%|██████████| 330/330 [00:14, 22.07it/s, step size=2.39e-01, acc. prob=0.930]


[outer 004] TRAIN (EMA+K-ens) ll=0.6811  br=0.2436  acc=0.6020


Sample: 100%|██████████| 330/330 [00:17, 19.36it/s, step size=2.37e-01, acc. prob=0.976]


[outer 005] TRAIN (EMA+K-ens) ll=0.6770  br=0.2415  acc=0.6390


Sample: 100%|██████████| 330/330 [00:14, 23.31it/s, step size=2.66e-01, acc. prob=0.945]


[outer 006] TRAIN (EMA+K-ens) ll=0.6750  br=0.2405  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 21.44it/s, step size=3.06e-01, acc. prob=0.930]


[outer 007] TRAIN (EMA+K-ens) ll=0.6766  br=0.2412  acc=0.6660


Sample: 100%|██████████| 330/330 [00:13, 23.82it/s, step size=2.92e-01, acc. prob=0.928]


[outer 008] TRAIN (EMA+K-ens) ll=0.6791  br=0.2422  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.87e-01, acc. prob=0.957]


[outer 009] TRAIN (EMA+K-ens) ll=0.6805  br=0.2429  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=2.74e-01, acc. prob=0.947]


[outer 010] TRAIN (EMA+K-ens) ll=0.6776  br=0.2415  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 23.81it/s, step size=3.00e-01, acc. prob=0.940]


[outer 011] TRAIN (EMA+K-ens) ll=0.6823  br=0.2437  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.23it/s, step size=3.53e-01, acc. prob=0.930]


[outer 012] TRAIN (EMA+K-ens) ll=0.6668  br=0.2362  acc=0.6860


Sample: 100%|██████████| 330/330 [00:12, 25.97it/s, step size=3.69e-01, acc. prob=0.940]


[outer 013] TRAIN (EMA+K-ens) ll=0.6695  br=0.2375  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.64it/s, step size=2.32e-01, acc. prob=0.969]


[outer 014] TRAIN (EMA+K-ens) ll=0.6607  br=0.2333  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.38it/s, step size=2.06e-01, acc. prob=0.964]


[outer 015] TRAIN (EMA+K-ens) ll=0.6627  br=0.2343  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.08it/s, step size=2.75e-01, acc. prob=0.936]


[outer 016] TRAIN (EMA+K-ens) ll=0.6572  br=0.2317  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 24.06it/s, step size=3.12e-01, acc. prob=0.944]


[outer 017] TRAIN (EMA+K-ens) ll=0.6491  br=0.2280  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 23.66it/s, step size=2.74e-01, acc. prob=0.931]


[outer 018] TRAIN (EMA+K-ens) ll=0.6463  br=0.2266  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.63it/s, step size=2.77e-01, acc. prob=0.953]


[outer 019] TRAIN (EMA+K-ens) ll=0.6502  br=0.2284  acc=0.6850


Sample: 100%|██████████| 330/330 [00:12, 25.62it/s, step size=3.85e-01, acc. prob=0.937]


[outer 020] TRAIN (EMA+K-ens) ll=0.6562  br=0.2313  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.52it/s, step size=3.12e-01, acc. prob=0.936]


[outer 021] TRAIN (EMA+K-ens) ll=0.6521  br=0.2294  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.13it/s, step size=2.38e-01, acc. prob=0.968]


[outer 022] TRAIN (EMA+K-ens) ll=0.6519  br=0.2294  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.61it/s, step size=2.84e-01, acc. prob=0.944]


[outer 023] TRAIN (EMA+K-ens) ll=0.6484  br=0.2277  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.31it/s, step size=2.94e-01, acc. prob=0.910]


[outer 024] TRAIN (EMA+K-ens) ll=0.6471  br=0.2271  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.60e-01, acc. prob=0.961]


[outer 025] TRAIN (EMA+K-ens) ll=0.6566  br=0.2316  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.41it/s, step size=2.32e-01, acc. prob=0.957]


[outer 026] TRAIN (EMA+K-ens) ll=0.6605  br=0.2335  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.27it/s, step size=2.38e-01, acc. prob=0.955]


[outer 027] TRAIN (EMA+K-ens) ll=0.6665  br=0.2363  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.94it/s, step size=2.67e-01, acc. prob=0.961]


[outer 028] TRAIN (EMA+K-ens) ll=0.6744  br=0.2400  acc=0.6750


Sample: 100%|██████████| 330/330 [00:15, 21.32it/s, step size=2.88e-01, acc. prob=0.952]


[outer 029] TRAIN (EMA+K-ens) ll=0.6811  br=0.2432  acc=0.6750


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.86e-01, acc. prob=0.937]


[outer 030] TRAIN (EMA+K-ens) ll=0.6831  br=0.2442  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 20.85it/s, step size=3.30e-01, acc. prob=0.934]


[outer 031] TRAIN (EMA+K-ens) ll=0.6912  br=0.2479  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 23.07it/s, step size=2.71e-01, acc. prob=0.961]


[outer 032] TRAIN (EMA+K-ens) ll=0.6867  br=0.2458  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.74it/s, step size=2.46e-01, acc. prob=0.956]


[outer 033] TRAIN (EMA+K-ens) ll=0.6824  br=0.2436  acc=0.6810


Sample: 100%|██████████| 330/330 [00:13, 24.19it/s, step size=2.50e-01, acc. prob=0.962]


[outer 034] TRAIN (EMA+K-ens) ll=0.6766  br=0.2408  acc=0.6810


Sample: 100%|██████████| 330/330 [00:15, 21.62it/s, step size=2.30e-01, acc. prob=0.964]


[outer 035] TRAIN (EMA+K-ens) ll=0.6745  br=0.2398  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.45it/s, step size=2.84e-01, acc. prob=0.969]


[outer 036] TRAIN (EMA+K-ens) ll=0.6696  br=0.2375  acc=0.6830


Sample: 100%|██████████| 330/330 [00:13, 25.20it/s, step size=3.04e-01, acc. prob=0.946]


[outer 037] TRAIN (EMA+K-ens) ll=0.6586  br=0.2324  acc=0.6830


Sample: 100%|██████████| 330/330 [00:13, 23.76it/s, step size=2.62e-01, acc. prob=0.936]


[outer 038] TRAIN (EMA+K-ens) ll=0.6731  br=0.2391  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.27it/s, step size=2.35e-01, acc. prob=0.952]


[outer 039] TRAIN (EMA+K-ens) ll=0.6592  br=0.2328  acc=0.6810
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}, {'accuracy': 0.6592000126838684, 'brier': 0.2397531419992447, 'logloss': 0.675529956817627}, {'accuracy': 0.6560199856758118, 'brier': 0.2281046360731125, 'logloss': 0.6481181979179382}]


Sample: 100%|██████████| 330/330 [00:15, 20.98it/s, step size=2.92e-01, acc. prob=0.938]


[outer 000] TRAIN (EMA+K-ens) ll=0.7140  br=0.2604  acc=0.4600


Sample: 100%|██████████| 330/330 [00:15, 20.84it/s, step size=3.38e-01, acc. prob=0.934]


[outer 001] TRAIN (EMA+K-ens) ll=0.7057  br=0.2555  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=3.06e-01, acc. prob=0.943]


[outer 002] TRAIN (EMA+K-ens) ll=0.6897  br=0.2475  acc=0.6470


Sample: 100%|██████████| 330/330 [00:14, 23.04it/s, step size=2.79e-01, acc. prob=0.934]


[outer 003] TRAIN (EMA+K-ens) ll=0.6876  br=0.2464  acc=0.6290


Sample: 100%|██████████| 330/330 [00:16, 20.56it/s, step size=2.46e-01, acc. prob=0.938]


[outer 004] TRAIN (EMA+K-ens) ll=0.6884  br=0.2466  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 23.33it/s, step size=2.43e-01, acc. prob=0.962]


[outer 005] TRAIN (EMA+K-ens) ll=0.6819  br=0.2434  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.62it/s, step size=2.95e-01, acc. prob=0.938]


[outer 006] TRAIN (EMA+K-ens) ll=0.6797  br=0.2423  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.87it/s, step size=2.38e-01, acc. prob=0.932]


[outer 007] TRAIN (EMA+K-ens) ll=0.6828  br=0.2438  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.55e-01, acc. prob=0.949]


[outer 008] TRAIN (EMA+K-ens) ll=0.6701  br=0.2377  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.46it/s, step size=2.95e-01, acc. prob=0.943]


[outer 009] TRAIN (EMA+K-ens) ll=0.6734  br=0.2394  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.13it/s, step size=2.80e-01, acc. prob=0.953]


[outer 010] TRAIN (EMA+K-ens) ll=0.6640  br=0.2349  acc=0.6960


Sample: 100%|██████████| 330/330 [00:14, 23.41it/s, step size=2.87e-01, acc. prob=0.944]


[outer 011] TRAIN (EMA+K-ens) ll=0.6694  br=0.2371  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.38it/s, step size=2.67e-01, acc. prob=0.962]


[outer 012] TRAIN (EMA+K-ens) ll=0.6636  br=0.2344  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.75it/s, step size=3.21e-01, acc. prob=0.954]


[outer 013] TRAIN (EMA+K-ens) ll=0.6711  br=0.2380  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.25it/s, step size=3.39e-01, acc. prob=0.931]


[outer 014] TRAIN (EMA+K-ens) ll=0.6654  br=0.2355  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.69it/s, step size=2.55e-01, acc. prob=0.955]


[outer 015] TRAIN (EMA+K-ens) ll=0.6615  br=0.2336  acc=0.6970


Sample: 100%|██████████| 330/330 [00:15, 21.64it/s, step size=2.59e-01, acc. prob=0.964]


[outer 016] TRAIN (EMA+K-ens) ll=0.6786  br=0.2413  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 24.33it/s, step size=2.74e-01, acc. prob=0.951]


[outer 017] TRAIN (EMA+K-ens) ll=0.6771  br=0.2405  acc=0.7030


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.98e-01, acc. prob=0.937]


[outer 018] TRAIN (EMA+K-ens) ll=0.6955  br=0.2493  acc=0.6790


Sample: 100%|██████████| 330/330 [00:14, 23.36it/s, step size=2.44e-01, acc. prob=0.975]


[outer 019] TRAIN (EMA+K-ens) ll=0.6830  br=0.2437  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.48it/s, step size=2.89e-01, acc. prob=0.939]


[outer 020] TRAIN (EMA+K-ens) ll=0.6851  br=0.2447  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=3.56e-01, acc. prob=0.906]


[outer 021] TRAIN (EMA+K-ens) ll=0.6788  br=0.2418  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 23.26it/s, step size=2.94e-01, acc. prob=0.950]


[outer 022] TRAIN (EMA+K-ens) ll=0.6946  br=0.2491  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.06it/s, step size=2.98e-01, acc. prob=0.928]


[outer 023] TRAIN (EMA+K-ens) ll=0.6957  br=0.2495  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.06it/s, step size=2.37e-01, acc. prob=0.958]


[outer 024] TRAIN (EMA+K-ens) ll=0.6917  br=0.2474  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.24it/s, step size=2.59e-01, acc. prob=0.965]


[outer 025] TRAIN (EMA+K-ens) ll=0.6870  br=0.2451  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.71it/s, step size=3.23e-01, acc. prob=0.908]


[outer 026] TRAIN (EMA+K-ens) ll=0.6752  br=0.2395  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.82e-01, acc. prob=0.937]


[outer 027] TRAIN (EMA+K-ens) ll=0.6800  br=0.2417  acc=0.6990


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=3.07e-01, acc. prob=0.925]


[outer 028] TRAIN (EMA+K-ens) ll=0.6854  br=0.2437  acc=0.6980


Sample: 100%|██████████| 330/330 [00:14, 22.02it/s, step size=2.62e-01, acc. prob=0.945]


[outer 029] TRAIN (EMA+K-ens) ll=0.6807  br=0.2418  acc=0.6990


Sample: 100%|██████████| 330/330 [00:12, 25.97it/s, step size=3.33e-01, acc. prob=0.944]


[outer 030] TRAIN (EMA+K-ens) ll=0.6515  br=0.2286  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 23.77it/s, step size=2.85e-01, acc. prob=0.932]


[outer 031] TRAIN (EMA+K-ens) ll=0.6552  br=0.2305  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 24.53it/s, step size=2.43e-01, acc. prob=0.947]


[outer 032] TRAIN (EMA+K-ens) ll=0.6430  br=0.2249  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 23.15it/s, step size=2.75e-01, acc. prob=0.946]


[outer 033] TRAIN (EMA+K-ens) ll=0.6423  br=0.2246  acc=0.6980


Sample: 100%|██████████| 330/330 [00:14, 23.43it/s, step size=2.84e-01, acc. prob=0.948]


[outer 034] TRAIN (EMA+K-ens) ll=0.6352  br=0.2213  acc=0.6980


Sample: 100%|██████████| 330/330 [00:14, 22.29it/s, step size=2.69e-01, acc. prob=0.977]


[outer 035] TRAIN (EMA+K-ens) ll=0.6309  br=0.2193  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.01it/s, step size=2.65e-01, acc. prob=0.960]


[outer 036] TRAIN (EMA+K-ens) ll=0.6164  br=0.2124  acc=0.7020


Sample: 100%|██████████| 330/330 [00:16, 20.61it/s, step size=2.53e-01, acc. prob=0.947]


[outer 037] TRAIN (EMA+K-ens) ll=0.6171  br=0.2128  acc=0.6980


Sample: 100%|██████████| 330/330 [00:14, 22.23it/s, step size=2.27e-01, acc. prob=0.965]


[outer 038] TRAIN (EMA+K-ens) ll=0.6244  br=0.2162  acc=0.6980


Sample: 100%|██████████| 330/330 [00:14, 23.13it/s, step size=2.89e-01, acc. prob=0.958]


[outer 039] TRAIN (EMA+K-ens) ll=0.6224  br=0.2153  acc=0.7050
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}, {'accuracy': 0.6592000126838684, 'brier': 0.2397531419992447, 'logloss': 0.675529956817627}, {'accuracy': 0.6560199856758118, 'brier': 0.2281046360731125, 'logloss': 0.6481181979179382}, {'accuracy': 0.7019599676132202, 'brier': 0.24499835073947906, 'logloss': 0.686168909072876}]


Sample: 100%|██████████| 330/330 [00:16, 19.91it/s, step size=3.15e-01, acc. prob=0.961]


[outer 000] TRAIN (EMA+K-ens) ll=0.7055  br=0.2558  acc=0.5720


Sample: 100%|██████████| 330/330 [00:14, 23.37it/s, step size=3.33e-01, acc. prob=0.924]


[outer 001] TRAIN (EMA+K-ens) ll=0.6775  br=0.2417  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=2.56e-01, acc. prob=0.972]


[outer 002] TRAIN (EMA+K-ens) ll=0.6807  br=0.2431  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=2.25e-01, acc. prob=0.960]


[outer 003] TRAIN (EMA+K-ens) ll=0.6885  br=0.2470  acc=0.6810


Sample: 100%|██████████| 330/330 [00:13, 24.32it/s, step size=3.33e-01, acc. prob=0.917]


[outer 004] TRAIN (EMA+K-ens) ll=0.6793  br=0.2425  acc=0.6820


Sample: 100%|██████████| 330/330 [00:13, 24.61it/s, step size=2.88e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6832  br=0.2444  acc=0.6820


Sample: 100%|██████████| 330/330 [00:14, 22.86it/s, step size=2.75e-01, acc. prob=0.950]


[outer 006] TRAIN (EMA+K-ens) ll=0.6852  br=0.2454  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 22.77it/s, step size=2.46e-01, acc. prob=0.963]


[outer 007] TRAIN (EMA+K-ens) ll=0.6833  br=0.2445  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.82it/s, step size=2.90e-01, acc. prob=0.944]


[outer 008] TRAIN (EMA+K-ens) ll=0.6762  br=0.2411  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.74it/s, step size=2.28e-01, acc. prob=0.956]


[outer 009] TRAIN (EMA+K-ens) ll=0.6730  br=0.2395  acc=0.6930


Sample: 100%|██████████| 330/330 [00:14, 23.07it/s, step size=3.02e-01, acc. prob=0.956]


[outer 010] TRAIN (EMA+K-ens) ll=0.6713  br=0.2386  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=2.43e-01, acc. prob=0.970]


[outer 011] TRAIN (EMA+K-ens) ll=0.6695  br=0.2377  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.79it/s, step size=3.17e-01, acc. prob=0.942]


[outer 012] TRAIN (EMA+K-ens) ll=0.6867  br=0.2460  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=2.66e-01, acc. prob=0.932]


[outer 013] TRAIN (EMA+K-ens) ll=0.6844  br=0.2447  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.29it/s, step size=3.54e-01, acc. prob=0.899]


[outer 014] TRAIN (EMA+K-ens) ll=0.6800  br=0.2426  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.93it/s, step size=2.32e-01, acc. prob=0.958]


[outer 015] TRAIN (EMA+K-ens) ll=0.6807  br=0.2429  acc=0.6940


Sample: 100%|██████████| 330/330 [00:12, 25.44it/s, step size=2.65e-01, acc. prob=0.927]


[outer 016] TRAIN (EMA+K-ens) ll=0.6922  br=0.2483  acc=0.6340


Sample: 100%|██████████| 330/330 [00:13, 25.17it/s, step size=3.23e-01, acc. prob=0.928]


[outer 017] TRAIN (EMA+K-ens) ll=0.6989  br=0.2514  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.50e-01, acc. prob=0.951]


[outer 018] TRAIN (EMA+K-ens) ll=0.6994  br=0.2516  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.20it/s, step size=2.93e-01, acc. prob=0.941]


[outer 019] TRAIN (EMA+K-ens) ll=0.6960  br=0.2500  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 23.30it/s, step size=2.98e-01, acc. prob=0.939]


[outer 020] TRAIN (EMA+K-ens) ll=0.6817  br=0.2432  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.14it/s, step size=3.24e-01, acc. prob=0.919]


[outer 021] TRAIN (EMA+K-ens) ll=0.6856  br=0.2451  acc=0.6640


Sample: 100%|██████████| 330/330 [00:13, 24.44it/s, step size=3.89e-01, acc. prob=0.942]


[outer 022] TRAIN (EMA+K-ens) ll=0.6956  br=0.2496  acc=0.6340


Sample: 100%|██████████| 330/330 [00:14, 22.28it/s, step size=3.17e-01, acc. prob=0.955]


[outer 023] TRAIN (EMA+K-ens) ll=0.6932  br=0.2484  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.88it/s, step size=2.51e-01, acc. prob=0.946]


[outer 024] TRAIN (EMA+K-ens) ll=0.6891  br=0.2465  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.41it/s, step size=3.06e-01, acc. prob=0.954]


[outer 025] TRAIN (EMA+K-ens) ll=0.6842  br=0.2442  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.79it/s, step size=3.07e-01, acc. prob=0.934]


[outer 026] TRAIN (EMA+K-ens) ll=0.6854  br=0.2448  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.77it/s, step size=2.64e-01, acc. prob=0.952]


[outer 027] TRAIN (EMA+K-ens) ll=0.6812  br=0.2430  acc=0.6950
[Early stop @ outer 27] Δll=0.304%, Δbr=0.480%, Δacc=0.002
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}, {'accuracy': 0.6592000126838684, 'brier': 0.2397531419992447, 'logloss': 0.675529956817627}, {'accuracy': 0.6560199856758118, 'brier': 0.2281046360731125, 'logloss': 0.6481181979179382}, {'accuracy': 0.7019599676132202, 'brier': 0.24499835073947906, 'logloss': 0.686168909072876}, {'accuracy': 0.6717399954795837, 'brier': 0.22386552393436432, 'logloss': 0.6387910842895508}]


Sample: 100%|██████████| 330/330 [00:14, 23.04it/s, step size=2.95e-01, acc. prob=0.937]


[outer 000] TRAIN (EMA+K-ens) ll=0.7034  br=0.2549  acc=0.5170


Sample: 100%|██████████| 330/330 [00:14, 22.17it/s, step size=2.09e-01, acc. prob=0.961]


[outer 001] TRAIN (EMA+K-ens) ll=0.6787  br=0.2425  acc=0.6270


Sample: 100%|██████████| 330/330 [00:13, 23.60it/s, step size=3.25e-01, acc. prob=0.945]


[outer 002] TRAIN (EMA+K-ens) ll=0.6692  br=0.2379  acc=0.6250


Sample: 100%|██████████| 330/330 [00:15, 21.87it/s, step size=3.14e-01, acc. prob=0.944]


[outer 003] TRAIN (EMA+K-ens) ll=0.6511  br=0.2290  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.32it/s, step size=3.60e-01, acc. prob=0.950]


[outer 004] TRAIN (EMA+K-ens) ll=0.6530  br=0.2298  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.57it/s, step size=2.74e-01, acc. prob=0.965]


[outer 005] TRAIN (EMA+K-ens) ll=0.6545  br=0.2304  acc=0.7070


Sample: 100%|██████████| 330/330 [00:14, 22.96it/s, step size=2.80e-01, acc. prob=0.933]


[outer 006] TRAIN (EMA+K-ens) ll=0.6602  br=0.2330  acc=0.7010


Sample: 100%|██████████| 330/330 [00:16, 20.19it/s, step size=3.10e-01, acc. prob=0.928]


[outer 007] TRAIN (EMA+K-ens) ll=0.6674  br=0.2362  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.37it/s, step size=2.44e-01, acc. prob=0.960]


[outer 008] TRAIN (EMA+K-ens) ll=0.6711  br=0.2377  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.61it/s, step size=2.91e-01, acc. prob=0.953]


[outer 009] TRAIN (EMA+K-ens) ll=0.6654  br=0.2351  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.29it/s, step size=2.81e-01, acc. prob=0.951]


[outer 010] TRAIN (EMA+K-ens) ll=0.6719  br=0.2378  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 23.17it/s, step size=3.05e-01, acc. prob=0.959]


[outer 011] TRAIN (EMA+K-ens) ll=0.6643  br=0.2345  acc=0.7040


Sample: 100%|██████████| 330/330 [00:15, 21.75it/s, step size=2.01e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6601  br=0.2328  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.42it/s, step size=2.89e-01, acc. prob=0.943]


[outer 013] TRAIN (EMA+K-ens) ll=0.6651  br=0.2347  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.73it/s, step size=3.22e-01, acc. prob=0.967]


[outer 014] TRAIN (EMA+K-ens) ll=0.6614  br=0.2328  acc=0.7110


Sample: 100%|██████████| 330/330 [00:14, 22.16it/s, step size=3.01e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6543  br=0.2297  acc=0.7170


Sample: 100%|██████████| 330/330 [00:16, 20.43it/s, step size=2.78e-01, acc. prob=0.951]


[outer 016] TRAIN (EMA+K-ens) ll=0.6457  br=0.2258  acc=0.7160


Sample: 100%|██████████| 330/330 [00:14, 22.39it/s, step size=2.99e-01, acc. prob=0.955]


[outer 017] TRAIN (EMA+K-ens) ll=0.6431  br=0.2246  acc=0.7140


Sample: 100%|██████████| 330/330 [00:14, 22.65it/s, step size=3.09e-01, acc. prob=0.941]


[outer 018] TRAIN (EMA+K-ens) ll=0.6472  br=0.2266  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.10it/s, step size=2.53e-01, acc. prob=0.949]


[outer 019] TRAIN (EMA+K-ens) ll=0.6725  br=0.2383  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.47it/s, step size=3.14e-01, acc. prob=0.939]


[outer 020] TRAIN (EMA+K-ens) ll=0.6656  br=0.2350  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 23.38it/s, step size=2.86e-01, acc. prob=0.945]


[outer 021] TRAIN (EMA+K-ens) ll=0.6632  br=0.2341  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.84it/s, step size=3.04e-01, acc. prob=0.940]


[outer 022] TRAIN (EMA+K-ens) ll=0.6704  br=0.2375  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.45it/s, step size=2.41e-01, acc. prob=0.971]


[outer 023] TRAIN (EMA+K-ens) ll=0.6757  br=0.2402  acc=0.7030


Sample: 100%|██████████| 330/330 [00:14, 22.90it/s, step size=2.78e-01, acc. prob=0.942]


[outer 024] TRAIN (EMA+K-ens) ll=0.6765  br=0.2406  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.68it/s, step size=2.84e-01, acc. prob=0.948]


[outer 025] TRAIN (EMA+K-ens) ll=0.6808  br=0.2426  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.58it/s, step size=3.63e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6679  br=0.2365  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.39it/s, step size=2.35e-01, acc. prob=0.977]


[outer 027] TRAIN (EMA+K-ens) ll=0.6548  br=0.2304  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=2.92e-01, acc. prob=0.939]


[outer 028] TRAIN (EMA+K-ens) ll=0.6625  br=0.2340  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.77it/s, step size=3.18e-01, acc. prob=0.918]


[outer 029] TRAIN (EMA+K-ens) ll=0.6617  br=0.2336  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 23.32it/s, step size=3.74e-01, acc. prob=0.948]


[outer 030] TRAIN (EMA+K-ens) ll=0.6558  br=0.2308  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 23.50it/s, step size=2.80e-01, acc. prob=0.968]


[outer 031] TRAIN (EMA+K-ens) ll=0.6455  br=0.2258  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 23.67it/s, step size=2.44e-01, acc. prob=0.934]


[outer 032] TRAIN (EMA+K-ens) ll=0.6482  br=0.2270  acc=0.7110


Sample: 100%|██████████| 330/330 [00:12, 25.85it/s, step size=2.88e-01, acc. prob=0.938]


[outer 033] TRAIN (EMA+K-ens) ll=0.6535  br=0.2294  acc=0.7110


Sample: 100%|██████████| 330/330 [00:14, 22.92it/s, step size=2.83e-01, acc. prob=0.939]


[outer 034] TRAIN (EMA+K-ens) ll=0.6556  br=0.2305  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=3.16e-01, acc. prob=0.952]


[outer 035] TRAIN (EMA+K-ens) ll=0.6531  br=0.2292  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=2.95e-01, acc. prob=0.960]


[outer 036] TRAIN (EMA+K-ens) ll=0.6478  br=0.2268  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.05it/s, step size=2.90e-01, acc. prob=0.957]


[outer 037] TRAIN (EMA+K-ens) ll=0.6442  br=0.2251  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.27e-01, acc. prob=0.965]


[outer 038] TRAIN (EMA+K-ens) ll=0.6335  br=0.2203  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 25.27it/s, step size=2.40e-01, acc. prob=0.965]


[outer 039] TRAIN (EMA+K-ens) ll=0.6346  br=0.2208  acc=0.7090
[{'accuracy': 0.6989799737930298, 'brier': 0.2324819415807724, 'logloss': 0.6596900820732117}, {'accuracy': 0.6593599915504456, 'brier': 0.24097435176372528, 'logloss': 0.6775277853012085}, {'accuracy': 0.6898399591445923, 'brier': 0.2538439631462097, 'logloss': 0.706803560256958}, {'accuracy': 0.6958799958229065, 'brier': 0.2439161241054535, 'logloss': 0.6849684715270996}, {'accuracy': 0.676580011844635, 'brier': 0.24694004654884338, 'logloss': 0.689387321472168}, {'accuracy': 0.6592000126838684, 'brier': 0.2397531419992447, 'logloss': 0.675529956817627}, {'accuracy': 0.6560199856758118, 'brier': 0.2281046360731125, 'logloss': 0.6481181979179382}, {'accuracy': 0.7019599676132202, 'brier': 0.24499835073947906, 'logloss': 0.686168909072876}, {'accuracy': 0.6717399954795837, 'brier': 0.22386552393436432, 'logloss': 0.6387910842895508}, {'accuracy': 0.7062399983406067, 'brier': 0.2121962010860443, 'logloss': 0.6165426969528198

In [None]:
all_metrics = []
noise_type="normal"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=False, nonlinear=False, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )

    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:11, 28.45it/s, step size=3.33e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6654  br=0.2363  acc=0.6090


Sample: 100%|██████████| 330/330 [00:11, 28.35it/s, step size=3.88e-01, acc. prob=0.918]


[outer 001] TRAIN (EMA+K-ens) ll=0.6660  br=0.2362  acc=0.6520


Sample: 100%|██████████| 330/330 [00:11, 29.74it/s, step size=4.57e-01, acc. prob=0.909]


[outer 002] TRAIN (EMA+K-ens) ll=0.6695  br=0.2379  acc=0.6540


Sample: 100%|██████████| 330/330 [00:12, 27.42it/s, step size=3.75e-01, acc. prob=0.956]


[outer 003] TRAIN (EMA+K-ens) ll=0.6667  br=0.2366  acc=0.6800


Sample: 100%|██████████| 330/330 [00:11, 29.68it/s, step size=4.52e-01, acc. prob=0.914]


[outer 004] TRAIN (EMA+K-ens) ll=0.6666  br=0.2366  acc=0.6730


Sample: 100%|██████████| 330/330 [00:11, 27.60it/s, step size=4.10e-01, acc. prob=0.928]


[outer 005] TRAIN (EMA+K-ens) ll=0.6649  br=0.2358  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 28.26it/s, step size=3.80e-01, acc. prob=0.938]


[outer 006] TRAIN (EMA+K-ens) ll=0.6637  br=0.2352  acc=0.6900


Sample: 100%|██████████| 330/330 [00:12, 27.08it/s, step size=3.66e-01, acc. prob=0.947]


[outer 007] TRAIN (EMA+K-ens) ll=0.6627  br=0.2346  acc=0.6920


Sample: 100%|██████████| 330/330 [00:12, 27.03it/s, step size=4.11e-01, acc. prob=0.918]


[outer 008] TRAIN (EMA+K-ens) ll=0.6639  br=0.2352  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.64it/s, step size=3.83e-01, acc. prob=0.929]


[outer 009] TRAIN (EMA+K-ens) ll=0.6661  br=0.2363  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.14it/s, step size=4.32e-01, acc. prob=0.905]


[outer 010] TRAIN (EMA+K-ens) ll=0.6590  br=0.2328  acc=0.6980


Sample: 100%|██████████| 330/330 [00:12, 27.26it/s, step size=4.30e-01, acc. prob=0.894]


[outer 011] TRAIN (EMA+K-ens) ll=0.6563  br=0.2315  acc=0.7000


Sample: 100%|██████████| 330/330 [00:11, 29.58it/s, step size=3.61e-01, acc. prob=0.928]


[outer 012] TRAIN (EMA+K-ens) ll=0.6562  br=0.2315  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 29.25it/s, step size=3.71e-01, acc. prob=0.927]


[outer 013] TRAIN (EMA+K-ens) ll=0.6568  br=0.2318  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 29.00it/s, step size=3.66e-01, acc. prob=0.938]


[outer 014] TRAIN (EMA+K-ens) ll=0.6547  br=0.2308  acc=0.6990


Sample: 100%|██████████| 330/330 [00:12, 26.61it/s, step size=3.65e-01, acc. prob=0.950]


[outer 015] TRAIN (EMA+K-ens) ll=0.6551  br=0.2310  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 29.10it/s, step size=4.07e-01, acc. prob=0.923]


[outer 016] TRAIN (EMA+K-ens) ll=0.6519  br=0.2295  acc=0.7000


Sample: 100%|██████████| 330/330 [00:11, 27.71it/s, step size=4.10e-01, acc. prob=0.903]


[outer 017] TRAIN (EMA+K-ens) ll=0.6469  br=0.2270  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 28.48it/s, step size=4.17e-01, acc. prob=0.917]


[outer 018] TRAIN (EMA+K-ens) ll=0.6522  br=0.2296  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 27.55it/s, step size=4.19e-01, acc. prob=0.904]


[outer 019] TRAIN (EMA+K-ens) ll=0.6569  br=0.2319  acc=0.6880


Sample: 100%|██████████| 330/330 [00:11, 27.56it/s, step size=4.05e-01, acc. prob=0.931]


[outer 020] TRAIN (EMA+K-ens) ll=0.6556  br=0.2313  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.89it/s, step size=3.80e-01, acc. prob=0.918]


[outer 021] TRAIN (EMA+K-ens) ll=0.6552  br=0.2311  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.94it/s, step size=4.06e-01, acc. prob=0.930]


[outer 022] TRAIN (EMA+K-ens) ll=0.6579  br=0.2324  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.18it/s, step size=4.49e-01, acc. prob=0.889]


[outer 023] TRAIN (EMA+K-ens) ll=0.6567  br=0.2319  acc=0.7000


Sample: 100%|██████████| 330/330 [00:11, 28.24it/s, step size=3.65e-01, acc. prob=0.965]


[outer 024] TRAIN (EMA+K-ens) ll=0.6571  br=0.2320  acc=0.7000


Sample: 100%|██████████| 330/330 [00:12, 26.07it/s, step size=3.93e-01, acc. prob=0.921]


[outer 025] TRAIN (EMA+K-ens) ll=0.6582  br=0.2326  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 29.13it/s, step size=3.86e-01, acc. prob=0.931]


[outer 026] TRAIN (EMA+K-ens) ll=0.6597  br=0.2333  acc=0.6920


Sample: 100%|██████████| 330/330 [00:12, 26.72it/s, step size=3.85e-01, acc. prob=0.923]


[outer 027] TRAIN (EMA+K-ens) ll=0.6605  br=0.2338  acc=0.6840


Sample: 100%|██████████| 330/330 [00:10, 32.16it/s, step size=4.10e-01, acc. prob=0.922]


[outer 028] TRAIN (EMA+K-ens) ll=0.6590  br=0.2330  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 29.35it/s, step size=3.56e-01, acc. prob=0.955]


[outer 029] TRAIN (EMA+K-ens) ll=0.6602  br=0.2336  acc=0.6890


Sample: 100%|██████████| 330/330 [00:12, 27.42it/s, step size=3.93e-01, acc. prob=0.940]


[outer 030] TRAIN (EMA+K-ens) ll=0.6598  br=0.2334  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 29.50it/s, step size=3.81e-01, acc. prob=0.928]


[outer 031] TRAIN (EMA+K-ens) ll=0.6624  br=0.2346  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.59it/s, step size=3.90e-01, acc. prob=0.931]


[outer 032] TRAIN (EMA+K-ens) ll=0.6650  br=0.2359  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 29.66it/s, step size=3.91e-01, acc. prob=0.917]


[outer 033] TRAIN (EMA+K-ens) ll=0.6674  br=0.2371  acc=0.6630


Sample: 100%|██████████| 330/330 [00:11, 28.51it/s, step size=4.35e-01, acc. prob=0.918]


[outer 034] TRAIN (EMA+K-ens) ll=0.6666  br=0.2366  acc=0.6540


Sample: 100%|██████████| 330/330 [00:12, 26.31it/s, step size=3.93e-01, acc. prob=0.932]


[outer 035] TRAIN (EMA+K-ens) ll=0.6661  br=0.2364  acc=0.6570


Sample: 100%|██████████| 330/330 [00:12, 27.15it/s, step size=4.10e-01, acc. prob=0.926]


[outer 036] TRAIN (EMA+K-ens) ll=0.6657  br=0.2362  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 28.26it/s, step size=3.67e-01, acc. prob=0.927]


[outer 037] TRAIN (EMA+K-ens) ll=0.6645  br=0.2356  acc=0.6960


Sample: 100%|██████████| 330/330 [00:11, 27.61it/s, step size=3.66e-01, acc. prob=0.939]


[outer 038] TRAIN (EMA+K-ens) ll=0.6651  br=0.2359  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 29.16it/s, step size=4.25e-01, acc. prob=0.926]


[outer 039] TRAIN (EMA+K-ens) ll=0.6646  br=0.2356  acc=0.6960
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}]


Sample: 100%|██████████| 330/330 [00:11, 28.20it/s, step size=4.43e-01, acc. prob=0.883]


[outer 000] TRAIN (EMA+K-ens) ll=0.6707  br=0.2389  acc=0.6060


Sample: 100%|██████████| 330/330 [00:11, 28.52it/s, step size=3.81e-01, acc. prob=0.914]


[outer 001] TRAIN (EMA+K-ens) ll=0.6667  br=0.2369  acc=0.6200


Sample: 100%|██████████| 330/330 [00:11, 28.71it/s, step size=3.53e-01, acc. prob=0.914]


[outer 002] TRAIN (EMA+K-ens) ll=0.6694  br=0.2382  acc=0.6200


Sample: 100%|██████████| 330/330 [00:10, 30.16it/s, step size=4.07e-01, acc. prob=0.920]


[outer 003] TRAIN (EMA+K-ens) ll=0.6638  br=0.2354  acc=0.6400


Sample: 100%|██████████| 330/330 [00:11, 28.87it/s, step size=4.03e-01, acc. prob=0.901]


[outer 004] TRAIN (EMA+K-ens) ll=0.6645  br=0.2357  acc=0.6520


Sample: 100%|██████████| 330/330 [00:11, 28.75it/s, step size=3.47e-01, acc. prob=0.955]


[outer 005] TRAIN (EMA+K-ens) ll=0.6650  br=0.2359  acc=0.6740


Sample: 100%|██████████| 330/330 [00:12, 27.21it/s, step size=4.10e-01, acc. prob=0.930]


[outer 006] TRAIN (EMA+K-ens) ll=0.6653  br=0.2361  acc=0.6820


Sample: 100%|██████████| 330/330 [00:10, 30.19it/s, step size=3.92e-01, acc. prob=0.910]


[outer 007] TRAIN (EMA+K-ens) ll=0.6632  br=0.2351  acc=0.6770


Sample: 100%|██████████| 330/330 [00:11, 27.68it/s, step size=3.39e-01, acc. prob=0.946]


[outer 008] TRAIN (EMA+K-ens) ll=0.6678  br=0.2373  acc=0.6670


Sample: 100%|██████████| 330/330 [00:12, 26.52it/s, step size=3.66e-01, acc. prob=0.931]


[outer 009] TRAIN (EMA+K-ens) ll=0.6679  br=0.2374  acc=0.6750


Sample: 100%|██████████| 330/330 [00:11, 27.86it/s, step size=4.43e-01, acc. prob=0.901]


[outer 010] TRAIN (EMA+K-ens) ll=0.6657  br=0.2363  acc=0.6750


Sample: 100%|██████████| 330/330 [00:12, 27.09it/s, step size=4.14e-01, acc. prob=0.880]


[outer 011] TRAIN (EMA+K-ens) ll=0.6666  br=0.2367  acc=0.6700


Sample: 100%|██████████| 330/330 [00:12, 26.59it/s, step size=4.05e-01, acc. prob=0.925]


[outer 012] TRAIN (EMA+K-ens) ll=0.6673  br=0.2371  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 28.89it/s, step size=3.73e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6643  br=0.2356  acc=0.6780


Sample: 100%|██████████| 330/330 [00:11, 27.94it/s, step size=3.88e-01, acc. prob=0.922]


[outer 014] TRAIN (EMA+K-ens) ll=0.6641  br=0.2355  acc=0.6870


Sample: 100%|██████████| 330/330 [00:11, 29.18it/s, step size=4.26e-01, acc. prob=0.904]


[outer 015] TRAIN (EMA+K-ens) ll=0.6681  br=0.2375  acc=0.6660


Sample: 100%|██████████| 330/330 [00:12, 26.13it/s, step size=3.86e-01, acc. prob=0.927]


[outer 016] TRAIN (EMA+K-ens) ll=0.6702  br=0.2386  acc=0.6490


Sample: 100%|██████████| 330/330 [00:11, 28.42it/s, step size=4.16e-01, acc. prob=0.905]


[outer 017] TRAIN (EMA+K-ens) ll=0.6717  br=0.2393  acc=0.6480


Sample: 100%|██████████| 330/330 [00:12, 27.33it/s, step size=4.01e-01, acc. prob=0.905]


[outer 018] TRAIN (EMA+K-ens) ll=0.6706  br=0.2388  acc=0.6510


Sample: 100%|██████████| 330/330 [00:11, 29.85it/s, step size=4.03e-01, acc. prob=0.919]


[outer 019] TRAIN (EMA+K-ens) ll=0.6722  br=0.2395  acc=0.6550


Sample: 100%|██████████| 330/330 [00:11, 28.46it/s, step size=4.23e-01, acc. prob=0.917]


[outer 020] TRAIN (EMA+K-ens) ll=0.6737  br=0.2402  acc=0.6730


Sample: 100%|██████████| 330/330 [00:11, 28.48it/s, step size=4.24e-01, acc. prob=0.898]


[outer 021] TRAIN (EMA+K-ens) ll=0.6763  br=0.2415  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 28.05it/s, step size=3.81e-01, acc. prob=0.947]


[outer 022] TRAIN (EMA+K-ens) ll=0.6733  br=0.2400  acc=0.6830


Sample: 100%|██████████| 330/330 [00:12, 27.09it/s, step size=3.68e-01, acc. prob=0.943]


[outer 023] TRAIN (EMA+K-ens) ll=0.6716  br=0.2392  acc=0.6740


Sample: 100%|██████████| 330/330 [00:11, 27.60it/s, step size=3.66e-01, acc. prob=0.934]


[outer 024] TRAIN (EMA+K-ens) ll=0.6656  br=0.2363  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 29.07it/s, step size=4.09e-01, acc. prob=0.928]


[outer 025] TRAIN (EMA+K-ens) ll=0.6654  br=0.2361  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 29.46it/s, step size=3.71e-01, acc. prob=0.918]


[outer 026] TRAIN (EMA+K-ens) ll=0.6667  br=0.2368  acc=0.6730


Sample: 100%|██████████| 330/330 [00:11, 28.67it/s, step size=3.79e-01, acc. prob=0.929]


[outer 027] TRAIN (EMA+K-ens) ll=0.6692  br=0.2380  acc=0.6790


Sample: 100%|██████████| 330/330 [00:11, 28.89it/s, step size=4.11e-01, acc. prob=0.937]


[outer 028] TRAIN (EMA+K-ens) ll=0.6625  br=0.2347  acc=0.6870


Sample: 100%|██████████| 330/330 [00:11, 28.95it/s, step size=4.29e-01, acc. prob=0.896]


[outer 029] TRAIN (EMA+K-ens) ll=0.6596  br=0.2333  acc=0.6790


Sample: 100%|██████████| 330/330 [00:10, 31.78it/s, step size=4.04e-01, acc. prob=0.901]


[outer 030] TRAIN (EMA+K-ens) ll=0.6635  br=0.2352  acc=0.6630


Sample: 100%|██████████| 330/330 [00:11, 29.52it/s, step size=3.95e-01, acc. prob=0.929]


[outer 031] TRAIN (EMA+K-ens) ll=0.6649  br=0.2359  acc=0.6590


Sample: 100%|██████████| 330/330 [00:11, 28.43it/s, step size=3.54e-01, acc. prob=0.938]


[outer 032] TRAIN (EMA+K-ens) ll=0.6656  br=0.2362  acc=0.6700


Sample: 100%|██████████| 330/330 [00:11, 28.70it/s, step size=3.36e-01, acc. prob=0.948]


[outer 033] TRAIN (EMA+K-ens) ll=0.6631  br=0.2350  acc=0.6710


Sample: 100%|██████████| 330/330 [00:11, 27.63it/s, step size=3.92e-01, acc. prob=0.922]


[outer 034] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6690


Sample: 100%|██████████| 330/330 [00:11, 29.25it/s, step size=4.39e-01, acc. prob=0.894]


[outer 035] TRAIN (EMA+K-ens) ll=0.6595  br=0.2332  acc=0.6760


Sample: 100%|██████████| 330/330 [00:11, 28.55it/s, step size=3.89e-01, acc. prob=0.946]


[outer 036] TRAIN (EMA+K-ens) ll=0.6624  br=0.2346  acc=0.6740


Sample: 100%|██████████| 330/330 [00:12, 26.10it/s, step size=3.71e-01, acc. prob=0.920]


[outer 037] TRAIN (EMA+K-ens) ll=0.6631  br=0.2350  acc=0.6760


Sample: 100%|██████████| 330/330 [00:12, 26.87it/s, step size=3.56e-01, acc. prob=0.940]


[outer 038] TRAIN (EMA+K-ens) ll=0.6561  br=0.2315  acc=0.6910


Sample: 100%|██████████| 330/330 [00:10, 31.27it/s, step size=4.53e-01, acc. prob=0.900]


[outer 039] TRAIN (EMA+K-ens) ll=0.6524  br=0.2297  acc=0.6920
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}]


Sample: 100%|██████████| 330/330 [00:11, 28.90it/s, step size=3.94e-01, acc. prob=0.915]


[outer 000] TRAIN (EMA+K-ens) ll=0.6639  br=0.2354  acc=0.5970


Sample: 100%|██████████| 330/330 [00:11, 28.98it/s, step size=3.49e-01, acc. prob=0.952]


[outer 001] TRAIN (EMA+K-ens) ll=0.6591  br=0.2330  acc=0.6500


Sample: 100%|██████████| 330/330 [00:11, 28.35it/s, step size=3.74e-01, acc. prob=0.941]


[outer 002] TRAIN (EMA+K-ens) ll=0.6620  br=0.2344  acc=0.6560


Sample: 100%|██████████| 330/330 [00:11, 27.84it/s, step size=3.74e-01, acc. prob=0.926]


[outer 003] TRAIN (EMA+K-ens) ll=0.6552  br=0.2311  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.52it/s, step size=4.05e-01, acc. prob=0.914]


[outer 004] TRAIN (EMA+K-ens) ll=0.6522  br=0.2296  acc=0.7030


Sample: 100%|██████████| 330/330 [00:11, 27.57it/s, step size=3.72e-01, acc. prob=0.924]


[outer 005] TRAIN (EMA+K-ens) ll=0.6541  br=0.2305  acc=0.7030


Sample: 100%|██████████| 330/330 [00:11, 29.69it/s, step size=4.41e-01, acc. prob=0.897]


[outer 006] TRAIN (EMA+K-ens) ll=0.6580  br=0.2325  acc=0.6990


Sample: 100%|██████████| 330/330 [00:12, 26.90it/s, step size=4.52e-01, acc. prob=0.881]


[outer 007] TRAIN (EMA+K-ens) ll=0.6576  br=0.2323  acc=0.7030


Sample: 100%|██████████| 330/330 [00:11, 28.54it/s, step size=4.64e-01, acc. prob=0.921]


[outer 008] TRAIN (EMA+K-ens) ll=0.6640  br=0.2353  acc=0.7040


Sample: 100%|██████████| 330/330 [00:11, 28.94it/s, step size=4.38e-01, acc. prob=0.939]


[outer 009] TRAIN (EMA+K-ens) ll=0.6650  br=0.2358  acc=0.7010


Sample: 100%|██████████| 330/330 [00:11, 27.68it/s, step size=4.02e-01, acc. prob=0.923]


[outer 010] TRAIN (EMA+K-ens) ll=0.6653  br=0.2359  acc=0.7030


Sample: 100%|██████████| 330/330 [00:11, 28.82it/s, step size=3.95e-01, acc. prob=0.920]


[outer 011] TRAIN (EMA+K-ens) ll=0.6676  br=0.2369  acc=0.7100


Sample: 100%|██████████| 330/330 [00:10, 31.47it/s, step size=3.99e-01, acc. prob=0.941]


[outer 012] TRAIN (EMA+K-ens) ll=0.6688  br=0.2375  acc=0.7110


Sample: 100%|██████████| 330/330 [00:11, 29.32it/s, step size=3.99e-01, acc. prob=0.888]


[outer 013] TRAIN (EMA+K-ens) ll=0.6668  br=0.2365  acc=0.7110


Sample: 100%|██████████| 330/330 [00:11, 28.63it/s, step size=4.19e-01, acc. prob=0.913]


[outer 014] TRAIN (EMA+K-ens) ll=0.6636  br=0.2350  acc=0.7110


Sample: 100%|██████████| 330/330 [00:11, 27.65it/s, step size=3.86e-01, acc. prob=0.919]


[outer 015] TRAIN (EMA+K-ens) ll=0.6600  br=0.2334  acc=0.7180


Sample: 100%|██████████| 330/330 [00:11, 28.02it/s, step size=3.71e-01, acc. prob=0.939]


[outer 016] TRAIN (EMA+K-ens) ll=0.6570  br=0.2318  acc=0.7150


Sample: 100%|██████████| 330/330 [00:11, 28.77it/s, step size=3.69e-01, acc. prob=0.932]


[outer 017] TRAIN (EMA+K-ens) ll=0.6525  br=0.2297  acc=0.7150


Sample: 100%|██████████| 330/330 [00:11, 29.23it/s, step size=4.03e-01, acc. prob=0.917]


[outer 018] TRAIN (EMA+K-ens) ll=0.6514  br=0.2292  acc=0.7130


Sample: 100%|██████████| 330/330 [00:11, 28.72it/s, step size=3.86e-01, acc. prob=0.946]


[outer 019] TRAIN (EMA+K-ens) ll=0.6500  br=0.2285  acc=0.7100


Sample: 100%|██████████| 330/330 [00:12, 26.84it/s, step size=4.12e-01, acc. prob=0.930]


[outer 020] TRAIN (EMA+K-ens) ll=0.6460  br=0.2266  acc=0.7100


Sample: 100%|██████████| 330/330 [00:12, 25.76it/s, step size=3.75e-01, acc. prob=0.955]


[outer 021] TRAIN (EMA+K-ens) ll=0.6485  br=0.2278  acc=0.7000


Sample: 100%|██████████| 330/330 [00:10, 30.45it/s, step size=4.59e-01, acc. prob=0.867]


[outer 022] TRAIN (EMA+K-ens) ll=0.6526  br=0.2298  acc=0.6860


Sample: 100%|██████████| 330/330 [00:11, 27.80it/s, step size=3.84e-01, acc. prob=0.902]


[outer 023] TRAIN (EMA+K-ens) ll=0.6523  br=0.2297  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.81it/s, step size=4.15e-01, acc. prob=0.932]


[outer 024] TRAIN (EMA+K-ens) ll=0.6486  br=0.2279  acc=0.6910


Sample: 100%|██████████| 330/330 [00:12, 26.78it/s, step size=3.72e-01, acc. prob=0.919]


[outer 025] TRAIN (EMA+K-ens) ll=0.6477  br=0.2275  acc=0.6700


Sample: 100%|██████████| 330/330 [00:12, 25.47it/s, step size=3.94e-01, acc. prob=0.930]


[outer 026] TRAIN (EMA+K-ens) ll=0.6452  br=0.2263  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 29.39it/s, step size=3.55e-01, acc. prob=0.933]


[outer 027] TRAIN (EMA+K-ens) ll=0.6448  br=0.2261  acc=0.6950


Sample: 100%|██████████| 330/330 [00:12, 27.04it/s, step size=3.48e-01, acc. prob=0.928]


[outer 028] TRAIN (EMA+K-ens) ll=0.6490  br=0.2280  acc=0.6970


Sample: 100%|██████████| 330/330 [00:12, 26.58it/s, step size=4.01e-01, acc. prob=0.919]


[outer 029] TRAIN (EMA+K-ens) ll=0.6471  br=0.2271  acc=0.7080


Sample: 100%|██████████| 330/330 [00:12, 27.48it/s, step size=4.63e-01, acc. prob=0.880]


[outer 030] TRAIN (EMA+K-ens) ll=0.6429  br=0.2251  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 29.01it/s, step size=4.20e-01, acc. prob=0.920]


[outer 031] TRAIN (EMA+K-ens) ll=0.6429  br=0.2251  acc=0.7090


Sample: 100%|██████████| 330/330 [00:11, 27.51it/s, step size=3.80e-01, acc. prob=0.935]


[outer 032] TRAIN (EMA+K-ens) ll=0.6449  br=0.2260  acc=0.7160


Sample: 100%|██████████| 330/330 [00:11, 29.15it/s, step size=3.91e-01, acc. prob=0.922]


[outer 033] TRAIN (EMA+K-ens) ll=0.6489  br=0.2280  acc=0.7120


Sample: 100%|██████████| 330/330 [00:11, 29.02it/s, step size=4.00e-01, acc. prob=0.926]


[outer 034] TRAIN (EMA+K-ens) ll=0.6489  br=0.2280  acc=0.7100


Sample: 100%|██████████| 330/330 [00:11, 29.05it/s, step size=3.98e-01, acc. prob=0.903]


[outer 035] TRAIN (EMA+K-ens) ll=0.6462  br=0.2266  acc=0.7020


Sample: 100%|██████████| 330/330 [00:11, 28.60it/s, step size=4.04e-01, acc. prob=0.908]


[outer 036] TRAIN (EMA+K-ens) ll=0.6495  br=0.2282  acc=0.7010


Sample: 100%|██████████| 330/330 [00:11, 28.60it/s, step size=3.94e-01, acc. prob=0.929]


[outer 037] TRAIN (EMA+K-ens) ll=0.6479  br=0.2274  acc=0.7050


Sample: 100%|██████████| 330/330 [00:11, 27.53it/s, step size=3.78e-01, acc. prob=0.950]


[outer 038] TRAIN (EMA+K-ens) ll=0.6501  br=0.2285  acc=0.7090


Sample: 100%|██████████| 330/330 [00:11, 29.47it/s, step size=3.86e-01, acc. prob=0.921]


[outer 039] TRAIN (EMA+K-ens) ll=0.6502  br=0.2286  acc=0.7080
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}]


Sample: 100%|██████████| 330/330 [00:11, 28.72it/s, step size=3.66e-01, acc. prob=0.931]


[outer 000] TRAIN (EMA+K-ens) ll=0.6826  br=0.2444  acc=0.5830


Sample: 100%|██████████| 330/330 [00:12, 26.18it/s, step size=4.27e-01, acc. prob=0.919]


[outer 001] TRAIN (EMA+K-ens) ll=0.6772  br=0.2420  acc=0.6400


Sample: 100%|██████████| 330/330 [00:11, 27.57it/s, step size=3.96e-01, acc. prob=0.918]


[outer 002] TRAIN (EMA+K-ens) ll=0.6678  br=0.2374  acc=0.6410


Sample: 100%|██████████| 330/330 [00:11, 28.71it/s, step size=3.87e-01, acc. prob=0.909]


[outer 003] TRAIN (EMA+K-ens) ll=0.6675  br=0.2372  acc=0.6600


Sample: 100%|██████████| 330/330 [00:12, 25.96it/s, step size=3.54e-01, acc. prob=0.952]


[outer 004] TRAIN (EMA+K-ens) ll=0.6630  br=0.2350  acc=0.6700


Sample: 100%|██████████| 330/330 [00:12, 27.26it/s, step size=4.52e-01, acc. prob=0.904]


[outer 005] TRAIN (EMA+K-ens) ll=0.6578  br=0.2324  acc=0.6840


Sample: 100%|██████████| 330/330 [00:11, 28.67it/s, step size=3.44e-01, acc. prob=0.937]


[outer 006] TRAIN (EMA+K-ens) ll=0.6580  br=0.2325  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.13it/s, step size=4.64e-01, acc. prob=0.886]


[outer 007] TRAIN (EMA+K-ens) ll=0.6568  br=0.2319  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 27.92it/s, step size=3.88e-01, acc. prob=0.908]


[outer 008] TRAIN (EMA+K-ens) ll=0.6539  br=0.2305  acc=0.6980


Sample: 100%|██████████| 330/330 [00:12, 27.10it/s, step size=3.78e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6504  br=0.2287  acc=0.7020


Sample: 100%|██████████| 330/330 [00:11, 27.79it/s, step size=3.86e-01, acc. prob=0.941]


[outer 010] TRAIN (EMA+K-ens) ll=0.6497  br=0.2284  acc=0.7030


Sample: 100%|██████████| 330/330 [00:11, 29.79it/s, step size=3.57e-01, acc. prob=0.928]


[outer 011] TRAIN (EMA+K-ens) ll=0.6480  br=0.2276  acc=0.6950


Sample: 100%|██████████| 330/330 [00:10, 30.49it/s, step size=3.76e-01, acc. prob=0.935]


[outer 012] TRAIN (EMA+K-ens) ll=0.6500  br=0.2285  acc=0.6950


Sample: 100%|██████████| 330/330 [00:12, 27.25it/s, step size=3.99e-01, acc. prob=0.918]


[outer 013] TRAIN (EMA+K-ens) ll=0.6516  br=0.2293  acc=0.7040


Sample: 100%|██████████| 330/330 [00:11, 27.53it/s, step size=3.33e-01, acc. prob=0.952]


[outer 014] TRAIN (EMA+K-ens) ll=0.6499  br=0.2285  acc=0.7000


Sample: 100%|██████████| 330/330 [00:11, 27.53it/s, step size=3.69e-01, acc. prob=0.925]


[outer 015] TRAIN (EMA+K-ens) ll=0.6511  br=0.2291  acc=0.7050


Sample: 100%|██████████| 330/330 [00:12, 26.44it/s, step size=3.94e-01, acc. prob=0.944]


[outer 016] TRAIN (EMA+K-ens) ll=0.6501  br=0.2286  acc=0.7030


Sample: 100%|██████████| 330/330 [00:12, 27.22it/s, step size=3.92e-01, acc. prob=0.920]


[outer 017] TRAIN (EMA+K-ens) ll=0.6514  br=0.2292  acc=0.7020


Sample: 100%|██████████| 330/330 [00:11, 28.00it/s, step size=3.79e-01, acc. prob=0.931]


[outer 018] TRAIN (EMA+K-ens) ll=0.6555  br=0.2312  acc=0.7010


Sample: 100%|██████████| 330/330 [00:12, 27.25it/s, step size=3.50e-01, acc. prob=0.940]


[outer 019] TRAIN (EMA+K-ens) ll=0.6544  br=0.2307  acc=0.6990
[Early stop @ outer 19] Δll=0.063%, Δbr=0.100%, Δacc=0.004
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}]


Sample: 100%|██████████| 330/330 [00:12, 25.55it/s, step size=3.41e-01, acc. prob=0.953]


[outer 000] TRAIN (EMA+K-ens) ll=0.6639  br=0.2354  acc=0.6320


Sample: 100%|██████████| 330/330 [00:11, 28.65it/s, step size=4.27e-01, acc. prob=0.922]


[outer 001] TRAIN (EMA+K-ens) ll=0.6700  br=0.2385  acc=0.5890


Sample: 100%|██████████| 330/330 [00:11, 28.21it/s, step size=3.75e-01, acc. prob=0.948]


[outer 002] TRAIN (EMA+K-ens) ll=0.6745  br=0.2407  acc=0.6320


Sample: 100%|██████████| 330/330 [00:11, 28.40it/s, step size=4.52e-01, acc. prob=0.903]


[outer 003] TRAIN (EMA+K-ens) ll=0.6677  br=0.2373  acc=0.6590


Sample: 100%|██████████| 330/330 [00:11, 28.69it/s, step size=3.76e-01, acc. prob=0.935]


[outer 004] TRAIN (EMA+K-ens) ll=0.6645  br=0.2357  acc=0.6750


Sample: 100%|██████████| 330/330 [00:11, 29.80it/s, step size=4.21e-01, acc. prob=0.917]


[outer 005] TRAIN (EMA+K-ens) ll=0.6616  br=0.2343  acc=0.6860


Sample: 100%|██████████| 330/330 [00:10, 30.29it/s, step size=3.78e-01, acc. prob=0.931]


[outer 006] TRAIN (EMA+K-ens) ll=0.6633  br=0.2351  acc=0.6840


Sample: 100%|██████████| 330/330 [00:12, 26.73it/s, step size=4.04e-01, acc. prob=0.927]


[outer 007] TRAIN (EMA+K-ens) ll=0.6630  br=0.2349  acc=0.6750


Sample: 100%|██████████| 330/330 [00:11, 28.49it/s, step size=3.35e-01, acc. prob=0.950]


[outer 008] TRAIN (EMA+K-ens) ll=0.6660  br=0.2364  acc=0.6720


Sample: 100%|██████████| 330/330 [00:12, 25.98it/s, step size=3.47e-01, acc. prob=0.935]


[outer 009] TRAIN (EMA+K-ens) ll=0.6621  br=0.2345  acc=0.6710


Sample: 100%|██████████| 330/330 [00:11, 29.94it/s, step size=4.18e-01, acc. prob=0.914]


[outer 010] TRAIN (EMA+K-ens) ll=0.6601  br=0.2335  acc=0.6770


Sample: 100%|██████████| 330/330 [00:11, 27.83it/s, step size=3.98e-01, acc. prob=0.936]


[outer 011] TRAIN (EMA+K-ens) ll=0.6604  br=0.2337  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 27.94it/s, step size=3.56e-01, acc. prob=0.948]


[outer 012] TRAIN (EMA+K-ens) ll=0.6599  br=0.2334  acc=0.6790


Sample: 100%|██████████| 330/330 [00:12, 27.33it/s, step size=4.12e-01, acc. prob=0.920]


[outer 013] TRAIN (EMA+K-ens) ll=0.6600  br=0.2335  acc=0.6890


Sample: 100%|██████████| 330/330 [00:11, 29.14it/s, step size=3.75e-01, acc. prob=0.932]


[outer 014] TRAIN (EMA+K-ens) ll=0.6597  br=0.2333  acc=0.6880


Sample: 100%|██████████| 330/330 [00:11, 28.65it/s, step size=3.53e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6603  br=0.2336  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 27.02it/s, step size=3.70e-01, acc. prob=0.941]


[outer 016] TRAIN (EMA+K-ens) ll=0.6591  br=0.2330  acc=0.6890


Sample: 100%|██████████| 330/330 [00:11, 29.28it/s, step size=4.11e-01, acc. prob=0.932]


[outer 017] TRAIN (EMA+K-ens) ll=0.6600  br=0.2334  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 27.71it/s, step size=4.09e-01, acc. prob=0.924]


[outer 018] TRAIN (EMA+K-ens) ll=0.6606  br=0.2337  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.25it/s, step size=3.92e-01, acc. prob=0.931]


[outer 019] TRAIN (EMA+K-ens) ll=0.6645  br=0.2356  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 26.89it/s, step size=3.82e-01, acc. prob=0.920]


[outer 020] TRAIN (EMA+K-ens) ll=0.6672  br=0.2370  acc=0.6780


Sample: 100%|██████████| 330/330 [00:10, 30.96it/s, step size=4.62e-01, acc. prob=0.867]


[outer 021] TRAIN (EMA+K-ens) ll=0.6693  br=0.2380  acc=0.6750


Sample: 100%|██████████| 330/330 [00:11, 28.93it/s, step size=3.63e-01, acc. prob=0.923]


[outer 022] TRAIN (EMA+K-ens) ll=0.6687  br=0.2377  acc=0.6760


Sample: 100%|██████████| 330/330 [00:10, 30.58it/s, step size=4.03e-01, acc. prob=0.921]


[outer 023] TRAIN (EMA+K-ens) ll=0.6708  br=0.2388  acc=0.6640


Sample: 100%|██████████| 330/330 [00:11, 28.32it/s, step size=4.44e-01, acc. prob=0.900]


[outer 024] TRAIN (EMA+K-ens) ll=0.6691  br=0.2379  acc=0.6670


Sample: 100%|██████████| 330/330 [00:11, 29.33it/s, step size=4.09e-01, acc. prob=0.885]


[outer 025] TRAIN (EMA+K-ens) ll=0.6737  br=0.2403  acc=0.6450


Sample: 100%|██████████| 330/330 [00:10, 31.40it/s, step size=4.35e-01, acc. prob=0.890]


[outer 026] TRAIN (EMA+K-ens) ll=0.6725  br=0.2396  acc=0.6760


Sample: 100%|██████████| 330/330 [00:11, 28.25it/s, step size=3.89e-01, acc. prob=0.932]


[outer 027] TRAIN (EMA+K-ens) ll=0.6744  br=0.2406  acc=0.6680


Sample: 100%|██████████| 330/330 [00:11, 28.63it/s, step size=4.78e-01, acc. prob=0.852]


[outer 028] TRAIN (EMA+K-ens) ll=0.6742  br=0.2404  acc=0.6640


Sample: 100%|██████████| 330/330 [00:12, 26.98it/s, step size=4.19e-01, acc. prob=0.915]


[outer 029] TRAIN (EMA+K-ens) ll=0.6740  br=0.2403  acc=0.6490


Sample: 100%|██████████| 330/330 [00:11, 27.72it/s, step size=4.09e-01, acc. prob=0.915]


[outer 030] TRAIN (EMA+K-ens) ll=0.6798  br=0.2432  acc=0.6560


Sample: 100%|██████████| 330/330 [00:10, 31.10it/s, step size=4.22e-01, acc. prob=0.926]


[outer 031] TRAIN (EMA+K-ens) ll=0.6748  br=0.2407  acc=0.6360


Sample: 100%|██████████| 330/330 [00:11, 27.60it/s, step size=3.92e-01, acc. prob=0.920]


[outer 032] TRAIN (EMA+K-ens) ll=0.6741  br=0.2404  acc=0.6430


Sample: 100%|██████████| 330/330 [00:11, 27.52it/s, step size=3.92e-01, acc. prob=0.937]


[outer 033] TRAIN (EMA+K-ens) ll=0.6722  br=0.2394  acc=0.6580


Sample: 100%|██████████| 330/330 [00:10, 31.30it/s, step size=4.15e-01, acc. prob=0.927]


[outer 034] TRAIN (EMA+K-ens) ll=0.6664  br=0.2365  acc=0.6780


Sample: 100%|██████████| 330/330 [00:11, 28.44it/s, step size=4.19e-01, acc. prob=0.922]


[outer 035] TRAIN (EMA+K-ens) ll=0.6610  br=0.2339  acc=0.6850


Sample: 100%|██████████| 330/330 [00:11, 29.80it/s, step size=4.20e-01, acc. prob=0.929]


[outer 036] TRAIN (EMA+K-ens) ll=0.6604  br=0.2336  acc=0.6840


Sample: 100%|██████████| 330/330 [00:11, 28.84it/s, step size=3.95e-01, acc. prob=0.917]


[outer 037] TRAIN (EMA+K-ens) ll=0.6603  br=0.2336  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 26.49it/s, step size=3.91e-01, acc. prob=0.929]


[outer 038] TRAIN (EMA+K-ens) ll=0.6539  br=0.2305  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 26.45it/s, step size=3.98e-01, acc. prob=0.926]


[outer 039] TRAIN (EMA+K-ens) ll=0.6572  br=0.2321  acc=0.6860
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}]


Sample: 100%|██████████| 330/330 [00:11, 28.29it/s, step size=4.09e-01, acc. prob=0.884]


[outer 000] TRAIN (EMA+K-ens) ll=0.7043  br=0.2549  acc=0.4970


Sample: 100%|██████████| 330/330 [00:11, 28.53it/s, step size=4.25e-01, acc. prob=0.918]


[outer 001] TRAIN (EMA+K-ens) ll=0.6947  br=0.2501  acc=0.5540


Sample: 100%|██████████| 330/330 [00:12, 26.89it/s, step size=3.75e-01, acc. prob=0.913]


[outer 002] TRAIN (EMA+K-ens) ll=0.6807  br=0.2436  acc=0.5950


Sample: 100%|██████████| 330/330 [00:11, 27.82it/s, step size=4.61e-01, acc. prob=0.881]


[outer 003] TRAIN (EMA+K-ens) ll=0.6754  br=0.2411  acc=0.6070


Sample: 100%|██████████| 330/330 [00:11, 29.21it/s, step size=4.25e-01, acc. prob=0.935]


[outer 004] TRAIN (EMA+K-ens) ll=0.6734  br=0.2401  acc=0.6130


Sample: 100%|██████████| 330/330 [00:12, 26.57it/s, step size=4.64e-01, acc. prob=0.877]


[outer 005] TRAIN (EMA+K-ens) ll=0.6706  br=0.2387  acc=0.6300


Sample: 100%|██████████| 330/330 [00:11, 29.55it/s, step size=3.59e-01, acc. prob=0.941]


[outer 006] TRAIN (EMA+K-ens) ll=0.6703  br=0.2385  acc=0.6550


Sample: 100%|██████████| 330/330 [00:11, 28.02it/s, step size=4.05e-01, acc. prob=0.925]


[outer 007] TRAIN (EMA+K-ens) ll=0.6684  br=0.2376  acc=0.6780


Sample: 100%|██████████| 330/330 [00:12, 27.12it/s, step size=4.13e-01, acc. prob=0.939]


[outer 008] TRAIN (EMA+K-ens) ll=0.6684  br=0.2376  acc=0.6720


Sample: 100%|██████████| 330/330 [00:12, 27.20it/s, step size=4.15e-01, acc. prob=0.907]


[outer 009] TRAIN (EMA+K-ens) ll=0.6660  br=0.2364  acc=0.6790


Sample: 100%|██████████| 330/330 [00:12, 27.47it/s, step size=3.41e-01, acc. prob=0.949]


[outer 010] TRAIN (EMA+K-ens) ll=0.6665  br=0.2366  acc=0.6840


Sample: 100%|██████████| 330/330 [00:11, 28.45it/s, step size=3.74e-01, acc. prob=0.938]


[outer 011] TRAIN (EMA+K-ens) ll=0.6663  br=0.2366  acc=0.6810


Sample: 100%|██████████| 330/330 [00:12, 26.95it/s, step size=4.07e-01, acc. prob=0.905]


[outer 012] TRAIN (EMA+K-ens) ll=0.6687  br=0.2378  acc=0.6760


Sample: 100%|██████████| 330/330 [00:12, 26.16it/s, step size=3.70e-01, acc. prob=0.925]


[outer 013] TRAIN (EMA+K-ens) ll=0.6685  br=0.2377  acc=0.6690


Sample: 100%|██████████| 330/330 [00:11, 28.08it/s, step size=3.40e-01, acc. prob=0.958]


[outer 014] TRAIN (EMA+K-ens) ll=0.6685  br=0.2377  acc=0.6590


Sample: 100%|██████████| 330/330 [00:12, 27.43it/s, step size=4.06e-01, acc. prob=0.931]


[outer 015] TRAIN (EMA+K-ens) ll=0.6703  br=0.2386  acc=0.6530


Sample: 100%|██████████| 330/330 [00:10, 30.94it/s, step size=4.39e-01, acc. prob=0.906]


[outer 016] TRAIN (EMA+K-ens) ll=0.6677  br=0.2373  acc=0.6690


Sample: 100%|██████████| 330/330 [00:11, 28.52it/s, step size=3.96e-01, acc. prob=0.931]


[outer 017] TRAIN (EMA+K-ens) ll=0.6681  br=0.2375  acc=0.6700


Sample: 100%|██████████| 330/330 [00:11, 27.88it/s, step size=3.65e-01, acc. prob=0.927]


[outer 018] TRAIN (EMA+K-ens) ll=0.6683  br=0.2375  acc=0.6790


Sample: 100%|██████████| 330/330 [00:11, 28.20it/s, step size=4.32e-01, acc. prob=0.910]


[outer 019] TRAIN (EMA+K-ens) ll=0.6669  br=0.2369  acc=0.6830


Sample: 100%|██████████| 330/330 [00:11, 29.66it/s, step size=4.38e-01, acc. prob=0.935]


[outer 020] TRAIN (EMA+K-ens) ll=0.6675  br=0.2371  acc=0.6800


Sample: 100%|██████████| 330/330 [00:10, 30.12it/s, step size=4.05e-01, acc. prob=0.920]


[outer 021] TRAIN (EMA+K-ens) ll=0.6694  br=0.2381  acc=0.6780


Sample: 100%|██████████| 330/330 [00:10, 30.53it/s, step size=4.24e-01, acc. prob=0.916]


[outer 022] TRAIN (EMA+K-ens) ll=0.6691  br=0.2378  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 25.58it/s, step size=4.13e-01, acc. prob=0.886]


[outer 023] TRAIN (EMA+K-ens) ll=0.6655  br=0.2361  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 27.12it/s, step size=3.20e-01, acc. prob=0.945]


[outer 024] TRAIN (EMA+K-ens) ll=0.6639  br=0.2354  acc=0.6840


Sample: 100%|██████████| 330/330 [00:12, 25.80it/s, step size=3.98e-01, acc. prob=0.947]


[outer 025] TRAIN (EMA+K-ens) ll=0.6648  br=0.2358  acc=0.6790


Sample: 100%|██████████| 330/330 [00:12, 27.34it/s, step size=3.53e-01, acc. prob=0.942]


[outer 026] TRAIN (EMA+K-ens) ll=0.6648  br=0.2359  acc=0.6660


Sample: 100%|██████████| 330/330 [00:11, 29.46it/s, step size=4.64e-01, acc. prob=0.911]


[outer 027] TRAIN (EMA+K-ens) ll=0.6647  br=0.2358  acc=0.6650


Sample: 100%|██████████| 330/330 [00:11, 28.43it/s, step size=4.03e-01, acc. prob=0.933]


[outer 028] TRAIN (EMA+K-ens) ll=0.6599  br=0.2334  acc=0.6770


Sample: 100%|██████████| 330/330 [00:12, 27.36it/s, step size=4.22e-01, acc. prob=0.932]


[outer 029] TRAIN (EMA+K-ens) ll=0.6617  br=0.2342  acc=0.6840


Sample: 100%|██████████| 330/330 [00:12, 25.66it/s, step size=4.03e-01, acc. prob=0.921]


[outer 030] TRAIN (EMA+K-ens) ll=0.6626  br=0.2347  acc=0.6860


Sample: 100%|██████████| 330/330 [00:10, 30.96it/s, step size=3.94e-01, acc. prob=0.941]


[outer 031] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 27.88it/s, step size=4.40e-01, acc. prob=0.903]


[outer 032] TRAIN (EMA+K-ens) ll=0.6648  br=0.2358  acc=0.6780


Sample: 100%|██████████| 330/330 [00:11, 29.22it/s, step size=5.06e-01, acc. prob=0.871]


[outer 033] TRAIN (EMA+K-ens) ll=0.6630  br=0.2350  acc=0.6770


Sample: 100%|██████████| 330/330 [00:10, 30.12it/s, step size=3.92e-01, acc. prob=0.929]


[outer 034] TRAIN (EMA+K-ens) ll=0.6647  br=0.2358  acc=0.6780


Sample: 100%|██████████| 330/330 [00:11, 28.81it/s, step size=4.33e-01, acc. prob=0.907]


[outer 035] TRAIN (EMA+K-ens) ll=0.6668  br=0.2368  acc=0.6750
[Early stop @ outer 35] Δll=0.102%, Δbr=0.140%, Δacc=0.004
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}, {'accuracy': 0.6503399610519409, 'brier': 0.23466165363788605, 'logloss': 0.6621001362800598}]


Sample: 100%|██████████| 330/330 [00:11, 28.59it/s, step size=4.22e-01, acc. prob=0.886]


[outer 000] TRAIN (EMA+K-ens) ll=0.7164  br=0.2606  acc=0.5330


Sample: 100%|██████████| 330/330 [00:11, 27.69it/s, step size=3.76e-01, acc. prob=0.925]


[outer 001] TRAIN (EMA+K-ens) ll=0.6909  br=0.2486  acc=0.5660


Sample: 100%|██████████| 330/330 [00:11, 28.56it/s, step size=3.67e-01, acc. prob=0.930]


[outer 002] TRAIN (EMA+K-ens) ll=0.6828  br=0.2445  acc=0.6300


Sample: 100%|██████████| 330/330 [00:11, 28.43it/s, step size=4.04e-01, acc. prob=0.937]


[outer 003] TRAIN (EMA+K-ens) ll=0.6788  br=0.2426  acc=0.6230


Sample: 100%|██████████| 330/330 [00:12, 25.54it/s, step size=3.98e-01, acc. prob=0.930]


[outer 004] TRAIN (EMA+K-ens) ll=0.6751  br=0.2407  acc=0.6430


Sample: 100%|██████████| 330/330 [00:12, 27.42it/s, step size=3.81e-01, acc. prob=0.928]


[outer 005] TRAIN (EMA+K-ens) ll=0.6718  br=0.2392  acc=0.6590


Sample: 100%|██████████| 330/330 [00:11, 29.97it/s, step size=3.75e-01, acc. prob=0.917]


[outer 006] TRAIN (EMA+K-ens) ll=0.6719  br=0.2393  acc=0.6680


Sample: 100%|██████████| 330/330 [00:11, 29.87it/s, step size=3.37e-01, acc. prob=0.924]


[outer 007] TRAIN (EMA+K-ens) ll=0.6694  br=0.2381  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 27.86it/s, step size=4.36e-01, acc. prob=0.901]


[outer 008] TRAIN (EMA+K-ens) ll=0.6666  br=0.2367  acc=0.6850


Sample: 100%|██████████| 330/330 [00:12, 27.19it/s, step size=4.44e-01, acc. prob=0.891]


[outer 009] TRAIN (EMA+K-ens) ll=0.6675  br=0.2371  acc=0.6840


Sample: 100%|██████████| 330/330 [00:12, 26.21it/s, step size=3.84e-01, acc. prob=0.918]


[outer 010] TRAIN (EMA+K-ens) ll=0.6681  br=0.2374  acc=0.6820


Sample: 100%|██████████| 330/330 [00:11, 28.33it/s, step size=3.64e-01, acc. prob=0.938]


[outer 011] TRAIN (EMA+K-ens) ll=0.6706  br=0.2387  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 28.60it/s, step size=3.87e-01, acc. prob=0.935]


[outer 012] TRAIN (EMA+K-ens) ll=0.6737  br=0.2402  acc=0.6520


Sample: 100%|██████████| 330/330 [00:12, 27.47it/s, step size=3.80e-01, acc. prob=0.927]


[outer 013] TRAIN (EMA+K-ens) ll=0.6765  br=0.2416  acc=0.6480


Sample: 100%|██████████| 330/330 [00:12, 27.19it/s, step size=3.71e-01, acc. prob=0.954]


[outer 014] TRAIN (EMA+K-ens) ll=0.6779  br=0.2422  acc=0.6550


Sample: 100%|██████████| 330/330 [00:11, 29.28it/s, step size=4.04e-01, acc. prob=0.931]


[outer 015] TRAIN (EMA+K-ens) ll=0.6792  br=0.2428  acc=0.6390


Sample: 100%|██████████| 330/330 [00:11, 28.02it/s, step size=3.34e-01, acc. prob=0.945]


[outer 016] TRAIN (EMA+K-ens) ll=0.6765  br=0.2415  acc=0.6390


Sample: 100%|██████████| 330/330 [00:11, 27.86it/s, step size=3.84e-01, acc. prob=0.930]


[outer 017] TRAIN (EMA+K-ens) ll=0.6771  br=0.2418  acc=0.6430


Sample: 100%|██████████| 330/330 [00:11, 27.89it/s, step size=3.87e-01, acc. prob=0.933]


[outer 018] TRAIN (EMA+K-ens) ll=0.6731  br=0.2398  acc=0.6600


Sample: 100%|██████████| 330/330 [00:11, 28.03it/s, step size=3.87e-01, acc. prob=0.936]


[outer 019] TRAIN (EMA+K-ens) ll=0.6696  br=0.2382  acc=0.6680


Sample: 100%|██████████| 330/330 [00:12, 27.30it/s, step size=3.47e-01, acc. prob=0.931]


[outer 020] TRAIN (EMA+K-ens) ll=0.6670  br=0.2369  acc=0.6710


Sample: 100%|██████████| 330/330 [00:12, 26.68it/s, step size=4.00e-01, acc. prob=0.929]


[outer 021] TRAIN (EMA+K-ens) ll=0.6658  br=0.2363  acc=0.6800


Sample: 100%|██████████| 330/330 [00:11, 28.81it/s, step size=4.02e-01, acc. prob=0.934]


[outer 022] TRAIN (EMA+K-ens) ll=0.6625  br=0.2347  acc=0.6860


Sample: 100%|██████████| 330/330 [00:11, 28.64it/s, step size=3.89e-01, acc. prob=0.948]


[outer 023] TRAIN (EMA+K-ens) ll=0.6612  br=0.2341  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 28.85it/s, step size=4.06e-01, acc. prob=0.919]


[outer 024] TRAIN (EMA+K-ens) ll=0.6626  br=0.2348  acc=0.6800


Sample: 100%|██████████| 330/330 [00:11, 29.96it/s, step size=3.99e-01, acc. prob=0.944]


[outer 025] TRAIN (EMA+K-ens) ll=0.6615  br=0.2342  acc=0.6830


Sample: 100%|██████████| 330/330 [00:11, 28.27it/s, step size=4.07e-01, acc. prob=0.896]


[outer 026] TRAIN (EMA+K-ens) ll=0.6634  br=0.2352  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 26.53it/s, step size=3.53e-01, acc. prob=0.941]


[outer 027] TRAIN (EMA+K-ens) ll=0.6633  br=0.2351  acc=0.6840


Sample: 100%|██████████| 330/330 [00:12, 26.96it/s, step size=3.73e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6658  br=0.2363  acc=0.6860


Sample: 100%|██████████| 330/330 [00:12, 26.93it/s, step size=3.52e-01, acc. prob=0.947]


[outer 029] TRAIN (EMA+K-ens) ll=0.6648  br=0.2358  acc=0.6870


Sample: 100%|██████████| 330/330 [00:11, 27.71it/s, step size=4.07e-01, acc. prob=0.916]


[outer 030] TRAIN (EMA+K-ens) ll=0.6664  br=0.2366  acc=0.6740


Sample: 100%|██████████| 330/330 [00:12, 26.51it/s, step size=2.97e-01, acc. prob=0.959]


[outer 031] TRAIN (EMA+K-ens) ll=0.6674  br=0.2371  acc=0.6700


Sample: 100%|██████████| 330/330 [00:12, 25.71it/s, step size=3.80e-01, acc. prob=0.918]


[outer 032] TRAIN (EMA+K-ens) ll=0.6694  br=0.2381  acc=0.6620


Sample: 100%|██████████| 330/330 [00:11, 28.03it/s, step size=3.44e-01, acc. prob=0.931]


[outer 033] TRAIN (EMA+K-ens) ll=0.6692  br=0.2380  acc=0.6640


Sample: 100%|██████████| 330/330 [00:12, 25.40it/s, step size=3.20e-01, acc. prob=0.931]


[outer 034] TRAIN (EMA+K-ens) ll=0.6703  br=0.2386  acc=0.6650


Sample: 100%|██████████| 330/330 [00:11, 28.69it/s, step size=4.22e-01, acc. prob=0.923]


[outer 035] TRAIN (EMA+K-ens) ll=0.6726  br=0.2397  acc=0.6700


Sample: 100%|██████████| 330/330 [00:11, 28.41it/s, step size=4.17e-01, acc. prob=0.928]


[outer 036] TRAIN (EMA+K-ens) ll=0.6696  br=0.2382  acc=0.6810


Sample: 100%|██████████| 330/330 [00:11, 29.20it/s, step size=4.30e-01, acc. prob=0.927]


[outer 037] TRAIN (EMA+K-ens) ll=0.6709  br=0.2388  acc=0.6840


Sample: 100%|██████████| 330/330 [00:11, 28.62it/s, step size=3.79e-01, acc. prob=0.935]


[outer 038] TRAIN (EMA+K-ens) ll=0.6651  br=0.2360  acc=0.6850


Sample: 100%|██████████| 330/330 [00:11, 29.27it/s, step size=4.03e-01, acc. prob=0.938]


[outer 039] TRAIN (EMA+K-ens) ll=0.6677  br=0.2372  acc=0.6830
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}, {'accuracy': 0.6503399610519409, 'brier': 0.23466165363788605, 'logloss': 0.6621001362800598}, {'accuracy': 0.611840009689331, 'brier': 0.24080230295658112, 'logloss': 0.6748088598251343}]


Sample: 100%|██████████| 330/330 [00:11, 27.62it/s, step size=3.70e-01, acc. prob=0.940]


[outer 000] TRAIN (EMA+K-ens) ll=0.6972  br=0.2519  acc=0.5430


Sample: 100%|██████████| 330/330 [00:10, 30.26it/s, step size=4.32e-01, acc. prob=0.923]


[outer 001] TRAIN (EMA+K-ens) ll=0.6863  br=0.2465  acc=0.5480


Sample: 100%|██████████| 330/330 [00:12, 27.04it/s, step size=4.03e-01, acc. prob=0.931]


[outer 002] TRAIN (EMA+K-ens) ll=0.6696  br=0.2382  acc=0.6390


Sample: 100%|██████████| 330/330 [00:11, 28.10it/s, step size=4.59e-01, acc. prob=0.870]


[outer 003] TRAIN (EMA+K-ens) ll=0.6675  br=0.2372  acc=0.6440


Sample: 100%|██████████| 330/330 [00:11, 28.70it/s, step size=4.42e-01, acc. prob=0.913]


[outer 004] TRAIN (EMA+K-ens) ll=0.6688  br=0.2378  acc=0.6850


Sample: 100%|██████████| 330/330 [00:12, 27.24it/s, step size=3.38e-01, acc. prob=0.950]


[outer 005] TRAIN (EMA+K-ens) ll=0.6662  br=0.2365  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 27.80it/s, step size=4.28e-01, acc. prob=0.915]


[outer 006] TRAIN (EMA+K-ens) ll=0.6662  br=0.2365  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 27.77it/s, step size=3.48e-01, acc. prob=0.942]


[outer 007] TRAIN (EMA+K-ens) ll=0.6678  br=0.2373  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 29.73it/s, step size=4.49e-01, acc. prob=0.870]


[outer 008] TRAIN (EMA+K-ens) ll=0.6615  br=0.2341  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 25.19it/s, step size=3.59e-01, acc. prob=0.932]


[outer 009] TRAIN (EMA+K-ens) ll=0.6596  br=0.2332  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 29.19it/s, step size=4.15e-01, acc. prob=0.916]


[outer 010] TRAIN (EMA+K-ens) ll=0.6661  br=0.2364  acc=0.6960


Sample: 100%|██████████| 330/330 [00:11, 29.79it/s, step size=4.44e-01, acc. prob=0.893]


[outer 011] TRAIN (EMA+K-ens) ll=0.6660  br=0.2363  acc=0.6960


Sample: 100%|██████████| 330/330 [00:12, 26.54it/s, step size=3.73e-01, acc. prob=0.917]


[outer 012] TRAIN (EMA+K-ens) ll=0.6643  br=0.2355  acc=0.6960


Sample: 100%|██████████| 330/330 [00:12, 27.28it/s, step size=3.70e-01, acc. prob=0.940]


[outer 013] TRAIN (EMA+K-ens) ll=0.6657  br=0.2361  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 29.57it/s, step size=4.02e-01, acc. prob=0.916]


[outer 014] TRAIN (EMA+K-ens) ll=0.6640  br=0.2354  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.38it/s, step size=3.94e-01, acc. prob=0.934]


[outer 015] TRAIN (EMA+K-ens) ll=0.6657  br=0.2362  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.23it/s, step size=4.31e-01, acc. prob=0.897]


[outer 016] TRAIN (EMA+K-ens) ll=0.6683  br=0.2375  acc=0.6940


Sample: 100%|██████████| 330/330 [00:10, 30.08it/s, step size=3.47e-01, acc. prob=0.954]


[outer 017] TRAIN (EMA+K-ens) ll=0.6682  br=0.2374  acc=0.6830


Sample: 100%|██████████| 330/330 [00:11, 27.69it/s, step size=3.95e-01, acc. prob=0.931]


[outer 018] TRAIN (EMA+K-ens) ll=0.6623  br=0.2345  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 27.83it/s, step size=3.91e-01, acc. prob=0.932]


[outer 019] TRAIN (EMA+K-ens) ll=0.6566  br=0.2317  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 28.25it/s, step size=3.91e-01, acc. prob=0.914]


[outer 020] TRAIN (EMA+K-ens) ll=0.6568  br=0.2319  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.45it/s, step size=4.05e-01, acc. prob=0.939]


[outer 021] TRAIN (EMA+K-ens) ll=0.6575  br=0.2322  acc=0.6980


Sample: 100%|██████████| 330/330 [00:12, 25.67it/s, step size=3.82e-01, acc. prob=0.915]


[outer 022] TRAIN (EMA+K-ens) ll=0.6572  br=0.2321  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 29.51it/s, step size=4.64e-01, acc. prob=0.881]


[outer 023] TRAIN (EMA+K-ens) ll=0.6560  br=0.2315  acc=0.6840


Sample: 100%|██████████| 330/330 [00:11, 27.53it/s, step size=3.66e-01, acc. prob=0.937]


[outer 024] TRAIN (EMA+K-ens) ll=0.6552  br=0.2311  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 29.78it/s, step size=4.96e-01, acc. prob=0.878]


[outer 025] TRAIN (EMA+K-ens) ll=0.6581  br=0.2325  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 28.35it/s, step size=4.31e-01, acc. prob=0.924]


[outer 026] TRAIN (EMA+K-ens) ll=0.6588  br=0.2329  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.95it/s, step size=4.15e-01, acc. prob=0.915]


[outer 027] TRAIN (EMA+K-ens) ll=0.6594  br=0.2331  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 29.31it/s, step size=4.22e-01, acc. prob=0.928]


[outer 028] TRAIN (EMA+K-ens) ll=0.6580  br=0.2324  acc=0.6970


Sample: 100%|██████████| 330/330 [00:12, 26.14it/s, step size=3.79e-01, acc. prob=0.921]


[outer 029] TRAIN (EMA+K-ens) ll=0.6567  br=0.2318  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 29.66it/s, step size=3.99e-01, acc. prob=0.924]


[outer 030] TRAIN (EMA+K-ens) ll=0.6555  br=0.2312  acc=0.7030
[Early stop @ outer 30] Δll=0.279%, Δbr=0.386%, Δacc=0.004
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}, {'accuracy': 0.6503399610519409, 'brier': 0.23466165363788605, 'logloss': 0.6621001362800598}, {'accuracy': 0.611840009689331, 'brier': 0.24080230295658112, 'logloss': 0.6748088598251343}, {'accuracy': 0.6690999865531921, 'brier': 0.23178991675376892, 'logloss': 0.6564321517944336}]


Sample: 100%|██████████| 330/330 [00:11, 28.90it/s, step size=4.06e-01, acc. prob=0.904]


[outer 000] TRAIN (EMA+K-ens) ll=0.6722  br=0.2394  acc=0.5860


Sample: 100%|██████████| 330/330 [00:11, 27.52it/s, step size=3.66e-01, acc. prob=0.949]


[outer 001] TRAIN (EMA+K-ens) ll=0.6520  br=0.2297  acc=0.6520


Sample: 100%|██████████| 330/330 [00:12, 27.07it/s, step size=3.76e-01, acc. prob=0.932]


[outer 002] TRAIN (EMA+K-ens) ll=0.6584  br=0.2328  acc=0.6610


Sample: 100%|██████████| 330/330 [00:11, 28.33it/s, step size=4.35e-01, acc. prob=0.900]


[outer 003] TRAIN (EMA+K-ens) ll=0.6636  br=0.2354  acc=0.6460


Sample: 100%|██████████| 330/330 [00:11, 29.20it/s, step size=4.30e-01, acc. prob=0.916]


[outer 004] TRAIN (EMA+K-ens) ll=0.6573  br=0.2323  acc=0.6790


Sample: 100%|██████████| 330/330 [00:10, 30.38it/s, step size=3.80e-01, acc. prob=0.915]


[outer 005] TRAIN (EMA+K-ens) ll=0.6588  br=0.2330  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.90it/s, step size=3.79e-01, acc. prob=0.943]


[outer 006] TRAIN (EMA+K-ens) ll=0.6581  br=0.2325  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 29.96it/s, step size=4.34e-01, acc. prob=0.915]


[outer 007] TRAIN (EMA+K-ens) ll=0.6580  br=0.2325  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 29.27it/s, step size=4.37e-01, acc. prob=0.913]


[outer 008] TRAIN (EMA+K-ens) ll=0.6596  br=0.2333  acc=0.6940


Sample: 100%|██████████| 330/330 [00:11, 28.58it/s, step size=3.97e-01, acc. prob=0.913]


[outer 009] TRAIN (EMA+K-ens) ll=0.6664  br=0.2366  acc=0.6930


Sample: 100%|██████████| 330/330 [00:11, 29.57it/s, step size=3.60e-01, acc. prob=0.950]


[outer 010] TRAIN (EMA+K-ens) ll=0.6650  br=0.2359  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.38it/s, step size=3.68e-01, acc. prob=0.906]


[outer 011] TRAIN (EMA+K-ens) ll=0.6618  br=0.2343  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.89it/s, step size=4.28e-01, acc. prob=0.923]


[outer 012] TRAIN (EMA+K-ens) ll=0.6674  br=0.2370  acc=0.6870


Sample: 100%|██████████| 330/330 [00:11, 28.75it/s, step size=4.03e-01, acc. prob=0.905]


[outer 013] TRAIN (EMA+K-ens) ll=0.6663  br=0.2365  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 28.48it/s, step size=4.18e-01, acc. prob=0.932]


[outer 014] TRAIN (EMA+K-ens) ll=0.6645  br=0.2356  acc=0.6900


Sample: 100%|██████████| 330/330 [00:10, 32.30it/s, step size=4.46e-01, acc. prob=0.935]


[outer 015] TRAIN (EMA+K-ens) ll=0.6648  br=0.2358  acc=0.6890


Sample: 100%|██████████| 330/330 [00:11, 29.27it/s, step size=4.29e-01, acc. prob=0.916]


[outer 016] TRAIN (EMA+K-ens) ll=0.6624  br=0.2346  acc=0.6870


Sample: 100%|██████████| 330/330 [00:11, 28.17it/s, step size=3.85e-01, acc. prob=0.924]


[outer 017] TRAIN (EMA+K-ens) ll=0.6591  br=0.2330  acc=0.6890


Sample: 100%|██████████| 330/330 [00:10, 30.03it/s, step size=4.31e-01, acc. prob=0.932]


[outer 018] TRAIN (EMA+K-ens) ll=0.6574  br=0.2322  acc=0.6920


Sample: 100%|██████████| 330/330 [00:11, 29.08it/s, step size=4.15e-01, acc. prob=0.911]


[outer 019] TRAIN (EMA+K-ens) ll=0.6588  br=0.2330  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 27.17it/s, step size=3.57e-01, acc. prob=0.931]


[outer 020] TRAIN (EMA+K-ens) ll=0.6588  br=0.2330  acc=0.6870


Sample: 100%|██████████| 330/330 [00:12, 27.16it/s, step size=3.98e-01, acc. prob=0.885]


[outer 021] TRAIN (EMA+K-ens) ll=0.6581  br=0.2325  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 26.71it/s, step size=3.65e-01, acc. prob=0.946]


[outer 022] TRAIN (EMA+K-ens) ll=0.6609  br=0.2339  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 28.54it/s, step size=3.55e-01, acc. prob=0.950]


[outer 023] TRAIN (EMA+K-ens) ll=0.6614  br=0.2341  acc=0.6920


Sample: 100%|██████████| 330/330 [00:10, 30.10it/s, step size=4.23e-01, acc. prob=0.904]


[outer 024] TRAIN (EMA+K-ens) ll=0.6622  br=0.2345  acc=0.6950


Sample: 100%|██████████| 330/330 [00:12, 26.98it/s, step size=4.22e-01, acc. prob=0.910]


[outer 025] TRAIN (EMA+K-ens) ll=0.6638  br=0.2354  acc=0.6890


Sample: 100%|██████████| 330/330 [00:11, 29.86it/s, step size=4.29e-01, acc. prob=0.926]


[outer 026] TRAIN (EMA+K-ens) ll=0.6684  br=0.2376  acc=0.6800


Sample: 100%|██████████| 330/330 [00:11, 28.79it/s, step size=3.94e-01, acc. prob=0.924]


[outer 027] TRAIN (EMA+K-ens) ll=0.6723  br=0.2395  acc=0.6670


Sample: 100%|██████████| 330/330 [00:12, 26.75it/s, step size=4.05e-01, acc. prob=0.940]


[outer 028] TRAIN (EMA+K-ens) ll=0.6723  br=0.2395  acc=0.6670


Sample: 100%|██████████| 330/330 [00:11, 29.19it/s, step size=3.98e-01, acc. prob=0.924]


[outer 029] TRAIN (EMA+K-ens) ll=0.6693  br=0.2380  acc=0.6910


Sample: 100%|██████████| 330/330 [00:10, 30.17it/s, step size=4.14e-01, acc. prob=0.933]


[outer 030] TRAIN (EMA+K-ens) ll=0.6691  br=0.2379  acc=0.6890


Sample: 100%|██████████| 330/330 [00:11, 27.83it/s, step size=3.60e-01, acc. prob=0.933]


[outer 031] TRAIN (EMA+K-ens) ll=0.6678  br=0.2373  acc=0.6920


Sample: 100%|██████████| 330/330 [00:10, 30.97it/s, step size=4.23e-01, acc. prob=0.899]


[outer 032] TRAIN (EMA+K-ens) ll=0.6671  br=0.2369  acc=0.6930


Sample: 100%|██████████| 330/330 [00:12, 26.70it/s, step size=3.99e-01, acc. prob=0.907]


[outer 033] TRAIN (EMA+K-ens) ll=0.6660  br=0.2364  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 29.21it/s, step size=4.35e-01, acc. prob=0.873]


[outer 034] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6950


Sample: 100%|██████████| 330/330 [00:11, 27.62it/s, step size=4.05e-01, acc. prob=0.933]


[outer 035] TRAIN (EMA+K-ens) ll=0.6625  br=0.2347  acc=0.6960


Sample: 100%|██████████| 330/330 [00:12, 27.46it/s, step size=3.75e-01, acc. prob=0.930]


[outer 036] TRAIN (EMA+K-ens) ll=0.6601  br=0.2335  acc=0.6930


Sample: 100%|██████████| 330/330 [00:12, 27.19it/s, step size=3.93e-01, acc. prob=0.921]


[outer 037] TRAIN (EMA+K-ens) ll=0.6658  br=0.2363  acc=0.6950


Sample: 100%|██████████| 330/330 [00:12, 27.37it/s, step size=3.98e-01, acc. prob=0.941]


[outer 038] TRAIN (EMA+K-ens) ll=0.6667  br=0.2367  acc=0.6960


Sample: 100%|██████████| 330/330 [00:10, 30.21it/s, step size=4.11e-01, acc. prob=0.923]


[outer 039] TRAIN (EMA+K-ens) ll=0.6667  br=0.2367  acc=0.6960
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}, {'accuracy': 0.6503399610519409, 'brier': 0.23466165363788605, 'logloss': 0.6621001362800598}, {'accuracy': 0.611840009689331, 'brier': 0.24080230295658112, 'logloss': 0.6748088598251343}, {'accuracy': 0.6690999865531921, 'brier': 0.23178991675376892, 'logloss': 0.6564321517944336}, {'accuracy': 0.6614599823951721, 'brier': 0.23591962456703186, 'logloss': 0.6651105880737305}]


Sample: 100%|██████████| 330/330 [00:12, 27.03it/s, step size=3.56e-01, acc. prob=0.947]


[outer 000] TRAIN (EMA+K-ens) ll=0.6528  br=0.2302  acc=0.6700


Sample: 100%|██████████| 330/330 [00:11, 27.88it/s, step size=4.29e-01, acc. prob=0.913]


[outer 001] TRAIN (EMA+K-ens) ll=0.6640  br=0.2354  acc=0.6530


Sample: 100%|██████████| 330/330 [00:11, 28.30it/s, step size=4.16e-01, acc. prob=0.930]


[outer 002] TRAIN (EMA+K-ens) ll=0.6641  br=0.2355  acc=0.6500


Sample: 100%|██████████| 330/330 [00:12, 26.05it/s, step size=3.83e-01, acc. prob=0.904]


[outer 003] TRAIN (EMA+K-ens) ll=0.6557  br=0.2314  acc=0.6980


Sample: 100%|██████████| 330/330 [00:11, 28.17it/s, step size=4.35e-01, acc. prob=0.928]


[outer 004] TRAIN (EMA+K-ens) ll=0.6603  br=0.2335  acc=0.7040


Sample: 100%|██████████| 330/330 [00:11, 28.75it/s, step size=3.67e-01, acc. prob=0.938]


[outer 005] TRAIN (EMA+K-ens) ll=0.6582  br=0.2325  acc=0.7140


Sample: 100%|██████████| 330/330 [00:12, 26.23it/s, step size=3.49e-01, acc. prob=0.952]


[outer 006] TRAIN (EMA+K-ens) ll=0.6564  br=0.2317  acc=0.7080


Sample: 100%|██████████| 330/330 [00:10, 30.36it/s, step size=4.26e-01, acc. prob=0.908]


[outer 007] TRAIN (EMA+K-ens) ll=0.6557  br=0.2313  acc=0.7110


Sample: 100%|██████████| 330/330 [00:11, 29.56it/s, step size=3.87e-01, acc. prob=0.930]


[outer 008] TRAIN (EMA+K-ens) ll=0.6568  br=0.2318  acc=0.7110


Sample: 100%|██████████| 330/330 [00:11, 28.04it/s, step size=3.50e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6539  br=0.2303  acc=0.7080


Sample: 100%|██████████| 330/330 [00:12, 26.89it/s, step size=3.96e-01, acc. prob=0.931]


[outer 010] TRAIN (EMA+K-ens) ll=0.6512  br=0.2290  acc=0.7020


Sample: 100%|██████████| 330/330 [00:11, 28.07it/s, step size=3.38e-01, acc. prob=0.937]


[outer 011] TRAIN (EMA+K-ens) ll=0.6527  br=0.2298  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 28.16it/s, step size=4.27e-01, acc. prob=0.908]


[outer 012] TRAIN (EMA+K-ens) ll=0.6506  br=0.2288  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 28.10it/s, step size=3.89e-01, acc. prob=0.938]


[outer 013] TRAIN (EMA+K-ens) ll=0.6495  br=0.2282  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 29.75it/s, step size=4.26e-01, acc. prob=0.897]


[outer 014] TRAIN (EMA+K-ens) ll=0.6514  br=0.2291  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 28.64it/s, step size=4.31e-01, acc. prob=0.898]


[outer 015] TRAIN (EMA+K-ens) ll=0.6551  br=0.2309  acc=0.6960


Sample: 100%|██████████| 330/330 [00:12, 27.38it/s, step size=3.61e-01, acc. prob=0.929]


[outer 016] TRAIN (EMA+K-ens) ll=0.6544  br=0.2307  acc=0.6900


Sample: 100%|██████████| 330/330 [00:11, 28.26it/s, step size=3.89e-01, acc. prob=0.942]


[outer 017] TRAIN (EMA+K-ens) ll=0.6561  br=0.2314  acc=0.6840


Sample: 100%|██████████| 330/330 [00:10, 30.01it/s, step size=3.87e-01, acc. prob=0.923]


[outer 018] TRAIN (EMA+K-ens) ll=0.6561  br=0.2315  acc=0.6970


Sample: 100%|██████████| 330/330 [00:11, 29.24it/s, step size=3.99e-01, acc. prob=0.935]


[outer 019] TRAIN (EMA+K-ens) ll=0.6575  br=0.2322  acc=0.7070


Sample: 100%|██████████| 330/330 [00:12, 26.91it/s, step size=3.41e-01, acc. prob=0.957]


[outer 020] TRAIN (EMA+K-ens) ll=0.6560  br=0.2314  acc=0.7070


Sample: 100%|██████████| 330/330 [00:11, 29.69it/s, step size=3.70e-01, acc. prob=0.934]


[outer 021] TRAIN (EMA+K-ens) ll=0.6567  br=0.2318  acc=0.7040


Sample: 100%|██████████| 330/330 [00:10, 30.90it/s, step size=4.50e-01, acc. prob=0.897]


[outer 022] TRAIN (EMA+K-ens) ll=0.6558  br=0.2313  acc=0.7060


Sample: 100%|██████████| 330/330 [00:11, 27.69it/s, step size=3.69e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6505  br=0.2287  acc=0.7070


Sample: 100%|██████████| 330/330 [00:12, 26.91it/s, step size=3.75e-01, acc. prob=0.933]


[outer 024] TRAIN (EMA+K-ens) ll=0.6525  br=0.2297  acc=0.7060


Sample: 100%|██████████| 330/330 [00:11, 27.58it/s, step size=4.24e-01, acc. prob=0.912]


[outer 025] TRAIN (EMA+K-ens) ll=0.6535  br=0.2302  acc=0.7040


Sample: 100%|██████████| 330/330 [00:11, 28.10it/s, step size=3.81e-01, acc. prob=0.922]


[outer 026] TRAIN (EMA+K-ens) ll=0.6567  br=0.2318  acc=0.7040


Sample: 100%|██████████| 330/330 [00:13, 25.06it/s, step size=4.02e-01, acc. prob=0.930]


[outer 027] TRAIN (EMA+K-ens) ll=0.6540  br=0.2305  acc=0.7020


Sample: 100%|██████████| 330/330 [00:12, 25.70it/s, step size=3.76e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6511  br=0.2291  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 29.44it/s, step size=3.68e-01, acc. prob=0.936]


[outer 029] TRAIN (EMA+K-ens) ll=0.6548  br=0.2309  acc=0.6990


Sample: 100%|██████████| 330/330 [00:11, 27.99it/s, step size=3.56e-01, acc. prob=0.939]


[outer 030] TRAIN (EMA+K-ens) ll=0.6559  br=0.2314  acc=0.7010
[Early stop @ outer 30] Δll=0.222%, Δbr=0.298%, Δacc=0.000
[{'accuracy': 0.6865999698638916, 'brier': 0.23077991604804993, 'logloss': 0.6544945240020752}, {'accuracy': 0.6186999678611755, 'brier': 0.23890694975852966, 'logloss': 0.6714202165603638}, {'accuracy': 0.6498399972915649, 'brier': 0.23585735261440277, 'logloss': 0.6648035645484924}, {'accuracy': 0.6753599643707275, 'brier': 0.22917573153972626, 'logloss': 0.6510149836540222}, {'accuracy': 0.6357199549674988, 'brier': 0.2415611296892166, 'logloss': 0.6763410568237305}, {'accuracy': 0.6503399610519409, 'brier': 0.23466165363788605, 'logloss': 0.6621001362800598}, {'accuracy': 0.611840009689331, 'brier': 0.24080230295658112, 'logloss': 0.6748088598251343}, {'accuracy': 0.6690999865531921, 'brier': 0.23178991675376892, 'logloss': 0.6564321517944336}, {'accuracy': 0.6614599823951721, 'brier': 0.23591962456703186, 'logloss': 0.6651105880737305}, {'accuracy': 0.687399983

In [None]:
all_metrics = []

for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type="normal",
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = "normal",
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=True, nonlinear=False, group=False,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=2.66e-01, acc. prob=0.948]


[outer 000] TRAIN (EMA+K-ens) ll=0.7484  br=0.2752  acc=0.4990


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=2.41e-01, acc. prob=0.954]


[outer 001] TRAIN (EMA+K-ens) ll=0.6945  br=0.2493  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 21.40it/s, step size=3.03e-01, acc. prob=0.920]


[outer 002] TRAIN (EMA+K-ens) ll=0.6709  br=0.2384  acc=0.6820


Sample: 100%|██████████| 330/330 [00:15, 21.87it/s, step size=2.59e-01, acc. prob=0.961]


[outer 003] TRAIN (EMA+K-ens) ll=0.6796  br=0.2425  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.06it/s, step size=2.66e-01, acc. prob=0.928]


[outer 004] TRAIN (EMA+K-ens) ll=0.6777  br=0.2416  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.82e-01, acc. prob=0.948]


[outer 005] TRAIN (EMA+K-ens) ll=0.6895  br=0.2470  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=2.69e-01, acc. prob=0.955]


[outer 006] TRAIN (EMA+K-ens) ll=0.6958  br=0.2498  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.64it/s, step size=2.59e-01, acc. prob=0.941]


[outer 007] TRAIN (EMA+K-ens) ll=0.6911  br=0.2475  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 23.74it/s, step size=3.36e-01, acc. prob=0.919]


[outer 008] TRAIN (EMA+K-ens) ll=0.6710  br=0.2381  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.93it/s, step size=2.53e-01, acc. prob=0.945]


[outer 009] TRAIN (EMA+K-ens) ll=0.6842  br=0.2442  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.74it/s, step size=2.92e-01, acc. prob=0.929]


[outer 010] TRAIN (EMA+K-ens) ll=0.6889  br=0.2463  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 23.60it/s, step size=2.96e-01, acc. prob=0.948]


[outer 011] TRAIN (EMA+K-ens) ll=0.6846  br=0.2444  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 20.41it/s, step size=2.08e-01, acc. prob=0.958]


[outer 012] TRAIN (EMA+K-ens) ll=0.6949  br=0.2491  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=2.51e-01, acc. prob=0.961]


[outer 013] TRAIN (EMA+K-ens) ll=0.6833  br=0.2437  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=2.11e-01, acc. prob=0.974]


[outer 014] TRAIN (EMA+K-ens) ll=0.6796  br=0.2422  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.15it/s, step size=2.75e-01, acc. prob=0.952]


[outer 015] TRAIN (EMA+K-ens) ll=0.6823  br=0.2436  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.27it/s, step size=2.55e-01, acc. prob=0.933]


[outer 016] TRAIN (EMA+K-ens) ll=0.7011  br=0.2522  acc=0.6630


Sample: 100%|██████████| 330/330 [00:14, 22.17it/s, step size=2.84e-01, acc. prob=0.928]


[outer 017] TRAIN (EMA+K-ens) ll=0.7004  br=0.2522  acc=0.6460


Sample: 100%|██████████| 330/330 [00:14, 22.55it/s, step size=2.46e-01, acc. prob=0.968]


[outer 018] TRAIN (EMA+K-ens) ll=0.7100  br=0.2567  acc=0.6200


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.43e-01, acc. prob=0.968]


[outer 019] TRAIN (EMA+K-ens) ll=0.7203  br=0.2613  acc=0.6270


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=2.28e-01, acc. prob=0.946]


[outer 020] TRAIN (EMA+K-ens) ll=0.7146  br=0.2588  acc=0.6430


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.47e-01, acc. prob=0.950]


[outer 021] TRAIN (EMA+K-ens) ll=0.7074  br=0.2557  acc=0.6740


Sample: 100%|██████████| 330/330 [00:14, 23.30it/s, step size=3.24e-01, acc. prob=0.953]


[outer 022] TRAIN (EMA+K-ens) ll=0.7092  br=0.2562  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 20.29it/s, step size=2.51e-01, acc. prob=0.947]


[outer 023] TRAIN (EMA+K-ens) ll=0.7029  br=0.2533  acc=0.6970


Sample: 100%|██████████| 330/330 [00:15, 20.97it/s, step size=2.44e-01, acc. prob=0.945]


[outer 024] TRAIN (EMA+K-ens) ll=0.6969  br=0.2505  acc=0.6940


Sample: 100%|██████████| 330/330 [00:15, 21.88it/s, step size=3.24e-01, acc. prob=0.950]


[outer 025] TRAIN (EMA+K-ens) ll=0.6999  br=0.2517  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=2.43e-01, acc. prob=0.953]


[outer 026] TRAIN (EMA+K-ens) ll=0.6884  br=0.2463  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.29it/s, step size=2.74e-01, acc. prob=0.963]


[outer 027] TRAIN (EMA+K-ens) ll=0.6836  br=0.2441  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 20.88it/s, step size=2.21e-01, acc. prob=0.967]


[outer 028] TRAIN (EMA+K-ens) ll=0.6829  br=0.2436  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 20.44it/s, step size=2.03e-01, acc. prob=0.964]


[outer 029] TRAIN (EMA+K-ens) ll=0.6951  br=0.2491  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.41it/s, step size=2.24e-01, acc. prob=0.960]


[outer 030] TRAIN (EMA+K-ens) ll=0.7009  br=0.2516  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.91it/s, step size=2.42e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6998  br=0.2511  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.10it/s, step size=2.70e-01, acc. prob=0.968]


[outer 032] TRAIN (EMA+K-ens) ll=0.7002  br=0.2510  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.89it/s, step size=3.42e-01, acc. prob=0.883]


[outer 033] TRAIN (EMA+K-ens) ll=0.6789  br=0.2415  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=2.94e-01, acc. prob=0.939]


[outer 034] TRAIN (EMA+K-ens) ll=0.6831  br=0.2435  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.07it/s, step size=2.72e-01, acc. prob=0.954]


[outer 035] TRAIN (EMA+K-ens) ll=0.6706  br=0.2378  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=2.92e-01, acc. prob=0.960]


[outer 036] TRAIN (EMA+K-ens) ll=0.6660  br=0.2358  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 20.72it/s, step size=2.24e-01, acc. prob=0.959]


[outer 037] TRAIN (EMA+K-ens) ll=0.6623  br=0.2340  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.41it/s, step size=2.65e-01, acc. prob=0.962]


[outer 038] TRAIN (EMA+K-ens) ll=0.6536  br=0.2301  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.78it/s, step size=2.11e-01, acc. prob=0.957]


[outer 039] TRAIN (EMA+K-ens) ll=0.6572  br=0.2318  acc=0.7000
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}]


Sample: 100%|██████████| 330/330 [00:15, 20.78it/s, step size=2.43e-01, acc. prob=0.956]


[outer 000] TRAIN (EMA+K-ens) ll=0.8102  br=0.3045  acc=0.2790


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=2.56e-01, acc. prob=0.955]


[outer 001] TRAIN (EMA+K-ens) ll=0.7086  br=0.2563  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.61e-01, acc. prob=0.948]


[outer 002] TRAIN (EMA+K-ens) ll=0.7160  br=0.2596  acc=0.6320


Sample: 100%|██████████| 330/330 [00:14, 22.74it/s, step size=2.94e-01, acc. prob=0.933]


[outer 003] TRAIN (EMA+K-ens) ll=0.6990  br=0.2517  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 20.85it/s, step size=2.64e-01, acc. prob=0.951]


[outer 004] TRAIN (EMA+K-ens) ll=0.6677  br=0.2368  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=2.30e-01, acc. prob=0.950]


[outer 005] TRAIN (EMA+K-ens) ll=0.6723  br=0.2389  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=2.50e-01, acc. prob=0.941]


[outer 006] TRAIN (EMA+K-ens) ll=0.6755  br=0.2406  acc=0.6800


Sample: 100%|██████████| 330/330 [00:14, 23.41it/s, step size=2.36e-01, acc. prob=0.930]


[outer 007] TRAIN (EMA+K-ens) ll=0.6748  br=0.2402  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=2.99e-01, acc. prob=0.929]


[outer 008] TRAIN (EMA+K-ens) ll=0.6495  br=0.2282  acc=0.6920


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.67e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6513  br=0.2290  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 19.47it/s, step size=2.42e-01, acc. prob=0.953]


[outer 010] TRAIN (EMA+K-ens) ll=0.6448  br=0.2259  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 20.65it/s, step size=2.74e-01, acc. prob=0.958]


[outer 011] TRAIN (EMA+K-ens) ll=0.6473  br=0.2271  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=2.84e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.6702  br=0.2380  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 21.71it/s, step size=2.69e-01, acc. prob=0.950]


[outer 013] TRAIN (EMA+K-ens) ll=0.6624  br=0.2343  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.20it/s, step size=3.53e-01, acc. prob=0.946]


[outer 014] TRAIN (EMA+K-ens) ll=0.6622  br=0.2341  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.34it/s, step size=2.06e-01, acc. prob=0.980]


[outer 015] TRAIN (EMA+K-ens) ll=0.6593  br=0.2329  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=2.90e-01, acc. prob=0.945]


[outer 016] TRAIN (EMA+K-ens) ll=0.6720  br=0.2389  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 21.64it/s, step size=3.11e-01, acc. prob=0.950]


[outer 017] TRAIN (EMA+K-ens) ll=0.6648  br=0.2354  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.01it/s, step size=3.16e-01, acc. prob=0.940]


[outer 018] TRAIN (EMA+K-ens) ll=0.6559  br=0.2313  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=2.35e-01, acc. prob=0.943]


[outer 019] TRAIN (EMA+K-ens) ll=0.6511  br=0.2290  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.19it/s, step size=2.75e-01, acc. prob=0.935]


[outer 020] TRAIN (EMA+K-ens) ll=0.6414  br=0.2244  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 20.34it/s, step size=2.44e-01, acc. prob=0.962]


[outer 021] TRAIN (EMA+K-ens) ll=0.6470  br=0.2271  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.42it/s, step size=2.84e-01, acc. prob=0.940]


[outer 022] TRAIN (EMA+K-ens) ll=0.6431  br=0.2253  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.83it/s, step size=2.68e-01, acc. prob=0.939]


[outer 023] TRAIN (EMA+K-ens) ll=0.6503  br=0.2287  acc=0.6740


Sample: 100%|██████████| 330/330 [00:14, 22.18it/s, step size=2.86e-01, acc. prob=0.923]


[outer 024] TRAIN (EMA+K-ens) ll=0.6483  br=0.2278  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.08e-01, acc. prob=0.964]


[outer 025] TRAIN (EMA+K-ens) ll=0.6468  br=0.2271  acc=0.6700


Sample: 100%|██████████| 330/330 [00:15, 21.32it/s, step size=2.76e-01, acc. prob=0.941]


[outer 026] TRAIN (EMA+K-ens) ll=0.6439  br=0.2256  acc=0.6790


Sample: 100%|██████████| 330/330 [00:13, 23.60it/s, step size=2.58e-01, acc. prob=0.971]


[outer 027] TRAIN (EMA+K-ens) ll=0.6491  br=0.2280  acc=0.6760


Sample: 100%|██████████| 330/330 [00:15, 21.39it/s, step size=2.60e-01, acc. prob=0.967]


[outer 028] TRAIN (EMA+K-ens) ll=0.6547  br=0.2306  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 23.39it/s, step size=2.99e-01, acc. prob=0.930]


[outer 029] TRAIN (EMA+K-ens) ll=0.6574  br=0.2318  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.43it/s, step size=2.62e-01, acc. prob=0.954]


[outer 030] TRAIN (EMA+K-ens) ll=0.6630  br=0.2345  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=2.75e-01, acc. prob=0.948]


[outer 031] TRAIN (EMA+K-ens) ll=0.6563  br=0.2314  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.25it/s, step size=2.19e-01, acc. prob=0.962]


[outer 032] TRAIN (EMA+K-ens) ll=0.6453  br=0.2262  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=3.28e-01, acc. prob=0.921]


[outer 033] TRAIN (EMA+K-ens) ll=0.6569  br=0.2316  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.53it/s, step size=3.20e-01, acc. prob=0.918]


[outer 034] TRAIN (EMA+K-ens) ll=0.6640  br=0.2350  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=3.28e-01, acc. prob=0.926]


[outer 035] TRAIN (EMA+K-ens) ll=0.6574  br=0.2320  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 21.86it/s, step size=2.32e-01, acc. prob=0.970]


[outer 036] TRAIN (EMA+K-ens) ll=0.6485  br=0.2280  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=2.21e-01, acc. prob=0.971]


[outer 037] TRAIN (EMA+K-ens) ll=0.6427  br=0.2250  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 23.42it/s, step size=2.72e-01, acc. prob=0.948]


[outer 038] TRAIN (EMA+K-ens) ll=0.6444  br=0.2257  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.79it/s, step size=3.32e-01, acc. prob=0.934]


[outer 039] TRAIN (EMA+K-ens) ll=0.6531  br=0.2296  acc=0.6910
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}]


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.96e-01, acc. prob=0.928]


[outer 000] TRAIN (EMA+K-ens) ll=0.7213  br=0.2622  acc=0.5460


Sample: 100%|██████████| 330/330 [00:15, 21.09it/s, step size=2.59e-01, acc. prob=0.954]


[outer 001] TRAIN (EMA+K-ens) ll=0.7249  br=0.2627  acc=0.6920


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.03e-01, acc. prob=0.924]


[outer 002] TRAIN (EMA+K-ens) ll=0.7101  br=0.2562  acc=0.7010


Sample: 100%|██████████| 330/330 [00:14, 22.08it/s, step size=2.66e-01, acc. prob=0.958]


[outer 003] TRAIN (EMA+K-ens) ll=0.6778  br=0.2412  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.03it/s, step size=3.00e-01, acc. prob=0.957]


[outer 004] TRAIN (EMA+K-ens) ll=0.6802  br=0.2424  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=3.05e-01, acc. prob=0.922]


[outer 005] TRAIN (EMA+K-ens) ll=0.6740  br=0.2395  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=2.41e-01, acc. prob=0.981]


[outer 006] TRAIN (EMA+K-ens) ll=0.6729  br=0.2390  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.65it/s, step size=2.43e-01, acc. prob=0.955]


[outer 007] TRAIN (EMA+K-ens) ll=0.6647  br=0.2351  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.63e-01, acc. prob=0.929]


[outer 008] TRAIN (EMA+K-ens) ll=0.6510  br=0.2286  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=2.31e-01, acc. prob=0.949]


[outer 009] TRAIN (EMA+K-ens) ll=0.6426  br=0.2246  acc=0.7110


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=2.55e-01, acc. prob=0.936]


[outer 010] TRAIN (EMA+K-ens) ll=0.6260  br=0.2169  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.06it/s, step size=2.50e-01, acc. prob=0.948]


[outer 011] TRAIN (EMA+K-ens) ll=0.6315  br=0.2194  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.20it/s, step size=2.47e-01, acc. prob=0.952]


[outer 012] TRAIN (EMA+K-ens) ll=0.6272  br=0.2175  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.29it/s, step size=2.62e-01, acc. prob=0.962]


[outer 013] TRAIN (EMA+K-ens) ll=0.6259  br=0.2171  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.57it/s, step size=3.01e-01, acc. prob=0.932]


[outer 014] TRAIN (EMA+K-ens) ll=0.6309  br=0.2195  acc=0.6790


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=2.28e-01, acc. prob=0.940]


[outer 015] TRAIN (EMA+K-ens) ll=0.6393  br=0.2234  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.72e-01, acc. prob=0.934]


[outer 016] TRAIN (EMA+K-ens) ll=0.6480  br=0.2274  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 20.68it/s, step size=2.44e-01, acc. prob=0.962]


[outer 017] TRAIN (EMA+K-ens) ll=0.6459  br=0.2263  acc=0.6750


Sample: 100%|██████████| 330/330 [00:14, 23.37it/s, step size=2.71e-01, acc. prob=0.945]


[outer 018] TRAIN (EMA+K-ens) ll=0.6705  br=0.2374  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.33it/s, step size=2.38e-01, acc. prob=0.954]


[outer 019] TRAIN (EMA+K-ens) ll=0.6758  br=0.2399  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=2.70e-01, acc. prob=0.967]


[outer 020] TRAIN (EMA+K-ens) ll=0.6807  br=0.2418  acc=0.7070


Sample: 100%|██████████| 330/330 [00:16, 20.47it/s, step size=2.65e-01, acc. prob=0.963]


[outer 021] TRAIN (EMA+K-ens) ll=0.6923  br=0.2470  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 20.65it/s, step size=2.46e-01, acc. prob=0.974]


[outer 022] TRAIN (EMA+K-ens) ll=0.6871  br=0.2448  acc=0.7120


Sample: 100%|██████████| 330/330 [00:17, 19.36it/s, step size=2.26e-01, acc. prob=0.968]


[outer 023] TRAIN (EMA+K-ens) ll=0.6940  br=0.2478  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=2.74e-01, acc. prob=0.960]


[outer 024] TRAIN (EMA+K-ens) ll=0.6963  br=0.2487  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=2.37e-01, acc. prob=0.947]


[outer 025] TRAIN (EMA+K-ens) ll=0.6970  br=0.2490  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.58it/s, step size=3.23e-01, acc. prob=0.937]


[outer 026] TRAIN (EMA+K-ens) ll=0.6764  br=0.2402  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=3.04e-01, acc. prob=0.928]


[outer 027] TRAIN (EMA+K-ens) ll=0.6672  br=0.2360  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.31it/s, step size=2.35e-01, acc. prob=0.946]


[outer 028] TRAIN (EMA+K-ens) ll=0.6557  br=0.2307  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.71it/s, step size=2.32e-01, acc. prob=0.962]


[outer 029] TRAIN (EMA+K-ens) ll=0.6416  br=0.2240  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.45it/s, step size=3.15e-01, acc. prob=0.952]


[outer 030] TRAIN (EMA+K-ens) ll=0.6406  br=0.2235  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.23it/s, step size=2.48e-01, acc. prob=0.965]


[outer 031] TRAIN (EMA+K-ens) ll=0.6332  br=0.2200  acc=0.7120


Sample: 100%|██████████| 330/330 [00:14, 22.01it/s, step size=2.91e-01, acc. prob=0.938]


[outer 032] TRAIN (EMA+K-ens) ll=0.6273  br=0.2172  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=1.90e-01, acc. prob=0.973]


[outer 033] TRAIN (EMA+K-ens) ll=0.6246  br=0.2159  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.37it/s, step size=2.61e-01, acc. prob=0.957]


[outer 034] TRAIN (EMA+K-ens) ll=0.6385  br=0.2222  acc=0.7140


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.13e-01, acc. prob=0.967]


[outer 035] TRAIN (EMA+K-ens) ll=0.6483  br=0.2267  acc=0.7150


Sample: 100%|██████████| 330/330 [00:17, 19.14it/s, step size=2.42e-01, acc. prob=0.967]


[outer 036] TRAIN (EMA+K-ens) ll=0.6527  br=0.2285  acc=0.7140


Sample: 100%|██████████| 330/330 [00:16, 20.30it/s, step size=2.58e-01, acc. prob=0.945]


[outer 037] TRAIN (EMA+K-ens) ll=0.6595  br=0.2317  acc=0.7120


Sample: 100%|██████████| 330/330 [00:14, 23.12it/s, step size=2.73e-01, acc. prob=0.953]


[outer 038] TRAIN (EMA+K-ens) ll=0.6526  br=0.2287  acc=0.7140


Sample: 100%|██████████| 330/330 [00:14, 22.61it/s, step size=3.07e-01, acc. prob=0.937]


[outer 039] TRAIN (EMA+K-ens) ll=0.6458  br=0.2257  acc=0.7160
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}]


Sample: 100%|██████████| 330/330 [00:15, 21.61it/s, step size=3.39e-01, acc. prob=0.919]


[outer 000] TRAIN (EMA+K-ens) ll=0.6929  br=0.2488  acc=0.6270


Sample: 100%|██████████| 330/330 [00:14, 22.75it/s, step size=3.29e-01, acc. prob=0.919]


[outer 001] TRAIN (EMA+K-ens) ll=0.6814  br=0.2428  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=1.97e-01, acc. prob=0.978]


[outer 002] TRAIN (EMA+K-ens) ll=0.6648  br=0.2350  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.80it/s, step size=2.68e-01, acc. prob=0.943]


[outer 003] TRAIN (EMA+K-ens) ll=0.6670  br=0.2362  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 20.32it/s, step size=2.95e-01, acc. prob=0.937]


[outer 004] TRAIN (EMA+K-ens) ll=0.6621  br=0.2339  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 20.93it/s, step size=2.57e-01, acc. prob=0.918]


[outer 005] TRAIN (EMA+K-ens) ll=0.6666  br=0.2363  acc=0.6970


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=2.80e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6642  br=0.2351  acc=0.7020


Sample: 100%|██████████| 330/330 [00:15, 21.55it/s, step size=2.93e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6659  br=0.2358  acc=0.7050


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.85e-01, acc. prob=0.921]


[outer 008] TRAIN (EMA+K-ens) ll=0.6575  br=0.2319  acc=0.7040


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=2.91e-01, acc. prob=0.942]


[outer 009] TRAIN (EMA+K-ens) ll=0.6598  br=0.2328  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 20.26it/s, step size=2.92e-01, acc. prob=0.922]


[outer 010] TRAIN (EMA+K-ens) ll=0.6624  br=0.2339  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.89it/s, step size=3.05e-01, acc. prob=0.917]


[outer 011] TRAIN (EMA+K-ens) ll=0.6723  br=0.2382  acc=0.7060


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=2.34e-01, acc. prob=0.952]


[outer 012] TRAIN (EMA+K-ens) ll=0.6607  br=0.2329  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 23.03it/s, step size=2.85e-01, acc. prob=0.959]


[outer 013] TRAIN (EMA+K-ens) ll=0.6508  br=0.2283  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.70e-01, acc. prob=0.950]


[outer 014] TRAIN (EMA+K-ens) ll=0.6568  br=0.2309  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.77e-01, acc. prob=0.927]


[outer 015] TRAIN (EMA+K-ens) ll=0.6626  br=0.2335  acc=0.7040


Sample: 100%|██████████| 330/330 [00:15, 21.02it/s, step size=2.80e-01, acc. prob=0.932]


[outer 016] TRAIN (EMA+K-ens) ll=0.6692  br=0.2364  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=1.94e-01, acc. prob=0.979]


[outer 017] TRAIN (EMA+K-ens) ll=0.6607  br=0.2328  acc=0.7050
[Early stop @ outer 17] Δll=0.116%, Δbr=0.399%, Δacc=0.003
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}]


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=2.60e-01, acc. prob=0.966]


[outer 000] TRAIN (EMA+K-ens) ll=0.6295  br=0.2190  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=2.84e-01, acc. prob=0.949]


[outer 001] TRAIN (EMA+K-ens) ll=0.6676  br=0.2375  acc=0.5770


Sample: 100%|██████████| 330/330 [00:15, 21.86it/s, step size=2.92e-01, acc. prob=0.974]


[outer 002] TRAIN (EMA+K-ens) ll=0.6298  br=0.2190  acc=0.6530


Sample: 100%|██████████| 330/330 [00:15, 21.03it/s, step size=2.51e-01, acc. prob=0.954]


[outer 003] TRAIN (EMA+K-ens) ll=0.6390  br=0.2238  acc=0.6350


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=2.96e-01, acc. prob=0.914]


[outer 004] TRAIN (EMA+K-ens) ll=0.6179  br=0.2136  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 20.67it/s, step size=2.27e-01, acc. prob=0.967]


[outer 005] TRAIN (EMA+K-ens) ll=0.6450  br=0.2264  acc=0.6440


Sample: 100%|██████████| 330/330 [00:16, 20.62it/s, step size=3.00e-01, acc. prob=0.950]


[outer 006] TRAIN (EMA+K-ens) ll=0.6573  br=0.2322  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 19.05it/s, step size=2.07e-01, acc. prob=0.974]


[outer 007] TRAIN (EMA+K-ens) ll=0.6544  br=0.2309  acc=0.6500


Sample: 100%|██████████| 330/330 [00:14, 23.14it/s, step size=2.77e-01, acc. prob=0.929]


[outer 008] TRAIN (EMA+K-ens) ll=0.6656  br=0.2359  acc=0.6680


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=3.35e-01, acc. prob=0.951]


[outer 009] TRAIN (EMA+K-ens) ll=0.6598  br=0.2330  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.47it/s, step size=3.40e-01, acc. prob=0.929]


[outer 010] TRAIN (EMA+K-ens) ll=0.6808  br=0.2426  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 21.23it/s, step size=2.37e-01, acc. prob=0.950]


[outer 011] TRAIN (EMA+K-ens) ll=0.6858  br=0.2450  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=2.35e-01, acc. prob=0.954]


[outer 012] TRAIN (EMA+K-ens) ll=0.7001  br=0.2519  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 21.45it/s, step size=2.60e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6904  br=0.2474  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 20.92it/s, step size=2.98e-01, acc. prob=0.941]


[outer 014] TRAIN (EMA+K-ens) ll=0.6768  br=0.2412  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 21.65it/s, step size=2.94e-01, acc. prob=0.943]


[outer 015] TRAIN (EMA+K-ens) ll=0.6881  br=0.2466  acc=0.6720


Sample: 100%|██████████| 330/330 [00:14, 22.27it/s, step size=2.51e-01, acc. prob=0.922]


[outer 016] TRAIN (EMA+K-ens) ll=0.6799  br=0.2426  acc=0.6690


Sample: 100%|██████████| 330/330 [00:14, 22.23it/s, step size=2.88e-01, acc. prob=0.912]


[outer 017] TRAIN (EMA+K-ens) ll=0.6810  br=0.2432  acc=0.6600


Sample: 100%|██████████| 330/330 [00:15, 21.19it/s, step size=3.04e-01, acc. prob=0.951]


[outer 018] TRAIN (EMA+K-ens) ll=0.6749  br=0.2404  acc=0.6600


Sample: 100%|██████████| 330/330 [00:15, 20.75it/s, step size=2.65e-01, acc. prob=0.939]


[outer 019] TRAIN (EMA+K-ens) ll=0.6780  br=0.2418  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 20.58it/s, step size=3.10e-01, acc. prob=0.940]


[outer 020] TRAIN (EMA+K-ens) ll=0.6829  br=0.2440  acc=0.6460


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=2.40e-01, acc. prob=0.952]


[outer 021] TRAIN (EMA+K-ens) ll=0.6803  br=0.2429  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=2.96e-01, acc. prob=0.950]


[outer 022] TRAIN (EMA+K-ens) ll=0.6842  br=0.2447  acc=0.6530


Sample: 100%|██████████| 330/330 [00:16, 20.57it/s, step size=2.79e-01, acc. prob=0.927]


[outer 023] TRAIN (EMA+K-ens) ll=0.6797  br=0.2428  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=2.84e-01, acc. prob=0.930]


[outer 024] TRAIN (EMA+K-ens) ll=0.6838  br=0.2447  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 20.52it/s, step size=2.92e-01, acc. prob=0.952]


[outer 025] TRAIN (EMA+K-ens) ll=0.6831  br=0.2443  acc=0.6440


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=3.19e-01, acc. prob=0.946]


[outer 026] TRAIN (EMA+K-ens) ll=0.6815  br=0.2435  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.95e-01, acc. prob=0.924]


[outer 027] TRAIN (EMA+K-ens) ll=0.6809  br=0.2433  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=3.13e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6809  br=0.2435  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.89it/s, step size=2.48e-01, acc. prob=0.946]


[outer 029] TRAIN (EMA+K-ens) ll=0.6826  br=0.2442  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=2.17e-01, acc. prob=0.974]


[outer 030] TRAIN (EMA+K-ens) ll=0.6847  br=0.2449  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.15it/s, step size=2.38e-01, acc. prob=0.967]


[outer 031] TRAIN (EMA+K-ens) ll=0.6842  br=0.2445  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.60it/s, step size=2.35e-01, acc. prob=0.963]


[outer 032] TRAIN (EMA+K-ens) ll=0.6875  br=0.2458  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=2.37e-01, acc. prob=0.951]


[outer 033] TRAIN (EMA+K-ens) ll=0.6880  br=0.2461  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.20it/s, step size=2.83e-01, acc. prob=0.971]


[outer 034] TRAIN (EMA+K-ens) ll=0.6965  br=0.2499  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=2.42e-01, acc. prob=0.937]


[outer 035] TRAIN (EMA+K-ens) ll=0.6884  br=0.2460  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=2.39e-01, acc. prob=0.968]


[outer 036] TRAIN (EMA+K-ens) ll=0.6864  br=0.2448  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 23.57it/s, step size=2.93e-01, acc. prob=0.958]


[outer 037] TRAIN (EMA+K-ens) ll=0.6827  br=0.2430  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 22.52it/s, step size=2.88e-01, acc. prob=0.945]


[outer 038] TRAIN (EMA+K-ens) ll=0.6899  br=0.2463  acc=0.6790


Sample: 100%|██████████| 330/330 [00:14, 23.17it/s, step size=2.53e-01, acc. prob=0.955]


[outer 039] TRAIN (EMA+K-ens) ll=0.6872  br=0.2455  acc=0.6900
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}]


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=3.01e-01, acc. prob=0.924]


[outer 000] TRAIN (EMA+K-ens) ll=0.7239  br=0.2640  acc=0.4830


Sample: 100%|██████████| 330/330 [00:15, 21.69it/s, step size=2.89e-01, acc. prob=0.953]


[outer 001] TRAIN (EMA+K-ens) ll=0.7257  br=0.2655  acc=0.4710


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=2.51e-01, acc. prob=0.967]


[outer 002] TRAIN (EMA+K-ens) ll=0.7009  br=0.2534  acc=0.5470


Sample: 100%|██████████| 330/330 [00:16, 20.45it/s, step size=2.51e-01, acc. prob=0.954]


[outer 003] TRAIN (EMA+K-ens) ll=0.6868  br=0.2466  acc=0.6030


Sample: 100%|██████████| 330/330 [00:15, 21.48it/s, step size=2.70e-01, acc. prob=0.962]


[outer 004] TRAIN (EMA+K-ens) ll=0.6795  br=0.2430  acc=0.6510


Sample: 100%|██████████| 330/330 [00:14, 22.25it/s, step size=2.93e-01, acc. prob=0.944]


[outer 005] TRAIN (EMA+K-ens) ll=0.6844  br=0.2452  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.14e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6799  br=0.2429  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.95e-01, acc. prob=0.926]


[outer 007] TRAIN (EMA+K-ens) ll=0.6789  br=0.2424  acc=0.6810


Sample: 100%|██████████| 330/330 [00:14, 22.56it/s, step size=2.43e-01, acc. prob=0.956]


[outer 008] TRAIN (EMA+K-ens) ll=0.6723  br=0.2391  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.54it/s, step size=2.80e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6638  br=0.2350  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.33it/s, step size=2.45e-01, acc. prob=0.962]


[outer 010] TRAIN (EMA+K-ens) ll=0.6684  br=0.2370  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=2.80e-01, acc. prob=0.971]


[outer 011] TRAIN (EMA+K-ens) ll=0.6696  br=0.2375  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.58it/s, step size=2.84e-01, acc. prob=0.951]


[outer 012] TRAIN (EMA+K-ens) ll=0.6750  br=0.2399  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 19.25it/s, step size=2.13e-01, acc. prob=0.956]


[outer 013] TRAIN (EMA+K-ens) ll=0.6706  br=0.2379  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.11it/s, step size=2.94e-01, acc. prob=0.965]


[outer 014] TRAIN (EMA+K-ens) ll=0.6664  br=0.2361  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.74e-01, acc. prob=0.956]


[outer 015] TRAIN (EMA+K-ens) ll=0.6661  br=0.2361  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 23.78it/s, step size=3.47e-01, acc. prob=0.936]


[outer 016] TRAIN (EMA+K-ens) ll=0.6732  br=0.2395  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=2.93e-01, acc. prob=0.962]


[outer 017] TRAIN (EMA+K-ens) ll=0.6783  br=0.2419  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.73it/s, step size=2.54e-01, acc. prob=0.939]


[outer 018] TRAIN (EMA+K-ens) ll=0.6715  br=0.2388  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.50it/s, step size=3.06e-01, acc. prob=0.920]


[outer 019] TRAIN (EMA+K-ens) ll=0.6772  br=0.2417  acc=0.6850
[Early stop @ outer 19] Δll=0.097%, Δbr=0.112%, Δacc=0.003
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}, {'accuracy': 0.6940999627113342, 'brier': 0.22990211844444275, 'logloss': 0.65262371301651}]


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=2.51e-01, acc. prob=0.949]


[outer 000] TRAIN (EMA+K-ens) ll=0.7860  br=0.2891  acc=0.5010


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=2.75e-01, acc. prob=0.963]


[outer 001] TRAIN (EMA+K-ens) ll=0.7173  br=0.2591  acc=0.6290


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=2.32e-01, acc. prob=0.960]


[outer 002] TRAIN (EMA+K-ens) ll=0.7692  br=0.2810  acc=0.5660


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=2.35e-01, acc. prob=0.961]


[outer 003] TRAIN (EMA+K-ens) ll=0.7627  br=0.2798  acc=0.5490


Sample: 100%|██████████| 330/330 [00:17, 18.34it/s, step size=2.77e-01, acc. prob=0.949]


[outer 004] TRAIN (EMA+K-ens) ll=0.7541  br=0.2759  acc=0.5850


Sample: 100%|██████████| 330/330 [00:15, 21.28it/s, step size=3.31e-01, acc. prob=0.882]


[outer 005] TRAIN (EMA+K-ens) ll=0.7517  br=0.2750  acc=0.6220


Sample: 100%|██████████| 330/330 [00:15, 21.19it/s, step size=2.88e-01, acc. prob=0.938]


[outer 006] TRAIN (EMA+K-ens) ll=0.7441  br=0.2718  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=2.13e-01, acc. prob=0.958]


[outer 007] TRAIN (EMA+K-ens) ll=0.7482  br=0.2733  acc=0.6200


Sample: 100%|██████████| 330/330 [00:15, 21.47it/s, step size=2.76e-01, acc. prob=0.931]


[outer 008] TRAIN (EMA+K-ens) ll=0.7441  br=0.2716  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.62e-01, acc. prob=0.953]


[outer 009] TRAIN (EMA+K-ens) ll=0.7598  br=0.2787  acc=0.6150


Sample: 100%|██████████| 330/330 [00:15, 21.45it/s, step size=2.82e-01, acc. prob=0.944]


[outer 010] TRAIN (EMA+K-ens) ll=0.7360  br=0.2683  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=2.94e-01, acc. prob=0.932]


[outer 011] TRAIN (EMA+K-ens) ll=0.7248  br=0.2630  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 20.50it/s, step size=2.46e-01, acc. prob=0.959]


[outer 012] TRAIN (EMA+K-ens) ll=0.7161  br=0.2589  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=2.54e-01, acc. prob=0.959]


[outer 013] TRAIN (EMA+K-ens) ll=0.7192  br=0.2602  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.35e-01, acc. prob=0.963]


[outer 014] TRAIN (EMA+K-ens) ll=0.7158  br=0.2584  acc=0.6820


Sample: 100%|██████████| 330/330 [00:15, 21.70it/s, step size=2.39e-01, acc. prob=0.951]


[outer 015] TRAIN (EMA+K-ens) ll=0.6957  br=0.2496  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.39e-01, acc. prob=0.945]


[outer 016] TRAIN (EMA+K-ens) ll=0.6841  br=0.2444  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.12it/s, step size=2.66e-01, acc. prob=0.956]


[outer 017] TRAIN (EMA+K-ens) ll=0.6689  br=0.2373  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.91e-01, acc. prob=0.952]


[outer 018] TRAIN (EMA+K-ens) ll=0.6696  br=0.2377  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=2.95e-01, acc. prob=0.940]


[outer 019] TRAIN (EMA+K-ens) ll=0.6706  br=0.2382  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=2.36e-01, acc. prob=0.949]


[outer 020] TRAIN (EMA+K-ens) ll=0.6644  br=0.2353  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=2.50e-01, acc. prob=0.970]


[outer 021] TRAIN (EMA+K-ens) ll=0.6545  br=0.2306  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=2.62e-01, acc. prob=0.952]


[outer 022] TRAIN (EMA+K-ens) ll=0.6542  br=0.2305  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=2.35e-01, acc. prob=0.945]


[outer 023] TRAIN (EMA+K-ens) ll=0.6507  br=0.2287  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=2.26e-01, acc. prob=0.946]


[outer 024] TRAIN (EMA+K-ens) ll=0.6495  br=0.2280  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.42e-01, acc. prob=0.961]


[outer 025] TRAIN (EMA+K-ens) ll=0.6545  br=0.2302  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 18.58it/s, step size=2.19e-01, acc. prob=0.949]


[outer 026] TRAIN (EMA+K-ens) ll=0.6549  br=0.2303  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=3.41e-01, acc. prob=0.927]


[outer 027] TRAIN (EMA+K-ens) ll=0.6528  br=0.2293  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.35it/s, step size=2.29e-01, acc. prob=0.959]


[outer 028] TRAIN (EMA+K-ens) ll=0.6653  br=0.2349  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.24it/s, step size=2.95e-01, acc. prob=0.921]


[outer 029] TRAIN (EMA+K-ens) ll=0.6638  br=0.2343  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 22.65it/s, step size=2.51e-01, acc. prob=0.948]


[outer 030] TRAIN (EMA+K-ens) ll=0.6562  br=0.2308  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=2.34e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6656  br=0.2354  acc=0.6850
[Early stop @ outer 31] Δll=0.009%, Δbr=0.235%, Δacc=0.000
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}, {'accuracy': 0.6940999627113342, 'brier': 0.22990211844444275, 'logloss': 0.65262371301651}, {'accuracy': 0.6937599778175354, 'brier': 0.24265410006046295, 'logloss': 0.6804242134094238}]


Sample: 100%|██████████| 330/330 [00:15, 20.82it/s, step size=2.52e-01, acc. prob=0.948]


[outer 000] TRAIN (EMA+K-ens) ll=0.6752  br=0.2410  acc=0.5890


Sample: 100%|██████████| 330/330 [00:15, 21.06it/s, step size=2.68e-01, acc. prob=0.942]


[outer 001] TRAIN (EMA+K-ens) ll=0.6932  br=0.2492  acc=0.6300


Sample: 100%|██████████| 330/330 [00:15, 20.66it/s, step size=2.49e-01, acc. prob=0.950]


[outer 002] TRAIN (EMA+K-ens) ll=0.6957  br=0.2504  acc=0.6330


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=2.30e-01, acc. prob=0.967]


[outer 003] TRAIN (EMA+K-ens) ll=0.6888  br=0.2468  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=2.49e-01, acc. prob=0.955]


[outer 004] TRAIN (EMA+K-ens) ll=0.6870  br=0.2459  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.69e-01, acc. prob=0.964]


[outer 005] TRAIN (EMA+K-ens) ll=0.6878  br=0.2460  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 20.92it/s, step size=2.26e-01, acc. prob=0.939]


[outer 006] TRAIN (EMA+K-ens) ll=0.6860  br=0.2450  acc=0.6720


Sample: 100%|██████████| 330/330 [00:14, 22.36it/s, step size=2.90e-01, acc. prob=0.944]


[outer 007] TRAIN (EMA+K-ens) ll=0.6876  br=0.2455  acc=0.6470


Sample: 100%|██████████| 330/330 [00:15, 21.85it/s, step size=3.02e-01, acc. prob=0.926]


[outer 008] TRAIN (EMA+K-ens) ll=0.6985  br=0.2503  acc=0.6330


Sample: 100%|██████████| 330/330 [00:16, 20.52it/s, step size=2.16e-01, acc. prob=0.973]


[outer 009] TRAIN (EMA+K-ens) ll=0.6910  br=0.2469  acc=0.6330


Sample: 100%|██████████| 330/330 [00:15, 21.24it/s, step size=2.27e-01, acc. prob=0.951]


[outer 010] TRAIN (EMA+K-ens) ll=0.7007  br=0.2512  acc=0.6080


Sample: 100%|██████████| 330/330 [00:14, 22.01it/s, step size=2.80e-01, acc. prob=0.940]


[outer 011] TRAIN (EMA+K-ens) ll=0.6973  br=0.2496  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.37e-01, acc. prob=0.960]


[outer 012] TRAIN (EMA+K-ens) ll=0.7045  br=0.2526  acc=0.6680


Sample: 100%|██████████| 330/330 [00:15, 21.41it/s, step size=3.20e-01, acc. prob=0.920]


[outer 013] TRAIN (EMA+K-ens) ll=0.7098  br=0.2551  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.95it/s, step size=2.35e-01, acc. prob=0.968]


[outer 014] TRAIN (EMA+K-ens) ll=0.7127  br=0.2566  acc=0.6980


Sample: 100%|██████████| 330/330 [00:16, 20.20it/s, step size=2.88e-01, acc. prob=0.955]


[outer 015] TRAIN (EMA+K-ens) ll=0.7176  br=0.2591  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 21.07it/s, step size=2.77e-01, acc. prob=0.974]


[outer 016] TRAIN (EMA+K-ens) ll=0.7132  br=0.2570  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.57it/s, step size=2.31e-01, acc. prob=0.951]


[outer 017] TRAIN (EMA+K-ens) ll=0.7108  br=0.2559  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 21.39it/s, step size=3.12e-01, acc. prob=0.937]


[outer 018] TRAIN (EMA+K-ens) ll=0.7105  br=0.2557  acc=0.6530


Sample: 100%|██████████| 330/330 [00:14, 22.31it/s, step size=2.98e-01, acc. prob=0.936]


[outer 019] TRAIN (EMA+K-ens) ll=0.7216  br=0.2604  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=3.17e-01, acc. prob=0.927]


[outer 020] TRAIN (EMA+K-ens) ll=0.7179  br=0.2589  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=2.34e-01, acc. prob=0.962]


[outer 021] TRAIN (EMA+K-ens) ll=0.6983  br=0.2502  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 23.09it/s, step size=2.88e-01, acc. prob=0.954]


[outer 022] TRAIN (EMA+K-ens) ll=0.6924  br=0.2478  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.38it/s, step size=3.07e-01, acc. prob=0.914]


[outer 023] TRAIN (EMA+K-ens) ll=0.6851  br=0.2445  acc=0.6960
[Early stop @ outer 23] Δll=0.389%, Δbr=0.439%, Δacc=0.002
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}, {'accuracy': 0.6940999627113342, 'brier': 0.22990211844444275, 'logloss': 0.65262371301651}, {'accuracy': 0.6937599778175354, 'brier': 0.24265410006046295, 'logloss': 0.6804242134094238}, {'accuracy': 0.6919599771499634, 'brier': 0.245417520403862, 'logloss': 0.6885477304458618}]


Sample: 100%|██████████| 330/330 [00:14, 22.50it/s, step size=3.11e-01, acc. prob=0.941]


[outer 000] TRAIN (EMA+K-ens) ll=0.7131  br=0.2560  acc=0.6160


Sample: 100%|██████████| 330/330 [00:15, 20.65it/s, step size=2.79e-01, acc. prob=0.950]


[outer 001] TRAIN (EMA+K-ens) ll=0.6977  br=0.2498  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.52e-01, acc. prob=0.947]


[outer 002] TRAIN (EMA+K-ens) ll=0.7178  br=0.2591  acc=0.6160


Sample: 100%|██████████| 330/330 [00:15, 20.98it/s, step size=2.40e-01, acc. prob=0.963]


[outer 003] TRAIN (EMA+K-ens) ll=0.7075  br=0.2546  acc=0.6200


Sample: 100%|██████████| 330/330 [00:15, 21.22it/s, step size=2.65e-01, acc. prob=0.950]


[outer 004] TRAIN (EMA+K-ens) ll=0.6938  br=0.2486  acc=0.6640


Sample: 100%|██████████| 330/330 [00:15, 21.16it/s, step size=2.58e-01, acc. prob=0.951]


[outer 005] TRAIN (EMA+K-ens) ll=0.6860  br=0.2449  acc=0.6600


Sample: 100%|██████████| 330/330 [00:14, 22.10it/s, step size=3.49e-01, acc. prob=0.903]


[outer 006] TRAIN (EMA+K-ens) ll=0.6829  br=0.2435  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 21.29it/s, step size=1.93e-01, acc. prob=0.974]


[outer 007] TRAIN (EMA+K-ens) ll=0.6742  br=0.2394  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.22e-01, acc. prob=0.957]


[outer 008] TRAIN (EMA+K-ens) ll=0.6598  br=0.2329  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 20.92it/s, step size=2.24e-01, acc. prob=0.957]


[outer 009] TRAIN (EMA+K-ens) ll=0.6693  br=0.2373  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.61e-01, acc. prob=0.937]


[outer 010] TRAIN (EMA+K-ens) ll=0.6634  br=0.2346  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.15it/s, step size=2.62e-01, acc. prob=0.942]


[outer 011] TRAIN (EMA+K-ens) ll=0.6568  br=0.2314  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=2.28e-01, acc. prob=0.961]


[outer 012] TRAIN (EMA+K-ens) ll=0.6613  br=0.2335  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=2.19e-01, acc. prob=0.969]


[outer 013] TRAIN (EMA+K-ens) ll=0.6679  br=0.2368  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.91it/s, step size=3.01e-01, acc. prob=0.949]


[outer 014] TRAIN (EMA+K-ens) ll=0.6702  br=0.2380  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=2.40e-01, acc. prob=0.974]


[outer 015] TRAIN (EMA+K-ens) ll=0.6718  br=0.2387  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.33it/s, step size=2.93e-01, acc. prob=0.944]


[outer 016] TRAIN (EMA+K-ens) ll=0.6767  br=0.2410  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.46it/s, step size=3.24e-01, acc. prob=0.922]


[outer 017] TRAIN (EMA+K-ens) ll=0.6726  br=0.2390  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.05it/s, step size=2.39e-01, acc. prob=0.962]


[outer 018] TRAIN (EMA+K-ens) ll=0.6673  br=0.2364  acc=0.6970


Sample: 100%|██████████| 330/330 [00:17, 18.97it/s, step size=2.63e-01, acc. prob=0.938]


[outer 019] TRAIN (EMA+K-ens) ll=0.6650  br=0.2353  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.30it/s, step size=3.29e-01, acc. prob=0.924]


[outer 020] TRAIN (EMA+K-ens) ll=0.6625  br=0.2341  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.93it/s, step size=2.79e-01, acc. prob=0.940]


[outer 021] TRAIN (EMA+K-ens) ll=0.6489  br=0.2277  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.67it/s, step size=2.88e-01, acc. prob=0.958]


[outer 022] TRAIN (EMA+K-ens) ll=0.6521  br=0.2291  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.03it/s, step size=2.70e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6574  br=0.2315  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 21.49it/s, step size=2.49e-01, acc. prob=0.930]


[outer 024] TRAIN (EMA+K-ens) ll=0.6696  br=0.2371  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 20.43it/s, step size=2.73e-01, acc. prob=0.936]


[outer 025] TRAIN (EMA+K-ens) ll=0.6698  br=0.2371  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 20.41it/s, step size=2.72e-01, acc. prob=0.964]


[outer 026] TRAIN (EMA+K-ens) ll=0.6690  br=0.2367  acc=0.6840


Sample: 100%|██████████| 330/330 [00:18, 18.16it/s, step size=2.16e-01, acc. prob=0.970]


[outer 027] TRAIN (EMA+K-ens) ll=0.6829  br=0.2428  acc=0.6690


Sample: 100%|██████████| 330/330 [00:13, 23.68it/s, step size=2.93e-01, acc. prob=0.940]


[outer 028] TRAIN (EMA+K-ens) ll=0.6835  br=0.2431  acc=0.6730


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=2.43e-01, acc. prob=0.929]


[outer 029] TRAIN (EMA+K-ens) ll=0.6932  br=0.2473  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.72it/s, step size=2.62e-01, acc. prob=0.922]


[outer 030] TRAIN (EMA+K-ens) ll=0.6836  br=0.2432  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 20.77it/s, step size=2.50e-01, acc. prob=0.966]


[outer 031] TRAIN (EMA+K-ens) ll=0.6835  br=0.2431  acc=0.6630


Sample: 100%|██████████| 330/330 [00:15, 20.77it/s, step size=3.64e-01, acc. prob=0.931]


[outer 032] TRAIN (EMA+K-ens) ll=0.6767  br=0.2401  acc=0.6480


Sample: 100%|██████████| 330/330 [00:15, 21.38it/s, step size=2.65e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6578  br=0.2317  acc=0.6760


Sample: 100%|██████████| 330/330 [00:15, 21.79it/s, step size=2.45e-01, acc. prob=0.950]


[outer 034] TRAIN (EMA+K-ens) ll=0.6715  br=0.2381  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.12it/s, step size=2.90e-01, acc. prob=0.954]


[outer 035] TRAIN (EMA+K-ens) ll=0.6672  br=0.2363  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=2.21e-01, acc. prob=0.970]


[outer 036] TRAIN (EMA+K-ens) ll=0.6647  br=0.2351  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.79it/s, step size=2.36e-01, acc. prob=0.959]


[outer 037] TRAIN (EMA+K-ens) ll=0.6700  br=0.2377  acc=0.6720


Sample: 100%|██████████| 330/330 [00:16, 20.28it/s, step size=3.25e-01, acc. prob=0.945]


[outer 038] TRAIN (EMA+K-ens) ll=0.6692  br=0.2374  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 18.56it/s, step size=2.23e-01, acc. prob=0.930]


[outer 039] TRAIN (EMA+K-ens) ll=0.6640  br=0.2349  acc=0.6940
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}, {'accuracy': 0.6940999627113342, 'brier': 0.22990211844444275, 'logloss': 0.65262371301651}, {'accuracy': 0.6937599778175354, 'brier': 0.24265410006046295, 'logloss': 0.6804242134094238}, {'accuracy': 0.6919599771499634, 'brier': 0.245417520403862, 'logloss': 0.6885477304458618}, {'accuracy': 0.6908199787139893, 'brier': 0.24359725415706635, 'logloss': 0.6830725073814392}]


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.04e-01, acc. prob=0.920]


[outer 000] TRAIN (EMA+K-ens) ll=0.6911  br=0.2479  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=2.68e-01, acc. prob=0.934]


[outer 001] TRAIN (EMA+K-ens) ll=0.7149  br=0.2592  acc=0.5720


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=3.31e-01, acc. prob=0.936]


[outer 002] TRAIN (EMA+K-ens) ll=0.6704  br=0.2384  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.22it/s, step size=2.84e-01, acc. prob=0.953]


[outer 003] TRAIN (EMA+K-ens) ll=0.7036  br=0.2542  acc=0.5470


Sample: 100%|██████████| 330/330 [00:15, 21.36it/s, step size=2.46e-01, acc. prob=0.955]


[outer 004] TRAIN (EMA+K-ens) ll=0.7083  br=0.2564  acc=0.5470


Sample: 100%|██████████| 330/330 [00:15, 21.00it/s, step size=2.93e-01, acc. prob=0.964]


[outer 005] TRAIN (EMA+K-ens) ll=0.7053  br=0.2547  acc=0.5990


Sample: 100%|██████████| 330/330 [00:15, 20.85it/s, step size=3.06e-01, acc. prob=0.918]


[outer 006] TRAIN (EMA+K-ens) ll=0.7020  br=0.2532  acc=0.6260


Sample: 100%|██████████| 330/330 [00:15, 21.17it/s, step size=2.94e-01, acc. prob=0.945]


[outer 007] TRAIN (EMA+K-ens) ll=0.6940  br=0.2493  acc=0.6690


Sample: 100%|██████████| 330/330 [00:14, 22.16it/s, step size=2.87e-01, acc. prob=0.945]


[outer 008] TRAIN (EMA+K-ens) ll=0.6918  br=0.2483  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 21.11it/s, step size=2.57e-01, acc. prob=0.964]


[outer 009] TRAIN (EMA+K-ens) ll=0.6865  br=0.2457  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 21.18it/s, step size=2.89e-01, acc. prob=0.941]


[outer 010] TRAIN (EMA+K-ens) ll=0.6987  br=0.2514  acc=0.6530


Sample: 100%|██████████| 330/330 [00:16, 20.19it/s, step size=2.39e-01, acc. prob=0.936]


[outer 011] TRAIN (EMA+K-ens) ll=0.6717  br=0.2387  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.37it/s, step size=2.57e-01, acc. prob=0.965]


[outer 012] TRAIN (EMA+K-ens) ll=0.6648  br=0.2354  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.89it/s, step size=2.62e-01, acc. prob=0.964]


[outer 013] TRAIN (EMA+K-ens) ll=0.6730  br=0.2393  acc=0.6620


Sample: 100%|██████████| 330/330 [00:13, 23.79it/s, step size=2.89e-01, acc. prob=0.947]


[outer 014] TRAIN (EMA+K-ens) ll=0.6849  br=0.2448  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 21.91it/s, step size=2.77e-01, acc. prob=0.949]


[outer 015] TRAIN (EMA+K-ens) ll=0.6710  br=0.2384  acc=0.6680


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.32e-01, acc. prob=0.958]


[outer 016] TRAIN (EMA+K-ens) ll=0.6728  br=0.2394  acc=0.6700


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=2.81e-01, acc. prob=0.945]


[outer 017] TRAIN (EMA+K-ens) ll=0.6763  br=0.2407  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=2.60e-01, acc. prob=0.945]


[outer 018] TRAIN (EMA+K-ens) ll=0.6855  br=0.2446  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.79it/s, step size=3.07e-01, acc. prob=0.959]


[outer 019] TRAIN (EMA+K-ens) ll=0.6884  br=0.2460  acc=0.7040


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=2.63e-01, acc. prob=0.940]


[outer 020] TRAIN (EMA+K-ens) ll=0.6918  br=0.2473  acc=0.7070


Sample: 100%|██████████| 330/330 [00:14, 22.71it/s, step size=2.44e-01, acc. prob=0.975]


[outer 021] TRAIN (EMA+K-ens) ll=0.6816  br=0.2427  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.88it/s, step size=3.08e-01, acc. prob=0.912]


[outer 022] TRAIN (EMA+K-ens) ll=0.6712  br=0.2380  acc=0.7010


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.43e-01, acc. prob=0.950]


[outer 023] TRAIN (EMA+K-ens) ll=0.6878  br=0.2454  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.60it/s, step size=2.65e-01, acc. prob=0.962]


[outer 024] TRAIN (EMA+K-ens) ll=0.6874  br=0.2449  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=2.64e-01, acc. prob=0.954]


[outer 025] TRAIN (EMA+K-ens) ll=0.6840  br=0.2435  acc=0.7010


Sample: 100%|██████████| 330/330 [00:14, 22.35it/s, step size=3.03e-01, acc. prob=0.931]


[outer 026] TRAIN (EMA+K-ens) ll=0.6850  br=0.2441  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=2.96e-01, acc. prob=0.915]


[outer 027] TRAIN (EMA+K-ens) ll=0.6848  br=0.2441  acc=0.7090


Sample: 100%|██████████| 330/330 [00:16, 20.34it/s, step size=2.79e-01, acc. prob=0.960]


[outer 028] TRAIN (EMA+K-ens) ll=0.6850  br=0.2443  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.02it/s, step size=2.99e-01, acc. prob=0.957]


[outer 029] TRAIN (EMA+K-ens) ll=0.6796  br=0.2418  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 22.26it/s, step size=2.72e-01, acc. prob=0.962]


[outer 030] TRAIN (EMA+K-ens) ll=0.6797  br=0.2419  acc=0.7070


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.22e-01, acc. prob=0.962]


[outer 031] TRAIN (EMA+K-ens) ll=0.6764  br=0.2405  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.96it/s, step size=2.67e-01, acc. prob=0.956]


[outer 032] TRAIN (EMA+K-ens) ll=0.6737  br=0.2393  acc=0.7070


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=2.27e-01, acc. prob=0.956]


[outer 033] TRAIN (EMA+K-ens) ll=0.6757  br=0.2401  acc=0.7070


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=3.39e-01, acc. prob=0.895]


[outer 034] TRAIN (EMA+K-ens) ll=0.6667  br=0.2359  acc=0.7090


Sample: 100%|██████████| 330/330 [00:16, 20.17it/s, step size=2.70e-01, acc. prob=0.961]


[outer 035] TRAIN (EMA+K-ens) ll=0.6669  br=0.2360  acc=0.7090


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=2.86e-01, acc. prob=0.938]


[outer 036] TRAIN (EMA+K-ens) ll=0.6650  br=0.2350  acc=0.7090


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=2.71e-01, acc. prob=0.956]


[outer 037] TRAIN (EMA+K-ens) ll=0.6714  br=0.2380  acc=0.7040


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.42e-01, acc. prob=0.953]


[outer 038] TRAIN (EMA+K-ens) ll=0.6786  br=0.2414  acc=0.7090


Sample: 100%|██████████| 330/330 [00:14, 23.43it/s, step size=2.77e-01, acc. prob=0.929]


[outer 039] TRAIN (EMA+K-ens) ll=0.6840  br=0.2438  acc=0.7080
[{'accuracy': 0.6460599899291992, 'brier': 0.24081972241401672, 'logloss': 0.675815761089325}, {'accuracy': 0.670799970626831, 'brier': 0.24431811273097992, 'logloss': 0.6851312518119812}, {'accuracy': 0.6979599595069885, 'brier': 0.24207225441932678, 'logloss': 0.680252194404602}, {'accuracy': 0.6688599586486816, 'brier': 0.22250163555145264, 'logloss': 0.6349937915802002}, {'accuracy': 0.6383999586105347, 'brier': 0.2461216300725937, 'logloss': 0.6869973540306091}, {'accuracy': 0.6940999627113342, 'brier': 0.22990211844444275, 'logloss': 0.65262371301651}, {'accuracy': 0.6937599778175354, 'brier': 0.24265410006046295, 'logloss': 0.6804242134094238}, {'accuracy': 0.6919599771499634, 'brier': 0.245417520403862, 'logloss': 0.6885477304458618}, {'accuracy': 0.6908199787139893, 'brier': 0.24359725415706635, 'logloss': 0.6830725073814392}, {'accuracy': 0.6908999681472778, 'brier': 0.2408372163772583, 'logloss': 0.67692953348159

In [None]:
all_metrics = []
noise_type = "normal"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=False, nonlinear=True, group=False,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:19, 16.63it/s, step size=1.78e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6753  br=0.2422  acc=0.5510


Sample: 100%|██████████| 330/330 [00:19, 17.34it/s, step size=2.28e-01, acc. prob=0.926]


[outer 001] TRAIN (EMA+K-ens) ll=0.6441  br=0.2263  acc=0.6130


Sample: 100%|██████████| 330/330 [00:19, 17.27it/s, step size=1.91e-01, acc. prob=0.939]


[outer 002] TRAIN (EMA+K-ens) ll=0.6623  br=0.2345  acc=0.6340


Sample: 100%|██████████| 330/330 [00:21, 15.54it/s, step size=1.72e-01, acc. prob=0.945]


[outer 003] TRAIN (EMA+K-ens) ll=0.6579  br=0.2324  acc=0.7000


Sample: 100%|██████████| 330/330 [00:20, 16.47it/s, step size=2.02e-01, acc. prob=0.932]


[outer 004] TRAIN (EMA+K-ens) ll=0.6523  br=0.2297  acc=0.6670


Sample: 100%|██████████| 330/330 [00:19, 16.53it/s, step size=2.37e-01, acc. prob=0.903]


[outer 005] TRAIN (EMA+K-ens) ll=0.6636  br=0.2352  acc=0.6390


Sample: 100%|██████████| 330/330 [00:20, 16.23it/s, step size=2.11e-01, acc. prob=0.968]


[outer 006] TRAIN (EMA+K-ens) ll=0.6611  br=0.2339  acc=0.6640


Sample: 100%|██████████| 330/330 [00:20, 15.83it/s, step size=2.06e-01, acc. prob=0.927]


[outer 007] TRAIN (EMA+K-ens) ll=0.6590  br=0.2330  acc=0.6520


Sample: 100%|██████████| 330/330 [00:20, 15.77it/s, step size=1.85e-01, acc. prob=0.942]


[outer 008] TRAIN (EMA+K-ens) ll=0.6613  br=0.2340  acc=0.7000


Sample: 100%|██████████| 330/330 [00:19, 16.89it/s, step size=2.08e-01, acc. prob=0.932]


[outer 009] TRAIN (EMA+K-ens) ll=0.6689  br=0.2377  acc=0.6910


Sample: 100%|██████████| 330/330 [00:21, 15.64it/s, step size=1.64e-01, acc. prob=0.956]


[outer 010] TRAIN (EMA+K-ens) ll=0.6630  br=0.2348  acc=0.6870


Sample: 100%|██████████| 330/330 [00:19, 17.17it/s, step size=2.13e-01, acc. prob=0.933]


[outer 011] TRAIN (EMA+K-ens) ll=0.6641  br=0.2354  acc=0.6710


Sample: 100%|██████████| 330/330 [00:20, 16.26it/s, step size=2.10e-01, acc. prob=0.938]


[outer 012] TRAIN (EMA+K-ens) ll=0.6712  br=0.2387  acc=0.6780


Sample: 100%|██████████| 330/330 [00:23, 13.89it/s, step size=1.51e-01, acc. prob=0.974]


[outer 013] TRAIN (EMA+K-ens) ll=0.6649  br=0.2356  acc=0.7000


Sample: 100%|██████████| 330/330 [00:19, 16.98it/s, step size=2.28e-01, acc. prob=0.940]


[outer 014] TRAIN (EMA+K-ens) ll=0.6664  br=0.2362  acc=0.7040


Sample: 100%|██████████| 330/330 [00:23, 13.98it/s, step size=1.62e-01, acc. prob=0.964]


[outer 015] TRAIN (EMA+K-ens) ll=0.6662  br=0.2362  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.63it/s, step size=1.90e-01, acc. prob=0.955]


[outer 016] TRAIN (EMA+K-ens) ll=0.6712  br=0.2386  acc=0.6970


Sample: 100%|██████████| 330/330 [00:20, 16.34it/s, step size=2.16e-01, acc. prob=0.968]


[outer 017] TRAIN (EMA+K-ens) ll=0.6681  br=0.2370  acc=0.7000


Sample: 100%|██████████| 330/330 [00:19, 16.81it/s, step size=2.16e-01, acc. prob=0.945]


[outer 018] TRAIN (EMA+K-ens) ll=0.6721  br=0.2389  acc=0.7000


Sample: 100%|██████████| 330/330 [00:22, 14.43it/s, step size=1.50e-01, acc. prob=0.955]


[outer 019] TRAIN (EMA+K-ens) ll=0.6729  br=0.2394  acc=0.7000


Sample: 100%|██████████| 330/330 [00:21, 15.67it/s, step size=1.82e-01, acc. prob=0.952]


[outer 020] TRAIN (EMA+K-ens) ll=0.6712  br=0.2386  acc=0.7000


Sample: 100%|██████████| 330/330 [00:19, 16.72it/s, step size=1.86e-01, acc. prob=0.966]


[outer 021] TRAIN (EMA+K-ens) ll=0.6681  br=0.2372  acc=0.6970


Sample: 100%|██████████| 330/330 [00:22, 14.86it/s, step size=1.69e-01, acc. prob=0.959]


[outer 022] TRAIN (EMA+K-ens) ll=0.6789  br=0.2424  acc=0.6620


Sample: 100%|██████████| 330/330 [00:21, 15.25it/s, step size=2.00e-01, acc. prob=0.961]


[outer 023] TRAIN (EMA+K-ens) ll=0.6770  br=0.2414  acc=0.6900


Sample: 100%|██████████| 330/330 [00:20, 16.40it/s, step size=1.79e-01, acc. prob=0.955]


[outer 024] TRAIN (EMA+K-ens) ll=0.6736  br=0.2397  acc=0.6870


Sample: 100%|██████████| 330/330 [00:20, 15.98it/s, step size=1.87e-01, acc. prob=0.964]


[outer 025] TRAIN (EMA+K-ens) ll=0.6770  br=0.2415  acc=0.6900


Sample: 100%|██████████| 330/330 [00:22, 14.75it/s, step size=1.63e-01, acc. prob=0.956]


[outer 026] TRAIN (EMA+K-ens) ll=0.6753  br=0.2407  acc=0.6730


Sample: 100%|██████████| 330/330 [00:20, 15.89it/s, step size=2.07e-01, acc. prob=0.948]


[outer 027] TRAIN (EMA+K-ens) ll=0.6748  br=0.2404  acc=0.6880


Sample: 100%|██████████| 330/330 [00:19, 16.68it/s, step size=2.00e-01, acc. prob=0.955]


[outer 028] TRAIN (EMA+K-ens) ll=0.6816  br=0.2437  acc=0.6900


Sample: 100%|██████████| 330/330 [00:19, 16.70it/s, step size=1.83e-01, acc. prob=0.945]


[outer 029] TRAIN (EMA+K-ens) ll=0.6896  br=0.2475  acc=0.6560


Sample: 100%|██████████| 330/330 [00:20, 16.19it/s, step size=1.67e-01, acc. prob=0.963]


[outer 030] TRAIN (EMA+K-ens) ll=0.6843  br=0.2451  acc=0.6830


Sample: 100%|██████████| 330/330 [00:21, 15.29it/s, step size=2.17e-01, acc. prob=0.938]


[outer 031] TRAIN (EMA+K-ens) ll=0.6903  br=0.2480  acc=0.6260


Sample: 100%|██████████| 330/330 [00:20, 16.38it/s, step size=1.59e-01, acc. prob=0.955]


[outer 032] TRAIN (EMA+K-ens) ll=0.6825  br=0.2443  acc=0.6560


Sample: 100%|██████████| 330/330 [00:22, 14.55it/s, step size=2.20e-01, acc. prob=0.940]


[outer 033] TRAIN (EMA+K-ens) ll=0.6858  br=0.2459  acc=0.6490


Sample: 100%|██████████| 330/330 [00:20, 16.09it/s, step size=1.84e-01, acc. prob=0.963]


[outer 034] TRAIN (EMA+K-ens) ll=0.6801  br=0.2431  acc=0.6930


Sample: 100%|██████████| 330/330 [00:20, 16.41it/s, step size=1.96e-01, acc. prob=0.979]


[outer 035] TRAIN (EMA+K-ens) ll=0.6720  br=0.2391  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 15.97it/s, step size=1.93e-01, acc. prob=0.939]


[outer 036] TRAIN (EMA+K-ens) ll=0.6704  br=0.2384  acc=0.7000


Sample: 100%|██████████| 330/330 [00:20, 15.91it/s, step size=2.07e-01, acc. prob=0.941]


[outer 037] TRAIN (EMA+K-ens) ll=0.6616  br=0.2340  acc=0.7000


Sample: 100%|██████████| 330/330 [00:20, 15.91it/s, step size=2.06e-01, acc. prob=0.947]


[outer 038] TRAIN (EMA+K-ens) ll=0.6618  br=0.2342  acc=0.7000


Sample: 100%|██████████| 330/330 [00:20, 16.49it/s, step size=1.92e-01, acc. prob=0.973]


[outer 039] TRAIN (EMA+K-ens) ll=0.6600  br=0.2334  acc=0.7000
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}]


Sample: 100%|██████████| 330/330 [00:21, 15.22it/s, step size=2.05e-01, acc. prob=0.946]


[outer 000] TRAIN (EMA+K-ens) ll=0.7328  br=0.2682  acc=0.5440


Sample: 100%|██████████| 330/330 [00:20, 16.25it/s, step size=1.78e-01, acc. prob=0.949]


[outer 001] TRAIN (EMA+K-ens) ll=0.6802  br=0.2431  acc=0.6240


Sample: 100%|██████████| 330/330 [00:22, 14.53it/s, step size=2.17e-01, acc. prob=0.926]


[outer 002] TRAIN (EMA+K-ens) ll=0.6680  br=0.2374  acc=0.6540


Sample: 100%|██████████| 330/330 [00:22, 14.55it/s, step size=1.96e-01, acc. prob=0.944]


[outer 003] TRAIN (EMA+K-ens) ll=0.6704  br=0.2385  acc=0.6470


Sample: 100%|██████████| 330/330 [00:20, 16.39it/s, step size=2.09e-01, acc. prob=0.936]


[outer 004] TRAIN (EMA+K-ens) ll=0.6722  br=0.2394  acc=0.6500


Sample: 100%|██████████| 330/330 [00:21, 15.24it/s, step size=1.93e-01, acc. prob=0.946]


[outer 005] TRAIN (EMA+K-ens) ll=0.6729  br=0.2396  acc=0.6660


Sample: 100%|██████████| 330/330 [00:21, 15.07it/s, step size=1.71e-01, acc. prob=0.979]


[outer 006] TRAIN (EMA+K-ens) ll=0.6691  br=0.2378  acc=0.6750


Sample: 100%|██████████| 330/330 [00:21, 15.38it/s, step size=1.94e-01, acc. prob=0.929]


[outer 007] TRAIN (EMA+K-ens) ll=0.6693  br=0.2379  acc=0.6770


Sample: 100%|██████████| 330/330 [00:20, 16.30it/s, step size=1.77e-01, acc. prob=0.950]


[outer 008] TRAIN (EMA+K-ens) ll=0.6647  br=0.2357  acc=0.6910


Sample: 100%|██████████| 330/330 [00:21, 15.04it/s, step size=1.97e-01, acc. prob=0.943]


[outer 009] TRAIN (EMA+K-ens) ll=0.6610  br=0.2338  acc=0.6910


Sample: 100%|██████████| 330/330 [00:19, 16.65it/s, step size=1.59e-01, acc. prob=0.943]


[outer 010] TRAIN (EMA+K-ens) ll=0.6697  br=0.2379  acc=0.6910


Sample: 100%|██████████| 330/330 [00:21, 15.37it/s, step size=1.39e-01, acc. prob=0.961]


[outer 011] TRAIN (EMA+K-ens) ll=0.6704  br=0.2383  acc=0.6870


Sample: 100%|██████████| 330/330 [00:21, 15.63it/s, step size=1.79e-01, acc. prob=0.945]


[outer 012] TRAIN (EMA+K-ens) ll=0.6690  br=0.2375  acc=0.6700


Sample: 100%|██████████| 330/330 [00:21, 15.02it/s, step size=1.56e-01, acc. prob=0.969]


[outer 013] TRAIN (EMA+K-ens) ll=0.6640  br=0.2352  acc=0.6530


Sample: 100%|██████████| 330/330 [00:20, 15.81it/s, step size=2.24e-01, acc. prob=0.948]


[outer 014] TRAIN (EMA+K-ens) ll=0.6730  br=0.2395  acc=0.6500


Sample: 100%|██████████| 330/330 [00:22, 14.88it/s, step size=1.89e-01, acc. prob=0.946]


[outer 015] TRAIN (EMA+K-ens) ll=0.6715  br=0.2389  acc=0.6320
[Early stop @ outer 15] Δll=0.170%, Δbr=0.312%, Δacc=0.002
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}]


Sample: 100%|██████████| 330/330 [00:21, 15.57it/s, step size=1.87e-01, acc. prob=0.932]


[outer 000] TRAIN (EMA+K-ens) ll=0.6848  br=0.2451  acc=0.6020


Sample: 100%|██████████| 330/330 [00:18, 17.74it/s, step size=1.93e-01, acc. prob=0.963]


[outer 001] TRAIN (EMA+K-ens) ll=0.6766  br=0.2413  acc=0.6200


Sample: 100%|██████████| 330/330 [00:21, 15.48it/s, step size=1.71e-01, acc. prob=0.963]


[outer 002] TRAIN (EMA+K-ens) ll=0.6806  br=0.2433  acc=0.6420


Sample: 100%|██████████| 330/330 [00:21, 15.02it/s, step size=1.92e-01, acc. prob=0.940]


[outer 003] TRAIN (EMA+K-ens) ll=0.6767  br=0.2413  acc=0.6660


Sample: 100%|██████████| 330/330 [00:20, 15.98it/s, step size=1.64e-01, acc. prob=0.964]


[outer 004] TRAIN (EMA+K-ens) ll=0.6957  br=0.2505  acc=0.6110


Sample: 100%|██████████| 330/330 [00:22, 14.57it/s, step size=2.01e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6835  br=0.2447  acc=0.6470


Sample: 100%|██████████| 330/330 [00:20, 16.20it/s, step size=2.18e-01, acc. prob=0.925]


[outer 006] TRAIN (EMA+K-ens) ll=0.6745  br=0.2404  acc=0.6560


Sample: 100%|██████████| 330/330 [00:21, 15.67it/s, step size=1.91e-01, acc. prob=0.955]


[outer 007] TRAIN (EMA+K-ens) ll=0.6675  br=0.2368  acc=0.6870


Sample: 100%|██████████| 330/330 [00:20, 16.41it/s, step size=1.77e-01, acc. prob=0.950]


[outer 008] TRAIN (EMA+K-ens) ll=0.6699  br=0.2381  acc=0.6690


Sample: 100%|██████████| 330/330 [00:22, 14.61it/s, step size=1.74e-01, acc. prob=0.961]


[outer 009] TRAIN (EMA+K-ens) ll=0.6686  br=0.2374  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 17.45it/s, step size=2.48e-01, acc. prob=0.910]


[outer 010] TRAIN (EMA+K-ens) ll=0.6542  br=0.2304  acc=0.7120


Sample: 100%|██████████| 330/330 [00:19, 16.67it/s, step size=2.11e-01, acc. prob=0.961]


[outer 011] TRAIN (EMA+K-ens) ll=0.6507  br=0.2288  acc=0.7110


Sample: 100%|██████████| 330/330 [00:22, 14.88it/s, step size=2.03e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.6382  br=0.2227  acc=0.7120


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=2.06e-01, acc. prob=0.918]


[outer 013] TRAIN (EMA+K-ens) ll=0.6429  br=0.2249  acc=0.7110


Sample: 100%|██████████| 330/330 [00:21, 15.54it/s, step size=2.00e-01, acc. prob=0.941]


[outer 014] TRAIN (EMA+K-ens) ll=0.6495  br=0.2281  acc=0.7110


Sample: 100%|██████████| 330/330 [00:22, 14.67it/s, step size=1.87e-01, acc. prob=0.962]


[outer 015] TRAIN (EMA+K-ens) ll=0.6529  br=0.2298  acc=0.7110


Sample: 100%|██████████| 330/330 [00:21, 15.07it/s, step size=2.02e-01, acc. prob=0.944]


[outer 016] TRAIN (EMA+K-ens) ll=0.6484  br=0.2275  acc=0.7110


Sample: 100%|██████████| 330/330 [00:19, 16.91it/s, step size=2.10e-01, acc. prob=0.937]


[outer 017] TRAIN (EMA+K-ens) ll=0.6387  br=0.2229  acc=0.7120


Sample: 100%|██████████| 330/330 [00:21, 15.49it/s, step size=1.69e-01, acc. prob=0.947]


[outer 018] TRAIN (EMA+K-ens) ll=0.6574  br=0.2318  acc=0.7120


Sample: 100%|██████████| 330/330 [00:21, 15.46it/s, step size=1.95e-01, acc. prob=0.951]


[outer 019] TRAIN (EMA+K-ens) ll=0.6613  br=0.2337  acc=0.7120


Sample: 100%|██████████| 330/330 [00:20, 16.02it/s, step size=1.72e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6638  br=0.2348  acc=0.7120


Sample: 100%|██████████| 330/330 [00:21, 15.43it/s, step size=1.66e-01, acc. prob=0.962]


[outer 021] TRAIN (EMA+K-ens) ll=0.6640  br=0.2351  acc=0.7060


Sample: 100%|██████████| 330/330 [00:19, 16.56it/s, step size=2.58e-01, acc. prob=0.939]


[outer 022] TRAIN (EMA+K-ens) ll=0.6579  br=0.2320  acc=0.7120


Sample: 100%|██████████| 330/330 [00:20, 16.00it/s, step size=1.91e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6609  br=0.2336  acc=0.7040


Sample: 100%|██████████| 330/330 [00:19, 16.50it/s, step size=2.40e-01, acc. prob=0.935]


[outer 024] TRAIN (EMA+K-ens) ll=0.6614  br=0.2338  acc=0.7030


Sample: 100%|██████████| 330/330 [00:21, 15.08it/s, step size=2.18e-01, acc. prob=0.936]


[outer 025] TRAIN (EMA+K-ens) ll=0.6707  br=0.2384  acc=0.7030


Sample: 100%|██████████| 330/330 [00:20, 16.13it/s, step size=2.02e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6655  br=0.2358  acc=0.7120


Sample: 100%|██████████| 330/330 [00:22, 14.97it/s, step size=1.71e-01, acc. prob=0.953]


[outer 027] TRAIN (EMA+K-ens) ll=0.6653  br=0.2357  acc=0.7120


Sample: 100%|██████████| 330/330 [00:21, 15.10it/s, step size=1.60e-01, acc. prob=0.962]


[outer 028] TRAIN (EMA+K-ens) ll=0.6614  br=0.2339  acc=0.7070


Sample: 100%|██████████| 330/330 [00:22, 14.52it/s, step size=1.71e-01, acc. prob=0.963]


[outer 029] TRAIN (EMA+K-ens) ll=0.6630  br=0.2347  acc=0.6940


Sample: 100%|██████████| 330/330 [00:22, 14.68it/s, step size=1.95e-01, acc. prob=0.941]


[outer 030] TRAIN (EMA+K-ens) ll=0.6705  br=0.2384  acc=0.6960


Sample: 100%|██████████| 330/330 [00:20, 16.00it/s, step size=2.20e-01, acc. prob=0.936]


[outer 031] TRAIN (EMA+K-ens) ll=0.6573  br=0.2320  acc=0.7030


Sample: 100%|██████████| 330/330 [00:18, 17.74it/s, step size=1.80e-01, acc. prob=0.958]


[outer 032] TRAIN (EMA+K-ens) ll=0.6559  br=0.2314  acc=0.7050


Sample: 100%|██████████| 330/330 [00:20, 16.23it/s, step size=1.80e-01, acc. prob=0.959]


[outer 033] TRAIN (EMA+K-ens) ll=0.6466  br=0.2269  acc=0.6980


Sample: 100%|██████████| 330/330 [00:22, 14.84it/s, step size=2.00e-01, acc. prob=0.944]


[outer 034] TRAIN (EMA+K-ens) ll=0.6422  br=0.2249  acc=0.7040


Sample: 100%|██████████| 330/330 [00:20, 15.91it/s, step size=1.88e-01, acc. prob=0.926]


[outer 035] TRAIN (EMA+K-ens) ll=0.6388  br=0.2233  acc=0.6880


Sample: 100%|██████████| 330/330 [00:20, 16.01it/s, step size=1.96e-01, acc. prob=0.944]


[outer 036] TRAIN (EMA+K-ens) ll=0.6344  br=0.2212  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 17.56it/s, step size=2.04e-01, acc. prob=0.969]


[outer 037] TRAIN (EMA+K-ens) ll=0.6290  br=0.2188  acc=0.6990


Sample: 100%|██████████| 330/330 [00:20, 16.34it/s, step size=1.96e-01, acc. prob=0.940]


[outer 038] TRAIN (EMA+K-ens) ll=0.6347  br=0.2213  acc=0.7120


Sample: 100%|██████████| 330/330 [00:20, 16.39it/s, step size=1.91e-01, acc. prob=0.960]


[outer 039] TRAIN (EMA+K-ens) ll=0.6389  br=0.2233  acc=0.7120
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}]


Sample: 100%|██████████| 330/330 [00:19, 16.53it/s, step size=2.11e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6715  br=0.2386  acc=0.6510


Sample: 100%|██████████| 330/330 [00:20, 16.31it/s, step size=1.51e-01, acc. prob=0.966]


[outer 001] TRAIN (EMA+K-ens) ll=0.6735  br=0.2396  acc=0.6630


Sample: 100%|██████████| 330/330 [00:19, 16.60it/s, step size=1.82e-01, acc. prob=0.949]


[outer 002] TRAIN (EMA+K-ens) ll=0.6715  br=0.2386  acc=0.6790


Sample: 100%|██████████| 330/330 [00:19, 17.29it/s, step size=1.91e-01, acc. prob=0.928]


[outer 003] TRAIN (EMA+K-ens) ll=0.6628  br=0.2346  acc=0.6830


Sample: 100%|██████████| 330/330 [00:21, 15.57it/s, step size=1.81e-01, acc. prob=0.958]


[outer 004] TRAIN (EMA+K-ens) ll=0.6599  br=0.2332  acc=0.6740


Sample: 100%|██████████| 330/330 [00:19, 16.69it/s, step size=1.91e-01, acc. prob=0.939]


[outer 005] TRAIN (EMA+K-ens) ll=0.6557  br=0.2312  acc=0.6940


Sample: 100%|██████████| 330/330 [00:22, 14.59it/s, step size=2.16e-01, acc. prob=0.945]


[outer 006] TRAIN (EMA+K-ens) ll=0.6517  br=0.2293  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.48it/s, step size=1.71e-01, acc. prob=0.953]


[outer 007] TRAIN (EMA+K-ens) ll=0.6543  br=0.2305  acc=0.6940


Sample: 100%|██████████| 330/330 [00:20, 16.29it/s, step size=1.97e-01, acc. prob=0.944]


[outer 008] TRAIN (EMA+K-ens) ll=0.6566  br=0.2317  acc=0.6990


Sample: 100%|██████████| 330/330 [00:20, 16.19it/s, step size=1.95e-01, acc. prob=0.956]


[outer 009] TRAIN (EMA+K-ens) ll=0.6475  br=0.2274  acc=0.7020


Sample: 100%|██████████| 330/330 [00:19, 16.98it/s, step size=1.68e-01, acc. prob=0.956]


[outer 010] TRAIN (EMA+K-ens) ll=0.6395  br=0.2235  acc=0.7010


Sample: 100%|██████████| 330/330 [00:20, 16.11it/s, step size=2.08e-01, acc. prob=0.957]


[outer 011] TRAIN (EMA+K-ens) ll=0.6428  br=0.2250  acc=0.7040


Sample: 100%|██████████| 330/330 [00:21, 15.03it/s, step size=2.05e-01, acc. prob=0.921]


[outer 012] TRAIN (EMA+K-ens) ll=0.6333  br=0.2204  acc=0.7050


Sample: 100%|██████████| 330/330 [00:22, 14.38it/s, step size=1.95e-01, acc. prob=0.950]


[outer 013] TRAIN (EMA+K-ens) ll=0.6357  br=0.2216  acc=0.7050


Sample: 100%|██████████| 330/330 [00:20, 16.13it/s, step size=1.85e-01, acc. prob=0.959]


[outer 014] TRAIN (EMA+K-ens) ll=0.6378  br=0.2226  acc=0.7050


Sample: 100%|██████████| 330/330 [00:20, 16.41it/s, step size=2.03e-01, acc. prob=0.957]


[outer 015] TRAIN (EMA+K-ens) ll=0.6336  br=0.2205  acc=0.7050


Sample: 100%|██████████| 330/330 [00:20, 16.19it/s, step size=1.72e-01, acc. prob=0.972]


[outer 016] TRAIN (EMA+K-ens) ll=0.6291  br=0.2185  acc=0.7050


Sample: 100%|██████████| 330/330 [00:21, 15.11it/s, step size=1.89e-01, acc. prob=0.934]


[outer 017] TRAIN (EMA+K-ens) ll=0.6328  br=0.2202  acc=0.7050


Sample: 100%|██████████| 330/330 [00:21, 15.41it/s, step size=1.62e-01, acc. prob=0.947]


[outer 018] TRAIN (EMA+K-ens) ll=0.6395  br=0.2233  acc=0.7040


Sample: 100%|██████████| 330/330 [00:21, 15.47it/s, step size=1.88e-01, acc. prob=0.943]


[outer 019] TRAIN (EMA+K-ens) ll=0.6351  br=0.2213  acc=0.7050


Sample: 100%|██████████| 330/330 [00:19, 17.33it/s, step size=2.26e-01, acc. prob=0.955]


[outer 020] TRAIN (EMA+K-ens) ll=0.6423  br=0.2247  acc=0.7050


Sample: 100%|██████████| 330/330 [00:21, 15.34it/s, step size=1.64e-01, acc. prob=0.970]


[outer 021] TRAIN (EMA+K-ens) ll=0.6406  br=0.2239  acc=0.7050


Sample: 100%|██████████| 330/330 [00:21, 15.19it/s, step size=1.70e-01, acc. prob=0.963]


[outer 022] TRAIN (EMA+K-ens) ll=0.6481  br=0.2275  acc=0.7050
[Early stop @ outer 22] Δll=0.065%, Δbr=0.099%, Δacc=0.001
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}]


Sample: 100%|██████████| 330/330 [00:22, 14.74it/s, step size=1.62e-01, acc. prob=0.957]


[outer 000] TRAIN (EMA+K-ens) ll=0.6991  br=0.2528  acc=0.5170


Sample: 100%|██████████| 330/330 [00:21, 15.39it/s, step size=1.64e-01, acc. prob=0.961]


[outer 001] TRAIN (EMA+K-ens) ll=0.7046  br=0.2547  acc=0.5780


Sample: 100%|██████████| 330/330 [00:23, 14.07it/s, step size=1.97e-01, acc. prob=0.944]


[outer 002] TRAIN (EMA+K-ens) ll=0.6925  br=0.2491  acc=0.6010


Sample: 100%|██████████| 330/330 [00:22, 14.45it/s, step size=2.26e-01, acc. prob=0.928]


[outer 003] TRAIN (EMA+K-ens) ll=0.6885  br=0.2472  acc=0.6330


Sample: 100%|██████████| 330/330 [00:20, 16.03it/s, step size=2.17e-01, acc. prob=0.937]


[outer 004] TRAIN (EMA+K-ens) ll=0.6742  br=0.2403  acc=0.6410


Sample: 100%|██████████| 330/330 [00:20, 16.32it/s, step size=1.92e-01, acc. prob=0.966]


[outer 005] TRAIN (EMA+K-ens) ll=0.6726  br=0.2395  acc=0.6740


Sample: 100%|██████████| 330/330 [00:19, 16.52it/s, step size=1.75e-01, acc. prob=0.969]


[outer 006] TRAIN (EMA+K-ens) ll=0.6673  br=0.2370  acc=0.6840


Sample: 100%|██████████| 330/330 [00:21, 15.09it/s, step size=1.56e-01, acc. prob=0.957]


[outer 007] TRAIN (EMA+K-ens) ll=0.6689  br=0.2377  acc=0.6840


Sample: 100%|██████████| 330/330 [00:22, 14.38it/s, step size=2.32e-01, acc. prob=0.921]


[outer 008] TRAIN (EMA+K-ens) ll=0.6637  br=0.2352  acc=0.6810


Sample: 100%|██████████| 330/330 [00:21, 15.21it/s, step size=2.10e-01, acc. prob=0.951]


[outer 009] TRAIN (EMA+K-ens) ll=0.6611  br=0.2340  acc=0.6820


Sample: 100%|██████████| 330/330 [00:19, 17.29it/s, step size=1.84e-01, acc. prob=0.924]


[outer 010] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6860


Sample: 100%|██████████| 330/330 [00:21, 15.59it/s, step size=1.82e-01, acc. prob=0.965]


[outer 011] TRAIN (EMA+K-ens) ll=0.6569  br=0.2319  acc=0.6880


Sample: 100%|██████████| 330/330 [00:22, 14.97it/s, step size=1.66e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6655  br=0.2360  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 18.11it/s, step size=2.30e-01, acc. prob=0.951]


[outer 013] TRAIN (EMA+K-ens) ll=0.6626  br=0.2346  acc=0.6880


Sample: 100%|██████████| 330/330 [00:20, 16.00it/s, step size=1.74e-01, acc. prob=0.970]


[outer 014] TRAIN (EMA+K-ens) ll=0.6645  br=0.2356  acc=0.6880


Sample: 100%|██████████| 330/330 [00:21, 15.11it/s, step size=1.65e-01, acc. prob=0.962]


[outer 015] TRAIN (EMA+K-ens) ll=0.6633  br=0.2350  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 17.49it/s, step size=1.97e-01, acc. prob=0.959]


[outer 016] TRAIN (EMA+K-ens) ll=0.6721  br=0.2391  acc=0.6880


Sample: 100%|██████████| 330/330 [00:20, 16.19it/s, step size=2.02e-01, acc. prob=0.934]


[outer 017] TRAIN (EMA+K-ens) ll=0.6755  br=0.2408  acc=0.6880


Sample: 100%|██████████| 330/330 [00:19, 16.87it/s, step size=1.84e-01, acc. prob=0.956]


[outer 018] TRAIN (EMA+K-ens) ll=0.6750  br=0.2406  acc=0.6880


Sample: 100%|██████████| 330/330 [00:19, 16.92it/s, step size=2.18e-01, acc. prob=0.911]


[outer 019] TRAIN (EMA+K-ens) ll=0.6845  br=0.2451  acc=0.6880


Sample: 100%|██████████| 330/330 [00:21, 15.64it/s, step size=1.83e-01, acc. prob=0.967]


[outer 020] TRAIN (EMA+K-ens) ll=0.6860  br=0.2458  acc=0.6780


Sample: 100%|██████████| 330/330 [00:20, 16.15it/s, step size=1.86e-01, acc. prob=0.943]


[outer 021] TRAIN (EMA+K-ens) ll=0.6858  br=0.2457  acc=0.6780


Sample: 100%|██████████| 330/330 [00:20, 16.33it/s, step size=1.85e-01, acc. prob=0.950]


[outer 022] TRAIN (EMA+K-ens) ll=0.6871  br=0.2463  acc=0.6880


Sample: 100%|██████████| 330/330 [00:21, 15.34it/s, step size=1.97e-01, acc. prob=0.955]


[outer 023] TRAIN (EMA+K-ens) ll=0.6913  br=0.2482  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.80it/s, step size=1.56e-01, acc. prob=0.960]


[outer 024] TRAIN (EMA+K-ens) ll=0.6984  br=0.2516  acc=0.6390


Sample: 100%|██████████| 330/330 [00:22, 14.68it/s, step size=1.58e-01, acc. prob=0.941]


[outer 025] TRAIN (EMA+K-ens) ll=0.6893  br=0.2474  acc=0.6700


Sample: 100%|██████████| 330/330 [00:21, 15.18it/s, step size=1.69e-01, acc. prob=0.960]


[outer 026] TRAIN (EMA+K-ens) ll=0.6792  br=0.2425  acc=0.6800


Sample: 100%|██████████| 330/330 [00:20, 15.83it/s, step size=1.57e-01, acc. prob=0.955]


[outer 027] TRAIN (EMA+K-ens) ll=0.6702  br=0.2384  acc=0.6870


Sample: 100%|██████████| 330/330 [00:20, 16.26it/s, step size=2.20e-01, acc. prob=0.937]


[outer 028] TRAIN (EMA+K-ens) ll=0.6673  br=0.2370  acc=0.6690


Sample: 100%|██████████| 330/330 [00:20, 15.86it/s, step size=1.55e-01, acc. prob=0.946]


[outer 029] TRAIN (EMA+K-ens) ll=0.6651  br=0.2360  acc=0.6640


Sample: 100%|██████████| 330/330 [00:19, 16.61it/s, step size=2.13e-01, acc. prob=0.910]


[outer 030] TRAIN (EMA+K-ens) ll=0.6564  br=0.2318  acc=0.6680


Sample: 100%|██████████| 330/330 [00:21, 15.14it/s, step size=1.97e-01, acc. prob=0.929]


[outer 031] TRAIN (EMA+K-ens) ll=0.6550  br=0.2311  acc=0.6770


Sample: 100%|██████████| 330/330 [00:21, 15.40it/s, step size=1.53e-01, acc. prob=0.958]


[outer 032] TRAIN (EMA+K-ens) ll=0.6376  br=0.2226  acc=0.6860


Sample: 100%|██████████| 330/330 [00:18, 17.57it/s, step size=2.07e-01, acc. prob=0.949]


[outer 033] TRAIN (EMA+K-ens) ll=0.6367  br=0.2222  acc=0.6920


Sample: 100%|██████████| 330/330 [00:18, 17.43it/s, step size=2.03e-01, acc. prob=0.946]


[outer 034] TRAIN (EMA+K-ens) ll=0.6494  br=0.2284  acc=0.6800


Sample: 100%|██████████| 330/330 [00:21, 15.13it/s, step size=1.91e-01, acc. prob=0.931]


[outer 035] TRAIN (EMA+K-ens) ll=0.6478  br=0.2275  acc=0.6900


Sample: 100%|██████████| 330/330 [00:21, 15.69it/s, step size=1.71e-01, acc. prob=0.943]


[outer 036] TRAIN (EMA+K-ens) ll=0.6442  br=0.2258  acc=0.6910


Sample: 100%|██████████| 330/330 [00:20, 16.18it/s, step size=2.01e-01, acc. prob=0.951]


[outer 037] TRAIN (EMA+K-ens) ll=0.6542  br=0.2305  acc=0.6900


Sample: 100%|██████████| 330/330 [00:22, 14.55it/s, step size=1.82e-01, acc. prob=0.944]


[outer 038] TRAIN (EMA+K-ens) ll=0.6567  br=0.2317  acc=0.6890


Sample: 100%|██████████| 330/330 [00:20, 15.79it/s, step size=2.27e-01, acc. prob=0.949]


[outer 039] TRAIN (EMA+K-ens) ll=0.6551  br=0.2309  acc=0.6840
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}]


Sample: 100%|██████████| 330/330 [00:22, 14.87it/s, step size=1.76e-01, acc. prob=0.961]


[outer 000] TRAIN (EMA+K-ens) ll=0.7110  br=0.2577  acc=0.5520


Sample: 100%|██████████| 330/330 [00:19, 16.76it/s, step size=1.89e-01, acc. prob=0.957]


[outer 001] TRAIN (EMA+K-ens) ll=0.6753  br=0.2411  acc=0.5640


Sample: 100%|██████████| 330/330 [00:20, 16.33it/s, step size=1.50e-01, acc. prob=0.976]


[outer 002] TRAIN (EMA+K-ens) ll=0.6722  br=0.2396  acc=0.6230


Sample: 100%|██████████| 330/330 [00:22, 14.95it/s, step size=1.48e-01, acc. prob=0.975]


[outer 003] TRAIN (EMA+K-ens) ll=0.6693  br=0.2381  acc=0.6640


Sample: 100%|██████████| 330/330 [00:20, 16.47it/s, step size=2.08e-01, acc. prob=0.954]


[outer 004] TRAIN (EMA+K-ens) ll=0.6615  br=0.2343  acc=0.6730


Sample: 100%|██████████| 330/330 [00:18, 17.57it/s, step size=2.35e-01, acc. prob=0.919]


[outer 005] TRAIN (EMA+K-ens) ll=0.6592  br=0.2331  acc=0.6730


Sample: 100%|██████████| 330/330 [00:21, 15.50it/s, step size=1.61e-01, acc. prob=0.969]


[outer 006] TRAIN (EMA+K-ens) ll=0.6609  br=0.2340  acc=0.6810


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=2.07e-01, acc. prob=0.965]


[outer 007] TRAIN (EMA+K-ens) ll=0.6657  br=0.2363  acc=0.6810


Sample: 100%|██████████| 330/330 [00:20, 15.77it/s, step size=1.88e-01, acc. prob=0.950]


[outer 008] TRAIN (EMA+K-ens) ll=0.6666  br=0.2366  acc=0.6780


Sample: 100%|██████████| 330/330 [00:21, 15.05it/s, step size=1.58e-01, acc. prob=0.961]


[outer 009] TRAIN (EMA+K-ens) ll=0.6663  br=0.2365  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.79it/s, step size=1.52e-01, acc. prob=0.962]


[outer 010] TRAIN (EMA+K-ens) ll=0.6675  br=0.2371  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.65it/s, step size=1.54e-01, acc. prob=0.955]


[outer 011] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.49it/s, step size=1.82e-01, acc. prob=0.929]


[outer 012] TRAIN (EMA+K-ens) ll=0.6635  br=0.2351  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.84it/s, step size=1.84e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6723  br=0.2394  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 16.95it/s, step size=2.19e-01, acc. prob=0.949]


[outer 014] TRAIN (EMA+K-ens) ll=0.6652  br=0.2359  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.68it/s, step size=1.80e-01, acc. prob=0.965]


[outer 015] TRAIN (EMA+K-ens) ll=0.6595  br=0.2332  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 17.15it/s, step size=2.19e-01, acc. prob=0.940]


[outer 016] TRAIN (EMA+K-ens) ll=0.6535  br=0.2303  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.03it/s, step size=1.76e-01, acc. prob=0.966]


[outer 017] TRAIN (EMA+K-ens) ll=0.6500  br=0.2287  acc=0.6800


Sample: 100%|██████████| 330/330 [00:19, 16.53it/s, step size=2.06e-01, acc. prob=0.970]


[outer 018] TRAIN (EMA+K-ens) ll=0.6428  br=0.2254  acc=0.6700


Sample: 100%|██████████| 330/330 [00:20, 15.74it/s, step size=1.86e-01, acc. prob=0.932]


[outer 019] TRAIN (EMA+K-ens) ll=0.6452  br=0.2265  acc=0.6750


Sample: 100%|██████████| 330/330 [00:22, 14.87it/s, step size=1.90e-01, acc. prob=0.973]


[outer 020] TRAIN (EMA+K-ens) ll=0.6538  br=0.2307  acc=0.6710


Sample: 100%|██████████| 330/330 [00:18, 17.83it/s, step size=2.18e-01, acc. prob=0.959]


[outer 021] TRAIN (EMA+K-ens) ll=0.6471  br=0.2273  acc=0.6830


Sample: 100%|██████████| 330/330 [00:19, 16.72it/s, step size=1.93e-01, acc. prob=0.966]


[outer 022] TRAIN (EMA+K-ens) ll=0.6549  br=0.2311  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.65it/s, step size=1.42e-01, acc. prob=0.933]


[outer 023] TRAIN (EMA+K-ens) ll=0.6569  br=0.2321  acc=0.6670


Sample: 100%|██████████| 330/330 [00:20, 15.90it/s, step size=1.81e-01, acc. prob=0.971]


[outer 024] TRAIN (EMA+K-ens) ll=0.6733  br=0.2399  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 16.67it/s, step size=1.92e-01, acc. prob=0.952]


[outer 025] TRAIN (EMA+K-ens) ll=0.6759  br=0.2410  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.79it/s, step size=2.03e-01, acc. prob=0.947]


[outer 026] TRAIN (EMA+K-ens) ll=0.6817  br=0.2437  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 16.91it/s, step size=2.04e-01, acc. prob=0.952]


[outer 027] TRAIN (EMA+K-ens) ll=0.6848  br=0.2452  acc=0.6780


Sample: 100%|██████████| 330/330 [00:20, 16.30it/s, step size=1.99e-01, acc. prob=0.938]


[outer 028] TRAIN (EMA+K-ens) ll=0.6810  br=0.2434  acc=0.6690


Sample: 100%|██████████| 330/330 [00:22, 14.37it/s, step size=1.98e-01, acc. prob=0.952]


[outer 029] TRAIN (EMA+K-ens) ll=0.6888  br=0.2471  acc=0.6670


Sample: 100%|██████████| 330/330 [00:21, 15.67it/s, step size=1.69e-01, acc. prob=0.966]


[outer 030] TRAIN (EMA+K-ens) ll=0.6954  br=0.2501  acc=0.6640


Sample: 100%|██████████| 330/330 [00:21, 15.41it/s, step size=2.06e-01, acc. prob=0.937]


[outer 031] TRAIN (EMA+K-ens) ll=0.6977  br=0.2511  acc=0.6720


Sample: 100%|██████████| 330/330 [00:22, 14.56it/s, step size=1.83e-01, acc. prob=0.956]


[outer 032] TRAIN (EMA+K-ens) ll=0.6848  br=0.2451  acc=0.6810


Sample: 100%|██████████| 330/330 [00:23, 13.81it/s, step size=1.58e-01, acc. prob=0.970]


[outer 033] TRAIN (EMA+K-ens) ll=0.6833  br=0.2444  acc=0.6600


Sample: 100%|██████████| 330/330 [00:22, 14.79it/s, step size=1.91e-01, acc. prob=0.956]


[outer 034] TRAIN (EMA+K-ens) ll=0.6810  br=0.2433  acc=0.6830


Sample: 100%|██████████| 330/330 [00:21, 15.51it/s, step size=2.08e-01, acc. prob=0.950]


[outer 035] TRAIN (EMA+K-ens) ll=0.6774  br=0.2417  acc=0.6850


Sample: 100%|██████████| 330/330 [00:20, 16.11it/s, step size=2.01e-01, acc. prob=0.957]


[outer 036] TRAIN (EMA+K-ens) ll=0.6812  br=0.2433  acc=0.6850


Sample: 100%|██████████| 330/330 [00:20, 16.34it/s, step size=2.31e-01, acc. prob=0.950]


[outer 037] TRAIN (EMA+K-ens) ll=0.6814  br=0.2432  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.78it/s, step size=1.85e-01, acc. prob=0.968]


[outer 038] TRAIN (EMA+K-ens) ll=0.6727  br=0.2392  acc=0.6850


Sample: 100%|██████████| 330/330 [00:20, 15.73it/s, step size=2.49e-01, acc. prob=0.915]


[outer 039] TRAIN (EMA+K-ens) ll=0.6764  br=0.2411  acc=0.6720
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}, {'accuracy': 0.6538400053977966, 'brier': 0.2408851832151413, 'logloss': 0.6783546805381775}]


Sample: 100%|██████████| 330/330 [00:22, 14.53it/s, step size=2.09e-01, acc. prob=0.950]


[outer 000] TRAIN (EMA+K-ens) ll=0.7752  br=0.2873  acc=0.4900


Sample: 100%|██████████| 330/330 [00:23, 13.76it/s, step size=1.70e-01, acc. prob=0.950]


[outer 001] TRAIN (EMA+K-ens) ll=0.7297  br=0.2666  acc=0.5650


Sample: 100%|██████████| 330/330 [00:20, 15.76it/s, step size=1.85e-01, acc. prob=0.963]


[outer 002] TRAIN (EMA+K-ens) ll=0.7134  br=0.2592  acc=0.6220


Sample: 100%|██████████| 330/330 [00:20, 15.85it/s, step size=2.20e-01, acc. prob=0.962]


[outer 003] TRAIN (EMA+K-ens) ll=0.7122  br=0.2586  acc=0.6270


Sample: 100%|██████████| 330/330 [00:20, 15.72it/s, step size=2.24e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6935  br=0.2494  acc=0.6890


Sample: 100%|██████████| 330/330 [00:21, 15.27it/s, step size=2.04e-01, acc. prob=0.958]


[outer 005] TRAIN (EMA+K-ens) ll=0.6901  br=0.2479  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 17.22it/s, step size=2.52e-01, acc. prob=0.901]


[outer 006] TRAIN (EMA+K-ens) ll=0.6867  br=0.2463  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.28it/s, step size=1.74e-01, acc. prob=0.949]


[outer 007] TRAIN (EMA+K-ens) ll=0.6845  br=0.2452  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.30it/s, step size=1.85e-01, acc. prob=0.921]


[outer 008] TRAIN (EMA+K-ens) ll=0.6776  br=0.2418  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.48it/s, step size=1.87e-01, acc. prob=0.958]


[outer 009] TRAIN (EMA+K-ens) ll=0.6790  br=0.2425  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.46it/s, step size=1.77e-01, acc. prob=0.958]


[outer 010] TRAIN (EMA+K-ens) ll=0.6790  br=0.2424  acc=0.6850


Sample: 100%|██████████| 330/330 [00:20, 16.04it/s, step size=2.17e-01, acc. prob=0.944]


[outer 011] TRAIN (EMA+K-ens) ll=0.6723  br=0.2392  acc=0.6840


Sample: 100%|██████████| 330/330 [00:22, 14.96it/s, step size=1.84e-01, acc. prob=0.961]


[outer 012] TRAIN (EMA+K-ens) ll=0.6823  br=0.2440  acc=0.6830


Sample: 100%|██████████| 330/330 [00:22, 14.91it/s, step size=1.97e-01, acc. prob=0.950]


[outer 013] TRAIN (EMA+K-ens) ll=0.6886  br=0.2472  acc=0.6630


Sample: 100%|██████████| 330/330 [00:20, 16.39it/s, step size=2.39e-01, acc. prob=0.929]


[outer 014] TRAIN (EMA+K-ens) ll=0.6906  br=0.2481  acc=0.6610


Sample: 100%|██████████| 330/330 [00:19, 17.18it/s, step size=2.18e-01, acc. prob=0.937]


[outer 015] TRAIN (EMA+K-ens) ll=0.6917  br=0.2486  acc=0.6610


Sample: 100%|██████████| 330/330 [00:19, 16.77it/s, step size=1.80e-01, acc. prob=0.947]


[outer 016] TRAIN (EMA+K-ens) ll=0.6858  br=0.2458  acc=0.6840


Sample: 100%|██████████| 330/330 [00:21, 15.33it/s, step size=1.84e-01, acc. prob=0.966]


[outer 017] TRAIN (EMA+K-ens) ll=0.6947  br=0.2502  acc=0.6840


Sample: 100%|██████████| 330/330 [00:21, 15.13it/s, step size=1.37e-01, acc. prob=0.972]


[outer 018] TRAIN (EMA+K-ens) ll=0.6886  br=0.2473  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 16.60it/s, step size=2.28e-01, acc. prob=0.900]


[outer 019] TRAIN (EMA+K-ens) ll=0.6959  br=0.2508  acc=0.6390


Sample: 100%|██████████| 330/330 [00:21, 15.14it/s, step size=2.13e-01, acc. prob=0.920]


[outer 020] TRAIN (EMA+K-ens) ll=0.6916  br=0.2488  acc=0.6710


Sample: 100%|██████████| 330/330 [00:20, 15.85it/s, step size=2.05e-01, acc. prob=0.953]


[outer 021] TRAIN (EMA+K-ens) ll=0.6918  br=0.2488  acc=0.6650


Sample: 100%|██████████| 330/330 [00:19, 17.01it/s, step size=2.33e-01, acc. prob=0.944]


[outer 022] TRAIN (EMA+K-ens) ll=0.7005  br=0.2528  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.33it/s, step size=1.81e-01, acc. prob=0.953]


[outer 023] TRAIN (EMA+K-ens) ll=0.7040  br=0.2544  acc=0.6810


Sample: 100%|██████████| 330/330 [00:22, 14.46it/s, step size=2.04e-01, acc. prob=0.919]


[outer 024] TRAIN (EMA+K-ens) ll=0.7031  br=0.2540  acc=0.6520


Sample: 100%|██████████| 330/330 [00:20, 16.32it/s, step size=2.29e-01, acc. prob=0.925]


[outer 025] TRAIN (EMA+K-ens) ll=0.6796  br=0.2427  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.60it/s, step size=1.79e-01, acc. prob=0.941]


[outer 026] TRAIN (EMA+K-ens) ll=0.6826  br=0.2441  acc=0.6830


Sample: 100%|██████████| 330/330 [00:21, 15.23it/s, step size=2.18e-01, acc. prob=0.953]


[outer 027] TRAIN (EMA+K-ens) ll=0.6764  br=0.2412  acc=0.6800


Sample: 100%|██████████| 330/330 [00:21, 15.20it/s, step size=1.68e-01, acc. prob=0.952]


[outer 028] TRAIN (EMA+K-ens) ll=0.6708  br=0.2383  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.54it/s, step size=2.37e-01, acc. prob=0.908]


[outer 029] TRAIN (EMA+K-ens) ll=0.6601  br=0.2333  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.56it/s, step size=1.96e-01, acc. prob=0.931]


[outer 030] TRAIN (EMA+K-ens) ll=0.6487  br=0.2279  acc=0.6850


Sample: 100%|██████████| 330/330 [00:21, 15.37it/s, step size=1.83e-01, acc. prob=0.955]


[outer 031] TRAIN (EMA+K-ens) ll=0.6456  br=0.2264  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.55it/s, step size=1.95e-01, acc. prob=0.965]


[outer 032] TRAIN (EMA+K-ens) ll=0.6465  br=0.2268  acc=0.6860


Sample: 100%|██████████| 330/330 [00:21, 15.47it/s, step size=2.14e-01, acc. prob=0.926]


[outer 033] TRAIN (EMA+K-ens) ll=0.6460  br=0.2266  acc=0.6850


Sample: 100%|██████████| 330/330 [00:20, 15.76it/s, step size=2.17e-01, acc. prob=0.964]


[outer 034] TRAIN (EMA+K-ens) ll=0.6480  br=0.2275  acc=0.6870


Sample: 100%|██████████| 330/330 [00:22, 14.47it/s, step size=1.69e-01, acc. prob=0.958]


[outer 035] TRAIN (EMA+K-ens) ll=0.6504  br=0.2288  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.45it/s, step size=1.95e-01, acc. prob=0.958]


[outer 036] TRAIN (EMA+K-ens) ll=0.6499  br=0.2285  acc=0.6850


Sample: 100%|██████████| 330/330 [00:22, 14.88it/s, step size=1.63e-01, acc. prob=0.957]


[outer 037] TRAIN (EMA+K-ens) ll=0.6537  br=0.2303  acc=0.6850


Sample: 100%|██████████| 330/330 [00:19, 16.51it/s, step size=2.09e-01, acc. prob=0.913]


[outer 038] TRAIN (EMA+K-ens) ll=0.6529  br=0.2300  acc=0.6850


Sample: 100%|██████████| 330/330 [00:23, 14.29it/s, step size=1.94e-01, acc. prob=0.969]


[outer 039] TRAIN (EMA+K-ens) ll=0.6518  br=0.2295  acc=0.6850
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}, {'accuracy': 0.6538400053977966, 'brier': 0.2408851832151413, 'logloss': 0.6783546805381775}, {'accuracy': 0.6790399551391602, 'brier': 0.2522827386856079, 'logloss': 0.6986757516860962}]


Sample: 100%|██████████| 330/330 [00:23, 14.28it/s, step size=2.36e-01, acc. prob=0.939]


[outer 000] TRAIN (EMA+K-ens) ll=0.7385  br=0.2702  acc=0.4940


Sample: 100%|██████████| 330/330 [00:21, 15.37it/s, step size=2.09e-01, acc. prob=0.967]


[outer 001] TRAIN (EMA+K-ens) ll=0.6966  br=0.2509  acc=0.5910


Sample: 100%|██████████| 330/330 [00:19, 17.32it/s, step size=1.84e-01, acc. prob=0.965]


[outer 002] TRAIN (EMA+K-ens) ll=0.6931  br=0.2497  acc=0.5930


Sample: 100%|██████████| 330/330 [00:20, 15.88it/s, step size=2.28e-01, acc. prob=0.952]


[outer 003] TRAIN (EMA+K-ens) ll=0.6958  br=0.2509  acc=0.6080


Sample: 100%|██████████| 330/330 [00:21, 15.26it/s, step size=1.84e-01, acc. prob=0.947]


[outer 004] TRAIN (EMA+K-ens) ll=0.6929  br=0.2494  acc=0.6020


Sample: 100%|██████████| 330/330 [00:23, 13.82it/s, step size=2.05e-01, acc. prob=0.935]


[outer 005] TRAIN (EMA+K-ens) ll=0.6978  br=0.2518  acc=0.5850


Sample: 100%|██████████| 330/330 [00:21, 15.19it/s, step size=1.84e-01, acc. prob=0.976]


[outer 006] TRAIN (EMA+K-ens) ll=0.6884  br=0.2472  acc=0.6190


Sample: 100%|██████████| 330/330 [00:19, 16.64it/s, step size=2.16e-01, acc. prob=0.960]


[outer 007] TRAIN (EMA+K-ens) ll=0.6841  br=0.2451  acc=0.6970


Sample: 100%|██████████| 330/330 [00:19, 17.10it/s, step size=2.26e-01, acc. prob=0.949]


[outer 008] TRAIN (EMA+K-ens) ll=0.6790  br=0.2427  acc=0.6600


Sample: 100%|██████████| 330/330 [00:19, 17.07it/s, step size=2.10e-01, acc. prob=0.909]


[outer 009] TRAIN (EMA+K-ens) ll=0.6818  br=0.2441  acc=0.6970


Sample: 100%|██████████| 330/330 [00:22, 14.62it/s, step size=1.75e-01, acc. prob=0.966]


[outer 010] TRAIN (EMA+K-ens) ll=0.6752  br=0.2408  acc=0.6980


Sample: 100%|██████████| 330/330 [00:21, 15.03it/s, step size=2.01e-01, acc. prob=0.920]


[outer 011] TRAIN (EMA+K-ens) ll=0.6703  br=0.2384  acc=0.6980


Sample: 100%|██████████| 330/330 [00:22, 14.79it/s, step size=1.44e-01, acc. prob=0.954]


[outer 012] TRAIN (EMA+K-ens) ll=0.6683  br=0.2375  acc=0.6980


Sample: 100%|██████████| 330/330 [00:18, 18.28it/s, step size=1.79e-01, acc. prob=0.957]


[outer 013] TRAIN (EMA+K-ens) ll=0.6591  br=0.2330  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 15.75it/s, step size=1.96e-01, acc. prob=0.940]


[outer 014] TRAIN (EMA+K-ens) ll=0.6534  br=0.2303  acc=0.6980


Sample: 100%|██████████| 330/330 [00:21, 15.39it/s, step size=1.49e-01, acc. prob=0.965]


[outer 015] TRAIN (EMA+K-ens) ll=0.6519  br=0.2295  acc=0.6950


Sample: 100%|██████████| 330/330 [00:21, 15.33it/s, step size=1.81e-01, acc. prob=0.961]


[outer 016] TRAIN (EMA+K-ens) ll=0.6501  br=0.2286  acc=0.6930


Sample: 100%|██████████| 330/330 [00:22, 14.82it/s, step size=1.49e-01, acc. prob=0.960]


[outer 017] TRAIN (EMA+K-ens) ll=0.6577  br=0.2323  acc=0.6590


Sample: 100%|██████████| 330/330 [00:20, 16.22it/s, step size=2.48e-01, acc. prob=0.918]


[outer 018] TRAIN (EMA+K-ens) ll=0.6578  br=0.2323  acc=0.6780


Sample: 100%|██████████| 330/330 [00:21, 15.55it/s, step size=1.97e-01, acc. prob=0.947]


[outer 019] TRAIN (EMA+K-ens) ll=0.6579  br=0.2323  acc=0.6790


Sample: 100%|██████████| 330/330 [00:22, 14.59it/s, step size=1.96e-01, acc. prob=0.951]


[outer 020] TRAIN (EMA+K-ens) ll=0.6645  br=0.2356  acc=0.6940


Sample: 100%|██████████| 330/330 [00:22, 14.71it/s, step size=1.91e-01, acc. prob=0.962]


[outer 021] TRAIN (EMA+K-ens) ll=0.6671  br=0.2367  acc=0.6980


Sample: 100%|██████████| 330/330 [00:20, 15.82it/s, step size=2.08e-01, acc. prob=0.955]


[outer 022] TRAIN (EMA+K-ens) ll=0.6746  br=0.2404  acc=0.6980


Sample: 100%|██████████| 330/330 [00:22, 14.75it/s, step size=1.85e-01, acc. prob=0.960]


[outer 023] TRAIN (EMA+K-ens) ll=0.6797  br=0.2428  acc=0.6980


Sample: 100%|██████████| 330/330 [00:23, 14.00it/s, step size=2.01e-01, acc. prob=0.963]


[outer 024] TRAIN (EMA+K-ens) ll=0.6797  br=0.2427  acc=0.6820


Sample: 100%|██████████| 330/330 [00:19, 16.88it/s, step size=2.05e-01, acc. prob=0.941]


[outer 025] TRAIN (EMA+K-ens) ll=0.6798  br=0.2424  acc=0.6980


Sample: 100%|██████████| 330/330 [00:22, 14.43it/s, step size=2.46e-01, acc. prob=0.927]


[outer 026] TRAIN (EMA+K-ens) ll=0.6817  br=0.2434  acc=0.6980


Sample: 100%|██████████| 330/330 [00:23, 13.89it/s, step size=1.84e-01, acc. prob=0.940]


[outer 027] TRAIN (EMA+K-ens) ll=0.6814  br=0.2435  acc=0.6960


Sample: 100%|██████████| 330/330 [00:23, 14.18it/s, step size=1.92e-01, acc. prob=0.945]


[outer 028] TRAIN (EMA+K-ens) ll=0.6667  br=0.2365  acc=0.6980


Sample: 100%|██████████| 330/330 [00:23, 14.25it/s, step size=1.78e-01, acc. prob=0.951]


[outer 029] TRAIN (EMA+K-ens) ll=0.6746  br=0.2403  acc=0.6960


Sample: 100%|██████████| 330/330 [00:20, 15.89it/s, step size=1.84e-01, acc. prob=0.958]


[outer 030] TRAIN (EMA+K-ens) ll=0.6779  br=0.2419  acc=0.6980


Sample: 100%|██████████| 330/330 [00:18, 17.45it/s, step size=2.37e-01, acc. prob=0.933]


[outer 031] TRAIN (EMA+K-ens) ll=0.6768  br=0.2414  acc=0.6980


Sample: 100%|██████████| 330/330 [00:20, 16.20it/s, step size=2.18e-01, acc. prob=0.947]


[outer 032] TRAIN (EMA+K-ens) ll=0.6729  br=0.2396  acc=0.6980


Sample: 100%|██████████| 330/330 [00:21, 15.19it/s, step size=1.76e-01, acc. prob=0.949]


[outer 033] TRAIN (EMA+K-ens) ll=0.6597  br=0.2332  acc=0.6980
[Early stop @ outer 33] Δll=0.364%, Δbr=0.464%, Δacc=0.002
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}, {'accuracy': 0.6538400053977966, 'brier': 0.2408851832151413, 'logloss': 0.6783546805381775}, {'accuracy': 0.6790399551391602, 'brier': 0.2522827386856079, 'logloss': 0.6986757516860962}, {'accuracy': 0.6533399820327759, 'brier': 0.2420029491186142, 'logloss': 0.6774113774299622}]


Sample: 100%|██████████| 330/330 [00:23, 14.33it/s, step size=2.01e-01, acc. prob=0.954]


[outer 000] TRAIN (EMA+K-ens) ll=0.6594  br=0.2324  acc=0.6510


Sample: 100%|██████████| 330/330 [00:20, 16.04it/s, step size=1.89e-01, acc. prob=0.946]


[outer 001] TRAIN (EMA+K-ens) ll=0.6560  br=0.2315  acc=0.6690


Sample: 100%|██████████| 330/330 [00:21, 15.15it/s, step size=1.72e-01, acc. prob=0.962]


[outer 002] TRAIN (EMA+K-ens) ll=0.6536  br=0.2305  acc=0.6410


Sample: 100%|██████████| 330/330 [00:22, 15.00it/s, step size=2.33e-01, acc. prob=0.920]


[outer 003] TRAIN (EMA+K-ens) ll=0.6457  br=0.2265  acc=0.6940


Sample: 100%|██████████| 330/330 [00:22, 14.90it/s, step size=2.17e-01, acc. prob=0.938]


[outer 004] TRAIN (EMA+K-ens) ll=0.6504  br=0.2287  acc=0.6950


Sample: 100%|██████████| 330/330 [00:19, 17.14it/s, step size=1.88e-01, acc. prob=0.947]


[outer 005] TRAIN (EMA+K-ens) ll=0.6472  br=0.2272  acc=0.6950


Sample: 100%|██████████| 330/330 [00:21, 15.66it/s, step size=2.05e-01, acc. prob=0.967]


[outer 006] TRAIN (EMA+K-ens) ll=0.6526  br=0.2299  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 16.33it/s, step size=2.14e-01, acc. prob=0.936]


[outer 007] TRAIN (EMA+K-ens) ll=0.6505  br=0.2288  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.76it/s, step size=1.58e-01, acc. prob=0.964]


[outer 008] TRAIN (EMA+K-ens) ll=0.6506  br=0.2288  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.40it/s, step size=1.73e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6600  br=0.2334  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 15.73it/s, step size=2.06e-01, acc. prob=0.954]


[outer 010] TRAIN (EMA+K-ens) ll=0.6652  br=0.2359  acc=0.6530


Sample: 100%|██████████| 330/330 [00:21, 15.69it/s, step size=1.60e-01, acc. prob=0.965]


[outer 011] TRAIN (EMA+K-ens) ll=0.6627  br=0.2347  acc=0.6720


Sample: 100%|██████████| 330/330 [00:21, 15.49it/s, step size=1.94e-01, acc. prob=0.946]


[outer 012] TRAIN (EMA+K-ens) ll=0.6568  br=0.2318  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.78it/s, step size=1.78e-01, acc. prob=0.961]


[outer 013] TRAIN (EMA+K-ens) ll=0.6627  br=0.2347  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.72it/s, step size=1.95e-01, acc. prob=0.947]


[outer 014] TRAIN (EMA+K-ens) ll=0.6584  br=0.2325  acc=0.6950


Sample: 100%|██████████| 330/330 [00:21, 15.40it/s, step size=1.96e-01, acc. prob=0.974]


[outer 015] TRAIN (EMA+K-ens) ll=0.6684  br=0.2374  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 15.95it/s, step size=2.19e-01, acc. prob=0.965]


[outer 016] TRAIN (EMA+K-ens) ll=0.6735  br=0.2398  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 15.95it/s, step size=1.77e-01, acc. prob=0.966]


[outer 017] TRAIN (EMA+K-ens) ll=0.6639  br=0.2352  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 16.07it/s, step size=2.18e-01, acc. prob=0.914]


[outer 018] TRAIN (EMA+K-ens) ll=0.6646  br=0.2354  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 16.32it/s, step size=2.55e-01, acc. prob=0.938]


[outer 019] TRAIN (EMA+K-ens) ll=0.6753  br=0.2406  acc=0.6890


Sample: 100%|██████████| 330/330 [00:21, 15.16it/s, step size=1.94e-01, acc. prob=0.958]


[outer 020] TRAIN (EMA+K-ens) ll=0.6811  br=0.2434  acc=0.6790


Sample: 100%|██████████| 330/330 [00:20, 16.39it/s, step size=1.95e-01, acc. prob=0.943]


[outer 021] TRAIN (EMA+K-ens) ll=0.6830  br=0.2445  acc=0.6760


Sample: 100%|██████████| 330/330 [00:20, 16.07it/s, step size=1.99e-01, acc. prob=0.930]


[outer 022] TRAIN (EMA+K-ens) ll=0.6736  br=0.2399  acc=0.6930


Sample: 100%|██████████| 330/330 [00:22, 14.52it/s, step size=2.13e-01, acc. prob=0.950]


[outer 023] TRAIN (EMA+K-ens) ll=0.6710  br=0.2387  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.77it/s, step size=1.84e-01, acc. prob=0.948]


[outer 024] TRAIN (EMA+K-ens) ll=0.6675  br=0.2370  acc=0.6940


Sample: 100%|██████████| 330/330 [00:22, 14.59it/s, step size=1.84e-01, acc. prob=0.964]


[outer 025] TRAIN (EMA+K-ens) ll=0.6656  br=0.2360  acc=0.6950


Sample: 100%|██████████| 330/330 [00:21, 15.40it/s, step size=2.24e-01, acc. prob=0.959]


[outer 026] TRAIN (EMA+K-ens) ll=0.6707  br=0.2385  acc=0.6940


Sample: 100%|██████████| 330/330 [00:21, 15.23it/s, step size=2.08e-01, acc. prob=0.937]


[outer 027] TRAIN (EMA+K-ens) ll=0.6735  br=0.2400  acc=0.6420


Sample: 100%|██████████| 330/330 [00:19, 17.28it/s, step size=2.07e-01, acc. prob=0.937]


[outer 028] TRAIN (EMA+K-ens) ll=0.6752  br=0.2408  acc=0.6390


Sample: 100%|██████████| 330/330 [00:22, 14.90it/s, step size=1.80e-01, acc. prob=0.963]


[outer 029] TRAIN (EMA+K-ens) ll=0.6726  br=0.2395  acc=0.6510


Sample: 100%|██████████| 330/330 [00:19, 17.30it/s, step size=2.10e-01, acc. prob=0.932]


[outer 030] TRAIN (EMA+K-ens) ll=0.6754  br=0.2408  acc=0.6650


Sample: 100%|██████████| 330/330 [00:21, 15.33it/s, step size=1.87e-01, acc. prob=0.945]


[outer 031] TRAIN (EMA+K-ens) ll=0.6722  br=0.2392  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.47it/s, step size=1.90e-01, acc. prob=0.953]


[outer 032] TRAIN (EMA+K-ens) ll=0.6760  br=0.2410  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.40it/s, step size=1.72e-01, acc. prob=0.937]


[outer 033] TRAIN (EMA+K-ens) ll=0.6840  br=0.2447  acc=0.6900


Sample: 100%|██████████| 330/330 [00:21, 15.68it/s, step size=1.87e-01, acc. prob=0.952]


[outer 034] TRAIN (EMA+K-ens) ll=0.6734  br=0.2397  acc=0.6950


Sample: 100%|██████████| 330/330 [00:22, 14.42it/s, step size=1.85e-01, acc. prob=0.936]


[outer 035] TRAIN (EMA+K-ens) ll=0.6747  br=0.2405  acc=0.6840


Sample: 100%|██████████| 330/330 [00:19, 16.72it/s, step size=2.18e-01, acc. prob=0.954]


[outer 036] TRAIN (EMA+K-ens) ll=0.6674  br=0.2370  acc=0.6910


Sample: 100%|██████████| 330/330 [00:22, 14.59it/s, step size=1.86e-01, acc. prob=0.970]


[outer 037] TRAIN (EMA+K-ens) ll=0.6615  br=0.2342  acc=0.6950


Sample: 100%|██████████| 330/330 [00:23, 14.31it/s, step size=1.86e-01, acc. prob=0.945]


[outer 038] TRAIN (EMA+K-ens) ll=0.6671  br=0.2369  acc=0.6950


Sample: 100%|██████████| 330/330 [00:20, 16.09it/s, step size=2.16e-01, acc. prob=0.963]


[outer 039] TRAIN (EMA+K-ens) ll=0.6642  br=0.2355  acc=0.6950
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}, {'accuracy': 0.6538400053977966, 'brier': 0.2408851832151413, 'logloss': 0.6783546805381775}, {'accuracy': 0.6790399551391602, 'brier': 0.2522827386856079, 'logloss': 0.6986757516860962}, {'accuracy': 0.6533399820327759, 'brier': 0.2420029491186142, 'logloss': 0.6774113774299622}, {'accuracy': 0.6782799959182739, 'brier': 0.2289452999830246, 'logloss': 0.6507743000984192}]


Sample: 100%|██████████| 330/330 [00:22, 14.42it/s, step size=1.76e-01, acc. prob=0.953]


[outer 000] TRAIN (EMA+K-ens) ll=0.6736  br=0.2402  acc=0.6170


Sample: 100%|██████████| 330/330 [00:21, 15.42it/s, step size=1.84e-01, acc. prob=0.952]


[outer 001] TRAIN (EMA+K-ens) ll=0.6432  br=0.2253  acc=0.7050


Sample: 100%|██████████| 330/330 [00:20, 15.90it/s, step size=1.80e-01, acc. prob=0.962]


[outer 002] TRAIN (EMA+K-ens) ll=0.6518  br=0.2293  acc=0.7100


Sample: 100%|██████████| 330/330 [00:20, 16.17it/s, step size=1.92e-01, acc. prob=0.938]


[outer 003] TRAIN (EMA+K-ens) ll=0.6466  br=0.2267  acc=0.7090


Sample: 100%|██████████| 330/330 [00:21, 15.45it/s, step size=1.71e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6601  br=0.2332  acc=0.7090


Sample: 100%|██████████| 330/330 [00:23, 14.16it/s, step size=2.01e-01, acc. prob=0.915]


[outer 005] TRAIN (EMA+K-ens) ll=0.6587  br=0.2325  acc=0.7090


Sample: 100%|██████████| 330/330 [00:22, 14.75it/s, step size=1.61e-01, acc. prob=0.973]


[outer 006] TRAIN (EMA+K-ens) ll=0.6541  br=0.2304  acc=0.7090


Sample: 100%|██████████| 330/330 [00:21, 15.70it/s, step size=1.87e-01, acc. prob=0.951]


[outer 007] TRAIN (EMA+K-ens) ll=0.6517  br=0.2292  acc=0.7090


Sample: 100%|██████████| 330/330 [00:20, 15.75it/s, step size=1.91e-01, acc. prob=0.927]


[outer 008] TRAIN (EMA+K-ens) ll=0.6567  br=0.2316  acc=0.7080


Sample: 100%|██████████| 330/330 [00:22, 14.61it/s, step size=1.90e-01, acc. prob=0.963]


[outer 009] TRAIN (EMA+K-ens) ll=0.6617  br=0.2338  acc=0.7070


Sample: 100%|██████████| 330/330 [00:21, 15.14it/s, step size=2.20e-01, acc. prob=0.953]


[outer 010] TRAIN (EMA+K-ens) ll=0.6617  br=0.2337  acc=0.7140


Sample: 100%|██████████| 330/330 [00:21, 15.32it/s, step size=1.50e-01, acc. prob=0.961]


[outer 011] TRAIN (EMA+K-ens) ll=0.6547  br=0.2304  acc=0.7090


Sample: 100%|██████████| 330/330 [00:21, 15.68it/s, step size=2.06e-01, acc. prob=0.960]


[outer 012] TRAIN (EMA+K-ens) ll=0.6450  br=0.2259  acc=0.7090


Sample: 100%|██████████| 330/330 [00:21, 15.47it/s, step size=2.13e-01, acc. prob=0.954]


[outer 013] TRAIN (EMA+K-ens) ll=0.6459  br=0.2263  acc=0.7090


Sample: 100%|██████████| 330/330 [00:21, 15.09it/s, step size=1.93e-01, acc. prob=0.953]


[outer 014] TRAIN (EMA+K-ens) ll=0.6492  br=0.2278  acc=0.7170


Sample: 100%|██████████| 330/330 [00:22, 14.85it/s, step size=1.77e-01, acc. prob=0.956]


[outer 015] TRAIN (EMA+K-ens) ll=0.6543  br=0.2303  acc=0.7110
[Early stop @ outer 15] Δll=0.159%, Δbr=0.303%, Δacc=0.002
[{'accuracy': 0.6675599813461304, 'brier': 0.24638380110263824, 'logloss': 0.6881276369094849}, {'accuracy': 0.6920799612998962, 'brier': 0.2348405420780182, 'logloss': 0.6632438898086548}, {'accuracy': 0.6967799663543701, 'brier': 0.2275390326976776, 'logloss': 0.6475812196731567}, {'accuracy': 0.65065997838974, 'brier': 0.2482144832611084, 'logloss': 0.6926732063293457}, {'accuracy': 0.6920199990272522, 'brier': 0.22573788464069366, 'logloss': 0.6445251107215881}, {'accuracy': 0.6538400053977966, 'brier': 0.2408851832151413, 'logloss': 0.6783546805381775}, {'accuracy': 0.6790399551391602, 'brier': 0.2522827386856079, 'logloss': 0.6986757516860962}, {'accuracy': 0.6533399820327759, 'brier': 0.2420029491186142, 'logloss': 0.6774113774299622}, {'accuracy': 0.6782799959182739, 'brier': 0.2289452999830246, 'logloss': 0.6507743000984192}, {'accuracy': 0.6693599820137024

In [None]:
all_metrics = []
noise_type = "normal"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=True, nonlinear=True, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=2.80e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6735  br=0.2400  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=3.01e-01, acc. prob=0.939]


[outer 001] TRAIN (EMA+K-ens) ll=0.6628  br=0.2345  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=2.97e-01, acc. prob=0.947]


[outer 002] TRAIN (EMA+K-ens) ll=0.6683  br=0.2371  acc=0.6250


Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=3.24e-01, acc. prob=0.937]


[outer 003] TRAIN (EMA+K-ens) ll=0.6705  br=0.2382  acc=0.6360


Sample: 100%|██████████| 330/330 [00:19, 17.15it/s, step size=2.94e-01, acc. prob=0.935]


[outer 004] TRAIN (EMA+K-ens) ll=0.6667  br=0.2362  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=2.65e-01, acc. prob=0.947]


[outer 005] TRAIN (EMA+K-ens) ll=0.6561  br=0.2313  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.54e-01, acc. prob=0.949]


[outer 006] TRAIN (EMA+K-ens) ll=0.6626  br=0.2345  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=2.93e-01, acc. prob=0.951]


[outer 007] TRAIN (EMA+K-ens) ll=0.6668  br=0.2365  acc=0.6640


Sample: 100%|██████████| 330/330 [00:18, 17.47it/s, step size=2.45e-01, acc. prob=0.955]


[outer 008] TRAIN (EMA+K-ens) ll=0.6724  br=0.2391  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=2.62e-01, acc. prob=0.951]


[outer 009] TRAIN (EMA+K-ens) ll=0.6761  br=0.2410  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=3.22e-01, acc. prob=0.918]


[outer 010] TRAIN (EMA+K-ens) ll=0.6792  br=0.2426  acc=0.6290


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.74e-01, acc. prob=0.960]


[outer 011] TRAIN (EMA+K-ens) ll=0.6767  br=0.2413  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 19.74it/s, step size=3.58e-01, acc. prob=0.918]


[outer 012] TRAIN (EMA+K-ens) ll=0.6810  br=0.2435  acc=0.6540


Sample: 100%|██████████| 330/330 [00:16, 20.17it/s, step size=2.48e-01, acc. prob=0.960]


[outer 013] TRAIN (EMA+K-ens) ll=0.6905  br=0.2478  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.75e-01, acc. prob=0.951]


[outer 014] TRAIN (EMA+K-ens) ll=0.6890  br=0.2470  acc=0.6460


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.93e-01, acc. prob=0.941]


[outer 015] TRAIN (EMA+K-ens) ll=0.6865  br=0.2458  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.30it/s, step size=2.86e-01, acc. prob=0.949]


[outer 016] TRAIN (EMA+K-ens) ll=0.6874  br=0.2463  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=2.94e-01, acc. prob=0.936]


[outer 017] TRAIN (EMA+K-ens) ll=0.6865  br=0.2458  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.41it/s, step size=2.64e-01, acc. prob=0.935]


[outer 018] TRAIN (EMA+K-ens) ll=0.6838  br=0.2445  acc=0.6460


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=2.65e-01, acc. prob=0.947]


[outer 019] TRAIN (EMA+K-ens) ll=0.6871  br=0.2461  acc=0.6380


Sample: 100%|██████████| 330/330 [00:16, 19.87it/s, step size=2.65e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6831  br=0.2444  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.80e-01, acc. prob=0.905]


[outer 021] TRAIN (EMA+K-ens) ll=0.6816  br=0.2438  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=2.65e-01, acc. prob=0.949]


[outer 022] TRAIN (EMA+K-ens) ll=0.6736  br=0.2399  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=3.26e-01, acc. prob=0.926]


[outer 023] TRAIN (EMA+K-ens) ll=0.6751  br=0.2406  acc=0.6700


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=3.00e-01, acc. prob=0.946]


[outer 024] TRAIN (EMA+K-ens) ll=0.6680  br=0.2371  acc=0.6570


Sample: 100%|██████████| 330/330 [00:18, 18.29it/s, step size=2.59e-01, acc. prob=0.955]


[outer 025] TRAIN (EMA+K-ens) ll=0.6635  br=0.2348  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 20.69it/s, step size=3.19e-01, acc. prob=0.946]


[outer 026] TRAIN (EMA+K-ens) ll=0.6631  br=0.2344  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.47it/s, step size=2.88e-01, acc. prob=0.960]


[outer 027] TRAIN (EMA+K-ens) ll=0.6649  br=0.2351  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.41it/s, step size=3.11e-01, acc. prob=0.932]


[outer 028] TRAIN (EMA+K-ens) ll=0.6644  br=0.2348  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.88e-01, acc. prob=0.913]


[outer 029] TRAIN (EMA+K-ens) ll=0.6643  br=0.2346  acc=0.6860


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=2.86e-01, acc. prob=0.941]


[outer 030] TRAIN (EMA+K-ens) ll=0.6698  br=0.2375  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.54e-01, acc. prob=0.958]


[outer 031] TRAIN (EMA+K-ens) ll=0.6687  br=0.2369  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=3.11e-01, acc. prob=0.927]


[outer 032] TRAIN (EMA+K-ens) ll=0.6730  br=0.2391  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.55it/s, step size=2.48e-01, acc. prob=0.958]


[outer 033] TRAIN (EMA+K-ens) ll=0.6764  br=0.2409  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.90e-01, acc. prob=0.929]


[outer 034] TRAIN (EMA+K-ens) ll=0.6811  br=0.2432  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 20.23it/s, step size=2.63e-01, acc. prob=0.943]


[outer 035] TRAIN (EMA+K-ens) ll=0.6756  br=0.2406  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=2.76e-01, acc. prob=0.952]


[outer 036] TRAIN (EMA+K-ens) ll=0.6760  br=0.2410  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=2.80e-01, acc. prob=0.941]


[outer 037] TRAIN (EMA+K-ens) ll=0.6788  br=0.2422  acc=0.6360


Sample: 100%|██████████| 330/330 [00:15, 20.78it/s, step size=2.94e-01, acc. prob=0.949]


[outer 038] TRAIN (EMA+K-ens) ll=0.6791  br=0.2423  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=3.40e-01, acc. prob=0.926]


[outer 039] TRAIN (EMA+K-ens) ll=0.6784  br=0.2421  acc=0.6620
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}]


Sample: 100%|██████████| 330/330 [00:18, 17.84it/s, step size=2.63e-01, acc. prob=0.946]


[outer 000] TRAIN (EMA+K-ens) ll=0.6836  br=0.2451  acc=0.5810


Sample: 100%|██████████| 330/330 [00:16, 20.20it/s, step size=3.26e-01, acc. prob=0.904]


[outer 001] TRAIN (EMA+K-ens) ll=0.6772  br=0.2419  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=2.59e-01, acc. prob=0.947]


[outer 002] TRAIN (EMA+K-ens) ll=0.6785  br=0.2427  acc=0.5810


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.77e-01, acc. prob=0.933]


[outer 003] TRAIN (EMA+K-ens) ll=0.6742  br=0.2405  acc=0.6190


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=2.95e-01, acc. prob=0.919]


[outer 004] TRAIN (EMA+K-ens) ll=0.6715  br=0.2392  acc=0.6390


Sample: 100%|██████████| 330/330 [00:15, 21.65it/s, step size=3.50e-01, acc. prob=0.872]


[outer 005] TRAIN (EMA+K-ens) ll=0.6774  br=0.2419  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 20.37it/s, step size=3.11e-01, acc. prob=0.925]


[outer 006] TRAIN (EMA+K-ens) ll=0.6776  br=0.2420  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=2.77e-01, acc. prob=0.943]


[outer 007] TRAIN (EMA+K-ens) ll=0.6826  br=0.2443  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=3.04e-01, acc. prob=0.944]


[outer 008] TRAIN (EMA+K-ens) ll=0.6879  br=0.2468  acc=0.6610


Sample: 100%|██████████| 330/330 [00:18, 17.96it/s, step size=3.08e-01, acc. prob=0.921]


[outer 009] TRAIN (EMA+K-ens) ll=0.6916  br=0.2485  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.88e-01, acc. prob=0.944]


[outer 010] TRAIN (EMA+K-ens) ll=0.6909  br=0.2479  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=3.09e-01, acc. prob=0.944]


[outer 011] TRAIN (EMA+K-ens) ll=0.6913  br=0.2480  acc=0.6210


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=2.88e-01, acc. prob=0.955]


[outer 012] TRAIN (EMA+K-ens) ll=0.6898  br=0.2472  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=3.03e-01, acc. prob=0.934]


[outer 013] TRAIN (EMA+K-ens) ll=0.6852  br=0.2450  acc=0.6230


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=3.01e-01, acc. prob=0.950]


[outer 014] TRAIN (EMA+K-ens) ll=0.6853  br=0.2449  acc=0.6540


Sample: 100%|██████████| 330/330 [00:18, 18.20it/s, step size=3.09e-01, acc. prob=0.921]


[outer 015] TRAIN (EMA+K-ens) ll=0.6821  br=0.2436  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.94e-01, acc. prob=0.953]


[outer 016] TRAIN (EMA+K-ens) ll=0.6769  br=0.2412  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.93e-01, acc. prob=0.935]


[outer 017] TRAIN (EMA+K-ens) ll=0.6740  br=0.2398  acc=0.6820


Sample: 100%|██████████| 330/330 [00:19, 17.29it/s, step size=2.68e-01, acc. prob=0.955]


[outer 018] TRAIN (EMA+K-ens) ll=0.6697  br=0.2376  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=2.57e-01, acc. prob=0.935]


[outer 019] TRAIN (EMA+K-ens) ll=0.6709  br=0.2383  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 18.67it/s, step size=3.04e-01, acc. prob=0.937]


[outer 020] TRAIN (EMA+K-ens) ll=0.6748  br=0.2402  acc=0.6890


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=3.41e-01, acc. prob=0.910]


[outer 021] TRAIN (EMA+K-ens) ll=0.6732  br=0.2395  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=3.06e-01, acc. prob=0.938]


[outer 022] TRAIN (EMA+K-ens) ll=0.6736  br=0.2395  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=3.04e-01, acc. prob=0.913]


[outer 023] TRAIN (EMA+K-ens) ll=0.6690  br=0.2375  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=3.13e-01, acc. prob=0.912]


[outer 024] TRAIN (EMA+K-ens) ll=0.6659  br=0.2359  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.99e-01, acc. prob=0.933]


[outer 025] TRAIN (EMA+K-ens) ll=0.6618  br=0.2339  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 20.35it/s, step size=3.12e-01, acc. prob=0.900]


[outer 026] TRAIN (EMA+K-ens) ll=0.6648  br=0.2355  acc=0.6760


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=3.10e-01, acc. prob=0.902]


[outer 027] TRAIN (EMA+K-ens) ll=0.6643  br=0.2353  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=2.70e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6630  br=0.2346  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.66it/s, step size=3.21e-01, acc. prob=0.928]


[outer 029] TRAIN (EMA+K-ens) ll=0.6677  br=0.2369  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=3.18e-01, acc. prob=0.957]


[outer 030] TRAIN (EMA+K-ens) ll=0.6689  br=0.2375  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=3.17e-01, acc. prob=0.904]


[outer 031] TRAIN (EMA+K-ens) ll=0.6697  br=0.2376  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=3.03e-01, acc. prob=0.931]


[outer 032] TRAIN (EMA+K-ens) ll=0.6758  br=0.2408  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=3.39e-01, acc. prob=0.903]


[outer 033] TRAIN (EMA+K-ens) ll=0.6774  br=0.2417  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=2.59e-01, acc. prob=0.945]


[outer 034] TRAIN (EMA+K-ens) ll=0.6768  br=0.2414  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=3.09e-01, acc. prob=0.912]


[outer 035] TRAIN (EMA+K-ens) ll=0.6812  br=0.2434  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.02e-01, acc. prob=0.939]


[outer 036] TRAIN (EMA+K-ens) ll=0.6821  br=0.2439  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=2.77e-01, acc. prob=0.948]


[outer 037] TRAIN (EMA+K-ens) ll=0.6803  br=0.2432  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.53e-01, acc. prob=0.928]


[outer 038] TRAIN (EMA+K-ens) ll=0.6754  br=0.2409  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.89it/s, step size=2.52e-01, acc. prob=0.958]


[outer 039] TRAIN (EMA+K-ens) ll=0.6779  br=0.2421  acc=0.6450
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}]


Sample: 100%|██████████| 330/330 [00:16, 19.91it/s, step size=3.00e-01, acc. prob=0.932]


[outer 000] TRAIN (EMA+K-ens) ll=0.7130  br=0.2586  acc=0.5780


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=3.23e-01, acc. prob=0.925]


[outer 001] TRAIN (EMA+K-ens) ll=0.6775  br=0.2419  acc=0.6320


Sample: 100%|██████████| 330/330 [00:15, 21.26it/s, step size=3.43e-01, acc. prob=0.916]


[outer 002] TRAIN (EMA+K-ens) ll=0.6618  br=0.2342  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=2.59e-01, acc. prob=0.930]


[outer 003] TRAIN (EMA+K-ens) ll=0.6716  br=0.2388  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=3.15e-01, acc. prob=0.929]


[outer 004] TRAIN (EMA+K-ens) ll=0.6679  br=0.2370  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 18.07it/s, step size=2.45e-01, acc. prob=0.958]


[outer 005] TRAIN (EMA+K-ens) ll=0.6662  br=0.2361  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=2.83e-01, acc. prob=0.947]


[outer 006] TRAIN (EMA+K-ens) ll=0.6706  br=0.2380  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=3.29e-01, acc. prob=0.918]


[outer 007] TRAIN (EMA+K-ens) ll=0.6702  br=0.2378  acc=0.6960


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=3.22e-01, acc. prob=0.930]


[outer 008] TRAIN (EMA+K-ens) ll=0.6677  br=0.2364  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.70e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6735  br=0.2386  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 18.84it/s, step size=2.98e-01, acc. prob=0.938]


[outer 010] TRAIN (EMA+K-ens) ll=0.6789  br=0.2414  acc=0.6320


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=2.47e-01, acc. prob=0.946]


[outer 011] TRAIN (EMA+K-ens) ll=0.6730  br=0.2388  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.19it/s, step size=3.64e-01, acc. prob=0.865]


[outer 012] TRAIN (EMA+K-ens) ll=0.6781  br=0.2412  acc=0.6660


Sample: 100%|██████████| 330/330 [00:18, 18.20it/s, step size=2.67e-01, acc. prob=0.952]


[outer 013] TRAIN (EMA+K-ens) ll=0.6747  br=0.2399  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.35it/s, step size=2.54e-01, acc. prob=0.967]


[outer 014] TRAIN (EMA+K-ens) ll=0.6747  br=0.2402  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.13it/s, step size=2.78e-01, acc. prob=0.942]


[outer 015] TRAIN (EMA+K-ens) ll=0.6763  br=0.2409  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=3.71e-01, acc. prob=0.918]


[outer 016] TRAIN (EMA+K-ens) ll=0.6761  br=0.2407  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 19.82it/s, step size=2.59e-01, acc. prob=0.941]


[outer 017] TRAIN (EMA+K-ens) ll=0.6758  br=0.2407  acc=0.6780


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=2.52e-01, acc. prob=0.945]


[outer 018] TRAIN (EMA+K-ens) ll=0.6756  br=0.2406  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.42it/s, step size=2.73e-01, acc. prob=0.930]


[outer 019] TRAIN (EMA+K-ens) ll=0.6740  br=0.2397  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=3.22e-01, acc. prob=0.929]


[outer 020] TRAIN (EMA+K-ens) ll=0.6729  br=0.2391  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=2.40e-01, acc. prob=0.956]


[outer 021] TRAIN (EMA+K-ens) ll=0.6785  br=0.2415  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=3.19e-01, acc. prob=0.937]


[outer 022] TRAIN (EMA+K-ens) ll=0.6789  br=0.2417  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.60e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6788  br=0.2418  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.56it/s, step size=2.94e-01, acc. prob=0.925]


[outer 024] TRAIN (EMA+K-ens) ll=0.6802  br=0.2426  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 17.58it/s, step size=3.44e-01, acc. prob=0.881]


[outer 025] TRAIN (EMA+K-ens) ll=0.6813  br=0.2430  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=3.64e-01, acc. prob=0.911]


[outer 026] TRAIN (EMA+K-ens) ll=0.6795  br=0.2423  acc=0.6810


Sample: 100%|██████████| 330/330 [00:18, 17.85it/s, step size=2.85e-01, acc. prob=0.940]


[outer 027] TRAIN (EMA+K-ens) ll=0.6787  br=0.2420  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=3.11e-01, acc. prob=0.941]


[outer 028] TRAIN (EMA+K-ens) ll=0.6769  br=0.2411  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.58e-01, acc. prob=0.919]


[outer 029] TRAIN (EMA+K-ens) ll=0.6767  br=0.2411  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=3.19e-01, acc. prob=0.956]


[outer 030] TRAIN (EMA+K-ens) ll=0.6729  br=0.2390  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=3.29e-01, acc. prob=0.932]


[outer 031] TRAIN (EMA+K-ens) ll=0.6740  br=0.2394  acc=0.7080


Sample: 100%|██████████| 330/330 [00:17, 19.00it/s, step size=3.09e-01, acc. prob=0.932]


[outer 032] TRAIN (EMA+K-ens) ll=0.6738  br=0.2395  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 19.91it/s, step size=2.98e-01, acc. prob=0.939]


[outer 033] TRAIN (EMA+K-ens) ll=0.6739  br=0.2397  acc=0.7010


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.69e-01, acc. prob=0.917]


[outer 034] TRAIN (EMA+K-ens) ll=0.6752  br=0.2401  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=2.94e-01, acc. prob=0.925]


[outer 035] TRAIN (EMA+K-ens) ll=0.6783  br=0.2419  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.31it/s, step size=3.09e-01, acc. prob=0.938]


[outer 036] TRAIN (EMA+K-ens) ll=0.6765  br=0.2411  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=3.17e-01, acc. prob=0.934]


[outer 037] TRAIN (EMA+K-ens) ll=0.6766  br=0.2412  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.76e-01, acc. prob=0.936]


[outer 038] TRAIN (EMA+K-ens) ll=0.6802  br=0.2431  acc=0.6880
[Early stop @ outer 38] Δll=0.115%, Δbr=0.078%, Δacc=0.002
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}]


Sample: 100%|██████████| 330/330 [00:17, 18.90it/s, step size=3.44e-01, acc. prob=0.938]


[outer 000] TRAIN (EMA+K-ens) ll=0.6614  br=0.2338  acc=0.6720


Sample: 100%|██████████| 330/330 [00:18, 18.03it/s, step size=2.53e-01, acc. prob=0.945]


[outer 001] TRAIN (EMA+K-ens) ll=0.6738  br=0.2403  acc=0.5990


Sample: 100%|██████████| 330/330 [00:17, 18.85it/s, step size=2.64e-01, acc. prob=0.950]


[outer 002] TRAIN (EMA+K-ens) ll=0.6673  br=0.2369  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=3.05e-01, acc. prob=0.932]


[outer 003] TRAIN (EMA+K-ens) ll=0.6676  br=0.2369  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=3.10e-01, acc. prob=0.920]


[outer 004] TRAIN (EMA+K-ens) ll=0.6675  br=0.2367  acc=0.6930


Sample: 100%|██████████| 330/330 [00:18, 18.14it/s, step size=3.32e-01, acc. prob=0.902]


[outer 005] TRAIN (EMA+K-ens) ll=0.6605  br=0.2334  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.14e-01, acc. prob=0.936]


[outer 006] TRAIN (EMA+K-ens) ll=0.6630  br=0.2346  acc=0.7030


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=2.60e-01, acc. prob=0.942]


[outer 007] TRAIN (EMA+K-ens) ll=0.6624  br=0.2344  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=3.04e-01, acc. prob=0.948]


[outer 008] TRAIN (EMA+K-ens) ll=0.6656  br=0.2360  acc=0.6990


Sample: 100%|██████████| 330/330 [00:17, 18.97it/s, step size=2.84e-01, acc. prob=0.936]


[outer 009] TRAIN (EMA+K-ens) ll=0.6657  br=0.2361  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=2.49e-01, acc. prob=0.956]


[outer 010] TRAIN (EMA+K-ens) ll=0.6621  br=0.2343  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 19.09it/s, step size=3.02e-01, acc. prob=0.900]


[outer 011] TRAIN (EMA+K-ens) ll=0.6665  br=0.2364  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.95e-01, acc. prob=0.927]


[outer 012] TRAIN (EMA+K-ens) ll=0.6651  br=0.2357  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=3.29e-01, acc. prob=0.934]


[outer 013] TRAIN (EMA+K-ens) ll=0.6710  br=0.2387  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.86e-01, acc. prob=0.941]


[outer 014] TRAIN (EMA+K-ens) ll=0.6701  br=0.2382  acc=0.6810


Sample: 100%|██████████| 330/330 [00:18, 17.44it/s, step size=2.83e-01, acc. prob=0.944]


[outer 015] TRAIN (EMA+K-ens) ll=0.6678  br=0.2371  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 19.65it/s, step size=2.53e-01, acc. prob=0.947]


[outer 016] TRAIN (EMA+K-ens) ll=0.6669  br=0.2366  acc=0.6930


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.83e-01, acc. prob=0.928]


[outer 017] TRAIN (EMA+K-ens) ll=0.6651  br=0.2356  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=2.97e-01, acc. prob=0.937]


[outer 018] TRAIN (EMA+K-ens) ll=0.6701  br=0.2381  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=3.01e-01, acc. prob=0.943]


[outer 019] TRAIN (EMA+K-ens) ll=0.6654  br=0.2359  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 20.18it/s, step size=2.90e-01, acc. prob=0.933]


[outer 020] TRAIN (EMA+K-ens) ll=0.6645  br=0.2355  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=3.06e-01, acc. prob=0.916]


[outer 021] TRAIN (EMA+K-ens) ll=0.6705  br=0.2384  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=3.05e-01, acc. prob=0.919]


[outer 022] TRAIN (EMA+K-ens) ll=0.6711  br=0.2388  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 19.36it/s, step size=3.11e-01, acc. prob=0.941]


[outer 023] TRAIN (EMA+K-ens) ll=0.6717  br=0.2390  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.90e-01, acc. prob=0.930]


[outer 024] TRAIN (EMA+K-ens) ll=0.6747  br=0.2404  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=2.70e-01, acc. prob=0.936]


[outer 025] TRAIN (EMA+K-ens) ll=0.6769  br=0.2416  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=3.02e-01, acc. prob=0.946]


[outer 026] TRAIN (EMA+K-ens) ll=0.6751  br=0.2406  acc=0.6840


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=2.36e-01, acc. prob=0.960]


[outer 027] TRAIN (EMA+K-ens) ll=0.6756  br=0.2408  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.21e-01, acc. prob=0.921]


[outer 028] TRAIN (EMA+K-ens) ll=0.6741  br=0.2400  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=2.79e-01, acc. prob=0.947]


[outer 029] TRAIN (EMA+K-ens) ll=0.6656  br=0.2358  acc=0.6810


Sample: 100%|██████████| 330/330 [00:18, 17.87it/s, step size=2.58e-01, acc. prob=0.949]


[outer 030] TRAIN (EMA+K-ens) ll=0.6637  br=0.2348  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=2.53e-01, acc. prob=0.933]


[outer 031] TRAIN (EMA+K-ens) ll=0.6653  br=0.2356  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 21.20it/s, step size=2.69e-01, acc. prob=0.949]


[outer 032] TRAIN (EMA+K-ens) ll=0.6624  br=0.2341  acc=0.7010
[Early stop @ outer 32] Δll=0.279%, Δbr=0.460%, Δacc=0.003
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}]


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.77e-01, acc. prob=0.926]


[outer 000] TRAIN (EMA+K-ens) ll=0.7415  br=0.2719  acc=0.4550


Sample: 100%|██████████| 330/330 [00:16, 20.13it/s, step size=3.07e-01, acc. prob=0.924]


[outer 001] TRAIN (EMA+K-ens) ll=0.7107  br=0.2575  acc=0.5410


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=3.51e-01, acc. prob=0.917]


[outer 002] TRAIN (EMA+K-ens) ll=0.6990  br=0.2522  acc=0.6330


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=2.34e-01, acc. prob=0.968]


[outer 003] TRAIN (EMA+K-ens) ll=0.6914  br=0.2487  acc=0.6540


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=2.66e-01, acc. prob=0.956]


[outer 004] TRAIN (EMA+K-ens) ll=0.6931  br=0.2496  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=2.99e-01, acc. prob=0.921]


[outer 005] TRAIN (EMA+K-ens) ll=0.6949  br=0.2505  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 18.35it/s, step size=2.89e-01, acc. prob=0.960]


[outer 006] TRAIN (EMA+K-ens) ll=0.6950  br=0.2505  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.89e-01, acc. prob=0.932]


[outer 007] TRAIN (EMA+K-ens) ll=0.6994  br=0.2527  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 20.17it/s, step size=2.47e-01, acc. prob=0.965]


[outer 008] TRAIN (EMA+K-ens) ll=0.6971  br=0.2515  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 18.57it/s, step size=2.93e-01, acc. prob=0.914]


[outer 009] TRAIN (EMA+K-ens) ll=0.6996  br=0.2526  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.51e-01, acc. prob=0.941]


[outer 010] TRAIN (EMA+K-ens) ll=0.6996  br=0.2526  acc=0.6460


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=2.93e-01, acc. prob=0.963]


[outer 011] TRAIN (EMA+K-ens) ll=0.7038  br=0.2546  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.74e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.7026  br=0.2539  acc=0.6210


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=2.94e-01, acc. prob=0.928]


[outer 013] TRAIN (EMA+K-ens) ll=0.7019  br=0.2536  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=2.68e-01, acc. prob=0.943]


[outer 014] TRAIN (EMA+K-ens) ll=0.6960  br=0.2507  acc=0.6210


Sample: 100%|██████████| 330/330 [00:18, 17.88it/s, step size=2.61e-01, acc. prob=0.955]


[outer 015] TRAIN (EMA+K-ens) ll=0.6854  br=0.2456  acc=0.6400


Sample: 100%|██████████| 330/330 [00:18, 17.58it/s, step size=2.68e-01, acc. prob=0.959]


[outer 016] TRAIN (EMA+K-ens) ll=0.6768  br=0.2414  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.35e-01, acc. prob=0.960]


[outer 017] TRAIN (EMA+K-ens) ll=0.6764  br=0.2412  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=2.96e-01, acc. prob=0.940]


[outer 018] TRAIN (EMA+K-ens) ll=0.6764  br=0.2413  acc=0.6220


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=2.91e-01, acc. prob=0.936]


[outer 019] TRAIN (EMA+K-ens) ll=0.6701  br=0.2383  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.83e-01, acc. prob=0.947]


[outer 020] TRAIN (EMA+K-ens) ll=0.6728  br=0.2396  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.14e-01, acc. prob=0.920]


[outer 021] TRAIN (EMA+K-ens) ll=0.6682  br=0.2373  acc=0.6690


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=3.20e-01, acc. prob=0.931]


[outer 022] TRAIN (EMA+K-ens) ll=0.6737  br=0.2400  acc=0.6480


Sample: 100%|██████████| 330/330 [00:18, 17.94it/s, step size=2.52e-01, acc. prob=0.965]


[outer 023] TRAIN (EMA+K-ens) ll=0.6729  br=0.2395  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=2.22e-01, acc. prob=0.967]


[outer 024] TRAIN (EMA+K-ens) ll=0.6790  br=0.2426  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.20e-01, acc. prob=0.964]


[outer 025] TRAIN (EMA+K-ens) ll=0.6825  br=0.2443  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=3.00e-01, acc. prob=0.955]


[outer 026] TRAIN (EMA+K-ens) ll=0.6806  br=0.2434  acc=0.6250


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=3.20e-01, acc. prob=0.935]


[outer 027] TRAIN (EMA+K-ens) ll=0.6860  br=0.2458  acc=0.6190


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=3.03e-01, acc. prob=0.922]


[outer 028] TRAIN (EMA+K-ens) ll=0.6865  br=0.2461  acc=0.6120


Sample: 100%|██████████| 330/330 [00:18, 17.44it/s, step size=2.78e-01, acc. prob=0.966]


[outer 029] TRAIN (EMA+K-ens) ll=0.6852  br=0.2454  acc=0.6300


Sample: 100%|██████████| 330/330 [00:15, 21.79it/s, step size=3.25e-01, acc. prob=0.908]


[outer 030] TRAIN (EMA+K-ens) ll=0.6862  br=0.2459  acc=0.6220


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=2.84e-01, acc. prob=0.958]


[outer 031] TRAIN (EMA+K-ens) ll=0.6922  br=0.2488  acc=0.6430


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=3.30e-01, acc. prob=0.962]


[outer 032] TRAIN (EMA+K-ens) ll=0.6901  br=0.2477  acc=0.6280


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=3.15e-01, acc. prob=0.935]


[outer 033] TRAIN (EMA+K-ens) ll=0.6848  br=0.2452  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.07e-01, acc. prob=0.937]


[outer 034] TRAIN (EMA+K-ens) ll=0.6839  br=0.2449  acc=0.6340


Sample: 100%|██████████| 330/330 [00:16, 20.38it/s, step size=3.08e-01, acc. prob=0.945]


[outer 035] TRAIN (EMA+K-ens) ll=0.6801  br=0.2431  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=3.07e-01, acc. prob=0.893]


[outer 036] TRAIN (EMA+K-ens) ll=0.6774  br=0.2418  acc=0.6340


Sample: 100%|██████████| 330/330 [00:18, 18.32it/s, step size=2.75e-01, acc. prob=0.938]


[outer 037] TRAIN (EMA+K-ens) ll=0.6861  br=0.2461  acc=0.6330


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.04e-01, acc. prob=0.951]


[outer 038] TRAIN (EMA+K-ens) ll=0.6862  br=0.2461  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.79e-01, acc. prob=0.917]


[outer 039] TRAIN (EMA+K-ens) ll=0.6866  br=0.2464  acc=0.6410
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}]


Sample: 100%|██████████| 330/330 [00:18, 18.24it/s, step size=3.04e-01, acc. prob=0.941]


[outer 000] TRAIN (EMA+K-ens) ll=0.6714  br=0.2391  acc=0.6250


Sample: 100%|██████████| 330/330 [00:17, 18.45it/s, step size=2.86e-01, acc. prob=0.950]


[outer 001] TRAIN (EMA+K-ens) ll=0.6747  br=0.2408  acc=0.5910


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=3.00e-01, acc. prob=0.950]


[outer 002] TRAIN (EMA+K-ens) ll=0.6761  br=0.2414  acc=0.5940


Sample: 100%|██████████| 330/330 [00:15, 20.72it/s, step size=3.12e-01, acc. prob=0.902]


[outer 003] TRAIN (EMA+K-ens) ll=0.6802  br=0.2432  acc=0.6040


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=3.08e-01, acc. prob=0.908]


[outer 004] TRAIN (EMA+K-ens) ll=0.6828  br=0.2444  acc=0.5870


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=2.17e-01, acc. prob=0.971]


[outer 005] TRAIN (EMA+K-ens) ll=0.6813  br=0.2437  acc=0.6190


Sample: 100%|██████████| 330/330 [00:18, 17.85it/s, step size=2.86e-01, acc. prob=0.946]


[outer 006] TRAIN (EMA+K-ens) ll=0.6868  br=0.2464  acc=0.5880


Sample: 100%|██████████| 330/330 [00:15, 20.69it/s, step size=2.95e-01, acc. prob=0.934]


[outer 007] TRAIN (EMA+K-ens) ll=0.6830  br=0.2446  acc=0.6220


Sample: 100%|██████████| 330/330 [00:15, 20.74it/s, step size=3.29e-01, acc. prob=0.915]


[outer 008] TRAIN (EMA+K-ens) ll=0.6841  br=0.2451  acc=0.6050


Sample: 100%|██████████| 330/330 [00:18, 17.98it/s, step size=3.12e-01, acc. prob=0.939]


[outer 009] TRAIN (EMA+K-ens) ll=0.6848  br=0.2455  acc=0.6150


Sample: 100%|██████████| 330/330 [00:15, 20.66it/s, step size=3.69e-01, acc. prob=0.917]


[outer 010] TRAIN (EMA+K-ens) ll=0.6834  br=0.2448  acc=0.6140


Sample: 100%|██████████| 330/330 [00:17, 18.47it/s, step size=2.99e-01, acc. prob=0.932]


[outer 011] TRAIN (EMA+K-ens) ll=0.6796  br=0.2431  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.22e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.6791  br=0.2428  acc=0.6410


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=3.10e-01, acc. prob=0.929]


[outer 013] TRAIN (EMA+K-ens) ll=0.6840  br=0.2452  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.97e-01, acc. prob=0.938]


[outer 014] TRAIN (EMA+K-ens) ll=0.6751  br=0.2408  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=3.06e-01, acc. prob=0.931]


[outer 015] TRAIN (EMA+K-ens) ll=0.6802  br=0.2432  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=3.25e-01, acc. prob=0.911]


[outer 016] TRAIN (EMA+K-ens) ll=0.6821  br=0.2442  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 19.96it/s, step size=2.97e-01, acc. prob=0.933]


[outer 017] TRAIN (EMA+K-ens) ll=0.6830  br=0.2447  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=2.79e-01, acc. prob=0.943]


[outer 018] TRAIN (EMA+K-ens) ll=0.6826  br=0.2444  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.92it/s, step size=3.32e-01, acc. prob=0.949]


[outer 019] TRAIN (EMA+K-ens) ll=0.6860  br=0.2461  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.85e-01, acc. prob=0.954]


[outer 020] TRAIN (EMA+K-ens) ll=0.6914  br=0.2487  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=2.90e-01, acc. prob=0.949]


[outer 021] TRAIN (EMA+K-ens) ll=0.6891  br=0.2475  acc=0.6150


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.56e-01, acc. prob=0.915]


[outer 022] TRAIN (EMA+K-ens) ll=0.6928  br=0.2492  acc=0.6130


Sample: 100%|██████████| 330/330 [00:15, 21.13it/s, step size=3.05e-01, acc. prob=0.933]


[outer 023] TRAIN (EMA+K-ens) ll=0.6919  br=0.2487  acc=0.6060


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=3.18e-01, acc. prob=0.918]


[outer 024] TRAIN (EMA+K-ens) ll=0.6896  br=0.2476  acc=0.6100


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.98e-01, acc. prob=0.932]


[outer 025] TRAIN (EMA+K-ens) ll=0.6925  br=0.2489  acc=0.6180


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=2.95e-01, acc. prob=0.946]


[outer 026] TRAIN (EMA+K-ens) ll=0.6937  br=0.2497  acc=0.6240


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=3.53e-01, acc. prob=0.896]


[outer 027] TRAIN (EMA+K-ens) ll=0.6934  br=0.2494  acc=0.5960


Sample: 100%|██████████| 330/330 [00:17, 19.15it/s, step size=2.71e-01, acc. prob=0.942]


[outer 028] TRAIN (EMA+K-ens) ll=0.6871  br=0.2465  acc=0.6240


Sample: 100%|██████████| 330/330 [00:15, 20.80it/s, step size=3.10e-01, acc. prob=0.903]


[outer 029] TRAIN (EMA+K-ens) ll=0.6875  br=0.2466  acc=0.6270


Sample: 100%|██████████| 330/330 [00:15, 20.68it/s, step size=2.91e-01, acc. prob=0.951]


[outer 030] TRAIN (EMA+K-ens) ll=0.6829  br=0.2444  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 20.58it/s, step size=3.42e-01, acc. prob=0.910]


[outer 031] TRAIN (EMA+K-ens) ll=0.6823  br=0.2443  acc=0.6380
[Early stop @ outer 31] Δll=0.081%, Δbr=0.124%, Δacc=0.000
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}, {'accuracy': 0.6839999556541443, 'brier': 0.23712489008903503, 'logloss': 0.6677666902542114}]


Sample: 100%|██████████| 330/330 [00:17, 19.14it/s, step size=3.16e-01, acc. prob=0.899]


[outer 000] TRAIN (EMA+K-ens) ll=0.6760  br=0.2404  acc=0.5780


Sample: 100%|██████████| 330/330 [00:15, 20.74it/s, step size=3.72e-01, acc. prob=0.911]


[outer 001] TRAIN (EMA+K-ens) ll=0.6632  br=0.2351  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.57it/s, step size=2.68e-01, acc. prob=0.946]


[outer 002] TRAIN (EMA+K-ens) ll=0.6752  br=0.2409  acc=0.6660


Sample: 100%|██████████| 330/330 [00:18, 17.86it/s, step size=3.15e-01, acc. prob=0.923]


[outer 003] TRAIN (EMA+K-ens) ll=0.6760  br=0.2413  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 19.07it/s, step size=2.98e-01, acc. prob=0.875]


[outer 004] TRAIN (EMA+K-ens) ll=0.6737  br=0.2401  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=2.56e-01, acc. prob=0.958]


[outer 005] TRAIN (EMA+K-ens) ll=0.6735  br=0.2401  acc=0.6310


Sample: 100%|██████████| 330/330 [00:15, 20.84it/s, step size=3.14e-01, acc. prob=0.942]


[outer 006] TRAIN (EMA+K-ens) ll=0.6762  br=0.2414  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.47e-01, acc. prob=0.887]


[outer 007] TRAIN (EMA+K-ens) ll=0.6733  br=0.2399  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 19.17it/s, step size=2.51e-01, acc. prob=0.963]


[outer 008] TRAIN (EMA+K-ens) ll=0.6759  br=0.2411  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 18.36it/s, step size=2.83e-01, acc. prob=0.930]


[outer 009] TRAIN (EMA+K-ens) ll=0.6850  br=0.2455  acc=0.6410


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=3.10e-01, acc. prob=0.946]


[outer 010] TRAIN (EMA+K-ens) ll=0.6804  br=0.2433  acc=0.6380


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.80e-01, acc. prob=0.907]


[outer 011] TRAIN (EMA+K-ens) ll=0.6805  br=0.2432  acc=0.6430


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=2.62e-01, acc. prob=0.929]


[outer 012] TRAIN (EMA+K-ens) ll=0.6871  br=0.2465  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.74e-01, acc. prob=0.941]


[outer 013] TRAIN (EMA+K-ens) ll=0.6871  br=0.2463  acc=0.6410


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.16e-01, acc. prob=0.924]


[outer 014] TRAIN (EMA+K-ens) ll=0.6849  br=0.2453  acc=0.6470


Sample: 100%|██████████| 330/330 [00:18, 17.87it/s, step size=2.72e-01, acc. prob=0.961]


[outer 015] TRAIN (EMA+K-ens) ll=0.6865  br=0.2461  acc=0.6330


Sample: 100%|██████████| 330/330 [00:19, 16.64it/s, step size=2.92e-01, acc. prob=0.941]


[outer 016] TRAIN (EMA+K-ens) ll=0.6837  br=0.2448  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=2.60e-01, acc. prob=0.931]


[outer 017] TRAIN (EMA+K-ens) ll=0.6817  br=0.2437  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.85it/s, step size=3.14e-01, acc. prob=0.901]


[outer 018] TRAIN (EMA+K-ens) ll=0.6850  br=0.2452  acc=0.6260


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.54e-01, acc. prob=0.944]


[outer 019] TRAIN (EMA+K-ens) ll=0.6878  br=0.2467  acc=0.6040


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=2.50e-01, acc. prob=0.948]


[outer 020] TRAIN (EMA+K-ens) ll=0.6812  br=0.2436  acc=0.6200


Sample: 100%|██████████| 330/330 [00:18, 18.29it/s, step size=2.48e-01, acc. prob=0.964]


[outer 021] TRAIN (EMA+K-ens) ll=0.6817  br=0.2440  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 18.62it/s, step size=2.82e-01, acc. prob=0.931]


[outer 022] TRAIN (EMA+K-ens) ll=0.6871  br=0.2466  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 18.93it/s, step size=2.67e-01, acc. prob=0.945]


[outer 023] TRAIN (EMA+K-ens) ll=0.6908  br=0.2484  acc=0.6460


Sample: 100%|██████████| 330/330 [00:18, 17.92it/s, step size=2.69e-01, acc. prob=0.951]


[outer 024] TRAIN (EMA+K-ens) ll=0.6895  br=0.2478  acc=0.6440


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.97e-01, acc. prob=0.925]


[outer 025] TRAIN (EMA+K-ens) ll=0.6880  br=0.2471  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.82e-01, acc. prob=0.930]


[outer 026] TRAIN (EMA+K-ens) ll=0.6894  br=0.2478  acc=0.6350


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=3.04e-01, acc. prob=0.926]


[outer 027] TRAIN (EMA+K-ens) ll=0.6898  br=0.2480  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=3.01e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6933  br=0.2497  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.10e-01, acc. prob=0.934]


[outer 029] TRAIN (EMA+K-ens) ll=0.6930  br=0.2495  acc=0.6400


Sample: 100%|██████████| 330/330 [00:15, 20.83it/s, step size=2.75e-01, acc. prob=0.949]


[outer 030] TRAIN (EMA+K-ens) ll=0.6915  br=0.2488  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=2.84e-01, acc. prob=0.942]


[outer 031] TRAIN (EMA+K-ens) ll=0.6922  br=0.2491  acc=0.6540


Sample: 100%|██████████| 330/330 [00:18, 17.75it/s, step size=2.48e-01, acc. prob=0.930]


[outer 032] TRAIN (EMA+K-ens) ll=0.6954  br=0.2507  acc=0.6450


Sample: 100%|██████████| 330/330 [00:19, 17.20it/s, step size=2.90e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6964  br=0.2511  acc=0.6430


Sample: 100%|██████████| 330/330 [00:18, 18.12it/s, step size=2.57e-01, acc. prob=0.947]


[outer 034] TRAIN (EMA+K-ens) ll=0.6964  br=0.2510  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=3.25e-01, acc. prob=0.921]


[outer 035] TRAIN (EMA+K-ens) ll=0.6941  br=0.2501  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.98e-01, acc. prob=0.951]


[outer 036] TRAIN (EMA+K-ens) ll=0.6936  br=0.2497  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=3.00e-01, acc. prob=0.947]


[outer 037] TRAIN (EMA+K-ens) ll=0.6915  br=0.2487  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 18.93it/s, step size=2.82e-01, acc. prob=0.940]


[outer 038] TRAIN (EMA+K-ens) ll=0.6889  br=0.2475  acc=0.6490


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.60e-01, acc. prob=0.949]


[outer 039] TRAIN (EMA+K-ens) ll=0.6865  br=0.2463  acc=0.6240
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}, {'accuracy': 0.6839999556541443, 'brier': 0.23712489008903503, 'logloss': 0.6677666902542114}, {'accuracy': 0.6479200124740601, 'brier': 0.2339715212583542, 'logloss': 0.6620026230812073}]


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=3.07e-01, acc. prob=0.932]


[outer 000] TRAIN (EMA+K-ens) ll=0.6822  br=0.2445  acc=0.6270


Sample: 100%|██████████| 330/330 [00:18, 17.78it/s, step size=2.40e-01, acc. prob=0.952]


[outer 001] TRAIN (EMA+K-ens) ll=0.6790  br=0.2427  acc=0.5810


Sample: 100%|██████████| 330/330 [00:17, 19.25it/s, step size=2.87e-01, acc. prob=0.952]


[outer 002] TRAIN (EMA+K-ens) ll=0.6760  br=0.2412  acc=0.6360


Sample: 100%|██████████| 330/330 [00:17, 18.35it/s, step size=2.43e-01, acc. prob=0.944]


[outer 003] TRAIN (EMA+K-ens) ll=0.6735  br=0.2398  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.31it/s, step size=3.34e-01, acc. prob=0.922]


[outer 004] TRAIN (EMA+K-ens) ll=0.6726  br=0.2394  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=2.78e-01, acc. prob=0.932]


[outer 005] TRAIN (EMA+K-ens) ll=0.6768  br=0.2415  acc=0.6580


Sample: 100%|██████████| 330/330 [00:18, 18.15it/s, step size=2.72e-01, acc. prob=0.949]


[outer 006] TRAIN (EMA+K-ens) ll=0.6755  br=0.2409  acc=0.6560


Sample: 100%|██████████| 330/330 [00:18, 18.10it/s, step size=2.64e-01, acc. prob=0.937]


[outer 007] TRAIN (EMA+K-ens) ll=0.6769  br=0.2415  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 19.99it/s, step size=3.09e-01, acc. prob=0.919]


[outer 008] TRAIN (EMA+K-ens) ll=0.6816  br=0.2436  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 18.51it/s, step size=3.23e-01, acc. prob=0.934]


[outer 009] TRAIN (EMA+K-ens) ll=0.6787  br=0.2422  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=3.16e-01, acc. prob=0.942]


[outer 010] TRAIN (EMA+K-ens) ll=0.6780  br=0.2418  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 19.96it/s, step size=2.95e-01, acc. prob=0.945]


[outer 011] TRAIN (EMA+K-ens) ll=0.6738  br=0.2399  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=3.13e-01, acc. prob=0.932]


[outer 012] TRAIN (EMA+K-ens) ll=0.6681  br=0.2369  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.97e-01, acc. prob=0.932]


[outer 013] TRAIN (EMA+K-ens) ll=0.6642  br=0.2350  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=3.26e-01, acc. prob=0.929]


[outer 014] TRAIN (EMA+K-ens) ll=0.6675  br=0.2366  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 18.35it/s, step size=3.04e-01, acc. prob=0.933]


[outer 015] TRAIN (EMA+K-ens) ll=0.6653  br=0.2356  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.60it/s, step size=3.05e-01, acc. prob=0.941]


[outer 016] TRAIN (EMA+K-ens) ll=0.6605  br=0.2333  acc=0.6810


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=3.04e-01, acc. prob=0.943]


[outer 017] TRAIN (EMA+K-ens) ll=0.6641  br=0.2350  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.93e-01, acc. prob=0.938]


[outer 018] TRAIN (EMA+K-ens) ll=0.6661  br=0.2360  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.63e-01, acc. prob=0.962]


[outer 019] TRAIN (EMA+K-ens) ll=0.6804  br=0.2429  acc=0.6430


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.70e-01, acc. prob=0.961]


[outer 020] TRAIN (EMA+K-ens) ll=0.6826  br=0.2442  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.83e-01, acc. prob=0.956]


[outer 021] TRAIN (EMA+K-ens) ll=0.6858  br=0.2457  acc=0.6350


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.55e-01, acc. prob=0.962]


[outer 022] TRAIN (EMA+K-ens) ll=0.6758  br=0.2407  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=3.18e-01, acc. prob=0.940]


[outer 023] TRAIN (EMA+K-ens) ll=0.6799  br=0.2426  acc=0.6480


Sample: 100%|██████████| 330/330 [00:18, 18.30it/s, step size=2.72e-01, acc. prob=0.952]


[outer 024] TRAIN (EMA+K-ens) ll=0.6831  br=0.2442  acc=0.6220


Sample: 100%|██████████| 330/330 [00:15, 20.73it/s, step size=3.79e-01, acc. prob=0.902]


[outer 025] TRAIN (EMA+K-ens) ll=0.6819  br=0.2437  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=2.75e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6808  br=0.2434  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 19.96it/s, step size=2.86e-01, acc. prob=0.918]


[outer 027] TRAIN (EMA+K-ens) ll=0.6755  br=0.2409  acc=0.6510


Sample: 100%|██████████| 330/330 [00:18, 17.78it/s, step size=2.72e-01, acc. prob=0.936]


[outer 028] TRAIN (EMA+K-ens) ll=0.6819  br=0.2440  acc=0.6460


Sample: 100%|██████████| 330/330 [00:15, 21.48it/s, step size=3.27e-01, acc. prob=0.915]


[outer 029] TRAIN (EMA+K-ens) ll=0.6795  br=0.2429  acc=0.6610


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=2.91e-01, acc. prob=0.953]


[outer 030] TRAIN (EMA+K-ens) ll=0.6877  br=0.2469  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 18.56it/s, step size=2.75e-01, acc. prob=0.951]


[outer 031] TRAIN (EMA+K-ens) ll=0.6898  br=0.2480  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.91e-01, acc. prob=0.924]


[outer 032] TRAIN (EMA+K-ens) ll=0.6925  br=0.2493  acc=0.6120


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=3.05e-01, acc. prob=0.947]


[outer 033] TRAIN (EMA+K-ens) ll=0.6925  br=0.2493  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 19.07it/s, step size=2.91e-01, acc. prob=0.956]


[outer 034] TRAIN (EMA+K-ens) ll=0.6937  br=0.2496  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=3.02e-01, acc. prob=0.946]


[outer 035] TRAIN (EMA+K-ens) ll=0.6877  br=0.2467  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=2.99e-01, acc. prob=0.952]


[outer 036] TRAIN (EMA+K-ens) ll=0.6899  br=0.2477  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.53it/s, step size=2.65e-01, acc. prob=0.939]


[outer 037] TRAIN (EMA+K-ens) ll=0.6947  br=0.2499  acc=0.6270


Sample: 100%|██████████| 330/330 [00:18, 17.92it/s, step size=3.17e-01, acc. prob=0.915]


[outer 038] TRAIN (EMA+K-ens) ll=0.6931  br=0.2491  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=2.34e-01, acc. prob=0.946]


[outer 039] TRAIN (EMA+K-ens) ll=0.6901  br=0.2477  acc=0.6390
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}, {'accuracy': 0.6839999556541443, 'brier': 0.23712489008903503, 'logloss': 0.6677666902542114}, {'accuracy': 0.6479200124740601, 'brier': 0.2339715212583542, 'logloss': 0.6620026230812073}, {'accuracy': 0.536579966545105, 'brier': 0.2654944360256195, 'logloss': 0.7292959094047546}]


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=2.99e-01, acc. prob=0.918]


[outer 000] TRAIN (EMA+K-ens) ll=0.6778  br=0.2420  acc=0.5900


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=3.19e-01, acc. prob=0.944]


[outer 001] TRAIN (EMA+K-ens) ll=0.6946  br=0.2500  acc=0.5710


Sample: 100%|██████████| 330/330 [00:17, 18.69it/s, step size=2.97e-01, acc. prob=0.910]


[outer 002] TRAIN (EMA+K-ens) ll=0.6731  br=0.2395  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=3.23e-01, acc. prob=0.938]


[outer 003] TRAIN (EMA+K-ens) ll=0.6682  br=0.2370  acc=0.6640


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=3.82e-01, acc. prob=0.889]


[outer 004] TRAIN (EMA+K-ens) ll=0.6703  br=0.2382  acc=0.6610


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=3.08e-01, acc. prob=0.942]


[outer 005] TRAIN (EMA+K-ens) ll=0.6783  br=0.2420  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=2.91e-01, acc. prob=0.964]


[outer 006] TRAIN (EMA+K-ens) ll=0.6773  br=0.2416  acc=0.6540


Sample: 100%|██████████| 330/330 [00:19, 17.29it/s, step size=2.68e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6849  br=0.2453  acc=0.6350


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=3.15e-01, acc. prob=0.953]


[outer 008] TRAIN (EMA+K-ens) ll=0.6862  br=0.2459  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=2.93e-01, acc. prob=0.929]


[outer 009] TRAIN (EMA+K-ens) ll=0.6890  br=0.2473  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=3.10e-01, acc. prob=0.921]


[outer 010] TRAIN (EMA+K-ens) ll=0.6946  br=0.2501  acc=0.6360


Sample: 100%|██████████| 330/330 [00:17, 18.59it/s, step size=3.04e-01, acc. prob=0.951]


[outer 011] TRAIN (EMA+K-ens) ll=0.6991  br=0.2524  acc=0.6110


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.88e-01, acc. prob=0.944]


[outer 012] TRAIN (EMA+K-ens) ll=0.7023  br=0.2539  acc=0.6090


Sample: 100%|██████████| 330/330 [00:16, 20.34it/s, step size=3.15e-01, acc. prob=0.918]


[outer 013] TRAIN (EMA+K-ens) ll=0.6977  br=0.2517  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 20.01it/s, step size=2.91e-01, acc. prob=0.929]


[outer 014] TRAIN (EMA+K-ens) ll=0.7004  br=0.2529  acc=0.5780


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=2.72e-01, acc. prob=0.952]


[outer 015] TRAIN (EMA+K-ens) ll=0.6861  br=0.2458  acc=0.5920


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=2.60e-01, acc. prob=0.957]


[outer 016] TRAIN (EMA+K-ens) ll=0.6809  br=0.2433  acc=0.6230


Sample: 100%|██████████| 330/330 [00:15, 21.33it/s, step size=3.24e-01, acc. prob=0.930]


[outer 017] TRAIN (EMA+K-ens) ll=0.6727  br=0.2393  acc=0.6560


Sample: 100%|██████████| 330/330 [00:15, 20.99it/s, step size=2.69e-01, acc. prob=0.942]


[outer 018] TRAIN (EMA+K-ens) ll=0.6740  br=0.2399  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=2.81e-01, acc. prob=0.953]


[outer 019] TRAIN (EMA+K-ens) ll=0.6739  br=0.2398  acc=0.6380


Sample: 100%|██████████| 330/330 [00:16, 19.82it/s, step size=3.10e-01, acc. prob=0.938]


[outer 020] TRAIN (EMA+K-ens) ll=0.6693  br=0.2376  acc=0.6470


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=3.29e-01, acc. prob=0.930]


[outer 021] TRAIN (EMA+K-ens) ll=0.6676  br=0.2368  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.56e-01, acc. prob=0.953]


[outer 022] TRAIN (EMA+K-ens) ll=0.6645  br=0.2354  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 19.90it/s, step size=3.51e-01, acc. prob=0.927]


[outer 023] TRAIN (EMA+K-ens) ll=0.6678  br=0.2370  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=2.86e-01, acc. prob=0.951]


[outer 024] TRAIN (EMA+K-ens) ll=0.6742  br=0.2400  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.56it/s, step size=3.39e-01, acc. prob=0.938]


[outer 025] TRAIN (EMA+K-ens) ll=0.6749  br=0.2405  acc=0.6420


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=2.61e-01, acc. prob=0.949]


[outer 026] TRAIN (EMA+K-ens) ll=0.6723  br=0.2393  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=3.14e-01, acc. prob=0.911]


[outer 027] TRAIN (EMA+K-ens) ll=0.6690  br=0.2375  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=2.92e-01, acc. prob=0.952]


[outer 028] TRAIN (EMA+K-ens) ll=0.6667  br=0.2364  acc=0.6770


Sample: 100%|██████████| 330/330 [00:20, 16.35it/s, step size=3.44e-01, acc. prob=0.881]


[outer 029] TRAIN (EMA+K-ens) ll=0.6676  br=0.2369  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 19.00it/s, step size=2.74e-01, acc. prob=0.960]


[outer 030] TRAIN (EMA+K-ens) ll=0.6687  br=0.2375  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=2.58e-01, acc. prob=0.955]


[outer 031] TRAIN (EMA+K-ens) ll=0.6744  br=0.2403  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.83e-01, acc. prob=0.952]


[outer 032] TRAIN (EMA+K-ens) ll=0.6693  br=0.2379  acc=0.6740


Sample: 100%|██████████| 330/330 [00:18, 17.75it/s, step size=2.83e-01, acc. prob=0.953]


[outer 033] TRAIN (EMA+K-ens) ll=0.6721  br=0.2391  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=2.89e-01, acc. prob=0.952]


[outer 034] TRAIN (EMA+K-ens) ll=0.6729  br=0.2396  acc=0.6720


Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=2.93e-01, acc. prob=0.945]


[outer 035] TRAIN (EMA+K-ens) ll=0.6789  br=0.2425  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=3.68e-01, acc. prob=0.908]


[outer 036] TRAIN (EMA+K-ens) ll=0.6813  br=0.2437  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 20.33it/s, step size=3.22e-01, acc. prob=0.959]


[outer 037] TRAIN (EMA+K-ens) ll=0.6831  br=0.2445  acc=0.6410


Sample: 100%|██████████| 330/330 [00:18, 18.07it/s, step size=2.71e-01, acc. prob=0.946]


[outer 038] TRAIN (EMA+K-ens) ll=0.6863  br=0.2460  acc=0.6340


Sample: 100%|██████████| 330/330 [00:16, 19.74it/s, step size=2.90e-01, acc. prob=0.936]


[outer 039] TRAIN (EMA+K-ens) ll=0.6860  br=0.2457  acc=0.6500
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}, {'accuracy': 0.6839999556541443, 'brier': 0.23712489008903503, 'logloss': 0.6677666902542114}, {'accuracy': 0.6479200124740601, 'brier': 0.2339715212583542, 'logloss': 0.6620026230812073}, {'accuracy': 0.536579966545105, 'brier': 0.2654944360256195, 'logloss': 0.7292959094047546}, {'accuracy': 0.6070799827575684, 'brier': 0.24404005706310272, 'logloss': 0.6826115250587463}]


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=2.92e-01, acc. prob=0.957]


[outer 000] TRAIN (EMA+K-ens) ll=0.6765  br=0.2419  acc=0.5050


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=2.80e-01, acc. prob=0.941]


[outer 001] TRAIN (EMA+K-ens) ll=0.6724  br=0.2396  acc=0.6330


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.89e-01, acc. prob=0.921]


[outer 002] TRAIN (EMA+K-ens) ll=0.6717  br=0.2392  acc=0.6330


Sample: 100%|██████████| 330/330 [00:18, 17.87it/s, step size=2.60e-01, acc. prob=0.955]


[outer 003] TRAIN (EMA+K-ens) ll=0.6669  br=0.2368  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=3.61e-01, acc. prob=0.921]


[outer 004] TRAIN (EMA+K-ens) ll=0.6687  br=0.2376  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=2.71e-01, acc. prob=0.954]


[outer 005] TRAIN (EMA+K-ens) ll=0.6683  br=0.2373  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=2.79e-01, acc. prob=0.937]


[outer 006] TRAIN (EMA+K-ens) ll=0.6672  br=0.2368  acc=0.6940


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=3.37e-01, acc. prob=0.921]


[outer 007] TRAIN (EMA+K-ens) ll=0.6662  br=0.2362  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=2.77e-01, acc. prob=0.954]


[outer 008] TRAIN (EMA+K-ens) ll=0.6757  br=0.2406  acc=0.6730


Sample: 100%|██████████| 330/330 [00:18, 17.61it/s, step size=2.87e-01, acc. prob=0.959]


[outer 009] TRAIN (EMA+K-ens) ll=0.6785  br=0.2416  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=3.11e-01, acc. prob=0.917]


[outer 010] TRAIN (EMA+K-ens) ll=0.6778  br=0.2414  acc=0.6670


Sample: 100%|██████████| 330/330 [00:18, 17.61it/s, step size=2.65e-01, acc. prob=0.964]


[outer 011] TRAIN (EMA+K-ens) ll=0.6784  br=0.2419  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 19.41it/s, step size=3.11e-01, acc. prob=0.951]


[outer 012] TRAIN (EMA+K-ens) ll=0.6790  br=0.2424  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=2.96e-01, acc. prob=0.923]


[outer 013] TRAIN (EMA+K-ens) ll=0.6794  br=0.2426  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=3.38e-01, acc. prob=0.906]


[outer 014] TRAIN (EMA+K-ens) ll=0.6811  br=0.2435  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=3.78e-01, acc. prob=0.911]


[outer 015] TRAIN (EMA+K-ens) ll=0.6821  br=0.2441  acc=0.6490


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=2.68e-01, acc. prob=0.957]


[outer 016] TRAIN (EMA+K-ens) ll=0.6725  br=0.2394  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=2.56e-01, acc. prob=0.949]


[outer 017] TRAIN (EMA+K-ens) ll=0.6709  br=0.2386  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=2.85e-01, acc. prob=0.920]


[outer 018] TRAIN (EMA+K-ens) ll=0.6730  br=0.2395  acc=0.6920


Sample: 100%|██████████| 330/330 [00:16, 19.73it/s, step size=3.18e-01, acc. prob=0.934]


[outer 019] TRAIN (EMA+K-ens) ll=0.6713  br=0.2388  acc=0.7040


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=2.43e-01, acc. prob=0.983]


[outer 020] TRAIN (EMA+K-ens) ll=0.6683  br=0.2373  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.89e-01, acc. prob=0.937]


[outer 021] TRAIN (EMA+K-ens) ll=0.6667  br=0.2366  acc=0.6880


Sample: 100%|██████████| 330/330 [00:18, 17.46it/s, step size=3.26e-01, acc. prob=0.921]


[outer 022] TRAIN (EMA+K-ens) ll=0.6707  br=0.2385  acc=0.6630


Sample: 100%|██████████| 330/330 [00:18, 18.33it/s, step size=2.87e-01, acc. prob=0.938]


[outer 023] TRAIN (EMA+K-ens) ll=0.6704  br=0.2382  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 19.89it/s, step size=2.81e-01, acc. prob=0.964]


[outer 024] TRAIN (EMA+K-ens) ll=0.6686  br=0.2373  acc=0.7010


Sample: 100%|██████████| 330/330 [00:17, 19.15it/s, step size=2.68e-01, acc. prob=0.958]


[outer 025] TRAIN (EMA+K-ens) ll=0.6654  br=0.2359  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=3.11e-01, acc. prob=0.892]


[outer 026] TRAIN (EMA+K-ens) ll=0.6694  br=0.2378  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.60it/s, step size=2.48e-01, acc. prob=0.966]


[outer 027] TRAIN (EMA+K-ens) ll=0.6701  br=0.2379  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.96e-01, acc. prob=0.943]


[outer 028] TRAIN (EMA+K-ens) ll=0.6707  br=0.2383  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 21.75it/s, step size=3.64e-01, acc. prob=0.940]


[outer 029] TRAIN (EMA+K-ens) ll=0.6761  br=0.2408  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=3.14e-01, acc. prob=0.924]


[outer 030] TRAIN (EMA+K-ens) ll=0.6734  br=0.2395  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=3.47e-01, acc. prob=0.921]


[outer 031] TRAIN (EMA+K-ens) ll=0.6749  br=0.2404  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.83e-01, acc. prob=0.936]


[outer 032] TRAIN (EMA+K-ens) ll=0.6793  br=0.2424  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=2.92e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6834  br=0.2444  acc=0.6660


Sample: 100%|██████████| 330/330 [00:18, 17.85it/s, step size=3.10e-01, acc. prob=0.944]


[outer 034] TRAIN (EMA+K-ens) ll=0.6776  br=0.2416  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 20.88it/s, step size=3.03e-01, acc. prob=0.930]


[outer 035] TRAIN (EMA+K-ens) ll=0.6802  br=0.2429  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=2.51e-01, acc. prob=0.956]


[outer 036] TRAIN (EMA+K-ens) ll=0.6802  br=0.2427  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.76e-01, acc. prob=0.961]


[outer 037] TRAIN (EMA+K-ens) ll=0.6710  br=0.2383  acc=0.6640


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=3.32e-01, acc. prob=0.945]


[outer 038] TRAIN (EMA+K-ens) ll=0.6727  br=0.2391  acc=0.6670


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.57e-01, acc. prob=0.929]


[outer 039] TRAIN (EMA+K-ens) ll=0.6695  br=0.2374  acc=0.6670
[{'accuracy': 0.5864599943161011, 'brier': 0.2420940101146698, 'logloss': 0.6789154410362244}, {'accuracy': 0.5573199987411499, 'brier': 0.254721075296402, 'logloss': 0.711504340171814}, {'accuracy': 0.6012600064277649, 'brier': 0.2446962594985962, 'logloss': 0.6832140684127808}, {'accuracy': 0.6252399682998657, 'brier': 0.25350481271743774, 'logloss': 0.7039424777030945}, {'accuracy': 0.6339199542999268, 'brier': 0.25241318345069885, 'logloss': 0.6991651654243469}, {'accuracy': 0.6839999556541443, 'brier': 0.23712489008903503, 'logloss': 0.6677666902542114}, {'accuracy': 0.6479200124740601, 'brier': 0.2339715212583542, 'logloss': 0.6620026230812073}, {'accuracy': 0.536579966545105, 'brier': 0.2654944360256195, 'logloss': 0.7292959094047546}, {'accuracy': 0.6070799827575684, 'brier': 0.24404005706310272, 'logloss': 0.6826115250587463}, {'accuracy': 0.6310200095176697, 'brier': 0.24115285277366638, 'logloss': 0.6791824102401

# Scenario 2 : Misspecified error distribution

In [None]:
def generate_noise(noise_type, size):
    if noise_type == "normal":
        # 그대로 유지: 표준 정규
        return np.random.normal(0, 1, size=size)

    elif noise_type == "t":
        # df를 살짝 키워서(2.2) 안정성 확보, 대신 scale=3 배로 꼬리 강조
        return np.random.standard_t(df=2.2, size=size) * 3.0

    elif noise_type == "cauchy":
        # 표준 cauchy는 너무 뾰족하니까 scale을 4로 키워 tail을 극단적으로
        return np.random.standard_cauchy(size=size) * 4.0

    elif noise_type == "contaminated":
        # outlier 비율 10% → 20%로 증가
        # outlier의 분산도 더 크게: N(0,10)
        base = np.random.normal(0, 1, size=size)
        outlier_idx = np.random.choice(size, size=int(0.2 * size), replace=False)
        base[outlier_idx] = np.random.normal(0, 10, size=len(outlier_idx))
        return base

    else:
        raise ValueError("Unsupported noise type")

# ✅ 데이터 생성 함수
def simulate_dataset_noise(noise_type,n_per_group):
    rows = []
    noise_vector = generate_noise(noise_type, size=n_groups * n_per_group)
    noise_counter = 0  # noise 인덱스 추적

    for j in range(n_groups):
        group_name = group_labels[j]
        group_boost = group_effects[j]

        for _ in range(n_per_group):
            # Features
            logins_last_week = np.random.poisson(lam=5)
            previous_purchases = np.random.poisson(lam=2)
            viewed_target_category = np.random.binomial(1, p=0.5)
            discount_received = np.random.binomial(1, p=0.5)

            # True latent variable (U)
            U = (
                -4.5
                + 0.1 * previous_purchases
                + 0.6 * viewed_target_category
                + 0.9 * discount_received
                + 3 * viewed_target_category * discount_received
                + 1.1 * group_boost * discount_received
                + 0.13 * previous_purchases**2
                + noise_vector[noise_counter]
            )
            noise_counter += 1

            # Probit outcome
            # p = norm.cdf(U)
            # y = np.random.binomial(1, p)
            y = 1 if U >0 else 0

            rows.append({
                "group_id": j,
                "group_label": group_name,
                "logins_last_week": logins_last_week,
                "previous_purchases": previous_purchases,
                "viewed_target_category": viewed_target_category,
                "discount_received": discount_received,
                "y": y
            })
    df_simulated = pd.DataFrame(rows)
    scaler = StandardScaler()
    df_simulated[["logins_last_week", "previous_purchases"]] = scaler.fit_transform(
    df_simulated[["logins_last_week", "previous_purchases"]])
    return df_simulated
# ✅ 데이터셋 생성

df_simulated_cauchy_test = simulate_dataset_noise("cauchy",10000)
df_simulated_t_test = simulate_dataset_noise("t",10000)
df_simulated_contaminated_test = simulate_dataset_noise("contaminated",10000)



In [None]:
df_simulated_contaminated_test["y"].value_counts()



Unnamed: 0_level_0,count
y,Unnamed: 1_level_1
0,33525
1,16475


In [None]:
def run_experiment_loop2(
    seeds,
    feature_cols,
    noise_type_for_train="normal",
    npergroup=n_per_group,
    group=False, interaction=False, nonlinear=False,
    use_quasi=False, loss_kind="bce",
    draws=1000, tune=1000, target_accept=0.9
):
    results = []


    for seed in seeds:
        np.random.seed(seed)

        # 1) Train 데이터 생성
        df_train = simulate_dataset_noise(noise_type_for_train,npergroup)
        group_idx_train = df_train["group_id"].values if group else None
        df_test = simulate_dataset_noise(noise_type_for_train, 10000)
        X_test = df_test[feature_cols].copy()
        y_test = df_test["y"].values
        group_idx_test = df_test["group_id"].values if group else None
        if interaction:
          X_test["interaction"] = X_test["viewed_target_category"] * X_test["discount_received"]
        if nonlinear:
          X_test["purchases_squared"] = X_test["previous_purchases"] ** 2
        X_test = X_test.values.astype("float64")

        # 2) 모델 정의
        model, eta, y = define_model(
            df_train, feature_cols,
            group_idx=group_idx_train,
            group=group, interaction=interaction, nonlinear=nonlinear
        )

        # 3) 모델 실행 (Bayes vs Quasi 선택)
        if use_quasi:
            trace = run_quasi_model(
                model, eta, y,
                loss_kind=loss_kind,
                draws=draws, tune=tune, target_accept=target_accept,
                return_inferencedata=True,     idata_kwargs={"log_likelihood": True}
            )
        else:
            trace = run_bayesian_model(
                model, eta, y,
                draws=draws, tune=tune, target_accept=target_accept,
                return_inferencedata=True,     idata_kwargs={"log_likelihood": True}
            )


        beta_da = trace.posterior["beta"].stack(sample=("chain", "draw"))

        if beta_da.ndim == 2:
          # --- 비계층 (beta shape: (K, S)로 변환) ---
          beta = beta_da.transpose("sample", ...).values  # (S, K)

          # 선형예측자 η: (N_test, S)
          eta = X_test @ beta.T

        else:
          # --- 계층 (beta shape: (G, K, S)로 변환) ---
          beta = beta_da.transpose("sample", ...).values  # (S, G, K)
          S, G, K = beta.shape

          # 각 관측치의 그룹 계수를 매칭 → (S, N_test, K)
          beta_g = beta[:, group_idx_test, :]  # group_idx_test shape: (N_test,)

          # 선형예측자 η: (N_test, S)
          eta = np.einsum("snk,nk->ns", beta_g, X_test)

        # 2) probit 변환 (표준정규 CDF)
        p_samples = norm.cdf(eta)  # (N_test, S)

        # 3) 모든 β 샘플 기반 평균 확률
        p_mean = p_samples.mean(axis=1)  # (N_test,)

        # 4) 이진 예측 (threshold=0.5)
        y_pred = (p_mean >= 0.5).astype(int)

        # 6) 메트릭
        acc = accuracy_score(y_test, y_pred)
        logloss_val = log_loss(y_test, p_mean, labels=[0,1])
        brier = brier_score_loss(y_test, p_mean)

        results.append({
            "seed": seed,
            "acc": acc,
            "logloss": logloss_val,
            "brier": brier,
            "model_type": "quasi" if use_quasi else "bayes",
            "loss_kind": loss_kind if use_quasi else None
        })

    return pd.DataFrame(results)

## 1. Classical Bayesian model

In [None]:
df_results_bayes_normal = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 1000,
        interaction=False,
        nonlinear=False,
        group=False,
    use_quasi=False
)

print(df_results_bayes_normal)


Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.56046  0.661548  0.234989      bayes      None
1      1  0.56760  0.661832  0.234966      bayes      None
2      2  0.56820  0.661975  0.235176      bayes      None
3      3  0.56518  0.660890  0.234624      bayes      None
4      4  0.55976  0.661102  0.234689      bayes      None
5      5  0.56518  0.661489  0.234843      bayes      None
6      6  0.56552  0.660923  0.234661      bayes      None
7      7  0.57328  0.661197  0.234741      bayes      None
8      8  0.62076  0.661634  0.234974      bayes      None
9      9  0.56030  0.660806  0.234561      bayes      None
10    10  0.56116  0.661159  0.234669      bayes      None
11    11  0.62076  0.661562  0.234975      bayes      None
12    12  0.56736  0.660955  0.234666      bayes      None
13    13  0.56282  0.660921  0.234675      bayes      None
14    14  0.56408  0.661970  0.235200      bayes      None
15    15  0.56518  0.662006  0.235012      bayes      No

In [None]:
df_results_bayes_cauchy = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=False,
    noise_type_for_train="cauchy"
)

print(df_results_bayes_cauchy)

Output()

ERROR:pymc.stats.convergence:There were 222 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 260 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 335 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 237 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 357 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 313 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 280 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1229 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 223 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 628 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 229 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 291 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 230 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 371 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 721 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 216 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 569 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 200 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 633 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 428 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.63058  0.646294  0.227156      bayes      None
1      1  0.61868  0.654055  0.230847      bayes      None
2      2  0.62172  0.658592  0.232329      bayes      None
3      3  0.62544  0.654522  0.230173      bayes      None
4      4  0.62424  0.652796  0.230253      bayes      None
5      5  0.62170  0.654497  0.230941      bayes      None
6      6  0.61798  0.655234  0.231260      bayes      None
7      7  0.62776  0.650753  0.229042      bayes      None
8      8  0.62864  0.650901  0.229157      bayes      None
9      9  0.63368  0.650265  0.229067      bayes      None
10    10  0.63080  0.649812  0.228607      bayes      None
11    11  0.62040  0.653537  0.230618      bayes      None
12    12  0.62102  0.656369  0.231451      bayes      None
13    13  0.62632  0.653934  0.230273      bayes      None
14    14  0.62708  0.651920  0.229774      bayes      None
15    15  0.62348  0.653027  0.230343      bayes      No

In [None]:
df_results_bayes_t = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=False,
    noise_type_for_train="t"
)

print(df_results_bayes_t)

Output()

ERROR:pymc.stats.convergence:There were 395 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 266 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 481 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 450 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 384 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 201 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1109 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 306 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1137 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 231 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 261 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 584 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 229 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 605 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 429 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 231 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 449 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 220 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 323 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 422 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.67726  0.603086  0.206617      bayes      None
1      1  0.68652  0.597382  0.204551      bayes      None
2      2  0.67462  0.599341  0.205616      bayes      None
3      3  0.68102  0.597177  0.204516      bayes      None
4      4  0.68852  0.608448  0.209017      bayes      None
5      5  0.68170  0.598103  0.205116      bayes      None
6      6  0.67918  0.597091  0.204498      bayes      None
7      7  0.68332  0.600027  0.205706      bayes      None
8      8  0.69012  0.596229  0.204485      bayes      None
9      9  0.67840  0.598685  0.205527      bayes      None
10    10  0.68844  0.600795  0.205755      bayes      None
11    11  0.67648  0.607285  0.208483      bayes      None
12    12  0.67194  0.598712  0.205305      bayes      None
13    13  0.67198  0.598913  0.205336      bayes      None
14    14  0.68566  0.599378  0.205461      bayes      None
15    15  0.68156  0.602131  0.206849      bayes      No

In [None]:
df_results_bayes_conta = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=False,
    noise_type_for_train="contaminated"
)
print(df_results_bayes_conta)

Output()

ERROR:pymc.stats.convergence:There were 391 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1382 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 189 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 249 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 773 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 414 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 216 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 263 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 555 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 303 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 260 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 334 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 282 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1054 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 261 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 445 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1013 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 915 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 236 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 310 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.77718  0.464056  0.148911      bayes      None
1      1  0.76736  0.460290  0.148029      bayes      None
2      2  0.77640  0.465848  0.149451      bayes      None
3      3  0.78390  0.468940  0.150590      bayes      None
4      4  0.77498  0.471006  0.152024      bayes      None
5      5  0.76980  0.460939  0.147945      bayes      None
6      6  0.77708  0.466281  0.149703      bayes      None
7      7  0.77524  0.463612  0.148648      bayes      None
8      8  0.78534  0.470013  0.149847      bayes      None
9      9  0.78028  0.461996  0.148477      bayes      None
10    10  0.78616  0.461674  0.148119      bayes      None
11    11  0.76544  0.461575  0.147766      bayes      None
12    12  0.78210  0.462429  0.147969      bayes      None
13    13  0.77260  0.466117  0.148716      bayes      None
14    14  0.77098  0.468600  0.151351      bayes      None
15    15  0.78286  0.465270  0.148787      bayes      No

## 2. Quasi Bayes model with bce loss function

In [None]:
df_results_quasi_normal = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=False,
        nonlinear=False,
        group=False,
    use_quasi=True
)
print(df_results_quasi_normal)


Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

Output()

    seed      acc   logloss     brier model_type loss_kind
0      0  0.56154  0.661538  0.234983      quasi       bce
1      1  0.56760  0.661836  0.234968      quasi       bce
2      2  0.56818  0.661996  0.235188      quasi       bce
3      3  0.56518  0.660887  0.234625      quasi       bce
4      4  0.55976  0.661102  0.234689      quasi       bce
5      5  0.56518  0.661473  0.234836      quasi       bce
6      6  0.56552  0.660920  0.234659      quasi       bce
7      7  0.57328  0.661208  0.234747      quasi       bce
8      8  0.62076  0.661628  0.234972      quasi       bce
9      9  0.56030  0.660815  0.234564      quasi       bce
10    10  0.56116  0.661169  0.234673      quasi       bce
11    11  0.62090  0.661558  0.234973      quasi       bce
12    12  0.56736  0.660960  0.234669      quasi       bce
13    13  0.56116  0.660909  0.234670      quasi       bce
14    14  0.56408  0.661968  0.235199      quasi       bce
15    15  0.56518  0.662043  0.235026      quasi       b

In [None]:
df_results_quasi_cauchy = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="cauchy"
)

print(df_results_quasi_cauchy)


Output()

ERROR:pymc.stats.convergence:There were 848 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 365 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 434 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1310 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 290 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 254 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 674 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 558 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 601 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 264 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 356 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 198 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 352 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 409 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1669 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 733 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 354 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 215 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 366 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 217 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.63140  0.645820  0.226940      quasi       bce
1      1  0.61884  0.653843  0.230760      quasi       bce
2      2  0.62158  0.658330  0.232211      quasi       bce
3      3  0.62468  0.654961  0.230324      quasi       bce
4      4  0.62434  0.652877  0.230293      quasi       bce
5      5  0.62046  0.654523  0.230948      quasi       bce
6      6  0.61708  0.654842  0.231092      quasi       bce
7      7  0.62776  0.650817  0.229122      quasi       bce
8      8  0.62730  0.650920  0.229162      quasi       bce
9      9  0.63374  0.650154  0.229015      quasi       bce
10    10  0.63226  0.650309  0.228798      quasi       bce
11    11  0.62194  0.653263  0.230489      quasi       bce
12    12  0.62084  0.656457  0.231482      quasi       bce
13    13  0.62558  0.653641  0.230209      quasi       bce
14    14  0.62638  0.652003  0.229795      quasi       bce
15    15  0.62568  0.652142  0.229939      quasi       b

In [None]:
df_results_quasi__t = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="t"
)
print(df_results_quasi__t)


Output()

ERROR:pymc.stats.convergence:There were 549 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 979 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 348 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 310 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 286 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 315 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 336 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 182 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 519 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1054 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 273 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1192 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 318 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 182 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 278 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 435 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 697 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 238 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 399 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 211 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.67754  0.602747  0.206446      quasi       bce
1      1  0.68608  0.599413  0.205283      quasi       bce
2      2  0.67436  0.599545  0.205701      quasi       bce
3      3  0.68076  0.597617  0.204677      quasi       bce
4      4  0.68960  0.609318  0.209319      quasi       bce
5      5  0.68142  0.598324  0.205205      quasi       bce
6      6  0.67830  0.597847  0.204717      quasi       bce
7      7  0.68378  0.599965  0.205688      quasi       bce
8      8  0.68914  0.596187  0.204488      quasi       bce
9      9  0.67988  0.598200  0.205309      quasi       bce
10    10  0.68874  0.600556  0.205671      quasi       bce
11    11  0.67576  0.607476  0.208523      quasi       bce
12    12  0.67214  0.598647  0.205273      quasi       bce
13    13  0.67398  0.598766  0.205263      quasi       bce
14    14  0.68608  0.599465  0.205487      quasi       bce
15    15  0.68156  0.602133  0.206873      quasi       b

In [None]:
df_results_quasi_conta = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="contaminated"
)
print(df_results_quasi_conta)

Output()

ERROR:pymc.stats.convergence:There were 224 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 775 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 342 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 450 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 757 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 292 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 581 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 617 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 501 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 280 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 308 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 290 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 287 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 197 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 302 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 246 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 546 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1186 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 461 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 483 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.77722  0.464124  0.148933      quasi       bce
1      1  0.76678  0.459192  0.147534      quasi       bce
2      2  0.77630  0.465867  0.149510      quasi       bce
3      3  0.78386  0.468367  0.150458      quasi       bce
4      4  0.77502  0.471468  0.152082      quasi       bce
5      5  0.76982  0.460956  0.147957      quasi       bce
6      6  0.77730  0.465747  0.149494      quasi       bce
7      7  0.77560  0.464025  0.148743      quasi       bce
8      8  0.78506  0.469856  0.149818      quasi       bce
9      9  0.78034  0.461847  0.148399      quasi       bce
10    10  0.78648  0.461731  0.148120      quasi       bce
11    11  0.76452  0.461479  0.147772      quasi       bce
12    12  0.78176  0.462376  0.147945      quasi       bce
13    13  0.77286  0.465901  0.148672      quasi       bce
14    14  0.77088  0.468804  0.151403      quasi       bce
15    15  0.78278  0.465795  0.148868      quasi       b

## 3. SPH Bayes model

In [None]:
df_results_sph_normal = run_experiment_loop2(
    seeds=range(20),
    df_simulated_test=df_simulated_test,
    feature_cols=feature_cols,
    npergroup = 1000,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,loss_kind = "sph"
)

print(df_results_sph_normal)

TypeError: run_experiment_loop2() got an unexpected keyword argument 'df_simulated_test'

In [None]:
df_results_sph_cauchy = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="cauchy",loss_kind = "sph"
)

print(df_results_sph_cauchy)

Output()

ERROR:pymc.stats.convergence:There were 429 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 628 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 526 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 207 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 196 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 282 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 277 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 423 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 2089 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 312 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 355 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 853 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 257 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 237 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1133 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 305 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 232 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 558 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 618 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 684 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.62984  0.655821  0.230118      quasi       sph
1      1  0.62046  0.662196  0.233033      quasi       sph
2      2  0.63394  0.681422  0.239537      quasi       sph
3      3  0.63412  0.694517  0.241195      quasi       sph
4      4  0.62372  0.663291  0.233588      quasi       sph
5      5  0.62688  0.667297  0.234428      quasi       sph
6      6  0.61744  0.662621  0.233263      quasi       sph
7      7  0.62514  0.667661  0.234084      quasi       sph
8      8  0.62904  0.685829  0.239397      quasi       sph
9      9  0.63132  0.667420  0.234624      quasi       sph
10    10  0.63206  0.659543  0.231525      quasi       sph
11    11  0.62176  0.656596  0.231318      quasi       sph
12    12  0.62060  0.676462  0.237462      quasi       sph
13    13  0.62710  0.676176  0.237180      quasi       sph
14    14  0.62670  0.669799  0.235252      quasi       sph
15    15  0.63198  0.679775  0.239408      quasi       s

In [None]:
df_results_sph__t = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="t",loss_kind = "sph"
)

print(df_results_sph__t)

Output()

ERROR:pymc.stats.convergence:There were 480 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 853 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 227 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 186 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 723 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 177 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 355 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 335 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 253 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 230 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 198 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 295 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 618 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 827 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1071 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 172 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 614 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 185 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 174 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 385 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.67588  0.611535  0.207617      quasi       sph
1      1  0.68568  0.616520  0.208134      quasi       sph
2      2  0.69744  0.662121  0.217589      quasi       sph
3      3  0.69388  0.623601  0.209475      quasi       sph
4      4  0.70218  0.625620  0.210228      quasi       sph
5      5  0.68184  0.606379  0.206682      quasi       sph
6      6  0.67790  0.606361  0.205931      quasi       sph
7      7  0.71036  0.661855  0.217537      quasi       sph
8      8  0.71064  0.667913  0.218173      quasi       sph
9      9  0.67654  0.613118  0.208639      quasi       sph
10    10  0.69206  0.617210  0.208102      quasi       sph
11    11  0.67334  0.625972  0.212451      quasi       sph
12    12  0.66860  0.609297  0.206743      quasi       sph
13    13  0.66622  0.619364  0.210075      quasi       sph
14    14  0.69530  0.631713  0.212155      quasi       sph
15    15  0.69878  0.633862  0.213449      quasi       s

In [None]:
print(df_results_sph__t)

    seed      acc   logloss     brier model_type loss_kind
0      0  0.56864  0.693202  0.248528      quasi       sph
1      1  0.57512  0.688250  0.246328      quasi       sph
2      2  0.57286  0.677246  0.242066      quasi       sph
3      3  0.57132  0.678168  0.242581      quasi       sph
4      4  0.61560  0.683988  0.244593      quasi       sph
5      5  0.57638  0.679524  0.242821      quasi       sph
6      6  0.57044  0.684517  0.244677      quasi       sph
7      7  0.57634  0.676922  0.242003      quasi       sph
8      8  0.57270  0.674237  0.240745      quasi       sph
9      9  0.57218  0.687909  0.246586      quasi       sph
10    10  0.57572  0.678578  0.242676      quasi       sph
11    11  0.56312  0.708843  0.254075      quasi       sph
12    12  0.56868  0.687081  0.246072      quasi       sph
13    13  0.56420  0.684818  0.245253      quasi       sph
14    14  0.56672  0.675942  0.241453      quasi       sph
15    15  0.58216  0.674624  0.240934      quasi       s

In [None]:
df_results_sph_conta = run_experiment_loop2(
    seeds=range(20),
    feature_cols=feature_cols,
    npergroup = 200,
        interaction=True,
        nonlinear=True,
        group=True,
    use_quasi=True,
    noise_type_for_train="contaminated",loss_kind = "sph"
)




print(df_results_sph_conta)

Output()

ERROR:pymc.stats.convergence:There were 1124 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 250 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 160 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 233 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 697 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 197 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 345 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 252 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 284 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 223 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 1103 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 369 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 509 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 378 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 153 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 242 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 296 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 491 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 202 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


Output()

ERROR:pymc.stats.convergence:There were 318 divergences after tuning. Increase `target_accept` or reparameterize.
ERROR:pymc.stats.convergence:The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details


    seed      acc   logloss     brier model_type loss_kind
0      0  0.79444  0.466703  0.142950      quasi       sph
1      1  0.82032  0.577386  0.148960      quasi       sph
2      2  0.83032  0.522166  0.145506      quasi       sph
3      3  0.82932  0.560687  0.146225      quasi       sph
4      4  0.82764  0.651980  0.152184      quasi       sph
5      5  0.82116  0.583781  0.147143      quasi       sph
6      6  0.80146  0.491246  0.142696      quasi       sph
7      7  0.83022  0.585550  0.150276      quasi       sph
8      8  0.82226  0.643687  0.150069      quasi       sph
9      9  0.83514  0.625965  0.149147      quasi       sph
10    10  0.83060  0.611690  0.149944      quasi       sph
11    11  0.81860  0.631209  0.149666      quasi       sph
12    12  0.79130  0.472962  0.143580      quasi       sph
13    13  0.82310  0.640382  0.148220      quasi       sph
14    14  0.81890  0.567581  0.147790      quasi       sph
15    15  0.82820  0.573284  0.148970      quasi       s

In [None]:
print(df_results_sph_conta)

    seed      acc   logloss     brier model_type loss_kind
0      0  0.57264  0.681557  0.243656      quasi       sph
1      1  0.57082  0.671819  0.239269      quasi       sph
2      2  0.56318  0.675788  0.241431      quasi       sph
3      3  0.58098  0.674097  0.240932      quasi       sph
4      4  0.64462  0.706825  0.249561      quasi       sph
5      5  0.60440  0.677398  0.240746      quasi       sph
6      6  0.58284  0.695847  0.249484      quasi       sph
7      7  0.58602  0.687959  0.246669      quasi       sph
8      8  0.62044  0.671252  0.238479      quasi       sph
9      9  0.63988  0.660855  0.234345      quasi       sph
10    10  0.58860  0.678198  0.242757      quasi       sph
11    11  0.62746  0.674014  0.239225      quasi       sph
12    12  0.60020  0.688996  0.245330      quasi       sph
13    13  0.59830  0.692139  0.246136      quasi       sph
14    14  0.58830  0.670330  0.238827      quasi       sph
15    15  0.61044  0.665096  0.236265      quasi       s

## 4. KSD Bayes model

In [None]:
all_metrics = []
noise_type = "cauchy"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=True, nonlinear=True, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=3.12e-01, acc. prob=0.937]


[outer 000] TRAIN (EMA+K-ens) ll=0.6784  br=0.2427  acc=0.5580


Sample: 100%|██████████| 330/330 [00:16, 19.99it/s, step size=2.46e-01, acc. prob=0.964]


[outer 001] TRAIN (EMA+K-ens) ll=0.6746  br=0.2407  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=3.02e-01, acc. prob=0.936]


[outer 002] TRAIN (EMA+K-ens) ll=0.6776  br=0.2420  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 17.90it/s, step size=2.69e-01, acc. prob=0.947]


[outer 003] TRAIN (EMA+K-ens) ll=0.6711  br=0.2386  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=2.97e-01, acc. prob=0.923]


[outer 004] TRAIN (EMA+K-ens) ll=0.6699  br=0.2380  acc=0.6840


Sample: 100%|██████████| 330/330 [00:19, 16.62it/s, step size=2.49e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.6687  br=0.2375  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=3.42e-01, acc. prob=0.896]


[outer 006] TRAIN (EMA+K-ens) ll=0.6689  br=0.2375  acc=0.6950


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=2.47e-01, acc. prob=0.952]


[outer 007] TRAIN (EMA+K-ens) ll=0.6672  br=0.2366  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.95e-01, acc. prob=0.936]


[outer 008] TRAIN (EMA+K-ens) ll=0.6691  br=0.2374  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=3.44e-01, acc. prob=0.874]


[outer 009] TRAIN (EMA+K-ens) ll=0.6677  br=0.2367  acc=0.6840


Sample: 100%|██████████| 330/330 [00:18, 17.59it/s, step size=3.05e-01, acc. prob=0.896]


[outer 010] TRAIN (EMA+K-ens) ll=0.6672  br=0.2365  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.70it/s, step size=2.91e-01, acc. prob=0.950]


[outer 011] TRAIN (EMA+K-ens) ll=0.6671  br=0.2366  acc=0.6900


Sample: 100%|██████████| 330/330 [00:18, 17.82it/s, step size=2.52e-01, acc. prob=0.966]


[outer 012] TRAIN (EMA+K-ens) ll=0.6756  br=0.2405  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.49it/s, step size=3.03e-01, acc. prob=0.934]


[outer 013] TRAIN (EMA+K-ens) ll=0.6752  br=0.2403  acc=0.6790


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.81e-01, acc. prob=0.929]


[outer 014] TRAIN (EMA+K-ens) ll=0.6752  br=0.2403  acc=0.6770
[Early stop @ outer 14] Δll=0.021%, Δbr=0.150%, Δacc=0.004
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}]


Sample: 100%|██████████| 330/330 [00:17, 19.09it/s, step size=2.99e-01, acc. prob=0.922]


[outer 000] TRAIN (EMA+K-ens) ll=0.7086  br=0.2569  acc=0.5660


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.77e-01, acc. prob=0.936]


[outer 001] TRAIN (EMA+K-ens) ll=0.7175  br=0.2613  acc=0.4990


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=3.08e-01, acc. prob=0.937]


[outer 002] TRAIN (EMA+K-ens) ll=0.7140  br=0.2596  acc=0.5520


Sample: 100%|██████████| 330/330 [00:18, 18.04it/s, step size=3.45e-01, acc. prob=0.931]


[outer 003] TRAIN (EMA+K-ens) ll=0.7013  br=0.2534  acc=0.6100


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=2.94e-01, acc. prob=0.911]


[outer 004] TRAIN (EMA+K-ens) ll=0.7040  br=0.2547  acc=0.6000


Sample: 100%|██████████| 330/330 [00:17, 18.59it/s, step size=2.64e-01, acc. prob=0.940]


[outer 005] TRAIN (EMA+K-ens) ll=0.7013  br=0.2535  acc=0.6210


Sample: 100%|██████████| 330/330 [00:18, 18.16it/s, step size=3.13e-01, acc. prob=0.936]


[outer 006] TRAIN (EMA+K-ens) ll=0.6960  br=0.2510  acc=0.6320


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=2.76e-01, acc. prob=0.953]


[outer 007] TRAIN (EMA+K-ens) ll=0.6927  br=0.2493  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=3.39e-01, acc. prob=0.906]


[outer 008] TRAIN (EMA+K-ens) ll=0.6922  br=0.2489  acc=0.6440


Sample: 100%|██████████| 330/330 [00:18, 17.69it/s, step size=2.60e-01, acc. prob=0.949]


[outer 009] TRAIN (EMA+K-ens) ll=0.6862  br=0.2461  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=3.04e-01, acc. prob=0.903]


[outer 010] TRAIN (EMA+K-ens) ll=0.6821  br=0.2442  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=3.14e-01, acc. prob=0.913]


[outer 011] TRAIN (EMA+K-ens) ll=0.6863  br=0.2463  acc=0.6570


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=2.68e-01, acc. prob=0.954]


[outer 012] TRAIN (EMA+K-ens) ll=0.6821  br=0.2442  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 19.96it/s, step size=2.74e-01, acc. prob=0.943]


[outer 013] TRAIN (EMA+K-ens) ll=0.6804  br=0.2434  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.89it/s, step size=2.82e-01, acc. prob=0.949]


[outer 014] TRAIN (EMA+K-ens) ll=0.6821  br=0.2441  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=2.74e-01, acc. prob=0.932]


[outer 015] TRAIN (EMA+K-ens) ll=0.6834  br=0.2448  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.54e-01, acc. prob=0.946]


[outer 016] TRAIN (EMA+K-ens) ll=0.6837  br=0.2450  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.65it/s, step size=2.84e-01, acc. prob=0.929]


[outer 017] TRAIN (EMA+K-ens) ll=0.6836  br=0.2449  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=2.86e-01, acc. prob=0.944]


[outer 018] TRAIN (EMA+K-ens) ll=0.6798  br=0.2430  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=2.71e-01, acc. prob=0.941]


[outer 019] TRAIN (EMA+K-ens) ll=0.6736  br=0.2400  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 18.99it/s, step size=3.29e-01, acc. prob=0.924]


[outer 020] TRAIN (EMA+K-ens) ll=0.6748  br=0.2406  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.56it/s, step size=3.05e-01, acc. prob=0.949]


[outer 021] TRAIN (EMA+K-ens) ll=0.6759  br=0.2411  acc=0.6530


Sample: 100%|██████████| 330/330 [00:18, 18.24it/s, step size=2.82e-01, acc. prob=0.961]


[outer 022] TRAIN (EMA+K-ens) ll=0.6762  br=0.2412  acc=0.6720


Sample: 100%|██████████| 330/330 [00:15, 21.12it/s, step size=3.33e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6790  br=0.2426  acc=0.6810


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=2.67e-01, acc. prob=0.940]


[outer 024] TRAIN (EMA+K-ens) ll=0.6780  br=0.2421  acc=0.6530


Sample: 100%|██████████| 330/330 [00:18, 17.97it/s, step size=2.77e-01, acc. prob=0.944]


[outer 025] TRAIN (EMA+K-ens) ll=0.6839  br=0.2449  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=3.03e-01, acc. prob=0.945]


[outer 026] TRAIN (EMA+K-ens) ll=0.6912  br=0.2485  acc=0.6560


Sample: 100%|██████████| 330/330 [00:15, 20.74it/s, step size=2.98e-01, acc. prob=0.921]


[outer 027] TRAIN (EMA+K-ens) ll=0.6943  br=0.2499  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.81e-01, acc. prob=0.942]


[outer 028] TRAIN (EMA+K-ens) ll=0.6940  br=0.2499  acc=0.6350


Sample: 100%|██████████| 330/330 [00:19, 17.06it/s, step size=3.01e-01, acc. prob=0.928]


[outer 029] TRAIN (EMA+K-ens) ll=0.6974  br=0.2516  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=2.88e-01, acc. prob=0.948]


[outer 030] TRAIN (EMA+K-ens) ll=0.6952  br=0.2505  acc=0.6300


Sample: 100%|██████████| 330/330 [00:18, 17.78it/s, step size=3.12e-01, acc. prob=0.923]


[outer 031] TRAIN (EMA+K-ens) ll=0.6928  br=0.2494  acc=0.6780


Sample: 100%|██████████| 330/330 [00:17, 18.53it/s, step size=2.96e-01, acc. prob=0.919]


[outer 032] TRAIN (EMA+K-ens) ll=0.6945  br=0.2502  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=2.86e-01, acc. prob=0.928]


[outer 033] TRAIN (EMA+K-ens) ll=0.6875  br=0.2468  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.80e-01, acc. prob=0.935]


[outer 034] TRAIN (EMA+K-ens) ll=0.6856  br=0.2459  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=3.42e-01, acc. prob=0.915]


[outer 035] TRAIN (EMA+K-ens) ll=0.6838  br=0.2450  acc=0.6770


Sample: 100%|██████████| 330/330 [00:18, 18.13it/s, step size=2.95e-01, acc. prob=0.948]


[outer 036] TRAIN (EMA+K-ens) ll=0.6860  br=0.2461  acc=0.6730


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=3.01e-01, acc. prob=0.922]


[outer 037] TRAIN (EMA+K-ens) ll=0.6838  br=0.2450  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 17.52it/s, step size=2.66e-01, acc. prob=0.941]


[outer 038] TRAIN (EMA+K-ens) ll=0.6872  br=0.2466  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=2.70e-01, acc. prob=0.939]


[outer 039] TRAIN (EMA+K-ens) ll=0.6885  br=0.2472  acc=0.6510
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}]


Sample: 100%|██████████| 330/330 [00:17, 18.36it/s, step size=2.85e-01, acc. prob=0.945]


[outer 000] TRAIN (EMA+K-ens) ll=0.6961  br=0.2508  acc=0.5860


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=2.61e-01, acc. prob=0.960]


[outer 001] TRAIN (EMA+K-ens) ll=0.6781  br=0.2422  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.71e-01, acc. prob=0.958]


[outer 002] TRAIN (EMA+K-ens) ll=0.6766  br=0.2415  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.97e-01, acc. prob=0.939]


[outer 003] TRAIN (EMA+K-ens) ll=0.6846  br=0.2453  acc=0.6330


Sample: 100%|██████████| 330/330 [00:16, 20.04it/s, step size=2.92e-01, acc. prob=0.927]


[outer 004] TRAIN (EMA+K-ens) ll=0.6739  br=0.2402  acc=0.6700


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=3.08e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6757  br=0.2410  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=2.89e-01, acc. prob=0.947]


[outer 006] TRAIN (EMA+K-ens) ll=0.6812  br=0.2436  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=3.38e-01, acc. prob=0.926]


[outer 007] TRAIN (EMA+K-ens) ll=0.6788  br=0.2424  acc=0.6930


Sample: 100%|██████████| 330/330 [00:18, 18.20it/s, step size=3.07e-01, acc. prob=0.941]


[outer 008] TRAIN (EMA+K-ens) ll=0.6798  br=0.2429  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.70e-01, acc. prob=0.949]


[outer 009] TRAIN (EMA+K-ens) ll=0.6824  br=0.2441  acc=0.6260


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.88e-01, acc. prob=0.925]


[outer 010] TRAIN (EMA+K-ens) ll=0.6860  br=0.2458  acc=0.6270


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.17e-01, acc. prob=0.918]


[outer 011] TRAIN (EMA+K-ens) ll=0.6822  br=0.2439  acc=0.6230


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=2.48e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6901  br=0.2477  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=2.80e-01, acc. prob=0.973]


[outer 013] TRAIN (EMA+K-ens) ll=0.6896  br=0.2476  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.47e-01, acc. prob=0.901]


[outer 014] TRAIN (EMA+K-ens) ll=0.6837  br=0.2449  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.69it/s, step size=2.98e-01, acc. prob=0.940]


[outer 015] TRAIN (EMA+K-ens) ll=0.6836  br=0.2448  acc=0.6460


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.09e-01, acc. prob=0.932]


[outer 016] TRAIN (EMA+K-ens) ll=0.6821  br=0.2440  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 19.23it/s, step size=2.98e-01, acc. prob=0.956]


[outer 017] TRAIN (EMA+K-ens) ll=0.6853  br=0.2456  acc=0.6610


Sample: 100%|██████████| 330/330 [00:17, 18.35it/s, step size=2.87e-01, acc. prob=0.945]


[outer 018] TRAIN (EMA+K-ens) ll=0.6785  br=0.2424  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.30e-01, acc. prob=0.948]


[outer 019] TRAIN (EMA+K-ens) ll=0.6764  br=0.2414  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.52e-01, acc. prob=0.952]


[outer 020] TRAIN (EMA+K-ens) ll=0.6722  br=0.2394  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=3.18e-01, acc. prob=0.923]


[outer 021] TRAIN (EMA+K-ens) ll=0.6634  br=0.2350  acc=0.6890


Sample: 100%|██████████| 330/330 [00:16, 19.87it/s, step size=2.69e-01, acc. prob=0.967]


[outer 022] TRAIN (EMA+K-ens) ll=0.6643  br=0.2355  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=3.56e-01, acc. prob=0.913]


[outer 023] TRAIN (EMA+K-ens) ll=0.6662  br=0.2364  acc=0.6810


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=3.42e-01, acc. prob=0.925]


[outer 024] TRAIN (EMA+K-ens) ll=0.6645  br=0.2355  acc=0.6830


Sample: 100%|██████████| 330/330 [00:18, 17.54it/s, step size=3.22e-01, acc. prob=0.948]


[outer 025] TRAIN (EMA+K-ens) ll=0.6631  br=0.2349  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.69it/s, step size=3.27e-01, acc. prob=0.923]


[outer 026] TRAIN (EMA+K-ens) ll=0.6659  br=0.2361  acc=0.6860


Sample: 100%|██████████| 330/330 [00:18, 17.53it/s, step size=2.63e-01, acc. prob=0.964]


[outer 027] TRAIN (EMA+K-ens) ll=0.6688  br=0.2375  acc=0.6640


Sample: 100%|██████████| 330/330 [00:18, 18.08it/s, step size=2.87e-01, acc. prob=0.956]


[outer 028] TRAIN (EMA+K-ens) ll=0.6718  br=0.2389  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.57it/s, step size=3.06e-01, acc. prob=0.950]


[outer 029] TRAIN (EMA+K-ens) ll=0.6746  br=0.2402  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 20.18it/s, step size=3.00e-01, acc. prob=0.940]


[outer 030] TRAIN (EMA+K-ens) ll=0.6775  br=0.2415  acc=0.6890


Sample: 100%|██████████| 330/330 [00:18, 17.56it/s, step size=2.58e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6803  br=0.2427  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.88e-01, acc. prob=0.947]


[outer 032] TRAIN (EMA+K-ens) ll=0.6814  br=0.2434  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=3.03e-01, acc. prob=0.950]


[outer 033] TRAIN (EMA+K-ens) ll=0.6794  br=0.2423  acc=0.6810


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=3.37e-01, acc. prob=0.912]


[outer 034] TRAIN (EMA+K-ens) ll=0.6804  br=0.2429  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=2.29e-01, acc. prob=0.954]


[outer 035] TRAIN (EMA+K-ens) ll=0.6793  br=0.2423  acc=0.6640


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=3.02e-01, acc. prob=0.946]


[outer 036] TRAIN (EMA+K-ens) ll=0.6768  br=0.2410  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 20.61it/s, step size=2.25e-01, acc. prob=0.970]


[outer 037] TRAIN (EMA+K-ens) ll=0.6850  br=0.2450  acc=0.6510


Sample: 100%|██████████| 330/330 [00:18, 17.66it/s, step size=2.74e-01, acc. prob=0.939]


[outer 038] TRAIN (EMA+K-ens) ll=0.6775  br=0.2416  acc=0.6530


Sample: 100%|██████████| 330/330 [00:18, 17.38it/s, step size=2.79e-01, acc. prob=0.946]


[outer 039] TRAIN (EMA+K-ens) ll=0.6729  br=0.2393  acc=0.6600
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}]


Sample: 100%|██████████| 330/330 [00:19, 17.24it/s, step size=2.68e-01, acc. prob=0.945]


[outer 000] TRAIN (EMA+K-ens) ll=0.6621  br=0.2332  acc=0.6040


Sample: 100%|██████████| 330/330 [00:15, 20.84it/s, step size=3.01e-01, acc. prob=0.945]


[outer 001] TRAIN (EMA+K-ens) ll=0.6433  br=0.2253  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=3.06e-01, acc. prob=0.930]


[outer 002] TRAIN (EMA+K-ens) ll=0.6410  br=0.2241  acc=0.6720


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=3.12e-01, acc. prob=0.930]


[outer 003] TRAIN (EMA+K-ens) ll=0.6485  br=0.2278  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.76it/s, step size=3.18e-01, acc. prob=0.929]


[outer 004] TRAIN (EMA+K-ens) ll=0.6524  br=0.2295  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=2.93e-01, acc. prob=0.948]


[outer 005] TRAIN (EMA+K-ens) ll=0.6560  br=0.2313  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=2.87e-01, acc. prob=0.937]


[outer 006] TRAIN (EMA+K-ens) ll=0.6586  br=0.2324  acc=0.6530


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=2.82e-01, acc. prob=0.956]


[outer 007] TRAIN (EMA+K-ens) ll=0.6613  br=0.2336  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=2.67e-01, acc. prob=0.948]


[outer 008] TRAIN (EMA+K-ens) ll=0.6682  br=0.2371  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=3.15e-01, acc. prob=0.931]


[outer 009] TRAIN (EMA+K-ens) ll=0.6768  br=0.2413  acc=0.6900


Sample: 100%|██████████| 330/330 [00:17, 19.36it/s, step size=3.11e-01, acc. prob=0.896]


[outer 010] TRAIN (EMA+K-ens) ll=0.6786  br=0.2423  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.03e-01, acc. prob=0.949]


[outer 011] TRAIN (EMA+K-ens) ll=0.6781  br=0.2420  acc=0.6890


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.84e-01, acc. prob=0.924]


[outer 012] TRAIN (EMA+K-ens) ll=0.6743  br=0.2402  acc=0.6990


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=2.53e-01, acc. prob=0.962]


[outer 013] TRAIN (EMA+K-ens) ll=0.6753  br=0.2407  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=3.14e-01, acc. prob=0.936]


[outer 014] TRAIN (EMA+K-ens) ll=0.6754  br=0.2407  acc=0.7030


Sample: 100%|██████████| 330/330 [00:17, 19.09it/s, step size=3.28e-01, acc. prob=0.912]


[outer 015] TRAIN (EMA+K-ens) ll=0.6762  br=0.2412  acc=0.7030


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.83e-01, acc. prob=0.921]


[outer 016] TRAIN (EMA+K-ens) ll=0.6777  br=0.2420  acc=0.6980


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=2.86e-01, acc. prob=0.931]


[outer 017] TRAIN (EMA+K-ens) ll=0.6767  br=0.2415  acc=0.6930


Sample: 100%|██████████| 330/330 [00:17, 19.18it/s, step size=2.79e-01, acc. prob=0.949]


[outer 018] TRAIN (EMA+K-ens) ll=0.6784  br=0.2423  acc=0.6710


Sample: 100%|██████████| 330/330 [00:17, 18.60it/s, step size=2.50e-01, acc. prob=0.936]


[outer 019] TRAIN (EMA+K-ens) ll=0.6799  br=0.2429  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.32it/s, step size=3.15e-01, acc. prob=0.913]


[outer 020] TRAIN (EMA+K-ens) ll=0.6812  br=0.2436  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.05it/s, step size=2.78e-01, acc. prob=0.956]


[outer 021] TRAIN (EMA+K-ens) ll=0.6793  br=0.2427  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 18.33it/s, step size=2.87e-01, acc. prob=0.943]


[outer 022] TRAIN (EMA+K-ens) ll=0.6798  br=0.2429  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 20.92it/s, step size=3.15e-01, acc. prob=0.951]


[outer 023] TRAIN (EMA+K-ens) ll=0.6826  br=0.2442  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=3.35e-01, acc. prob=0.901]


[outer 024] TRAIN (EMA+K-ens) ll=0.6781  br=0.2421  acc=0.6750


Sample: 100%|██████████| 330/330 [00:15, 20.78it/s, step size=2.78e-01, acc. prob=0.935]


[outer 025] TRAIN (EMA+K-ens) ll=0.6677  br=0.2371  acc=0.6900


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=3.11e-01, acc. prob=0.950]


[outer 026] TRAIN (EMA+K-ens) ll=0.6644  br=0.2355  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.97e-01, acc. prob=0.949]


[outer 027] TRAIN (EMA+K-ens) ll=0.6652  br=0.2359  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=3.05e-01, acc. prob=0.926]


[outer 028] TRAIN (EMA+K-ens) ll=0.6648  br=0.2356  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=2.65e-01, acc. prob=0.959]


[outer 029] TRAIN (EMA+K-ens) ll=0.6710  br=0.2386  acc=0.6950


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=3.12e-01, acc. prob=0.897]


[outer 030] TRAIN (EMA+K-ens) ll=0.6691  br=0.2376  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=3.08e-01, acc. prob=0.941]


[outer 031] TRAIN (EMA+K-ens) ll=0.6661  br=0.2362  acc=0.6990


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=3.16e-01, acc. prob=0.947]


[outer 032] TRAIN (EMA+K-ens) ll=0.6692  br=0.2376  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=3.22e-01, acc. prob=0.922]


[outer 033] TRAIN (EMA+K-ens) ll=0.6723  br=0.2392  acc=0.6990


Sample: 100%|██████████| 330/330 [00:16, 19.42it/s, step size=2.69e-01, acc. prob=0.957]


[outer 034] TRAIN (EMA+K-ens) ll=0.6829  br=0.2444  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=2.96e-01, acc. prob=0.929]


[outer 035] TRAIN (EMA+K-ens) ll=0.6831  br=0.2445  acc=0.6890


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=3.03e-01, acc. prob=0.951]


[outer 036] TRAIN (EMA+K-ens) ll=0.6844  br=0.2451  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 20.83it/s, step size=3.03e-01, acc. prob=0.925]


[outer 037] TRAIN (EMA+K-ens) ll=0.6814  br=0.2437  acc=0.6700


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.13e-01, acc. prob=0.930]


[outer 038] TRAIN (EMA+K-ens) ll=0.6821  br=0.2440  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 18.81it/s, step size=3.46e-01, acc. prob=0.907]


[outer 039] TRAIN (EMA+K-ens) ll=0.6797  br=0.2429  acc=0.6420
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}]


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=2.48e-01, acc. prob=0.961]


[outer 000] TRAIN (EMA+K-ens) ll=0.7692  br=0.2853  acc=0.3930


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=2.54e-01, acc. prob=0.969]


[outer 001] TRAIN (EMA+K-ens) ll=0.7191  br=0.2623  acc=0.4730


Sample: 100%|██████████| 330/330 [00:18, 17.56it/s, step size=3.43e-01, acc. prob=0.894]


[outer 002] TRAIN (EMA+K-ens) ll=0.7100  br=0.2579  acc=0.4790


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=3.36e-01, acc. prob=0.926]


[outer 003] TRAIN (EMA+K-ens) ll=0.7016  br=0.2538  acc=0.5240


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=3.32e-01, acc. prob=0.899]


[outer 004] TRAIN (EMA+K-ens) ll=0.6960  br=0.2510  acc=0.5630


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=3.22e-01, acc. prob=0.915]


[outer 005] TRAIN (EMA+K-ens) ll=0.6921  br=0.2492  acc=0.5640


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.72e-01, acc. prob=0.949]


[outer 006] TRAIN (EMA+K-ens) ll=0.6882  br=0.2473  acc=0.6080


Sample: 100%|██████████| 330/330 [00:19, 17.18it/s, step size=3.07e-01, acc. prob=0.922]


[outer 007] TRAIN (EMA+K-ens) ll=0.6858  br=0.2461  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 20.23it/s, step size=2.70e-01, acc. prob=0.954]


[outer 008] TRAIN (EMA+K-ens) ll=0.6774  br=0.2420  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=3.52e-01, acc. prob=0.913]


[outer 009] TRAIN (EMA+K-ens) ll=0.6737  br=0.2402  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=2.83e-01, acc. prob=0.957]


[outer 010] TRAIN (EMA+K-ens) ll=0.6756  br=0.2411  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=2.92e-01, acc. prob=0.928]


[outer 011] TRAIN (EMA+K-ens) ll=0.6733  br=0.2400  acc=0.6510


Sample: 100%|██████████| 330/330 [00:19, 16.61it/s, step size=2.34e-01, acc. prob=0.962]


[outer 012] TRAIN (EMA+K-ens) ll=0.6752  br=0.2409  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.27e-01, acc. prob=0.933]


[outer 013] TRAIN (EMA+K-ens) ll=0.6763  br=0.2415  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 19.14it/s, step size=2.88e-01, acc. prob=0.930]


[outer 014] TRAIN (EMA+K-ens) ll=0.6743  br=0.2405  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=2.74e-01, acc. prob=0.927]


[outer 015] TRAIN (EMA+K-ens) ll=0.6766  br=0.2415  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 18.64it/s, step size=2.94e-01, acc. prob=0.925]


[outer 016] TRAIN (EMA+K-ens) ll=0.6784  br=0.2424  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.86e-01, acc. prob=0.960]


[outer 017] TRAIN (EMA+K-ens) ll=0.6812  br=0.2438  acc=0.6590


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=2.70e-01, acc. prob=0.935]


[outer 018] TRAIN (EMA+K-ens) ll=0.6766  br=0.2416  acc=0.6570


Sample: 100%|██████████| 330/330 [00:15, 20.92it/s, step size=3.37e-01, acc. prob=0.919]


[outer 019] TRAIN (EMA+K-ens) ll=0.6818  br=0.2441  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=3.31e-01, acc. prob=0.887]


[outer 020] TRAIN (EMA+K-ens) ll=0.6785  br=0.2424  acc=0.6150


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=2.44e-01, acc. prob=0.946]


[outer 021] TRAIN (EMA+K-ens) ll=0.6802  br=0.2431  acc=0.6080


Sample: 100%|██████████| 330/330 [00:17, 18.47it/s, step size=2.47e-01, acc. prob=0.958]


[outer 022] TRAIN (EMA+K-ens) ll=0.6795  br=0.2428  acc=0.6000


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=3.55e-01, acc. prob=0.933]


[outer 023] TRAIN (EMA+K-ens) ll=0.6782  br=0.2423  acc=0.6030


Sample: 100%|██████████| 330/330 [00:18, 17.81it/s, step size=2.60e-01, acc. prob=0.947]


[outer 024] TRAIN (EMA+K-ens) ll=0.6708  br=0.2386  acc=0.6240


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=2.50e-01, acc. prob=0.948]


[outer 025] TRAIN (EMA+K-ens) ll=0.6696  br=0.2379  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 19.05it/s, step size=2.46e-01, acc. prob=0.955]


[outer 026] TRAIN (EMA+K-ens) ll=0.6716  br=0.2389  acc=0.6330


Sample: 100%|██████████| 330/330 [00:18, 18.33it/s, step size=3.31e-01, acc. prob=0.920]


[outer 027] TRAIN (EMA+K-ens) ll=0.6689  br=0.2376  acc=0.6360


Sample: 100%|██████████| 330/330 [00:18, 17.63it/s, step size=2.86e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6729  br=0.2395  acc=0.6340


Sample: 100%|██████████| 330/330 [00:16, 20.49it/s, step size=3.37e-01, acc. prob=0.892]


[outer 029] TRAIN (EMA+K-ens) ll=0.6684  br=0.2374  acc=0.6390


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.87e-01, acc. prob=0.924]


[outer 030] TRAIN (EMA+K-ens) ll=0.6683  br=0.2374  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=3.38e-01, acc. prob=0.913]


[outer 031] TRAIN (EMA+K-ens) ll=0.6699  br=0.2381  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 20.56it/s, step size=3.00e-01, acc. prob=0.921]


[outer 032] TRAIN (EMA+K-ens) ll=0.6780  br=0.2421  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 20.28it/s, step size=3.17e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6799  br=0.2431  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=2.72e-01, acc. prob=0.931]


[outer 034] TRAIN (EMA+K-ens) ll=0.6768  br=0.2416  acc=0.6470


Sample: 100%|██████████| 330/330 [00:19, 16.62it/s, step size=2.95e-01, acc. prob=0.917]


[outer 035] TRAIN (EMA+K-ens) ll=0.6782  br=0.2424  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=3.17e-01, acc. prob=0.935]


[outer 036] TRAIN (EMA+K-ens) ll=0.6759  br=0.2413  acc=0.6260


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.26e-01, acc. prob=0.930]


[outer 037] TRAIN (EMA+K-ens) ll=0.6802  br=0.2433  acc=0.6180


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=2.74e-01, acc. prob=0.916]


[outer 038] TRAIN (EMA+K-ens) ll=0.6845  br=0.2452  acc=0.6160


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=3.28e-01, acc. prob=0.937]


[outer 039] TRAIN (EMA+K-ens) ll=0.6850  br=0.2455  acc=0.6110
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}]


Sample: 100%|██████████| 330/330 [00:18, 18.06it/s, step size=2.76e-01, acc. prob=0.959]


[outer 000] TRAIN (EMA+K-ens) ll=0.7181  br=0.2620  acc=0.4960


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=3.06e-01, acc. prob=0.933]


[outer 001] TRAIN (EMA+K-ens) ll=0.7006  br=0.2533  acc=0.5310


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.59e-01, acc. prob=0.919]


[outer 002] TRAIN (EMA+K-ens) ll=0.6883  br=0.2474  acc=0.5830


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=2.76e-01, acc. prob=0.960]


[outer 003] TRAIN (EMA+K-ens) ll=0.6958  br=0.2511  acc=0.5970


Sample: 100%|██████████| 330/330 [00:19, 17.30it/s, step size=2.69e-01, acc. prob=0.946]


[outer 004] TRAIN (EMA+K-ens) ll=0.6942  br=0.2503  acc=0.6020


Sample: 100%|██████████| 330/330 [00:18, 18.12it/s, step size=2.82e-01, acc. prob=0.943]


[outer 005] TRAIN (EMA+K-ens) ll=0.6991  br=0.2527  acc=0.5800


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=2.86e-01, acc. prob=0.931]


[outer 006] TRAIN (EMA+K-ens) ll=0.6979  br=0.2521  acc=0.5870


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=3.01e-01, acc. prob=0.922]


[outer 007] TRAIN (EMA+K-ens) ll=0.6947  br=0.2505  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.61e-01, acc. prob=0.944]


[outer 008] TRAIN (EMA+K-ens) ll=0.6934  br=0.2498  acc=0.6100


Sample: 100%|██████████| 330/330 [00:18, 18.08it/s, step size=2.72e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.6936  br=0.2499  acc=0.6130


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.48e-01, acc. prob=0.949]


[outer 010] TRAIN (EMA+K-ens) ll=0.6994  br=0.2528  acc=0.5870


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.78e-01, acc. prob=0.939]


[outer 011] TRAIN (EMA+K-ens) ll=0.6940  br=0.2501  acc=0.6080


Sample: 100%|██████████| 330/330 [00:18, 17.79it/s, step size=2.52e-01, acc. prob=0.932]


[outer 012] TRAIN (EMA+K-ens) ll=0.6968  br=0.2515  acc=0.6020


Sample: 100%|██████████| 330/330 [00:17, 18.52it/s, step size=2.78e-01, acc. prob=0.941]


[outer 013] TRAIN (EMA+K-ens) ll=0.6924  br=0.2492  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 18.45it/s, step size=3.43e-01, acc. prob=0.934]


[outer 014] TRAIN (EMA+K-ens) ll=0.6938  br=0.2500  acc=0.6100


Sample: 100%|██████████| 330/330 [00:18, 17.80it/s, step size=2.61e-01, acc. prob=0.922]


[outer 015] TRAIN (EMA+K-ens) ll=0.6904  br=0.2484  acc=0.6210


Sample: 100%|██████████| 330/330 [00:18, 17.54it/s, step size=3.17e-01, acc. prob=0.925]


[outer 016] TRAIN (EMA+K-ens) ll=0.6855  br=0.2460  acc=0.6470


Sample: 100%|██████████| 330/330 [00:18, 17.50it/s, step size=2.93e-01, acc. prob=0.930]


[outer 017] TRAIN (EMA+K-ens) ll=0.6830  br=0.2447  acc=0.6240


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=2.90e-01, acc. prob=0.913]


[outer 018] TRAIN (EMA+K-ens) ll=0.6763  br=0.2414  acc=0.6240


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.81e-01, acc. prob=0.946]


[outer 019] TRAIN (EMA+K-ens) ll=0.6792  br=0.2428  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.79e-01, acc. prob=0.946]


[outer 020] TRAIN (EMA+K-ens) ll=0.6760  br=0.2412  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=2.80e-01, acc. prob=0.945]


[outer 021] TRAIN (EMA+K-ens) ll=0.6755  br=0.2411  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=3.24e-01, acc. prob=0.949]


[outer 022] TRAIN (EMA+K-ens) ll=0.6704  br=0.2385  acc=0.6210


Sample: 100%|██████████| 330/330 [00:17, 18.79it/s, step size=2.81e-01, acc. prob=0.947]


[outer 023] TRAIN (EMA+K-ens) ll=0.6788  br=0.2426  acc=0.6110


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=2.53e-01, acc. prob=0.949]


[outer 024] TRAIN (EMA+K-ens) ll=0.6803  br=0.2434  acc=0.6110


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=2.75e-01, acc. prob=0.914]


[outer 025] TRAIN (EMA+K-ens) ll=0.6806  br=0.2435  acc=0.6120


Sample: 100%|██████████| 330/330 [00:16, 20.22it/s, step size=2.91e-01, acc. prob=0.938]


[outer 026] TRAIN (EMA+K-ens) ll=0.6853  br=0.2459  acc=0.6170


Sample: 100%|██████████| 330/330 [00:18, 17.51it/s, step size=3.12e-01, acc. prob=0.935]


[outer 027] TRAIN (EMA+K-ens) ll=0.6817  br=0.2441  acc=0.6360


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=2.62e-01, acc. prob=0.944]


[outer 028] TRAIN (EMA+K-ens) ll=0.6827  br=0.2447  acc=0.6210


Sample: 100%|██████████| 330/330 [00:18, 18.22it/s, step size=3.63e-01, acc. prob=0.914]


[outer 029] TRAIN (EMA+K-ens) ll=0.6819  br=0.2443  acc=0.6060


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=3.57e-01, acc. prob=0.901]


[outer 030] TRAIN (EMA+K-ens) ll=0.6835  br=0.2451  acc=0.6220


Sample: 100%|██████████| 330/330 [00:18, 17.58it/s, step size=2.40e-01, acc. prob=0.957]


[outer 031] TRAIN (EMA+K-ens) ll=0.6809  br=0.2438  acc=0.6010


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=2.66e-01, acc. prob=0.950]


[outer 032] TRAIN (EMA+K-ens) ll=0.6830  br=0.2448  acc=0.5910


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=3.20e-01, acc. prob=0.921]


[outer 033] TRAIN (EMA+K-ens) ll=0.6816  br=0.2441  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=3.15e-01, acc. prob=0.913]


[outer 034] TRAIN (EMA+K-ens) ll=0.6788  br=0.2427  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.08e-01, acc. prob=0.933]


[outer 035] TRAIN (EMA+K-ens) ll=0.6826  br=0.2445  acc=0.6240


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=3.36e-01, acc. prob=0.929]


[outer 036] TRAIN (EMA+K-ens) ll=0.6882  br=0.2471  acc=0.6080


Sample: 100%|██████████| 330/330 [00:18, 17.77it/s, step size=2.55e-01, acc. prob=0.956]


[outer 037] TRAIN (EMA+K-ens) ll=0.6883  br=0.2471  acc=0.6040


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=2.61e-01, acc. prob=0.938]


[outer 038] TRAIN (EMA+K-ens) ll=0.6898  br=0.2477  acc=0.6120


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=3.05e-01, acc. prob=0.898]


[outer 039] TRAIN (EMA+K-ens) ll=0.6868  br=0.2463  acc=0.6060
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}, {'accuracy': 0.6311799883842468, 'brier': 0.23762410879135132, 'logloss': 0.6682534217834473}]


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=3.14e-01, acc. prob=0.931]


[outer 000] TRAIN (EMA+K-ens) ll=0.7175  br=0.2614  acc=0.4750


Sample: 100%|██████████| 330/330 [00:18, 18.08it/s, step size=3.01e-01, acc. prob=0.940]


[outer 001] TRAIN (EMA+K-ens) ll=0.6952  br=0.2507  acc=0.6000


Sample: 100%|██████████| 330/330 [00:17, 18.56it/s, step size=3.14e-01, acc. prob=0.952]


[outer 002] TRAIN (EMA+K-ens) ll=0.6874  br=0.2468  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.32it/s, step size=3.18e-01, acc. prob=0.921]


[outer 003] TRAIN (EMA+K-ens) ll=0.6866  br=0.2464  acc=0.6270


Sample: 100%|██████████| 330/330 [00:15, 21.76it/s, step size=3.08e-01, acc. prob=0.929]


[outer 004] TRAIN (EMA+K-ens) ll=0.6805  br=0.2434  acc=0.6520


Sample: 100%|██████████| 330/330 [00:19, 17.31it/s, step size=3.10e-01, acc. prob=0.921]


[outer 005] TRAIN (EMA+K-ens) ll=0.6781  br=0.2423  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=3.12e-01, acc. prob=0.924]


[outer 006] TRAIN (EMA+K-ens) ll=0.6833  br=0.2449  acc=0.6360


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.77e-01, acc. prob=0.952]


[outer 007] TRAIN (EMA+K-ens) ll=0.6833  br=0.2449  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 19.59it/s, step size=2.67e-01, acc. prob=0.938]


[outer 008] TRAIN (EMA+K-ens) ll=0.6794  br=0.2430  acc=0.6480


Sample: 100%|██████████| 330/330 [00:18, 17.61it/s, step size=2.73e-01, acc. prob=0.956]


[outer 009] TRAIN (EMA+K-ens) ll=0.6810  br=0.2436  acc=0.6400


Sample: 100%|██████████| 330/330 [00:18, 17.68it/s, step size=2.97e-01, acc. prob=0.933]


[outer 010] TRAIN (EMA+K-ens) ll=0.6824  br=0.2443  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=3.14e-01, acc. prob=0.917]


[outer 011] TRAIN (EMA+K-ens) ll=0.6833  br=0.2447  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 18.47it/s, step size=2.72e-01, acc. prob=0.970]


[outer 012] TRAIN (EMA+K-ens) ll=0.6899  br=0.2480  acc=0.6010


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=3.18e-01, acc. prob=0.941]


[outer 013] TRAIN (EMA+K-ens) ll=0.6907  br=0.2481  acc=0.6320


Sample: 100%|██████████| 330/330 [00:18, 18.03it/s, step size=3.23e-01, acc. prob=0.948]


[outer 014] TRAIN (EMA+K-ens) ll=0.6847  br=0.2452  acc=0.6170


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=2.76e-01, acc. prob=0.925]


[outer 015] TRAIN (EMA+K-ens) ll=0.6880  br=0.2467  acc=0.6150


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=3.03e-01, acc. prob=0.927]


[outer 016] TRAIN (EMA+K-ens) ll=0.6885  br=0.2469  acc=0.6380


Sample: 100%|██████████| 330/330 [00:19, 16.65it/s, step size=2.68e-01, acc. prob=0.953]


[outer 017] TRAIN (EMA+K-ens) ll=0.6878  br=0.2466  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.77e-01, acc. prob=0.946]


[outer 018] TRAIN (EMA+K-ens) ll=0.6880  br=0.2467  acc=0.6270


Sample: 100%|██████████| 330/330 [00:17, 19.23it/s, step size=2.88e-01, acc. prob=0.955]


[outer 019] TRAIN (EMA+K-ens) ll=0.6882  br=0.2469  acc=0.6260


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=3.07e-01, acc. prob=0.938]


[outer 020] TRAIN (EMA+K-ens) ll=0.6854  br=0.2455  acc=0.6400


Sample: 100%|██████████| 330/330 [00:18, 17.88it/s, step size=2.38e-01, acc. prob=0.947]


[outer 021] TRAIN (EMA+K-ens) ll=0.6870  br=0.2463  acc=0.6720


Sample: 100%|██████████| 330/330 [00:18, 18.00it/s, step size=2.47e-01, acc. prob=0.957]


[outer 022] TRAIN (EMA+K-ens) ll=0.6867  br=0.2463  acc=0.6690


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.02e-01, acc. prob=0.936]


[outer 023] TRAIN (EMA+K-ens) ll=0.6818  br=0.2440  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 19.25it/s, step size=2.87e-01, acc. prob=0.934]


[outer 024] TRAIN (EMA+K-ens) ll=0.6747  br=0.2404  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 19.18it/s, step size=2.84e-01, acc. prob=0.941]


[outer 025] TRAIN (EMA+K-ens) ll=0.6768  br=0.2415  acc=0.6570


Sample: 100%|██████████| 330/330 [00:19, 17.12it/s, step size=2.68e-01, acc. prob=0.945]


[outer 026] TRAIN (EMA+K-ens) ll=0.6799  br=0.2429  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=2.91e-01, acc. prob=0.908]


[outer 027] TRAIN (EMA+K-ens) ll=0.6797  br=0.2428  acc=0.6430


Sample: 100%|██████████| 330/330 [00:18, 17.38it/s, step size=2.81e-01, acc. prob=0.921]


[outer 028] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.6490


Sample: 100%|██████████| 330/330 [00:16, 20.23it/s, step size=2.62e-01, acc. prob=0.950]


[outer 029] TRAIN (EMA+K-ens) ll=0.6800  br=0.2430  acc=0.6290


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.06e-01, acc. prob=0.937]


[outer 030] TRAIN (EMA+K-ens) ll=0.6825  br=0.2440  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=2.65e-01, acc. prob=0.941]


[outer 031] TRAIN (EMA+K-ens) ll=0.6846  br=0.2450  acc=0.6560


Sample: 100%|██████████| 330/330 [00:18, 18.02it/s, step size=3.32e-01, acc. prob=0.922]


[outer 032] TRAIN (EMA+K-ens) ll=0.6943  br=0.2499  acc=0.6330


Sample: 100%|██████████| 330/330 [00:19, 16.96it/s, step size=2.68e-01, acc. prob=0.948]


[outer 033] TRAIN (EMA+K-ens) ll=0.6918  br=0.2486  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 19.15it/s, step size=3.46e-01, acc. prob=0.913]


[outer 034] TRAIN (EMA+K-ens) ll=0.6842  br=0.2449  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=2.72e-01, acc. prob=0.938]


[outer 035] TRAIN (EMA+K-ens) ll=0.6813  br=0.2436  acc=0.6690


Sample: 100%|██████████| 330/330 [00:19, 17.17it/s, step size=2.38e-01, acc. prob=0.952]


[outer 036] TRAIN (EMA+K-ens) ll=0.6815  br=0.2437  acc=0.6690


Sample: 100%|██████████| 330/330 [00:18, 17.44it/s, step size=2.95e-01, acc. prob=0.943]


[outer 037] TRAIN (EMA+K-ens) ll=0.6810  br=0.2433  acc=0.6690


Sample: 100%|██████████| 330/330 [00:18, 17.42it/s, step size=2.68e-01, acc. prob=0.962]


[outer 038] TRAIN (EMA+K-ens) ll=0.6792  br=0.2424  acc=0.6790


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.05e-01, acc. prob=0.953]


[outer 039] TRAIN (EMA+K-ens) ll=0.6751  br=0.2405  acc=0.6890
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}, {'accuracy': 0.6311799883842468, 'brier': 0.23762410879135132, 'logloss': 0.6682534217834473}, {'accuracy': 0.6198399662971497, 'brier': 0.24079276621341705, 'logloss': 0.6747647523880005}]


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=2.71e-01, acc. prob=0.948]


[outer 000] TRAIN (EMA+K-ens) ll=0.6804  br=0.2436  acc=0.5610


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=3.29e-01, acc. prob=0.912]


[outer 001] TRAIN (EMA+K-ens) ll=0.6833  br=0.2449  acc=0.6030


Sample: 100%|██████████| 330/330 [00:17, 18.81it/s, step size=2.35e-01, acc. prob=0.962]


[outer 002] TRAIN (EMA+K-ens) ll=0.6867  br=0.2465  acc=0.6540


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=3.70e-01, acc. prob=0.886]


[outer 003] TRAIN (EMA+K-ens) ll=0.6881  br=0.2472  acc=0.6130


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.82e-01, acc. prob=0.911]


[outer 004] TRAIN (EMA+K-ens) ll=0.6915  br=0.2489  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.65e-01, acc. prob=0.959]


[outer 005] TRAIN (EMA+K-ens) ll=0.6893  br=0.2478  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=2.39e-01, acc. prob=0.963]


[outer 006] TRAIN (EMA+K-ens) ll=0.6964  br=0.2512  acc=0.6270


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=3.59e-01, acc. prob=0.913]


[outer 007] TRAIN (EMA+K-ens) ll=0.6988  br=0.2524  acc=0.6170


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=2.35e-01, acc. prob=0.934]


[outer 008] TRAIN (EMA+K-ens) ll=0.6998  br=0.2528  acc=0.6250


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.52e-01, acc. prob=0.954]


[outer 009] TRAIN (EMA+K-ens) ll=0.7022  br=0.2540  acc=0.6000


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=3.10e-01, acc. prob=0.918]


[outer 010] TRAIN (EMA+K-ens) ll=0.6959  br=0.2508  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=3.18e-01, acc. prob=0.948]


[outer 011] TRAIN (EMA+K-ens) ll=0.6943  br=0.2501  acc=0.6150


Sample: 100%|██████████| 330/330 [00:17, 18.57it/s, step size=2.93e-01, acc. prob=0.958]


[outer 012] TRAIN (EMA+K-ens) ll=0.6934  br=0.2496  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=2.61e-01, acc. prob=0.960]


[outer 013] TRAIN (EMA+K-ens) ll=0.6965  br=0.2512  acc=0.5890


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=2.81e-01, acc. prob=0.938]


[outer 014] TRAIN (EMA+K-ens) ll=0.6894  br=0.2478  acc=0.6320


Sample: 100%|██████████| 330/330 [00:16, 20.36it/s, step size=2.94e-01, acc. prob=0.951]


[outer 015] TRAIN (EMA+K-ens) ll=0.6808  br=0.2436  acc=0.6460


Sample: 100%|██████████| 330/330 [00:18, 18.09it/s, step size=3.04e-01, acc. prob=0.933]


[outer 016] TRAIN (EMA+K-ens) ll=0.6809  br=0.2437  acc=0.6420


Sample: 100%|██████████| 330/330 [00:18, 18.09it/s, step size=2.67e-01, acc. prob=0.945]


[outer 017] TRAIN (EMA+K-ens) ll=0.6756  br=0.2411  acc=0.6700


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.77e-01, acc. prob=0.932]


[outer 018] TRAIN (EMA+K-ens) ll=0.6829  br=0.2447  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.56it/s, step size=2.93e-01, acc. prob=0.942]


[outer 019] TRAIN (EMA+K-ens) ll=0.6818  br=0.2441  acc=0.6620


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.74e-01, acc. prob=0.955]


[outer 020] TRAIN (EMA+K-ens) ll=0.6751  br=0.2409  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=2.71e-01, acc. prob=0.939]


[outer 021] TRAIN (EMA+K-ens) ll=0.6741  br=0.2403  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.26e-01, acc. prob=0.927]


[outer 022] TRAIN (EMA+K-ens) ll=0.6744  br=0.2405  acc=0.6720


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=2.81e-01, acc. prob=0.955]


[outer 023] TRAIN (EMA+K-ens) ll=0.6788  br=0.2426  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 19.92it/s, step size=2.97e-01, acc. prob=0.909]


[outer 024] TRAIN (EMA+K-ens) ll=0.6808  br=0.2436  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=3.09e-01, acc. prob=0.948]


[outer 025] TRAIN (EMA+K-ens) ll=0.6851  br=0.2456  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=3.32e-01, acc. prob=0.927]


[outer 026] TRAIN (EMA+K-ens) ll=0.6852  br=0.2457  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=3.32e-01, acc. prob=0.917]


[outer 027] TRAIN (EMA+K-ens) ll=0.6869  br=0.2465  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=2.96e-01, acc. prob=0.926]


[outer 028] TRAIN (EMA+K-ens) ll=0.6865  br=0.2463  acc=0.6470


Sample: 100%|██████████| 330/330 [00:16, 20.01it/s, step size=2.99e-01, acc. prob=0.929]


[outer 029] TRAIN (EMA+K-ens) ll=0.6830  br=0.2446  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.17e-01, acc. prob=0.938]


[outer 030] TRAIN (EMA+K-ens) ll=0.6825  br=0.2444  acc=0.6430


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.95e-01, acc. prob=0.940]


[outer 031] TRAIN (EMA+K-ens) ll=0.6849  br=0.2455  acc=0.6380


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=3.39e-01, acc. prob=0.922]


[outer 032] TRAIN (EMA+K-ens) ll=0.6860  br=0.2461  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 20.18it/s, step size=2.67e-01, acc. prob=0.947]


[outer 033] TRAIN (EMA+K-ens) ll=0.6871  br=0.2466  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.97e-01, acc. prob=0.889]


[outer 034] TRAIN (EMA+K-ens) ll=0.6837  br=0.2448  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.20e-01, acc. prob=0.935]


[outer 035] TRAIN (EMA+K-ens) ll=0.6852  br=0.2456  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 20.19it/s, step size=2.92e-01, acc. prob=0.939]


[outer 036] TRAIN (EMA+K-ens) ll=0.6949  br=0.2503  acc=0.6380


Sample: 100%|██████████| 330/330 [00:18, 18.13it/s, step size=2.78e-01, acc. prob=0.921]


[outer 037] TRAIN (EMA+K-ens) ll=0.6980  br=0.2518  acc=0.6310


Sample: 100%|██████████| 330/330 [00:15, 20.95it/s, step size=2.88e-01, acc. prob=0.934]


[outer 038] TRAIN (EMA+K-ens) ll=0.7022  br=0.2539  acc=0.6100


Sample: 100%|██████████| 330/330 [00:17, 19.41it/s, step size=2.97e-01, acc. prob=0.932]


[outer 039] TRAIN (EMA+K-ens) ll=0.7027  br=0.2541  acc=0.6160
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}, {'accuracy': 0.6311799883842468, 'brier': 0.23762410879135132, 'logloss': 0.6682534217834473}, {'accuracy': 0.6198399662971497, 'brier': 0.24079276621341705, 'logloss': 0.6747647523880005}, {'accuracy': 0.5145400166511536, 'brier': 0.25846388936042786, 'logloss': 0.7117296457290649}]


Sample: 100%|██████████| 330/330 [00:19, 17.34it/s, step size=2.49e-01, acc. prob=0.951]


[outer 000] TRAIN (EMA+K-ens) ll=0.6774  br=0.2418  acc=0.5830


Sample: 100%|██████████| 330/330 [00:16, 20.37it/s, step size=2.88e-01, acc. prob=0.937]


[outer 001] TRAIN (EMA+K-ens) ll=0.6777  br=0.2421  acc=0.5790


Sample: 100%|██████████| 330/330 [00:16, 20.61it/s, step size=3.30e-01, acc. prob=0.932]


[outer 002] TRAIN (EMA+K-ens) ll=0.6730  br=0.2395  acc=0.6350


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=2.93e-01, acc. prob=0.955]


[outer 003] TRAIN (EMA+K-ens) ll=0.6605  br=0.2332  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=3.22e-01, acc. prob=0.920]


[outer 004] TRAIN (EMA+K-ens) ll=0.6698  br=0.2375  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 18.36it/s, step size=2.57e-01, acc. prob=0.955]


[outer 005] TRAIN (EMA+K-ens) ll=0.6671  br=0.2365  acc=0.6470


Sample: 100%|██████████| 330/330 [00:16, 20.50it/s, step size=3.39e-01, acc. prob=0.915]


[outer 006] TRAIN (EMA+K-ens) ll=0.6723  br=0.2390  acc=0.6530


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=3.15e-01, acc. prob=0.912]


[outer 007] TRAIN (EMA+K-ens) ll=0.6723  br=0.2390  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 18.34it/s, step size=3.39e-01, acc. prob=0.909]


[outer 008] TRAIN (EMA+K-ens) ll=0.6778  br=0.2417  acc=0.6740


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=2.62e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6778  br=0.2417  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=3.01e-01, acc. prob=0.925]


[outer 010] TRAIN (EMA+K-ens) ll=0.6819  br=0.2437  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.99e-01, acc. prob=0.941]


[outer 011] TRAIN (EMA+K-ens) ll=0.6890  br=0.2471  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=3.19e-01, acc. prob=0.932]


[outer 012] TRAIN (EMA+K-ens) ll=0.6873  br=0.2465  acc=0.6360


Sample: 100%|██████████| 330/330 [00:18, 18.33it/s, step size=2.88e-01, acc. prob=0.928]


[outer 013] TRAIN (EMA+K-ens) ll=0.6872  br=0.2464  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 20.30it/s, step size=3.18e-01, acc. prob=0.919]


[outer 014] TRAIN (EMA+K-ens) ll=0.6802  br=0.2430  acc=0.6260


Sample: 100%|██████████| 330/330 [00:16, 19.94it/s, step size=3.50e-01, acc. prob=0.902]


[outer 015] TRAIN (EMA+K-ens) ll=0.6806  br=0.2432  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=3.00e-01, acc. prob=0.942]


[outer 016] TRAIN (EMA+K-ens) ll=0.6789  br=0.2423  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 18.42it/s, step size=2.87e-01, acc. prob=0.958]


[outer 017] TRAIN (EMA+K-ens) ll=0.6749  br=0.2404  acc=0.6410


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=3.44e-01, acc. prob=0.916]


[outer 018] TRAIN (EMA+K-ens) ll=0.6726  br=0.2394  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 18.84it/s, step size=2.42e-01, acc. prob=0.944]


[outer 019] TRAIN (EMA+K-ens) ll=0.6732  br=0.2397  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=3.33e-01, acc. prob=0.930]


[outer 020] TRAIN (EMA+K-ens) ll=0.6735  br=0.2398  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=2.43e-01, acc. prob=0.962]


[outer 021] TRAIN (EMA+K-ens) ll=0.6729  br=0.2395  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=2.63e-01, acc. prob=0.941]


[outer 022] TRAIN (EMA+K-ens) ll=0.6760  br=0.2410  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=2.97e-01, acc. prob=0.930]


[outer 023] TRAIN (EMA+K-ens) ll=0.6761  br=0.2411  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.96e-01, acc. prob=0.904]


[outer 024] TRAIN (EMA+K-ens) ll=0.6682  br=0.2372  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=2.69e-01, acc. prob=0.951]


[outer 025] TRAIN (EMA+K-ens) ll=0.6726  br=0.2393  acc=0.6690


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.80e-01, acc. prob=0.938]


[outer 026] TRAIN (EMA+K-ens) ll=0.6738  br=0.2398  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.85e-01, acc. prob=0.935]


[outer 027] TRAIN (EMA+K-ens) ll=0.6780  br=0.2415  acc=0.6360


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=2.96e-01, acc. prob=0.930]


[outer 028] TRAIN (EMA+K-ens) ll=0.6793  br=0.2422  acc=0.6510
[Early stop @ outer 28] Δll=0.053%, Δbr=0.132%, Δacc=0.003
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}, {'accuracy': 0.6311799883842468, 'brier': 0.23762410879135132, 'logloss': 0.6682534217834473}, {'accuracy': 0.6198399662971497, 'brier': 0.24079276621341705, 'logloss': 0.6747647523880005}, {'accuracy': 0.5145400166511536, 'brier': 0.25846388936042786, 'logloss': 0.7117296457290649}, {'accuracy': 0.6212199926376343, 'brier': 0.24489834904670715, 'logloss': 0.6846963763237}]


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=2.75e-01, acc. prob=0.953]


[outer 000] TRAIN (EMA+K-ens) ll=0.6971  br=0.2506  acc=0.5880


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=3.53e-01, acc. prob=0.915]


[outer 001] TRAIN (EMA+K-ens) ll=0.6669  br=0.2364  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=2.78e-01, acc. prob=0.940]


[outer 002] TRAIN (EMA+K-ens) ll=0.6639  br=0.2350  acc=0.7220


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=3.15e-01, acc. prob=0.943]


[outer 003] TRAIN (EMA+K-ens) ll=0.6635  br=0.2348  acc=0.7040


Sample: 100%|██████████| 330/330 [00:16, 20.37it/s, step size=2.57e-01, acc. prob=0.969]


[outer 004] TRAIN (EMA+K-ens) ll=0.6630  br=0.2348  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=2.56e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.6657  br=0.2360  acc=0.6990


Sample: 100%|██████████| 330/330 [00:15, 21.46it/s, step size=3.21e-01, acc. prob=0.932]


[outer 006] TRAIN (EMA+K-ens) ll=0.6651  br=0.2358  acc=0.7020


Sample: 100%|██████████| 330/330 [00:18, 18.22it/s, step size=3.52e-01, acc. prob=0.933]


[outer 007] TRAIN (EMA+K-ens) ll=0.6644  br=0.2354  acc=0.7010


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.66e-01, acc. prob=0.933]


[outer 008] TRAIN (EMA+K-ens) ll=0.6655  br=0.2359  acc=0.6950


Sample: 100%|██████████| 330/330 [00:19, 17.30it/s, step size=2.66e-01, acc. prob=0.954]


[outer 009] TRAIN (EMA+K-ens) ll=0.6669  br=0.2366  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=3.03e-01, acc. prob=0.924]


[outer 010] TRAIN (EMA+K-ens) ll=0.6681  br=0.2372  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=2.83e-01, acc. prob=0.925]


[outer 011] TRAIN (EMA+K-ens) ll=0.6723  br=0.2392  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=3.43e-01, acc. prob=0.913]


[outer 012] TRAIN (EMA+K-ens) ll=0.6728  br=0.2394  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.70e-01, acc. prob=0.945]


[outer 013] TRAIN (EMA+K-ens) ll=0.6718  br=0.2390  acc=0.6770


Sample: 100%|██████████| 330/330 [00:18, 18.08it/s, step size=2.81e-01, acc. prob=0.928]


[outer 014] TRAIN (EMA+K-ens) ll=0.6712  br=0.2385  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 20.32it/s, step size=3.53e-01, acc. prob=0.905]


[outer 015] TRAIN (EMA+K-ens) ll=0.6708  br=0.2384  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=2.87e-01, acc. prob=0.944]


[outer 016] TRAIN (EMA+K-ens) ll=0.6687  br=0.2372  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=3.64e-01, acc. prob=0.897]


[outer 017] TRAIN (EMA+K-ens) ll=0.6691  br=0.2374  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.47it/s, step size=3.01e-01, acc. prob=0.935]


[outer 018] TRAIN (EMA+K-ens) ll=0.6694  br=0.2375  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 20.71it/s, step size=3.38e-01, acc. prob=0.923]


[outer 019] TRAIN (EMA+K-ens) ll=0.6666  br=0.2360  acc=0.6840


Sample: 100%|██████████| 330/330 [00:18, 17.95it/s, step size=3.04e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6671  br=0.2361  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=3.19e-01, acc. prob=0.906]


[outer 021] TRAIN (EMA+K-ens) ll=0.6679  br=0.2366  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.39it/s, step size=3.07e-01, acc. prob=0.944]


[outer 022] TRAIN (EMA+K-ens) ll=0.6686  br=0.2371  acc=0.6980


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=2.31e-01, acc. prob=0.968]


[outer 023] TRAIN (EMA+K-ens) ll=0.6643  br=0.2351  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.64e-01, acc. prob=0.958]


[outer 024] TRAIN (EMA+K-ens) ll=0.6617  br=0.2341  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=3.41e-01, acc. prob=0.914]


[outer 025] TRAIN (EMA+K-ens) ll=0.6637  br=0.2349  acc=0.6920


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.90e-01, acc. prob=0.945]


[outer 026] TRAIN (EMA+K-ens) ll=0.6657  br=0.2359  acc=0.7020


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.47e-01, acc. prob=0.901]


[outer 027] TRAIN (EMA+K-ens) ll=0.6674  br=0.2368  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.92e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6719  br=0.2390  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=2.91e-01, acc. prob=0.939]


[outer 029] TRAIN (EMA+K-ens) ll=0.6714  br=0.2386  acc=0.6900


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.14e-01, acc. prob=0.940]


[outer 030] TRAIN (EMA+K-ens) ll=0.6722  br=0.2389  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 19.53it/s, step size=3.01e-01, acc. prob=0.931]


[outer 031] TRAIN (EMA+K-ens) ll=0.6747  br=0.2402  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.05it/s, step size=3.13e-01, acc. prob=0.925]


[outer 032] TRAIN (EMA+K-ens) ll=0.6795  br=0.2422  acc=0.6390


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=2.34e-01, acc. prob=0.966]


[outer 033] TRAIN (EMA+K-ens) ll=0.6734  br=0.2396  acc=0.6340


Sample: 100%|██████████| 330/330 [00:18, 18.30it/s, step size=3.14e-01, acc. prob=0.937]


[outer 034] TRAIN (EMA+K-ens) ll=0.6701  br=0.2379  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.20it/s, step size=3.15e-01, acc. prob=0.930]


[outer 035] TRAIN (EMA+K-ens) ll=0.6700  br=0.2378  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=3.13e-01, acc. prob=0.918]


[outer 036] TRAIN (EMA+K-ens) ll=0.6685  br=0.2371  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 20.51it/s, step size=3.13e-01, acc. prob=0.917]


[outer 037] TRAIN (EMA+K-ens) ll=0.6631  br=0.2345  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=2.27e-01, acc. prob=0.965]


[outer 038] TRAIN (EMA+K-ens) ll=0.6633  br=0.2346  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 20.05it/s, step size=3.24e-01, acc. prob=0.903]


[outer 039] TRAIN (EMA+K-ens) ll=0.6687  br=0.2372  acc=0.6710
[{'accuracy': 0.6579599976539612, 'brier': 0.23307976126670837, 'logloss': 0.6597328186035156}, {'accuracy': 0.6448000073432922, 'brier': 0.23589971661567688, 'logloss': 0.665366530418396}, {'accuracy': 0.5943999886512756, 'brier': 0.24953223764896393, 'logloss': 0.6961998343467712}, {'accuracy': 0.6173799633979797, 'brier': 0.2418910264968872, 'logloss': 0.6784679293632507}, {'accuracy': 0.6060400009155273, 'brier': 0.24561749398708344, 'logloss': 0.6852532625198364}, {'accuracy': 0.6311799883842468, 'brier': 0.23762410879135132, 'logloss': 0.6682534217834473}, {'accuracy': 0.6198399662971497, 'brier': 0.24079276621341705, 'logloss': 0.6747647523880005}, {'accuracy': 0.5145400166511536, 'brier': 0.25846388936042786, 'logloss': 0.7117296457290649}, {'accuracy': 0.6212199926376343, 'brier': 0.24489834904670715, 'logloss': 0.6846963763237}, {'accuracy': 0.5591999888420105, 'brier': 0.26002082228660583, 'logloss': 0.7170931100

In [None]:
all_metrics = []
noise_type = "t"
for seed in range(3):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=True, nonlinear=True, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=3.60e-01, acc. prob=0.877]


[outer 000] TRAIN (EMA+K-ens) ll=0.7151  br=0.2603  acc=0.5530


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=2.88e-01, acc. prob=0.952]


[outer 001] TRAIN (EMA+K-ens) ll=0.6867  br=0.2464  acc=0.6150


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=3.07e-01, acc. prob=0.928]


[outer 002] TRAIN (EMA+K-ens) ll=0.6843  br=0.2454  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=3.82e-01, acc. prob=0.893]


[outer 003] TRAIN (EMA+K-ens) ll=0.6808  br=0.2435  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.98e-01, acc. prob=0.947]


[outer 004] TRAIN (EMA+K-ens) ll=0.6825  br=0.2443  acc=0.6210


Sample: 100%|██████████| 330/330 [00:19, 17.20it/s, step size=2.54e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.6746  br=0.2405  acc=0.6580


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=3.14e-01, acc. prob=0.917]


[outer 006] TRAIN (EMA+K-ens) ll=0.6792  br=0.2427  acc=0.6780


Sample: 100%|██████████| 330/330 [00:18, 17.71it/s, step size=2.94e-01, acc. prob=0.927]


[outer 007] TRAIN (EMA+K-ens) ll=0.6805  br=0.2433  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=3.15e-01, acc. prob=0.946]


[outer 008] TRAIN (EMA+K-ens) ll=0.6758  br=0.2410  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 19.13it/s, step size=2.78e-01, acc. prob=0.931]


[outer 009] TRAIN (EMA+K-ens) ll=0.6761  br=0.2411  acc=0.6980


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.89e-01, acc. prob=0.931]


[outer 010] TRAIN (EMA+K-ens) ll=0.6724  br=0.2392  acc=0.7060


Sample: 100%|██████████| 330/330 [00:17, 18.51it/s, step size=2.70e-01, acc. prob=0.941]


[outer 011] TRAIN (EMA+K-ens) ll=0.6741  br=0.2400  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=3.23e-01, acc. prob=0.943]


[outer 012] TRAIN (EMA+K-ens) ll=0.6753  br=0.2406  acc=0.6990


Sample: 100%|██████████| 330/330 [00:17, 19.13it/s, step size=2.55e-01, acc. prob=0.959]


[outer 013] TRAIN (EMA+K-ens) ll=0.6804  br=0.2430  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.17it/s, step size=2.79e-01, acc. prob=0.961]


[outer 014] TRAIN (EMA+K-ens) ll=0.6737  br=0.2397  acc=0.7070


Sample: 100%|██████████| 330/330 [00:18, 17.90it/s, step size=3.24e-01, acc. prob=0.952]


[outer 015] TRAIN (EMA+K-ens) ll=0.6722  br=0.2389  acc=0.6930


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=2.81e-01, acc. prob=0.952]


[outer 016] TRAIN (EMA+K-ens) ll=0.6717  br=0.2386  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 20.35it/s, step size=2.87e-01, acc. prob=0.925]


[outer 017] TRAIN (EMA+K-ens) ll=0.6740  br=0.2398  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=3.01e-01, acc. prob=0.950]


[outer 018] TRAIN (EMA+K-ens) ll=0.6757  br=0.2406  acc=0.6660
[Early stop @ outer 18] Δll=0.206%, Δbr=0.389%, Δacc=0.003
[{'accuracy': 0.6928600072860718, 'brier': 0.2329118251800537, 'logloss': 0.6630752682685852}]


Sample: 100%|██████████| 330/330 [00:16, 19.87it/s, step size=3.25e-01, acc. prob=0.917]


[outer 000] TRAIN (EMA+K-ens) ll=0.7193  br=0.2616  acc=0.5850


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=2.66e-01, acc. prob=0.947]


[outer 001] TRAIN (EMA+K-ens) ll=0.6851  br=0.2455  acc=0.6020


Sample: 100%|██████████| 330/330 [00:15, 21.10it/s, step size=3.24e-01, acc. prob=0.935]


[outer 002] TRAIN (EMA+K-ens) ll=0.6836  br=0.2450  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=3.31e-01, acc. prob=0.943]


[outer 003] TRAIN (EMA+K-ens) ll=0.6803  br=0.2433  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=2.89e-01, acc. prob=0.934]


[outer 004] TRAIN (EMA+K-ens) ll=0.6846  br=0.2452  acc=0.6220


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=3.26e-01, acc. prob=0.937]


[outer 005] TRAIN (EMA+K-ens) ll=0.6816  br=0.2440  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=2.71e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6772  br=0.2418  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=2.94e-01, acc. prob=0.937]


[outer 007] TRAIN (EMA+K-ens) ll=0.6780  br=0.2422  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=2.66e-01, acc. prob=0.937]


[outer 008] TRAIN (EMA+K-ens) ll=0.6742  br=0.2401  acc=0.6780


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=2.67e-01, acc. prob=0.939]


[outer 009] TRAIN (EMA+K-ens) ll=0.6787  br=0.2423  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 19.07it/s, step size=3.24e-01, acc. prob=0.943]


[outer 010] TRAIN (EMA+K-ens) ll=0.6780  br=0.2418  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 20.34it/s, step size=3.24e-01, acc. prob=0.888]


[outer 011] TRAIN (EMA+K-ens) ll=0.6812  br=0.2435  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=3.41e-01, acc. prob=0.937]


[outer 012] TRAIN (EMA+K-ens) ll=0.6815  br=0.2436  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=2.71e-01, acc. prob=0.919]


[outer 013] TRAIN (EMA+K-ens) ll=0.6853  br=0.2453  acc=0.6260


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=2.87e-01, acc. prob=0.943]


[outer 014] TRAIN (EMA+K-ens) ll=0.6915  br=0.2482  acc=0.6090


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.60e-01, acc. prob=0.945]


[outer 015] TRAIN (EMA+K-ens) ll=0.6979  br=0.2512  acc=0.5940


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=2.81e-01, acc. prob=0.934]


[outer 016] TRAIN (EMA+K-ens) ll=0.6937  br=0.2492  acc=0.6160


Sample: 100%|██████████| 330/330 [00:16, 20.53it/s, step size=2.52e-01, acc. prob=0.932]


[outer 017] TRAIN (EMA+K-ens) ll=0.6913  br=0.2479  acc=0.6070


Sample: 100%|██████████| 330/330 [00:15, 21.33it/s, step size=2.57e-01, acc. prob=0.963]


[outer 018] TRAIN (EMA+K-ens) ll=0.6930  br=0.2492  acc=0.5890


Sample: 100%|██████████| 330/330 [00:14, 22.01it/s, step size=3.01e-01, acc. prob=0.917]


[outer 019] TRAIN (EMA+K-ens) ll=0.6895  br=0.2476  acc=0.5950


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=2.90e-01, acc. prob=0.942]


[outer 020] TRAIN (EMA+K-ens) ll=0.6856  br=0.2456  acc=0.6260


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=3.41e-01, acc. prob=0.929]


[outer 021] TRAIN (EMA+K-ens) ll=0.6835  br=0.2447  acc=0.6330


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=3.20e-01, acc. prob=0.932]


[outer 022] TRAIN (EMA+K-ens) ll=0.6812  br=0.2436  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.47e-01, acc. prob=0.945]


[outer 023] TRAIN (EMA+K-ens) ll=0.6751  br=0.2407  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 20.04it/s, step size=3.49e-01, acc. prob=0.903]


[outer 024] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.17e-01, acc. prob=0.921]


[outer 025] TRAIN (EMA+K-ens) ll=0.6823  br=0.2441  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=3.00e-01, acc. prob=0.930]


[outer 026] TRAIN (EMA+K-ens) ll=0.6825  br=0.2439  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.99e-01, acc. prob=0.904]


[outer 027] TRAIN (EMA+K-ens) ll=0.6826  br=0.2439  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 19.42it/s, step size=2.98e-01, acc. prob=0.933]


[outer 028] TRAIN (EMA+K-ens) ll=0.6834  br=0.2444  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=3.29e-01, acc. prob=0.911]


[outer 029] TRAIN (EMA+K-ens) ll=0.6793  br=0.2425  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 21.45it/s, step size=3.28e-01, acc. prob=0.920]


[outer 030] TRAIN (EMA+K-ens) ll=0.6775  br=0.2416  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.68e-01, acc. prob=0.950]


[outer 031] TRAIN (EMA+K-ens) ll=0.6724  br=0.2392  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=3.54e-01, acc. prob=0.927]


[outer 032] TRAIN (EMA+K-ens) ll=0.6775  br=0.2417  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=3.22e-01, acc. prob=0.945]


[outer 033] TRAIN (EMA+K-ens) ll=0.6774  br=0.2417  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 21.49it/s, step size=3.44e-01, acc. prob=0.914]


[outer 034] TRAIN (EMA+K-ens) ll=0.6712  br=0.2387  acc=0.6640


Sample: 100%|██████████| 330/330 [00:15, 21.17it/s, step size=2.98e-01, acc. prob=0.916]


[outer 035] TRAIN (EMA+K-ens) ll=0.6696  br=0.2379  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.28it/s, step size=3.21e-01, acc. prob=0.932]


[outer 036] TRAIN (EMA+K-ens) ll=0.6698  br=0.2381  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=3.17e-01, acc. prob=0.899]


[outer 037] TRAIN (EMA+K-ens) ll=0.6701  br=0.2383  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.31e-01, acc. prob=0.887]


[outer 038] TRAIN (EMA+K-ens) ll=0.6695  br=0.2380  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 18.58it/s, step size=2.57e-01, acc. prob=0.954]


[outer 039] TRAIN (EMA+K-ens) ll=0.6784  br=0.2422  acc=0.6330
[{'accuracy': 0.6928600072860718, 'brier': 0.2329118251800537, 'logloss': 0.6630752682685852}, {'accuracy': 0.5746200084686279, 'brier': 0.2517673969268799, 'logloss': 0.6996152997016907}]


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=2.57e-01, acc. prob=0.953]


[outer 000] TRAIN (EMA+K-ens) ll=0.7096  br=0.2538  acc=0.5570


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=3.15e-01, acc. prob=0.945]


[outer 001] TRAIN (EMA+K-ens) ll=0.6920  br=0.2472  acc=0.5970


Sample: 100%|██████████| 330/330 [00:15, 21.08it/s, step size=3.54e-01, acc. prob=0.924]


[outer 002] TRAIN (EMA+K-ens) ll=0.6945  br=0.2483  acc=0.5870


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=3.11e-01, acc. prob=0.942]


[outer 003] TRAIN (EMA+K-ens) ll=0.6948  br=0.2492  acc=0.5690


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=3.23e-01, acc. prob=0.922]


[outer 004] TRAIN (EMA+K-ens) ll=0.6887  br=0.2468  acc=0.5790


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=3.31e-01, acc. prob=0.931]


[outer 005] TRAIN (EMA+K-ens) ll=0.6890  br=0.2471  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=3.06e-01, acc. prob=0.930]


[outer 006] TRAIN (EMA+K-ens) ll=0.6865  br=0.2459  acc=0.6170


Sample: 100%|██████████| 330/330 [00:16, 19.92it/s, step size=2.87e-01, acc. prob=0.945]


[outer 007] TRAIN (EMA+K-ens) ll=0.6815  br=0.2436  acc=0.6240


Sample: 100%|██████████| 330/330 [00:19, 17.22it/s, step size=2.71e-01, acc. prob=0.942]


[outer 008] TRAIN (EMA+K-ens) ll=0.6753  br=0.2407  acc=0.6240


Sample: 100%|██████████| 330/330 [00:18, 17.96it/s, step size=3.17e-01, acc. prob=0.899]


[outer 009] TRAIN (EMA+K-ens) ll=0.6714  br=0.2389  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=3.03e-01, acc. prob=0.942]


[outer 010] TRAIN (EMA+K-ens) ll=0.6637  br=0.2352  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.62e-01, acc. prob=0.945]


[outer 011] TRAIN (EMA+K-ens) ll=0.6652  br=0.2359  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=2.86e-01, acc. prob=0.954]


[outer 012] TRAIN (EMA+K-ens) ll=0.6604  br=0.2336  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.48it/s, step size=2.97e-01, acc. prob=0.953]


[outer 013] TRAIN (EMA+K-ens) ll=0.6566  br=0.2317  acc=0.6820


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.87e-01, acc. prob=0.958]


[outer 014] TRAIN (EMA+K-ens) ll=0.6557  br=0.2313  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.49it/s, step size=2.96e-01, acc. prob=0.953]


[outer 015] TRAIN (EMA+K-ens) ll=0.6590  br=0.2328  acc=0.6790


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=3.19e-01, acc. prob=0.937]


[outer 016] TRAIN (EMA+K-ens) ll=0.6636  br=0.2350  acc=0.6640


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=2.91e-01, acc. prob=0.951]


[outer 017] TRAIN (EMA+K-ens) ll=0.6710  br=0.2387  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 20.53it/s, step size=3.31e-01, acc. prob=0.933]


[outer 018] TRAIN (EMA+K-ens) ll=0.6777  br=0.2419  acc=0.6720


Sample: 100%|██████████| 330/330 [00:18, 17.78it/s, step size=2.18e-01, acc. prob=0.966]


[outer 019] TRAIN (EMA+K-ens) ll=0.6663  br=0.2363  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=3.31e-01, acc. prob=0.949]


[outer 020] TRAIN (EMA+K-ens) ll=0.6692  br=0.2378  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.04e-01, acc. prob=0.930]


[outer 021] TRAIN (EMA+K-ens) ll=0.6701  br=0.2382  acc=0.6860


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=3.26e-01, acc. prob=0.934]


[outer 022] TRAIN (EMA+K-ens) ll=0.6713  br=0.2387  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 19.23it/s, step size=2.53e-01, acc. prob=0.958]


[outer 023] TRAIN (EMA+K-ens) ll=0.6670  br=0.2366  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=2.99e-01, acc. prob=0.939]


[outer 024] TRAIN (EMA+K-ens) ll=0.6706  br=0.2382  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=3.10e-01, acc. prob=0.915]


[outer 025] TRAIN (EMA+K-ens) ll=0.6666  br=0.2363  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=2.87e-01, acc. prob=0.939]


[outer 026] TRAIN (EMA+K-ens) ll=0.6673  br=0.2367  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.50e-01, acc. prob=0.939]


[outer 027] TRAIN (EMA+K-ens) ll=0.6755  br=0.2407  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=3.13e-01, acc. prob=0.945]


[outer 028] TRAIN (EMA+K-ens) ll=0.6789  br=0.2422  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 19.74it/s, step size=3.04e-01, acc. prob=0.939]


[outer 029] TRAIN (EMA+K-ens) ll=0.6812  br=0.2433  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=3.35e-01, acc. prob=0.940]


[outer 030] TRAIN (EMA+K-ens) ll=0.6808  br=0.2434  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.32e-01, acc. prob=0.958]


[outer 031] TRAIN (EMA+K-ens) ll=0.6817  br=0.2439  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=3.19e-01, acc. prob=0.931]


[outer 032] TRAIN (EMA+K-ens) ll=0.6799  br=0.2430  acc=0.6700


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=2.50e-01, acc. prob=0.950]


[outer 033] TRAIN (EMA+K-ens) ll=0.6825  br=0.2443  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=3.57e-01, acc. prob=0.920]


[outer 034] TRAIN (EMA+K-ens) ll=0.6789  br=0.2426  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.70it/s, step size=3.08e-01, acc. prob=0.941]


[outer 035] TRAIN (EMA+K-ens) ll=0.6741  br=0.2402  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=2.98e-01, acc. prob=0.931]


[outer 036] TRAIN (EMA+K-ens) ll=0.6740  br=0.2402  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.67it/s, step size=2.64e-01, acc. prob=0.941]


[outer 037] TRAIN (EMA+K-ens) ll=0.6699  br=0.2383  acc=0.6760


Sample: 100%|██████████| 330/330 [00:18, 18.09it/s, step size=2.85e-01, acc. prob=0.962]


[outer 038] TRAIN (EMA+K-ens) ll=0.6722  br=0.2393  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 20.63it/s, step size=3.15e-01, acc. prob=0.939]


[outer 039] TRAIN (EMA+K-ens) ll=0.6757  br=0.2410  acc=0.6520
[{'accuracy': 0.6928600072860718, 'brier': 0.2329118251800537, 'logloss': 0.6630752682685852}, {'accuracy': 0.5746200084686279, 'brier': 0.2517673969268799, 'logloss': 0.6996152997016907}, {'accuracy': 0.6154999732971191, 'brier': 0.24270714819431305, 'logloss': 0.6799689531326294}]
        accuracy     brier   logloss
mean    0.627660  0.242462  0.680887
std     0.060051  0.009430  0.018287
median  0.615500  0.242707  0.679969
   accuracy     brier   logloss
0   0.69286  0.232912  0.663075
1   0.57462  0.251767  0.699615
2   0.61550  0.242707  0.679969


In [None]:
all_metrics = []
noise_type = "contaminated"
for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    df_test = simulate_dataset(
        noise_type = noise_type,
        n_per_group=10000
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_test, feature_cols,
        interaction=True, nonlinear=True, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    all_metrics.append(res["metrics_test"])
    print(all_metrics)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median'])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.98e-01, acc. prob=0.937]


[outer 000] TRAIN (EMA+K-ens) ll=0.6771  br=0.2418  acc=0.5970


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=2.89e-01, acc. prob=0.932]


[outer 001] TRAIN (EMA+K-ens) ll=0.6903  br=0.2480  acc=0.5780


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=2.71e-01, acc. prob=0.956]


[outer 002] TRAIN (EMA+K-ens) ll=0.6873  br=0.2467  acc=0.5740


Sample: 100%|██████████| 330/330 [00:17, 19.13it/s, step size=2.72e-01, acc. prob=0.929]


[outer 003] TRAIN (EMA+K-ens) ll=0.6762  br=0.2413  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.89e-01, acc. prob=0.919]


[outer 004] TRAIN (EMA+K-ens) ll=0.6763  br=0.2414  acc=0.6310


Sample: 100%|██████████| 330/330 [00:16, 19.49it/s, step size=2.89e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.6674  br=0.2371  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=2.99e-01, acc. prob=0.948]


[outer 006] TRAIN (EMA+K-ens) ll=0.6668  br=0.2367  acc=0.6620


Sample: 100%|██████████| 330/330 [00:16, 20.57it/s, step size=2.63e-01, acc. prob=0.968]


[outer 007] TRAIN (EMA+K-ens) ll=0.6658  br=0.2362  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.78it/s, step size=2.81e-01, acc. prob=0.960]


[outer 008] TRAIN (EMA+K-ens) ll=0.6660  br=0.2362  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=3.22e-01, acc. prob=0.906]


[outer 009] TRAIN (EMA+K-ens) ll=0.6630  br=0.2347  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 20.70it/s, step size=3.24e-01, acc. prob=0.939]


[outer 010] TRAIN (EMA+K-ens) ll=0.6628  br=0.2346  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 20.83it/s, step size=3.02e-01, acc. prob=0.936]


[outer 011] TRAIN (EMA+K-ens) ll=0.6646  br=0.2354  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.99it/s, step size=3.01e-01, acc. prob=0.944]


[outer 012] TRAIN (EMA+K-ens) ll=0.6670  br=0.2366  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.99e-01, acc. prob=0.924]


[outer 013] TRAIN (EMA+K-ens) ll=0.6672  br=0.2365  acc=0.6610


Sample: 100%|██████████| 330/330 [00:17, 18.93it/s, step size=2.47e-01, acc. prob=0.947]


[outer 014] TRAIN (EMA+K-ens) ll=0.6714  br=0.2385  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=2.76e-01, acc. prob=0.936]


[outer 015] TRAIN (EMA+K-ens) ll=0.6739  br=0.2398  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=3.43e-01, acc. prob=0.929]


[outer 016] TRAIN (EMA+K-ens) ll=0.6769  br=0.2414  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=3.35e-01, acc. prob=0.945]


[outer 017] TRAIN (EMA+K-ens) ll=0.6762  br=0.2411  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=3.14e-01, acc. prob=0.930]


[outer 018] TRAIN (EMA+K-ens) ll=0.6767  br=0.2413  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=3.05e-01, acc. prob=0.958]


[outer 019] TRAIN (EMA+K-ens) ll=0.6774  br=0.2418  acc=0.6450


Sample: 100%|██████████| 330/330 [00:15, 20.87it/s, step size=3.11e-01, acc. prob=0.920]


[outer 020] TRAIN (EMA+K-ens) ll=0.6722  br=0.2392  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 18.84it/s, step size=3.50e-01, acc. prob=0.937]


[outer 021] TRAIN (EMA+K-ens) ll=0.6773  br=0.2418  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=2.89e-01, acc. prob=0.941]


[outer 022] TRAIN (EMA+K-ens) ll=0.6756  br=0.2409  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 18.51it/s, step size=2.53e-01, acc. prob=0.955]


[outer 023] TRAIN (EMA+K-ens) ll=0.6724  br=0.2394  acc=0.6610


Sample: 100%|██████████| 330/330 [00:18, 18.14it/s, step size=3.48e-01, acc. prob=0.903]


[outer 024] TRAIN (EMA+K-ens) ll=0.6717  br=0.2390  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 19.89it/s, step size=3.07e-01, acc. prob=0.935]


[outer 025] TRAIN (EMA+K-ens) ll=0.6693  br=0.2379  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 20.48it/s, step size=3.11e-01, acc. prob=0.936]


[outer 026] TRAIN (EMA+K-ens) ll=0.6669  br=0.2367  acc=0.6780


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=3.07e-01, acc. prob=0.941]


[outer 027] TRAIN (EMA+K-ens) ll=0.6678  br=0.2371  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=2.71e-01, acc. prob=0.944]


[outer 028] TRAIN (EMA+K-ens) ll=0.6689  br=0.2376  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=2.80e-01, acc. prob=0.957]


[outer 029] TRAIN (EMA+K-ens) ll=0.6691  br=0.2378  acc=0.6830


Sample: 100%|██████████| 330/330 [00:18, 18.17it/s, step size=2.88e-01, acc. prob=0.921]


[outer 030] TRAIN (EMA+K-ens) ll=0.6704  br=0.2384  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.90e-01, acc. prob=0.934]


[outer 031] TRAIN (EMA+K-ens) ll=0.6671  br=0.2368  acc=0.6820


Sample: 100%|██████████| 330/330 [00:15, 21.00it/s, step size=3.39e-01, acc. prob=0.913]


[outer 032] TRAIN (EMA+K-ens) ll=0.6678  br=0.2371  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=3.12e-01, acc. prob=0.930]


[outer 033] TRAIN (EMA+K-ens) ll=0.6703  br=0.2383  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.52e-01, acc. prob=0.919]


[outer 034] TRAIN (EMA+K-ens) ll=0.6698  br=0.2382  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=3.11e-01, acc. prob=0.935]


[outer 035] TRAIN (EMA+K-ens) ll=0.6743  br=0.2403  acc=0.6710
[Early stop @ outer 35] Δll=0.083%, Δbr=0.101%, Δacc=0.005
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}]


Sample: 100%|██████████| 330/330 [00:17, 18.35it/s, step size=2.57e-01, acc. prob=0.952]


[outer 000] TRAIN (EMA+K-ens) ll=0.7095  br=0.2573  acc=0.5490


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=3.19e-01, acc. prob=0.911]


[outer 001] TRAIN (EMA+K-ens) ll=0.6897  br=0.2478  acc=0.6000


Sample: 100%|██████████| 330/330 [00:18, 18.16it/s, step size=3.16e-01, acc. prob=0.931]


[outer 002] TRAIN (EMA+K-ens) ll=0.6923  br=0.2493  acc=0.5950


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=3.52e-01, acc. prob=0.937]


[outer 003] TRAIN (EMA+K-ens) ll=0.6901  br=0.2481  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=2.55e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6863  br=0.2463  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=2.87e-01, acc. prob=0.920]


[outer 005] TRAIN (EMA+K-ens) ll=0.6914  br=0.2488  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.69it/s, step size=2.88e-01, acc. prob=0.934]


[outer 006] TRAIN (EMA+K-ens) ll=0.6920  br=0.2490  acc=0.6640


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.57e-01, acc. prob=0.944]


[outer 007] TRAIN (EMA+K-ens) ll=0.6900  br=0.2481  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=3.36e-01, acc. prob=0.904]


[outer 008] TRAIN (EMA+K-ens) ll=0.6888  br=0.2476  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 19.47it/s, step size=3.22e-01, acc. prob=0.915]


[outer 009] TRAIN (EMA+K-ens) ll=0.6939  br=0.2500  acc=0.6640


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=3.31e-01, acc. prob=0.937]


[outer 010] TRAIN (EMA+K-ens) ll=0.6954  br=0.2507  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=3.37e-01, acc. prob=0.945]


[outer 011] TRAIN (EMA+K-ens) ll=0.6992  br=0.2525  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.99e-01, acc. prob=0.944]


[outer 012] TRAIN (EMA+K-ens) ll=0.7013  br=0.2535  acc=0.5820


Sample: 100%|██████████| 330/330 [00:16, 19.53it/s, step size=3.01e-01, acc. prob=0.942]


[outer 013] TRAIN (EMA+K-ens) ll=0.7009  br=0.2534  acc=0.5820


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=3.42e-01, acc. prob=0.893]


[outer 014] TRAIN (EMA+K-ens) ll=0.7000  br=0.2529  acc=0.5480


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=2.53e-01, acc. prob=0.952]


[outer 015] TRAIN (EMA+K-ens) ll=0.7033  br=0.2542  acc=0.5670


Sample: 100%|██████████| 330/330 [00:18, 17.88it/s, step size=3.26e-01, acc. prob=0.926]


[outer 016] TRAIN (EMA+K-ens) ll=0.6998  br=0.2525  acc=0.5710


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.93e-01, acc. prob=0.949]


[outer 017] TRAIN (EMA+K-ens) ll=0.6971  br=0.2512  acc=0.5850


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=3.23e-01, acc. prob=0.942]


[outer 018] TRAIN (EMA+K-ens) ll=0.6942  br=0.2500  acc=0.5940


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.78e-01, acc. prob=0.932]


[outer 019] TRAIN (EMA+K-ens) ll=0.6917  br=0.2488  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=3.22e-01, acc. prob=0.924]


[outer 020] TRAIN (EMA+K-ens) ll=0.6927  br=0.2493  acc=0.6080


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=3.15e-01, acc. prob=0.919]


[outer 021] TRAIN (EMA+K-ens) ll=0.6893  br=0.2476  acc=0.6060


Sample: 100%|██████████| 330/330 [00:17, 19.14it/s, step size=2.76e-01, acc. prob=0.938]


[outer 022] TRAIN (EMA+K-ens) ll=0.6879  br=0.2470  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=2.96e-01, acc. prob=0.943]


[outer 023] TRAIN (EMA+K-ens) ll=0.6840  br=0.2451  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 20.24it/s, step size=2.72e-01, acc. prob=0.957]


[outer 024] TRAIN (EMA+K-ens) ll=0.6871  br=0.2466  acc=0.6340


Sample: 100%|██████████| 330/330 [00:18, 18.18it/s, step size=2.91e-01, acc. prob=0.921]


[outer 025] TRAIN (EMA+K-ens) ll=0.6905  br=0.2482  acc=0.6160


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.71e-01, acc. prob=0.949]


[outer 026] TRAIN (EMA+K-ens) ll=0.6892  br=0.2476  acc=0.6120


Sample: 100%|██████████| 330/330 [00:18, 17.55it/s, step size=2.77e-01, acc. prob=0.949]


[outer 027] TRAIN (EMA+K-ens) ll=0.6878  br=0.2469  acc=0.6100


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=2.60e-01, acc. prob=0.963]


[outer 028] TRAIN (EMA+K-ens) ll=0.6851  br=0.2456  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.47e-01, acc. prob=0.959]


[outer 029] TRAIN (EMA+K-ens) ll=0.6808  br=0.2435  acc=0.6200


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=3.17e-01, acc. prob=0.939]


[outer 030] TRAIN (EMA+K-ens) ll=0.6787  br=0.2426  acc=0.6160


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=2.76e-01, acc. prob=0.935]


[outer 031] TRAIN (EMA+K-ens) ll=0.6779  br=0.2423  acc=0.6150


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=2.86e-01, acc. prob=0.942]


[outer 032] TRAIN (EMA+K-ens) ll=0.6742  br=0.2405  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.61e-01, acc. prob=0.947]


[outer 033] TRAIN (EMA+K-ens) ll=0.6710  br=0.2389  acc=0.6500


Sample: 100%|██████████| 330/330 [00:18, 17.93it/s, step size=2.62e-01, acc. prob=0.949]


[outer 034] TRAIN (EMA+K-ens) ll=0.6734  br=0.2401  acc=0.6620


Sample: 100%|██████████| 330/330 [00:19, 17.20it/s, step size=3.34e-01, acc. prob=0.937]


[outer 035] TRAIN (EMA+K-ens) ll=0.6722  br=0.2395  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 20.21it/s, step size=2.93e-01, acc. prob=0.941]


[outer 036] TRAIN (EMA+K-ens) ll=0.6705  br=0.2387  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=2.90e-01, acc. prob=0.938]


[outer 037] TRAIN (EMA+K-ens) ll=0.6727  br=0.2398  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=2.86e-01, acc. prob=0.952]


[outer 038] TRAIN (EMA+K-ens) ll=0.6734  br=0.2401  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.45it/s, step size=3.18e-01, acc. prob=0.909]


[outer 039] TRAIN (EMA+K-ens) ll=0.6715  br=0.2391  acc=0.6720
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}]


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=2.64e-01, acc. prob=0.957]


[outer 000] TRAIN (EMA+K-ens) ll=0.7348  br=0.2673  acc=0.5460


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=3.14e-01, acc. prob=0.907]


[outer 001] TRAIN (EMA+K-ens) ll=0.6824  br=0.2444  acc=0.5800


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=2.93e-01, acc. prob=0.939]


[outer 002] TRAIN (EMA+K-ens) ll=0.6971  br=0.2511  acc=0.5770


Sample: 100%|██████████| 330/330 [00:16, 20.32it/s, step size=3.12e-01, acc. prob=0.916]


[outer 003] TRAIN (EMA+K-ens) ll=0.6914  br=0.2484  acc=0.6020


Sample: 100%|██████████| 330/330 [00:16, 19.82it/s, step size=2.36e-01, acc. prob=0.967]


[outer 004] TRAIN (EMA+K-ens) ll=0.6825  br=0.2442  acc=0.6010


Sample: 100%|██████████| 330/330 [00:16, 19.90it/s, step size=3.25e-01, acc. prob=0.936]


[outer 005] TRAIN (EMA+K-ens) ll=0.6862  br=0.2459  acc=0.6250


Sample: 100%|██████████| 330/330 [00:19, 17.21it/s, step size=2.74e-01, acc. prob=0.946]


[outer 006] TRAIN (EMA+K-ens) ll=0.6885  br=0.2471  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 19.05it/s, step size=2.81e-01, acc. prob=0.936]


[outer 007] TRAIN (EMA+K-ens) ll=0.6847  br=0.2453  acc=0.6600


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=3.33e-01, acc. prob=0.941]


[outer 008] TRAIN (EMA+K-ens) ll=0.6815  br=0.2436  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=2.87e-01, acc. prob=0.943]


[outer 009] TRAIN (EMA+K-ens) ll=0.6885  br=0.2470  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.47it/s, step size=3.07e-01, acc. prob=0.935]


[outer 010] TRAIN (EMA+K-ens) ll=0.6819  br=0.2439  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=2.99e-01, acc. prob=0.941]


[outer 011] TRAIN (EMA+K-ens) ll=0.6826  br=0.2442  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=3.32e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.6885  br=0.2470  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 20.81it/s, step size=3.40e-01, acc. prob=0.926]


[outer 013] TRAIN (EMA+K-ens) ll=0.6830  br=0.2443  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.77e-01, acc. prob=0.946]


[outer 014] TRAIN (EMA+K-ens) ll=0.6822  br=0.2438  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.59e-01, acc. prob=0.905]


[outer 015] TRAIN (EMA+K-ens) ll=0.6824  br=0.2439  acc=0.6990


Sample: 100%|██████████| 330/330 [00:17, 18.85it/s, step size=3.33e-01, acc. prob=0.919]


[outer 016] TRAIN (EMA+K-ens) ll=0.6882  br=0.2466  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=3.18e-01, acc. prob=0.908]


[outer 017] TRAIN (EMA+K-ens) ll=0.6847  br=0.2451  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=3.18e-01, acc. prob=0.949]


[outer 018] TRAIN (EMA+K-ens) ll=0.6844  br=0.2449  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.16it/s, step size=3.32e-01, acc. prob=0.898]


[outer 019] TRAIN (EMA+K-ens) ll=0.6791  br=0.2423  acc=0.6620


Sample: 100%|██████████| 330/330 [00:18, 17.95it/s, step size=3.24e-01, acc. prob=0.929]


[outer 020] TRAIN (EMA+K-ens) ll=0.6736  br=0.2398  acc=0.6550


Sample: 100%|██████████| 330/330 [00:15, 20.63it/s, step size=2.74e-01, acc. prob=0.938]


[outer 021] TRAIN (EMA+K-ens) ll=0.6741  br=0.2399  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 19.18it/s, step size=2.53e-01, acc. prob=0.942]


[outer 022] TRAIN (EMA+K-ens) ll=0.6684  br=0.2371  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.90it/s, step size=3.13e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.6663  br=0.2362  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 20.18it/s, step size=3.88e-01, acc. prob=0.910]


[outer 024] TRAIN (EMA+K-ens) ll=0.6623  br=0.2343  acc=0.7080


Sample: 100%|██████████| 330/330 [00:16, 20.41it/s, step size=3.32e-01, acc. prob=0.940]


[outer 025] TRAIN (EMA+K-ens) ll=0.6620  br=0.2340  acc=0.7070


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=3.00e-01, acc. prob=0.944]


[outer 026] TRAIN (EMA+K-ens) ll=0.6613  br=0.2337  acc=0.6920


Sample: 100%|██████████| 330/330 [00:16, 20.23it/s, step size=2.91e-01, acc. prob=0.931]


[outer 027] TRAIN (EMA+K-ens) ll=0.6626  br=0.2343  acc=0.6950


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.74e-01, acc. prob=0.952]


[outer 028] TRAIN (EMA+K-ens) ll=0.6675  br=0.2366  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=2.93e-01, acc. prob=0.935]


[outer 029] TRAIN (EMA+K-ens) ll=0.6685  br=0.2370  acc=0.7020


Sample: 100%|██████████| 330/330 [00:17, 19.23it/s, step size=2.97e-01, acc. prob=0.912]


[outer 030] TRAIN (EMA+K-ens) ll=0.6729  br=0.2391  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.48it/s, step size=2.84e-01, acc. prob=0.952]


[outer 031] TRAIN (EMA+K-ens) ll=0.6765  br=0.2408  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=3.15e-01, acc. prob=0.933]


[outer 032] TRAIN (EMA+K-ens) ll=0.6770  br=0.2410  acc=0.6470


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=3.07e-01, acc. prob=0.933]


[outer 033] TRAIN (EMA+K-ens) ll=0.6812  br=0.2429  acc=0.6330


Sample: 100%|██████████| 330/330 [00:18, 17.74it/s, step size=2.81e-01, acc. prob=0.941]


[outer 034] TRAIN (EMA+K-ens) ll=0.6781  br=0.2415  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=2.92e-01, acc. prob=0.952]


[outer 035] TRAIN (EMA+K-ens) ll=0.6787  br=0.2421  acc=0.6390


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=3.16e-01, acc. prob=0.925]


[outer 036] TRAIN (EMA+K-ens) ll=0.6772  br=0.2414  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=3.04e-01, acc. prob=0.950]


[outer 037] TRAIN (EMA+K-ens) ll=0.6751  br=0.2404  acc=0.6360


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=3.51e-01, acc. prob=0.936]


[outer 038] TRAIN (EMA+K-ens) ll=0.6755  br=0.2405  acc=0.6540


Sample: 100%|██████████| 330/330 [00:15, 21.24it/s, step size=2.87e-01, acc. prob=0.956]


[outer 039] TRAIN (EMA+K-ens) ll=0.6717  br=0.2385  acc=0.6650
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}]


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=2.76e-01, acc. prob=0.965]


[outer 000] TRAIN (EMA+K-ens) ll=0.7068  br=0.2532  acc=0.5280


Sample: 100%|██████████| 330/330 [00:16, 20.06it/s, step size=2.95e-01, acc. prob=0.911]


[outer 001] TRAIN (EMA+K-ens) ll=0.6854  br=0.2453  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=2.53e-01, acc. prob=0.959]


[outer 002] TRAIN (EMA+K-ens) ll=0.6844  br=0.2446  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 18.58it/s, step size=2.33e-01, acc. prob=0.967]


[outer 003] TRAIN (EMA+K-ens) ll=0.6841  br=0.2445  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=2.65e-01, acc. prob=0.967]


[outer 004] TRAIN (EMA+K-ens) ll=0.6809  br=0.2432  acc=0.6720


Sample: 100%|██████████| 330/330 [00:18, 17.92it/s, step size=2.61e-01, acc. prob=0.951]


[outer 005] TRAIN (EMA+K-ens) ll=0.6774  br=0.2415  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=3.06e-01, acc. prob=0.950]


[outer 006] TRAIN (EMA+K-ens) ll=0.6746  br=0.2403  acc=0.6670


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=2.91e-01, acc. prob=0.924]


[outer 007] TRAIN (EMA+K-ens) ll=0.6746  br=0.2403  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=3.11e-01, acc. prob=0.924]


[outer 008] TRAIN (EMA+K-ens) ll=0.6707  br=0.2385  acc=0.6670


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.29e-01, acc. prob=0.908]


[outer 009] TRAIN (EMA+K-ens) ll=0.6641  br=0.2352  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 20.25it/s, step size=2.88e-01, acc. prob=0.936]


[outer 010] TRAIN (EMA+K-ens) ll=0.6651  br=0.2357  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 20.06it/s, step size=2.76e-01, acc. prob=0.906]


[outer 011] TRAIN (EMA+K-ens) ll=0.6593  br=0.2329  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.19it/s, step size=2.97e-01, acc. prob=0.930]


[outer 012] TRAIN (EMA+K-ens) ll=0.6610  br=0.2336  acc=0.6840


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=2.95e-01, acc. prob=0.950]


[outer 013] TRAIN (EMA+K-ens) ll=0.6664  br=0.2362  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=2.90e-01, acc. prob=0.920]


[outer 014] TRAIN (EMA+K-ens) ll=0.6599  br=0.2330  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=3.13e-01, acc. prob=0.919]


[outer 015] TRAIN (EMA+K-ens) ll=0.6657  br=0.2357  acc=0.6690


Sample: 100%|██████████| 330/330 [00:18, 17.89it/s, step size=2.79e-01, acc. prob=0.942]


[outer 016] TRAIN (EMA+K-ens) ll=0.6659  br=0.2358  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.93e-01, acc. prob=0.930]


[outer 017] TRAIN (EMA+K-ens) ll=0.6794  br=0.2424  acc=0.6470


Sample: 100%|██████████| 330/330 [00:15, 20.70it/s, step size=3.03e-01, acc. prob=0.911]


[outer 018] TRAIN (EMA+K-ens) ll=0.6754  br=0.2403  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=3.17e-01, acc. prob=0.939]


[outer 019] TRAIN (EMA+K-ens) ll=0.6829  br=0.2440  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 20.94it/s, step size=2.34e-01, acc. prob=0.951]


[outer 020] TRAIN (EMA+K-ens) ll=0.6799  br=0.2427  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.73it/s, step size=3.47e-01, acc. prob=0.919]


[outer 021] TRAIN (EMA+K-ens) ll=0.6783  br=0.2419  acc=0.6620


Sample: 100%|██████████| 330/330 [00:16, 19.74it/s, step size=3.21e-01, acc. prob=0.939]


[outer 022] TRAIN (EMA+K-ens) ll=0.6832  br=0.2442  acc=0.6450


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=3.80e-01, acc. prob=0.894]


[outer 023] TRAIN (EMA+K-ens) ll=0.6766  br=0.2412  acc=0.6260


Sample: 100%|██████████| 330/330 [00:17, 18.58it/s, step size=2.99e-01, acc. prob=0.952]


[outer 024] TRAIN (EMA+K-ens) ll=0.6778  br=0.2417  acc=0.6330


Sample: 100%|██████████| 330/330 [00:16, 19.49it/s, step size=2.91e-01, acc. prob=0.958]


[outer 025] TRAIN (EMA+K-ens) ll=0.6658  br=0.2358  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=2.98e-01, acc. prob=0.953]


[outer 026] TRAIN (EMA+K-ens) ll=0.6656  br=0.2359  acc=0.6700


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=3.19e-01, acc. prob=0.929]


[outer 027] TRAIN (EMA+K-ens) ll=0.6602  br=0.2333  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=3.37e-01, acc. prob=0.921]


[outer 028] TRAIN (EMA+K-ens) ll=0.6578  br=0.2322  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 20.72it/s, step size=3.10e-01, acc. prob=0.918]


[outer 029] TRAIN (EMA+K-ens) ll=0.6528  br=0.2298  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=3.17e-01, acc. prob=0.931]


[outer 030] TRAIN (EMA+K-ens) ll=0.6523  br=0.2296  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.96e-01, acc. prob=0.926]


[outer 031] TRAIN (EMA+K-ens) ll=0.6545  br=0.2307  acc=0.6990


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=3.38e-01, acc. prob=0.909]


[outer 032] TRAIN (EMA+K-ens) ll=0.6534  br=0.2302  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=2.97e-01, acc. prob=0.940]


[outer 033] TRAIN (EMA+K-ens) ll=0.6541  br=0.2305  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.80e-01, acc. prob=0.934]


[outer 034] TRAIN (EMA+K-ens) ll=0.6574  br=0.2321  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=3.10e-01, acc. prob=0.922]


[outer 035] TRAIN (EMA+K-ens) ll=0.6593  br=0.2330  acc=0.6820


Sample: 100%|██████████| 330/330 [00:16, 19.80it/s, step size=3.31e-01, acc. prob=0.936]


[outer 036] TRAIN (EMA+K-ens) ll=0.6645  br=0.2355  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.32it/s, step size=3.05e-01, acc. prob=0.934]


[outer 037] TRAIN (EMA+K-ens) ll=0.6700  br=0.2381  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 20.27it/s, step size=3.38e-01, acc. prob=0.901]


[outer 038] TRAIN (EMA+K-ens) ll=0.6736  br=0.2397  acc=0.6790


Sample: 100%|██████████| 330/330 [00:15, 21.00it/s, step size=2.92e-01, acc. prob=0.917]


[outer 039] TRAIN (EMA+K-ens) ll=0.6704  br=0.2381  acc=0.6890
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}]


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=2.90e-01, acc. prob=0.919]


[outer 000] TRAIN (EMA+K-ens) ll=0.7063  br=0.2559  acc=0.4890


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=2.94e-01, acc. prob=0.938]


[outer 001] TRAIN (EMA+K-ens) ll=0.7162  br=0.2611  acc=0.5070


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.63e-01, acc. prob=0.939]


[outer 002] TRAIN (EMA+K-ens) ll=0.7032  br=0.2546  acc=0.5830


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=2.89e-01, acc. prob=0.942]


[outer 003] TRAIN (EMA+K-ens) ll=0.6941  br=0.2502  acc=0.6080


Sample: 100%|██████████| 330/330 [00:18, 17.62it/s, step size=2.29e-01, acc. prob=0.956]


[outer 004] TRAIN (EMA+K-ens) ll=0.6965  br=0.2514  acc=0.6110


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=2.96e-01, acc. prob=0.915]


[outer 005] TRAIN (EMA+K-ens) ll=0.6975  br=0.2518  acc=0.5850


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=3.59e-01, acc. prob=0.905]


[outer 006] TRAIN (EMA+K-ens) ll=0.6946  br=0.2504  acc=0.6020


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=3.11e-01, acc. prob=0.931]


[outer 007] TRAIN (EMA+K-ens) ll=0.6949  br=0.2505  acc=0.6110


Sample: 100%|██████████| 330/330 [00:15, 20.99it/s, step size=2.95e-01, acc. prob=0.910]


[outer 008] TRAIN (EMA+K-ens) ll=0.6964  br=0.2511  acc=0.6240


Sample: 100%|██████████| 330/330 [00:18, 18.00it/s, step size=2.67e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6908  br=0.2484  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.94it/s, step size=2.65e-01, acc. prob=0.962]


[outer 010] TRAIN (EMA+K-ens) ll=0.6897  br=0.2478  acc=0.6360


Sample: 100%|██████████| 330/330 [00:15, 20.88it/s, step size=3.14e-01, acc. prob=0.923]


[outer 011] TRAIN (EMA+K-ens) ll=0.6935  br=0.2496  acc=0.6380


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=3.20e-01, acc. prob=0.920]


[outer 012] TRAIN (EMA+K-ens) ll=0.6886  br=0.2473  acc=0.6420


Sample: 100%|██████████| 330/330 [00:15, 20.84it/s, step size=3.36e-01, acc. prob=0.929]


[outer 013] TRAIN (EMA+K-ens) ll=0.6846  br=0.2455  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.48it/s, step size=3.53e-01, acc. prob=0.921]


[outer 014] TRAIN (EMA+K-ens) ll=0.6825  br=0.2445  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=2.96e-01, acc. prob=0.926]


[outer 015] TRAIN (EMA+K-ens) ll=0.6810  br=0.2437  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=2.87e-01, acc. prob=0.954]


[outer 016] TRAIN (EMA+K-ens) ll=0.6777  br=0.2422  acc=0.6360


Sample: 100%|██████████| 330/330 [00:15, 20.74it/s, step size=2.87e-01, acc. prob=0.925]


[outer 017] TRAIN (EMA+K-ens) ll=0.6772  br=0.2419  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.41it/s, step size=2.83e-01, acc. prob=0.930]


[outer 018] TRAIN (EMA+K-ens) ll=0.6776  br=0.2421  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=2.68e-01, acc. prob=0.941]


[outer 019] TRAIN (EMA+K-ens) ll=0.6774  br=0.2420  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=3.37e-01, acc. prob=0.928]


[outer 020] TRAIN (EMA+K-ens) ll=0.6783  br=0.2423  acc=0.6340


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=3.10e-01, acc. prob=0.940]


[outer 021] TRAIN (EMA+K-ens) ll=0.6822  br=0.2442  acc=0.6170


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=3.29e-01, acc. prob=0.934]


[outer 022] TRAIN (EMA+K-ens) ll=0.6851  br=0.2456  acc=0.6070


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.84e-01, acc. prob=0.954]


[outer 023] TRAIN (EMA+K-ens) ll=0.6804  br=0.2433  acc=0.6190


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=3.25e-01, acc. prob=0.925]


[outer 024] TRAIN (EMA+K-ens) ll=0.6861  br=0.2460  acc=0.5840


Sample: 100%|██████████| 330/330 [00:18, 17.50it/s, step size=2.81e-01, acc. prob=0.932]


[outer 025] TRAIN (EMA+K-ens) ll=0.6903  br=0.2480  acc=0.5790


Sample: 100%|██████████| 330/330 [00:16, 20.15it/s, step size=2.64e-01, acc. prob=0.963]


[outer 026] TRAIN (EMA+K-ens) ll=0.6841  br=0.2450  acc=0.5990


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=2.95e-01, acc. prob=0.935]


[outer 027] TRAIN (EMA+K-ens) ll=0.6846  br=0.2452  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=2.47e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6872  br=0.2464  acc=0.6220


Sample: 100%|██████████| 330/330 [00:16, 20.41it/s, step size=3.20e-01, acc. prob=0.908]


[outer 029] TRAIN (EMA+K-ens) ll=0.6864  br=0.2459  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.91e-01, acc. prob=0.934]


[outer 030] TRAIN (EMA+K-ens) ll=0.6915  br=0.2482  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=3.11e-01, acc. prob=0.935]


[outer 031] TRAIN (EMA+K-ens) ll=0.6944  br=0.2497  acc=0.6470


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=2.72e-01, acc. prob=0.942]


[outer 032] TRAIN (EMA+K-ens) ll=0.6893  br=0.2475  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=3.25e-01, acc. prob=0.925]


[outer 033] TRAIN (EMA+K-ens) ll=0.6866  br=0.2462  acc=0.6630


Sample: 100%|██████████| 330/330 [00:18, 17.43it/s, step size=2.45e-01, acc. prob=0.907]


[outer 034] TRAIN (EMA+K-ens) ll=0.6939  br=0.2497  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=2.57e-01, acc. prob=0.956]


[outer 035] TRAIN (EMA+K-ens) ll=0.6894  br=0.2476  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=3.03e-01, acc. prob=0.928]


[outer 036] TRAIN (EMA+K-ens) ll=0.6878  br=0.2468  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 20.44it/s, step size=3.05e-01, acc. prob=0.909]


[outer 037] TRAIN (EMA+K-ens) ll=0.6787  br=0.2426  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.86e-01, acc. prob=0.944]


[outer 038] TRAIN (EMA+K-ens) ll=0.6750  br=0.2408  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 19.87it/s, step size=3.40e-01, acc. prob=0.917]


[outer 039] TRAIN (EMA+K-ens) ll=0.6781  br=0.2423  acc=0.6720
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}]


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.89e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6917  br=0.2479  acc=0.5580


Sample: 100%|██████████| 330/330 [00:17, 18.90it/s, step size=3.04e-01, acc. prob=0.926]


[outer 001] TRAIN (EMA+K-ens) ll=0.6843  br=0.2452  acc=0.5770


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=2.93e-01, acc. prob=0.925]


[outer 002] TRAIN (EMA+K-ens) ll=0.6680  br=0.2373  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.94it/s, step size=3.72e-01, acc. prob=0.918]


[outer 003] TRAIN (EMA+K-ens) ll=0.6738  br=0.2403  acc=0.6130


Sample: 100%|██████████| 330/330 [00:16, 19.76it/s, step size=3.68e-01, acc. prob=0.915]


[outer 004] TRAIN (EMA+K-ens) ll=0.6715  br=0.2392  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 19.53it/s, step size=3.49e-01, acc. prob=0.914]


[outer 005] TRAIN (EMA+K-ens) ll=0.6730  br=0.2399  acc=0.6110


Sample: 100%|██████████| 330/330 [00:19, 16.54it/s, step size=2.42e-01, acc. prob=0.943]


[outer 006] TRAIN (EMA+K-ens) ll=0.6754  br=0.2410  acc=0.6270


Sample: 100%|██████████| 330/330 [00:16, 20.25it/s, step size=2.94e-01, acc. prob=0.933]


[outer 007] TRAIN (EMA+K-ens) ll=0.6771  br=0.2418  acc=0.6260


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.47e-01, acc. prob=0.941]


[outer 008] TRAIN (EMA+K-ens) ll=0.6769  br=0.2417  acc=0.6440


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=2.61e-01, acc. prob=0.956]


[outer 009] TRAIN (EMA+K-ens) ll=0.6765  br=0.2415  acc=0.6250


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.76e-01, acc. prob=0.943]


[outer 010] TRAIN (EMA+K-ens) ll=0.6768  br=0.2416  acc=0.6410


Sample: 100%|██████████| 330/330 [00:15, 20.98it/s, step size=2.93e-01, acc. prob=0.943]


[outer 011] TRAIN (EMA+K-ens) ll=0.6731  br=0.2397  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 19.01it/s, step size=2.66e-01, acc. prob=0.956]


[outer 012] TRAIN (EMA+K-ens) ll=0.6762  br=0.2411  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=2.64e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6734  br=0.2397  acc=0.6670


Sample: 100%|██████████| 330/330 [00:19, 17.04it/s, step size=2.68e-01, acc. prob=0.930]


[outer 014] TRAIN (EMA+K-ens) ll=0.6659  br=0.2361  acc=0.6450


Sample: 100%|██████████| 330/330 [00:15, 21.40it/s, step size=3.28e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6624  br=0.2345  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=2.99e-01, acc. prob=0.907]


[outer 016] TRAIN (EMA+K-ens) ll=0.6616  br=0.2340  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=3.34e-01, acc. prob=0.933]


[outer 017] TRAIN (EMA+K-ens) ll=0.6620  br=0.2340  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.09e-01, acc. prob=0.932]


[outer 018] TRAIN (EMA+K-ens) ll=0.6643  br=0.2353  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 19.47it/s, step size=3.48e-01, acc. prob=0.932]


[outer 019] TRAIN (EMA+K-ens) ll=0.6636  br=0.2350  acc=0.6690


Sample: 100%|██████████| 330/330 [00:15, 21.71it/s, step size=3.90e-01, acc. prob=0.883]


[outer 020] TRAIN (EMA+K-ens) ll=0.6636  br=0.2349  acc=0.6900


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=3.35e-01, acc. prob=0.908]


[outer 021] TRAIN (EMA+K-ens) ll=0.6650  br=0.2356  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=2.83e-01, acc. prob=0.959]


[outer 022] TRAIN (EMA+K-ens) ll=0.6674  br=0.2369  acc=0.6620


Sample: 100%|██████████| 330/330 [00:18, 18.04it/s, step size=2.88e-01, acc. prob=0.946]


[outer 023] TRAIN (EMA+K-ens) ll=0.6716  br=0.2390  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 20.26it/s, step size=2.74e-01, acc. prob=0.959]


[outer 024] TRAIN (EMA+K-ens) ll=0.6715  br=0.2389  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 19.07it/s, step size=2.80e-01, acc. prob=0.908]


[outer 025] TRAIN (EMA+K-ens) ll=0.6702  br=0.2382  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.99it/s, step size=2.29e-01, acc. prob=0.963]


[outer 026] TRAIN (EMA+K-ens) ll=0.6672  br=0.2367  acc=0.6730


Sample: 100%|██████████| 330/330 [00:19, 17.36it/s, step size=2.61e-01, acc. prob=0.938]


[outer 027] TRAIN (EMA+K-ens) ll=0.6709  br=0.2386  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=2.78e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6708  br=0.2386  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=2.84e-01, acc. prob=0.925]


[outer 029] TRAIN (EMA+K-ens) ll=0.6722  br=0.2392  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=3.12e-01, acc. prob=0.956]


[outer 030] TRAIN (EMA+K-ens) ll=0.6716  br=0.2389  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.96e-01, acc. prob=0.940]


[outer 031] TRAIN (EMA+K-ens) ll=0.6674  br=0.2369  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.89e-01, acc. prob=0.936]


[outer 032] TRAIN (EMA+K-ens) ll=0.6714  br=0.2389  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=2.80e-01, acc. prob=0.928]


[outer 033] TRAIN (EMA+K-ens) ll=0.6684  br=0.2375  acc=0.6790


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=3.03e-01, acc. prob=0.942]


[outer 034] TRAIN (EMA+K-ens) ll=0.6759  br=0.2411  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.57e-01, acc. prob=0.899]


[outer 035] TRAIN (EMA+K-ens) ll=0.6751  br=0.2405  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=3.28e-01, acc. prob=0.924]


[outer 036] TRAIN (EMA+K-ens) ll=0.6724  br=0.2391  acc=0.6700


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=3.11e-01, acc. prob=0.950]


[outer 037] TRAIN (EMA+K-ens) ll=0.6674  br=0.2367  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=2.98e-01, acc. prob=0.937]


[outer 038] TRAIN (EMA+K-ens) ll=0.6682  br=0.2371  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=2.68e-01, acc. prob=0.941]


[outer 039] TRAIN (EMA+K-ens) ll=0.6762  br=0.2408  acc=0.6540
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}, {'accuracy': 0.605239987373352, 'brier': 0.24954001605510712, 'logloss': 0.694911539554596}]


Sample: 100%|██████████| 330/330 [00:18, 18.30it/s, step size=2.82e-01, acc. prob=0.955]


[outer 000] TRAIN (EMA+K-ens) ll=0.6822  br=0.2442  acc=0.6130


Sample: 100%|██████████| 330/330 [00:16, 19.65it/s, step size=2.99e-01, acc. prob=0.939]


[outer 001] TRAIN (EMA+K-ens) ll=0.6852  br=0.2458  acc=0.6070


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=3.03e-01, acc. prob=0.918]


[outer 002] TRAIN (EMA+K-ens) ll=0.6853  br=0.2459  acc=0.6160


Sample: 100%|██████████| 330/330 [00:17, 19.03it/s, step size=3.03e-01, acc. prob=0.937]


[outer 003] TRAIN (EMA+K-ens) ll=0.6891  br=0.2476  acc=0.5950


Sample: 100%|██████████| 330/330 [00:15, 20.78it/s, step size=2.86e-01, acc. prob=0.921]


[outer 004] TRAIN (EMA+K-ens) ll=0.6873  br=0.2467  acc=0.5960


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=3.20e-01, acc. prob=0.925]


[outer 005] TRAIN (EMA+K-ens) ll=0.6877  br=0.2468  acc=0.6040


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=2.62e-01, acc. prob=0.968]


[outer 006] TRAIN (EMA+K-ens) ll=0.6874  br=0.2467  acc=0.5890


Sample: 100%|██████████| 330/330 [00:18, 17.76it/s, step size=2.74e-01, acc. prob=0.948]


[outer 007] TRAIN (EMA+K-ens) ll=0.6870  br=0.2465  acc=0.6240


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=2.82e-01, acc. prob=0.950]


[outer 008] TRAIN (EMA+K-ens) ll=0.6846  br=0.2453  acc=0.6310


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=2.84e-01, acc. prob=0.954]


[outer 009] TRAIN (EMA+K-ens) ll=0.6836  br=0.2448  acc=0.6230


Sample: 100%|██████████| 330/330 [00:16, 20.01it/s, step size=3.22e-01, acc. prob=0.898]


[outer 010] TRAIN (EMA+K-ens) ll=0.6831  br=0.2444  acc=0.6220


Sample: 100%|██████████| 330/330 [00:15, 20.76it/s, step size=3.39e-01, acc. prob=0.926]


[outer 011] TRAIN (EMA+K-ens) ll=0.6803  br=0.2431  acc=0.6280


Sample: 100%|██████████| 330/330 [00:17, 18.97it/s, step size=2.56e-01, acc. prob=0.954]


[outer 012] TRAIN (EMA+K-ens) ll=0.6841  br=0.2451  acc=0.6360


Sample: 100%|██████████| 330/330 [00:17, 18.53it/s, step size=3.00e-01, acc. prob=0.942]


[outer 013] TRAIN (EMA+K-ens) ll=0.6796  br=0.2430  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=2.25e-01, acc. prob=0.969]


[outer 014] TRAIN (EMA+K-ens) ll=0.6772  br=0.2418  acc=0.6410


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.11e-01, acc. prob=0.953]


[outer 015] TRAIN (EMA+K-ens) ll=0.6803  br=0.2431  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=3.18e-01, acc. prob=0.942]


[outer 016] TRAIN (EMA+K-ens) ll=0.6862  br=0.2460  acc=0.6270


Sample: 100%|██████████| 330/330 [00:18, 17.70it/s, step size=2.94e-01, acc. prob=0.948]


[outer 017] TRAIN (EMA+K-ens) ll=0.6891  br=0.2474  acc=0.6120


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=2.58e-01, acc. prob=0.961]


[outer 018] TRAIN (EMA+K-ens) ll=0.6904  br=0.2481  acc=0.6220


Sample: 100%|██████████| 330/330 [00:18, 17.98it/s, step size=2.49e-01, acc. prob=0.973]


[outer 019] TRAIN (EMA+K-ens) ll=0.6941  br=0.2498  acc=0.6180


Sample: 100%|██████████| 330/330 [00:17, 19.36it/s, step size=2.79e-01, acc. prob=0.940]


[outer 020] TRAIN (EMA+K-ens) ll=0.6906  br=0.2481  acc=0.6240


Sample: 100%|██████████| 330/330 [00:18, 17.83it/s, step size=2.63e-01, acc. prob=0.946]


[outer 021] TRAIN (EMA+K-ens) ll=0.6990  br=0.2519  acc=0.6190


Sample: 100%|██████████| 330/330 [00:17, 18.79it/s, step size=3.08e-01, acc. prob=0.930]


[outer 022] TRAIN (EMA+K-ens) ll=0.7014  br=0.2533  acc=0.5890


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.99e-01, acc. prob=0.930]


[outer 023] TRAIN (EMA+K-ens) ll=0.6995  br=0.2524  acc=0.5950


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=2.66e-01, acc. prob=0.941]


[outer 024] TRAIN (EMA+K-ens) ll=0.7017  br=0.2533  acc=0.5870


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.97e-01, acc. prob=0.954]


[outer 025] TRAIN (EMA+K-ens) ll=0.7028  br=0.2536  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 18.52it/s, step size=3.10e-01, acc. prob=0.925]


[outer 026] TRAIN (EMA+K-ens) ll=0.7038  br=0.2543  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=3.03e-01, acc. prob=0.914]


[outer 027] TRAIN (EMA+K-ens) ll=0.7003  br=0.2527  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.30e-01, acc. prob=0.937]


[outer 028] TRAIN (EMA+K-ens) ll=0.6997  br=0.2524  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 20.26it/s, step size=2.72e-01, acc. prob=0.950]


[outer 029] TRAIN (EMA+K-ens) ll=0.6950  br=0.2504  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=2.68e-01, acc. prob=0.939]


[outer 030] TRAIN (EMA+K-ens) ll=0.6954  br=0.2505  acc=0.6480


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=3.49e-01, acc. prob=0.877]


[outer 031] TRAIN (EMA+K-ens) ll=0.6941  br=0.2500  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 19.11it/s, step size=2.51e-01, acc. prob=0.945]


[outer 032] TRAIN (EMA+K-ens) ll=0.6908  br=0.2483  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.92e-01, acc. prob=0.941]


[outer 033] TRAIN (EMA+K-ens) ll=0.6858  br=0.2459  acc=0.6680


Sample: 100%|██████████| 330/330 [00:18, 17.74it/s, step size=2.84e-01, acc. prob=0.938]


[outer 034] TRAIN (EMA+K-ens) ll=0.6816  br=0.2439  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=2.77e-01, acc. prob=0.957]


[outer 035] TRAIN (EMA+K-ens) ll=0.6829  br=0.2445  acc=0.6500


Sample: 100%|██████████| 330/330 [00:18, 17.70it/s, step size=2.90e-01, acc. prob=0.942]


[outer 036] TRAIN (EMA+K-ens) ll=0.6823  br=0.2442  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.44it/s, step size=3.45e-01, acc. prob=0.924]


[outer 037] TRAIN (EMA+K-ens) ll=0.6877  br=0.2467  acc=0.6520


Sample: 100%|██████████| 330/330 [00:19, 17.12it/s, step size=2.93e-01, acc. prob=0.940]


[outer 038] TRAIN (EMA+K-ens) ll=0.6890  br=0.2473  acc=0.6560


Sample: 100%|██████████| 330/330 [00:19, 17.30it/s, step size=2.85e-01, acc. prob=0.944]


[outer 039] TRAIN (EMA+K-ens) ll=0.6892  br=0.2472  acc=0.6490
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}, {'accuracy': 0.605239987373352, 'brier': 0.24954001605510712, 'logloss': 0.694911539554596}, {'accuracy': 0.649459958076477, 'brier': 0.24179817736148834, 'logloss': 0.6774382591247559}]


Sample: 100%|██████████| 330/330 [00:18, 18.11it/s, step size=2.46e-01, acc. prob=0.958]


[outer 000] TRAIN (EMA+K-ens) ll=0.6756  br=0.2413  acc=0.5900


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=3.04e-01, acc. prob=0.913]


[outer 001] TRAIN (EMA+K-ens) ll=0.6882  br=0.2470  acc=0.5970


Sample: 100%|██████████| 330/330 [00:15, 21.03it/s, step size=3.30e-01, acc. prob=0.901]


[outer 002] TRAIN (EMA+K-ens) ll=0.6831  br=0.2445  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.65e-01, acc. prob=0.945]


[outer 003] TRAIN (EMA+K-ens) ll=0.6805  br=0.2432  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=3.24e-01, acc. prob=0.907]


[outer 004] TRAIN (EMA+K-ens) ll=0.6848  br=0.2453  acc=0.6510


Sample: 100%|██████████| 330/330 [00:18, 18.10it/s, step size=2.69e-01, acc. prob=0.964]


[outer 005] TRAIN (EMA+K-ens) ll=0.6791  br=0.2426  acc=0.6630


Sample: 100%|██████████| 330/330 [00:18, 18.28it/s, step size=2.79e-01, acc. prob=0.951]


[outer 006] TRAIN (EMA+K-ens) ll=0.6773  br=0.2417  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.47it/s, step size=3.38e-01, acc. prob=0.921]


[outer 007] TRAIN (EMA+K-ens) ll=0.6763  br=0.2412  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=2.56e-01, acc. prob=0.945]


[outer 008] TRAIN (EMA+K-ens) ll=0.6787  br=0.2424  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 20.49it/s, step size=3.23e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6739  br=0.2400  acc=0.6700


Sample: 100%|██████████| 330/330 [00:17, 18.63it/s, step size=3.36e-01, acc. prob=0.926]


[outer 010] TRAIN (EMA+K-ens) ll=0.6719  br=0.2390  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=2.89e-01, acc. prob=0.964]


[outer 011] TRAIN (EMA+K-ens) ll=0.6775  br=0.2418  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.62e-01, acc. prob=0.942]


[outer 012] TRAIN (EMA+K-ens) ll=0.6733  br=0.2397  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.53it/s, step size=3.11e-01, acc. prob=0.943]


[outer 013] TRAIN (EMA+K-ens) ll=0.6751  br=0.2404  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=2.77e-01, acc. prob=0.941]


[outer 014] TRAIN (EMA+K-ens) ll=0.6777  br=0.2417  acc=0.6780


Sample: 100%|██████████| 330/330 [00:16, 20.17it/s, step size=2.93e-01, acc. prob=0.946]


[outer 015] TRAIN (EMA+K-ens) ll=0.6737  br=0.2398  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.91e-01, acc. prob=0.930]


[outer 016] TRAIN (EMA+K-ens) ll=0.6762  br=0.2407  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 18.81it/s, step size=2.58e-01, acc. prob=0.950]


[outer 017] TRAIN (EMA+K-ens) ll=0.6778  br=0.2412  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 20.60it/s, step size=2.71e-01, acc. prob=0.959]


[outer 018] TRAIN (EMA+K-ens) ll=0.6813  br=0.2430  acc=0.6340


Sample: 100%|██████████| 330/330 [00:19, 17.28it/s, step size=3.16e-01, acc. prob=0.925]


[outer 019] TRAIN (EMA+K-ens) ll=0.6774  br=0.2406  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=3.14e-01, acc. prob=0.948]


[outer 020] TRAIN (EMA+K-ens) ll=0.6752  br=0.2399  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 20.22it/s, step size=3.00e-01, acc. prob=0.916]


[outer 021] TRAIN (EMA+K-ens) ll=0.6762  br=0.2402  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 20.20it/s, step size=3.20e-01, acc. prob=0.959]


[outer 022] TRAIN (EMA+K-ens) ll=0.6726  br=0.2388  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=3.27e-01, acc. prob=0.933]


[outer 023] TRAIN (EMA+K-ens) ll=0.6791  br=0.2418  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 20.93it/s, step size=2.98e-01, acc. prob=0.943]


[outer 024] TRAIN (EMA+K-ens) ll=0.6752  br=0.2400  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.17it/s, step size=3.20e-01, acc. prob=0.920]


[outer 025] TRAIN (EMA+K-ens) ll=0.6739  br=0.2397  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=2.95e-01, acc. prob=0.936]


[outer 026] TRAIN (EMA+K-ens) ll=0.6716  br=0.2385  acc=0.6900


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=2.82e-01, acc. prob=0.956]


[outer 027] TRAIN (EMA+K-ens) ll=0.6736  br=0.2397  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=2.78e-01, acc. prob=0.955]


[outer 028] TRAIN (EMA+K-ens) ll=0.6752  br=0.2404  acc=0.6800


Sample: 100%|██████████| 330/330 [00:18, 17.97it/s, step size=2.75e-01, acc. prob=0.947]


[outer 029] TRAIN (EMA+K-ens) ll=0.6752  br=0.2406  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=2.64e-01, acc. prob=0.936]


[outer 030] TRAIN (EMA+K-ens) ll=0.6753  br=0.2406  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=2.97e-01, acc. prob=0.940]


[outer 031] TRAIN (EMA+K-ens) ll=0.6703  br=0.2382  acc=0.6810


Sample: 100%|██████████| 330/330 [00:14, 22.09it/s, step size=3.11e-01, acc. prob=0.921]


[outer 032] TRAIN (EMA+K-ens) ll=0.6689  br=0.2375  acc=0.6770


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=2.86e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6662  br=0.2363  acc=0.6760


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=2.93e-01, acc. prob=0.955]


[outer 034] TRAIN (EMA+K-ens) ll=0.6653  br=0.2357  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=2.56e-01, acc. prob=0.966]


[outer 035] TRAIN (EMA+K-ens) ll=0.6644  br=0.2352  acc=0.7010


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=2.99e-01, acc. prob=0.935]


[outer 036] TRAIN (EMA+K-ens) ll=0.6632  br=0.2348  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=3.65e-01, acc. prob=0.937]


[outer 037] TRAIN (EMA+K-ens) ll=0.6616  br=0.2339  acc=0.6950


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=3.09e-01, acc. prob=0.914]


[outer 038] TRAIN (EMA+K-ens) ll=0.6646  br=0.2354  acc=0.6910


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=2.88e-01, acc. prob=0.942]


[outer 039] TRAIN (EMA+K-ens) ll=0.6684  br=0.2372  acc=0.6910
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}, {'accuracy': 0.605239987373352, 'brier': 0.24954001605510712, 'logloss': 0.694911539554596}, {'accuracy': 0.649459958076477, 'brier': 0.24179817736148834, 'logloss': 0.6774382591247559}, {'accuracy': 0.5776599645614624, 'brier': 0.25739505887031555, 'logloss': 0.7132560610771179}]


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=2.87e-01, acc. prob=0.913]


[outer 000] TRAIN (EMA+K-ens) ll=0.7079  br=0.2564  acc=0.5900


Sample: 100%|██████████| 330/330 [00:19, 16.98it/s, step size=2.69e-01, acc. prob=0.943]


[outer 001] TRAIN (EMA+K-ens) ll=0.6976  br=0.2515  acc=0.5890


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=2.81e-01, acc. prob=0.943]


[outer 002] TRAIN (EMA+K-ens) ll=0.6732  br=0.2398  acc=0.6090


Sample: 100%|██████████| 330/330 [00:16, 20.38it/s, step size=2.82e-01, acc. prob=0.942]


[outer 003] TRAIN (EMA+K-ens) ll=0.6794  br=0.2427  acc=0.6150


Sample: 100%|██████████| 330/330 [00:18, 18.13it/s, step size=2.82e-01, acc. prob=0.947]


[outer 004] TRAIN (EMA+K-ens) ll=0.6772  br=0.2415  acc=0.6220


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=2.97e-01, acc. prob=0.949]


[outer 005] TRAIN (EMA+K-ens) ll=0.6824  br=0.2440  acc=0.6150


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=3.81e-01, acc. prob=0.879]


[outer 006] TRAIN (EMA+K-ens) ll=0.6824  br=0.2440  acc=0.6210


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=2.80e-01, acc. prob=0.952]


[outer 007] TRAIN (EMA+K-ens) ll=0.6838  br=0.2447  acc=0.6120


Sample: 100%|██████████| 330/330 [00:16, 20.18it/s, step size=3.05e-01, acc. prob=0.941]


[outer 008] TRAIN (EMA+K-ens) ll=0.6840  br=0.2447  acc=0.6140


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=2.45e-01, acc. prob=0.972]


[outer 009] TRAIN (EMA+K-ens) ll=0.6808  br=0.2431  acc=0.6250


Sample: 100%|██████████| 330/330 [00:16, 19.41it/s, step size=2.81e-01, acc. prob=0.924]


[outer 010] TRAIN (EMA+K-ens) ll=0.6851  br=0.2450  acc=0.6350


Sample: 100%|██████████| 330/330 [00:15, 20.78it/s, step size=4.16e-01, acc. prob=0.883]


[outer 011] TRAIN (EMA+K-ens) ll=0.6836  br=0.2444  acc=0.6430


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=3.32e-01, acc. prob=0.934]


[outer 012] TRAIN (EMA+K-ens) ll=0.6854  br=0.2451  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.09e-01, acc. prob=0.939]


[outer 013] TRAIN (EMA+K-ens) ll=0.6796  br=0.2423  acc=0.6520


Sample: 100%|██████████| 330/330 [00:19, 16.55it/s, step size=2.56e-01, acc. prob=0.954]


[outer 014] TRAIN (EMA+K-ens) ll=0.6776  br=0.2415  acc=0.6620


Sample: 100%|██████████| 330/330 [00:18, 18.20it/s, step size=3.30e-01, acc. prob=0.909]


[outer 015] TRAIN (EMA+K-ens) ll=0.6743  br=0.2399  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 18.34it/s, step size=2.84e-01, acc. prob=0.928]


[outer 016] TRAIN (EMA+K-ens) ll=0.6711  br=0.2383  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=2.73e-01, acc. prob=0.933]


[outer 017] TRAIN (EMA+K-ens) ll=0.6690  br=0.2374  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=3.63e-01, acc. prob=0.899]


[outer 018] TRAIN (EMA+K-ens) ll=0.6764  br=0.2411  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 18.47it/s, step size=2.97e-01, acc. prob=0.949]


[outer 019] TRAIN (EMA+K-ens) ll=0.6794  br=0.2424  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=2.99e-01, acc. prob=0.953]


[outer 020] TRAIN (EMA+K-ens) ll=0.6821  br=0.2438  acc=0.6450


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=2.87e-01, acc. prob=0.952]


[outer 021] TRAIN (EMA+K-ens) ll=0.6848  br=0.2450  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 20.64it/s, step size=2.99e-01, acc. prob=0.955]


[outer 022] TRAIN (EMA+K-ens) ll=0.6862  br=0.2454  acc=0.6510


Sample: 100%|██████████| 330/330 [00:18, 17.72it/s, step size=2.69e-01, acc. prob=0.949]


[outer 023] TRAIN (EMA+K-ens) ll=0.6839  br=0.2441  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=3.25e-01, acc. prob=0.932]


[outer 024] TRAIN (EMA+K-ens) ll=0.6827  br=0.2438  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=3.16e-01, acc. prob=0.940]


[outer 025] TRAIN (EMA+K-ens) ll=0.6853  br=0.2449  acc=0.6600


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=3.16e-01, acc. prob=0.926]


[outer 026] TRAIN (EMA+K-ens) ll=0.6754  br=0.2401  acc=0.6750


Sample: 100%|██████████| 330/330 [00:15, 20.68it/s, step size=3.21e-01, acc. prob=0.924]


[outer 027] TRAIN (EMA+K-ens) ll=0.6697  br=0.2376  acc=0.6800


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=3.18e-01, acc. prob=0.915]


[outer 028] TRAIN (EMA+K-ens) ll=0.6701  br=0.2379  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.42it/s, step size=3.67e-01, acc. prob=0.902]


[outer 029] TRAIN (EMA+K-ens) ll=0.6705  br=0.2381  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=3.60e-01, acc. prob=0.904]


[outer 030] TRAIN (EMA+K-ens) ll=0.6686  br=0.2374  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=3.16e-01, acc. prob=0.936]


[outer 031] TRAIN (EMA+K-ens) ll=0.6740  br=0.2399  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=2.97e-01, acc. prob=0.921]


[outer 032] TRAIN (EMA+K-ens) ll=0.6760  br=0.2410  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 20.34it/s, step size=2.92e-01, acc. prob=0.947]


[outer 033] TRAIN (EMA+K-ens) ll=0.6753  br=0.2407  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.71e-01, acc. prob=0.955]


[outer 034] TRAIN (EMA+K-ens) ll=0.6790  br=0.2425  acc=0.6850


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=2.96e-01, acc. prob=0.949]


[outer 035] TRAIN (EMA+K-ens) ll=0.6788  br=0.2423  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.48it/s, step size=3.36e-01, acc. prob=0.911]


[outer 036] TRAIN (EMA+K-ens) ll=0.6731  br=0.2396  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=2.90e-01, acc. prob=0.952]


[outer 037] TRAIN (EMA+K-ens) ll=0.6720  br=0.2391  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=2.79e-01, acc. prob=0.956]


[outer 038] TRAIN (EMA+K-ens) ll=0.6697  br=0.2380  acc=0.6810


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=3.22e-01, acc. prob=0.945]


[outer 039] TRAIN (EMA+K-ens) ll=0.6693  br=0.2378  acc=0.6700
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}, {'accuracy': 0.605239987373352, 'brier': 0.24954001605510712, 'logloss': 0.694911539554596}, {'accuracy': 0.649459958076477, 'brier': 0.24179817736148834, 'logloss': 0.6774382591247559}, {'accuracy': 0.5776599645614624, 'brier': 0.25739505887031555, 'logloss': 0.7132560610771179}, {'accuracy': 0.5586400032043457, 'brier': 0.25128015875816345, 'logloss': 0.699569046497345}]


Sample: 100%|██████████| 330/330 [00:19, 17.29it/s, step size=2.21e-01, acc. prob=0.966]


[outer 000] TRAIN (EMA+K-ens) ll=0.7021  br=0.2528  acc=0.5630


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=3.01e-01, acc. prob=0.942]


[outer 001] TRAIN (EMA+K-ens) ll=0.6669  br=0.2359  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 18.36it/s, step size=3.15e-01, acc. prob=0.910]


[outer 002] TRAIN (EMA+K-ens) ll=0.6701  br=0.2383  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.97e-01, acc. prob=0.925]


[outer 003] TRAIN (EMA+K-ens) ll=0.6650  br=0.2355  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.81e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6677  br=0.2369  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.99e-01, acc. prob=0.941]


[outer 005] TRAIN (EMA+K-ens) ll=0.6669  br=0.2365  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.53it/s, step size=3.29e-01, acc. prob=0.898]


[outer 006] TRAIN (EMA+K-ens) ll=0.6692  br=0.2376  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=3.35e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6692  br=0.2376  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 20.45it/s, step size=3.14e-01, acc. prob=0.933]


[outer 008] TRAIN (EMA+K-ens) ll=0.6706  br=0.2382  acc=0.6960


Sample: 100%|██████████| 330/330 [00:17, 19.20it/s, step size=2.51e-01, acc. prob=0.951]


[outer 009] TRAIN (EMA+K-ens) ll=0.6702  br=0.2380  acc=0.6990


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=3.33e-01, acc. prob=0.924]


[outer 010] TRAIN (EMA+K-ens) ll=0.6727  br=0.2389  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=3.31e-01, acc. prob=0.920]


[outer 011] TRAIN (EMA+K-ens) ll=0.6723  br=0.2389  acc=0.6940


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=2.74e-01, acc. prob=0.937]


[outer 012] TRAIN (EMA+K-ens) ll=0.6694  br=0.2372  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=3.20e-01, acc. prob=0.909]


[outer 013] TRAIN (EMA+K-ens) ll=0.6699  br=0.2374  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 19.17it/s, step size=2.66e-01, acc. prob=0.952]


[outer 014] TRAIN (EMA+K-ens) ll=0.6676  br=0.2363  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 20.91it/s, step size=4.01e-01, acc. prob=0.898]


[outer 015] TRAIN (EMA+K-ens) ll=0.6671  br=0.2360  acc=0.6850


Sample: 100%|██████████| 330/330 [00:18, 17.43it/s, step size=3.07e-01, acc. prob=0.950]


[outer 016] TRAIN (EMA+K-ens) ll=0.6635  br=0.2343  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 20.04it/s, step size=3.17e-01, acc. prob=0.908]


[outer 017] TRAIN (EMA+K-ens) ll=0.6705  br=0.2378  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 19.79it/s, step size=2.59e-01, acc. prob=0.950]


[outer 018] TRAIN (EMA+K-ens) ll=0.6641  br=0.2349  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=3.39e-01, acc. prob=0.919]


[outer 019] TRAIN (EMA+K-ens) ll=0.6631  br=0.2342  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 20.82it/s, step size=3.65e-01, acc. prob=0.929]


[outer 020] TRAIN (EMA+K-ens) ll=0.6649  br=0.2353  acc=0.7030


Sample: 100%|██████████| 330/330 [00:15, 20.96it/s, step size=3.42e-01, acc. prob=0.948]


[outer 021] TRAIN (EMA+K-ens) ll=0.6681  br=0.2369  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 20.44it/s, step size=2.65e-01, acc. prob=0.964]


[outer 022] TRAIN (EMA+K-ens) ll=0.6615  br=0.2337  acc=0.7020


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=2.48e-01, acc. prob=0.959]


[outer 023] TRAIN (EMA+K-ens) ll=0.6605  br=0.2334  acc=0.7080


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=3.08e-01, acc. prob=0.926]


[outer 024] TRAIN (EMA+K-ens) ll=0.6600  br=0.2331  acc=0.7080


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=3.43e-01, acc. prob=0.922]


[outer 025] TRAIN (EMA+K-ens) ll=0.6552  br=0.2308  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=2.99e-01, acc. prob=0.939]


[outer 026] TRAIN (EMA+K-ens) ll=0.6588  br=0.2325  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.44it/s, step size=2.80e-01, acc. prob=0.935]


[outer 027] TRAIN (EMA+K-ens) ll=0.6626  br=0.2344  acc=0.7120


Sample: 100%|██████████| 330/330 [00:16, 19.85it/s, step size=3.50e-01, acc. prob=0.910]


[outer 028] TRAIN (EMA+K-ens) ll=0.6694  br=0.2375  acc=0.7070


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=2.59e-01, acc. prob=0.953]


[outer 029] TRAIN (EMA+K-ens) ll=0.6679  br=0.2367  acc=0.7120


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=2.75e-01, acc. prob=0.956]


[outer 030] TRAIN (EMA+K-ens) ll=0.6750  br=0.2401  acc=0.7080


Sample: 100%|██████████| 330/330 [00:16, 20.44it/s, step size=3.00e-01, acc. prob=0.940]


[outer 031] TRAIN (EMA+K-ens) ll=0.6799  br=0.2423  acc=0.6990


Sample: 100%|██████████| 330/330 [00:17, 18.45it/s, step size=3.48e-01, acc. prob=0.901]


[outer 032] TRAIN (EMA+K-ens) ll=0.6813  br=0.2429  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=2.98e-01, acc. prob=0.925]


[outer 033] TRAIN (EMA+K-ens) ll=0.6843  br=0.2444  acc=0.6980


Sample: 100%|██████████| 330/330 [00:18, 17.79it/s, step size=2.46e-01, acc. prob=0.959]


[outer 034] TRAIN (EMA+K-ens) ll=0.6825  br=0.2437  acc=0.7060


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=2.80e-01, acc. prob=0.952]


[outer 035] TRAIN (EMA+K-ens) ll=0.6815  br=0.2434  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.53e-01, acc. prob=0.954]


[outer 036] TRAIN (EMA+K-ens) ll=0.6779  br=0.2417  acc=0.6940


Sample: 100%|██████████| 330/330 [00:17, 18.85it/s, step size=2.74e-01, acc. prob=0.923]


[outer 037] TRAIN (EMA+K-ens) ll=0.6837  br=0.2445  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 20.61it/s, step size=3.13e-01, acc. prob=0.923]


[outer 038] TRAIN (EMA+K-ens) ll=0.6846  br=0.2449  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=2.64e-01, acc. prob=0.946]


[outer 039] TRAIN (EMA+K-ens) ll=0.6840  br=0.2446  acc=0.6810
[{'accuracy': 0.5907599925994873, 'brier': 0.2555178701877594, 'logloss': 0.7077370285987854}, {'accuracy': 0.6608799695968628, 'brier': 0.23256878554821014, 'logloss': 0.6581940650939941}, {'accuracy': 0.6812999844551086, 'brier': 0.23211956024169922, 'logloss': 0.6584541201591492}, {'accuracy': 0.6370599865913391, 'brier': 0.24393577873706818, 'logloss': 0.6829428672790527}, {'accuracy': 0.6067599654197693, 'brier': 0.24485008418560028, 'logloss': 0.6843493580818176}, {'accuracy': 0.605239987373352, 'brier': 0.24954001605510712, 'logloss': 0.694911539554596}, {'accuracy': 0.649459958076477, 'brier': 0.24179817736148834, 'logloss': 0.6774382591247559}, {'accuracy': 0.5776599645614624, 'brier': 0.25739505887031555, 'logloss': 0.7132560610771179}, {'accuracy': 0.5586400032043457, 'brier': 0.25128015875816345, 'logloss': 0.699569046497345}, {'accuracy': 0.5994399785995483, 'brier': 0.24625147879123688, 'logloss': 0.6891066431

In [None]:

all_metrics = []
noise_type = "cauchy"
n_per_group_train = 200
use_long = False


for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_simulated_cauchy_test, feature_cols,
        interaction=True, nonlinear=True, group=True,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    # 마지막 θ로 test 예측 (또는 NUTS 마지막 50%로 p_mean만)
    p_test, m = predict_probit(res["final_theta"], df_simulated_cauchy_test, feature_cols, True, True, True)
    all_metrics.append(m)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median',lambda s: s.quantile(0.25),lambda s: s.quantile(0.75)])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=2.46e-01, acc. prob=0.940]


[outer 000] TRAIN (EMA+K-ens) ll=0.7558  br=0.2792  acc=0.4950


Sample: 100%|██████████| 330/330 [00:16, 19.59it/s, step size=2.96e-01, acc. prob=0.945]


[outer 001] TRAIN (EMA+K-ens) ll=0.7025  br=0.2541  acc=0.5530


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=2.96e-01, acc. prob=0.927]


[outer 002] TRAIN (EMA+K-ens) ll=0.6836  br=0.2450  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.82e-01, acc. prob=0.940]


[outer 003] TRAIN (EMA+K-ens) ll=0.6751  br=0.2409  acc=0.6230


Sample: 100%|██████████| 330/330 [00:16, 19.59it/s, step size=2.80e-01, acc. prob=0.949]


[outer 004] TRAIN (EMA+K-ens) ll=0.6822  br=0.2443  acc=0.6240


Sample: 100%|██████████| 330/330 [00:15, 21.08it/s, step size=3.15e-01, acc. prob=0.931]


[outer 005] TRAIN (EMA+K-ens) ll=0.6774  br=0.2419  acc=0.6240


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=2.83e-01, acc. prob=0.947]


[outer 006] TRAIN (EMA+K-ens) ll=0.6757  br=0.2410  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=2.74e-01, acc. prob=0.933]


[outer 007] TRAIN (EMA+K-ens) ll=0.6775  br=0.2418  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.13it/s, step size=2.94e-01, acc. prob=0.932]


[outer 008] TRAIN (EMA+K-ens) ll=0.6687  br=0.2374  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=3.15e-01, acc. prob=0.938]


[outer 009] TRAIN (EMA+K-ens) ll=0.6711  br=0.2387  acc=0.6830


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=3.10e-01, acc. prob=0.927]


[outer 010] TRAIN (EMA+K-ens) ll=0.6777  br=0.2417  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.65it/s, step size=2.55e-01, acc. prob=0.958]


[outer 011] TRAIN (EMA+K-ens) ll=0.6806  br=0.2431  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=2.81e-01, acc. prob=0.939]


[outer 012] TRAIN (EMA+K-ens) ll=0.6782  br=0.2419  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 20.06it/s, step size=3.05e-01, acc. prob=0.949]


[outer 013] TRAIN (EMA+K-ens) ll=0.6826  br=0.2441  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=2.81e-01, acc. prob=0.934]


[outer 014] TRAIN (EMA+K-ens) ll=0.6856  br=0.2455  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 21.54it/s, step size=2.67e-01, acc. prob=0.933]


[outer 015] TRAIN (EMA+K-ens) ll=0.6849  br=0.2452  acc=0.6780


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.72e-01, acc. prob=0.931]


[outer 016] TRAIN (EMA+K-ens) ll=0.6873  br=0.2463  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 19.04it/s, step size=3.21e-01, acc. prob=0.936]


[outer 017] TRAIN (EMA+K-ens) ll=0.6911  br=0.2481  acc=0.6570


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=2.72e-01, acc. prob=0.958]


[outer 018] TRAIN (EMA+K-ens) ll=0.6884  br=0.2469  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 19.02it/s, step size=3.16e-01, acc. prob=0.935]


[outer 019] TRAIN (EMA+K-ens) ll=0.6908  br=0.2480  acc=0.6190


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.72e-01, acc. prob=0.970]


[outer 020] TRAIN (EMA+K-ens) ll=0.6877  br=0.2466  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 19.24it/s, step size=2.75e-01, acc. prob=0.960]


[outer 021] TRAIN (EMA+K-ens) ll=0.6878  br=0.2467  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=2.35e-01, acc. prob=0.968]


[outer 022] TRAIN (EMA+K-ens) ll=0.6824  br=0.2440  acc=0.6440


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=3.30e-01, acc. prob=0.927]


[outer 023] TRAIN (EMA+K-ens) ll=0.6765  br=0.2411  acc=0.6410


Sample: 100%|██████████| 330/330 [00:15, 20.63it/s, step size=3.22e-01, acc. prob=0.932]


[outer 024] TRAIN (EMA+K-ens) ll=0.6746  br=0.2401  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 19.94it/s, step size=2.74e-01, acc. prob=0.941]


[outer 025] TRAIN (EMA+K-ens) ll=0.6696  br=0.2375  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.84it/s, step size=3.06e-01, acc. prob=0.941]


[outer 026] TRAIN (EMA+K-ens) ll=0.6719  br=0.2385  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.70e-01, acc. prob=0.944]


[outer 027] TRAIN (EMA+K-ens) ll=0.6644  br=0.2350  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 18.64it/s, step size=2.60e-01, acc. prob=0.941]


[outer 028] TRAIN (EMA+K-ens) ll=0.6594  br=0.2326  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.47it/s, step size=3.03e-01, acc. prob=0.944]


[outer 029] TRAIN (EMA+K-ens) ll=0.6588  br=0.2322  acc=0.6870


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.84e-01, acc. prob=0.947]


[outer 030] TRAIN (EMA+K-ens) ll=0.6604  br=0.2332  acc=0.6840


Sample: 100%|██████████| 330/330 [00:18, 18.28it/s, step size=2.84e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6668  br=0.2363  acc=0.6720


Sample: 100%|██████████| 330/330 [00:15, 21.08it/s, step size=3.38e-01, acc. prob=0.926]


[outer 032] TRAIN (EMA+K-ens) ll=0.6687  br=0.2372  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.40it/s, step size=3.34e-01, acc. prob=0.907]


[outer 033] TRAIN (EMA+K-ens) ll=0.6721  br=0.2389  acc=0.6570


Sample: 100%|██████████| 330/330 [00:16, 20.52it/s, step size=3.00e-01, acc. prob=0.947]


[outer 034] TRAIN (EMA+K-ens) ll=0.6673  br=0.2366  acc=0.6790


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.71e-01, acc. prob=0.950]


[outer 035] TRAIN (EMA+K-ens) ll=0.6727  br=0.2393  acc=0.6980


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=3.09e-01, acc. prob=0.921]


[outer 036] TRAIN (EMA+K-ens) ll=0.6782  br=0.2418  acc=0.6770


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.75e-01, acc. prob=0.952]


[outer 037] TRAIN (EMA+K-ens) ll=0.6763  br=0.2408  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 18.93it/s, step size=3.01e-01, acc. prob=0.928]


[outer 038] TRAIN (EMA+K-ens) ll=0.6788  br=0.2420  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=3.55e-01, acc. prob=0.879]


[outer 039] TRAIN (EMA+K-ens) ll=0.6778  br=0.2415  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 20.45it/s, step size=3.25e-01, acc. prob=0.936]


[outer 000] TRAIN (EMA+K-ens) ll=0.6896  br=0.2479  acc=0.6140


Sample: 100%|██████████| 330/330 [00:18, 18.20it/s, step size=3.01e-01, acc. prob=0.914]


[outer 001] TRAIN (EMA+K-ens) ll=0.6684  br=0.2376  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=3.00e-01, acc. prob=0.952]


[outer 002] TRAIN (EMA+K-ens) ll=0.6755  br=0.2410  acc=0.6250


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.39e-01, acc. prob=0.969]


[outer 003] TRAIN (EMA+K-ens) ll=0.6662  br=0.2365  acc=0.6750


Sample: 100%|██████████| 330/330 [00:14, 22.92it/s, step size=2.91e-01, acc. prob=0.932]


[outer 004] TRAIN (EMA+K-ens) ll=0.6700  br=0.2384  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 20.57it/s, step size=3.20e-01, acc. prob=0.894]


[outer 005] TRAIN (EMA+K-ens) ll=0.6719  br=0.2393  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=3.29e-01, acc. prob=0.942]


[outer 006] TRAIN (EMA+K-ens) ll=0.6726  br=0.2396  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.76it/s, step size=3.14e-01, acc. prob=0.911]


[outer 007] TRAIN (EMA+K-ens) ll=0.6716  br=0.2392  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=3.00e-01, acc. prob=0.945]


[outer 008] TRAIN (EMA+K-ens) ll=0.6709  br=0.2388  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=3.66e-01, acc. prob=0.917]


[outer 009] TRAIN (EMA+K-ens) ll=0.6741  br=0.2403  acc=0.6610


Sample: 100%|██████████| 330/330 [00:17, 19.34it/s, step size=2.92e-01, acc. prob=0.949]


[outer 010] TRAIN (EMA+K-ens) ll=0.6705  br=0.2386  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=3.12e-01, acc. prob=0.921]


[outer 011] TRAIN (EMA+K-ens) ll=0.6775  br=0.2420  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=3.33e-01, acc. prob=0.904]


[outer 012] TRAIN (EMA+K-ens) ll=0.6720  br=0.2393  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=3.03e-01, acc. prob=0.937]


[outer 013] TRAIN (EMA+K-ens) ll=0.6726  br=0.2396  acc=0.6820


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=2.98e-01, acc. prob=0.937]


[outer 014] TRAIN (EMA+K-ens) ll=0.6708  br=0.2387  acc=0.6780


Sample: 100%|██████████| 330/330 [00:19, 17.17it/s, step size=2.83e-01, acc. prob=0.949]


[outer 015] TRAIN (EMA+K-ens) ll=0.6700  br=0.2383  acc=0.6680


Sample: 100%|██████████| 330/330 [00:17, 19.17it/s, step size=3.04e-01, acc. prob=0.938]


[outer 016] TRAIN (EMA+K-ens) ll=0.6717  br=0.2391  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=3.05e-01, acc. prob=0.942]


[outer 017] TRAIN (EMA+K-ens) ll=0.6752  br=0.2408  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=2.84e-01, acc. prob=0.943]


[outer 018] TRAIN (EMA+K-ens) ll=0.6791  br=0.2427  acc=0.6540


Sample: 100%|██████████| 330/330 [00:18, 18.03it/s, step size=2.70e-01, acc. prob=0.947]


[outer 019] TRAIN (EMA+K-ens) ll=0.6764  br=0.2414  acc=0.6590


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=3.06e-01, acc. prob=0.926]


[outer 020] TRAIN (EMA+K-ens) ll=0.6789  br=0.2426  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.90e-01, acc. prob=0.941]


[outer 021] TRAIN (EMA+K-ens) ll=0.6777  br=0.2419  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 19.81it/s, step size=3.25e-01, acc. prob=0.917]


[outer 022] TRAIN (EMA+K-ens) ll=0.6789  br=0.2425  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.36it/s, step size=2.20e-01, acc. prob=0.962]


[outer 023] TRAIN (EMA+K-ens) ll=0.6804  br=0.2433  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 19.38it/s, step size=3.08e-01, acc. prob=0.926]


[outer 024] TRAIN (EMA+K-ens) ll=0.6796  br=0.2430  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.01it/s, step size=3.17e-01, acc. prob=0.913]


[outer 025] TRAIN (EMA+K-ens) ll=0.6792  br=0.2427  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.60it/s, step size=3.46e-01, acc. prob=0.906]


[outer 026] TRAIN (EMA+K-ens) ll=0.6791  br=0.2428  acc=0.6340


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=2.88e-01, acc. prob=0.927]


[outer 027] TRAIN (EMA+K-ens) ll=0.6753  br=0.2408  acc=0.6380


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=2.73e-01, acc. prob=0.948]


[outer 028] TRAIN (EMA+K-ens) ll=0.6756  br=0.2409  acc=0.6560


Sample: 100%|██████████| 330/330 [00:18, 18.33it/s, step size=2.44e-01, acc. prob=0.954]


[outer 029] TRAIN (EMA+K-ens) ll=0.6761  br=0.2412  acc=0.6770


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.68e-01, acc. prob=0.947]


[outer 030] TRAIN (EMA+K-ens) ll=0.6765  br=0.2414  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.55it/s, step size=2.98e-01, acc. prob=0.945]


[outer 031] TRAIN (EMA+K-ens) ll=0.6770  br=0.2415  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 19.14it/s, step size=2.67e-01, acc. prob=0.946]


[outer 032] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.6560
[Early stop @ outer 32] Δll=0.261%, Δbr=0.367%, Δacc=0.002


Sample: 100%|██████████| 330/330 [00:17, 19.18it/s, step size=2.85e-01, acc. prob=0.928]


[outer 000] TRAIN (EMA+K-ens) ll=0.6891  br=0.2480  acc=0.5170


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=3.29e-01, acc. prob=0.906]


[outer 001] TRAIN (EMA+K-ens) ll=0.6844  br=0.2453  acc=0.6280


Sample: 100%|██████████| 330/330 [00:17, 18.81it/s, step size=3.25e-01, acc. prob=0.913]


[outer 002] TRAIN (EMA+K-ens) ll=0.6828  br=0.2446  acc=0.6380


Sample: 100%|██████████| 330/330 [00:17, 18.67it/s, step size=2.47e-01, acc. prob=0.946]


[outer 003] TRAIN (EMA+K-ens) ll=0.6921  br=0.2491  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.41e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6823  br=0.2443  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.51it/s, step size=2.95e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6816  br=0.2440  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=2.79e-01, acc. prob=0.941]


[outer 006] TRAIN (EMA+K-ens) ll=0.6792  br=0.2428  acc=0.6350


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.44e-01, acc. prob=0.964]


[outer 007] TRAIN (EMA+K-ens) ll=0.6790  br=0.2427  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=3.00e-01, acc. prob=0.910]


[outer 008] TRAIN (EMA+K-ens) ll=0.6837  br=0.2450  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.69e-01, acc. prob=0.949]


[outer 009] TRAIN (EMA+K-ens) ll=0.6801  br=0.2433  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 18.96it/s, step size=2.64e-01, acc. prob=0.947]


[outer 010] TRAIN (EMA+K-ens) ll=0.6830  br=0.2447  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=2.96e-01, acc. prob=0.948]


[outer 011] TRAIN (EMA+K-ens) ll=0.6796  br=0.2430  acc=0.6710


Sample: 100%|██████████| 330/330 [00:15, 21.31it/s, step size=3.40e-01, acc. prob=0.917]


[outer 012] TRAIN (EMA+K-ens) ll=0.6849  br=0.2456  acc=0.6350


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=2.87e-01, acc. prob=0.936]


[outer 013] TRAIN (EMA+K-ens) ll=0.6806  br=0.2434  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.04e-01, acc. prob=0.943]


[outer 014] TRAIN (EMA+K-ens) ll=0.6852  br=0.2457  acc=0.6250


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=2.55e-01, acc. prob=0.949]


[outer 015] TRAIN (EMA+K-ens) ll=0.6801  br=0.2432  acc=0.6370


Sample: 100%|██████████| 330/330 [00:16, 19.59it/s, step size=2.94e-01, acc. prob=0.925]


[outer 016] TRAIN (EMA+K-ens) ll=0.6749  br=0.2407  acc=0.6390


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.66e-01, acc. prob=0.941]


[outer 017] TRAIN (EMA+K-ens) ll=0.6784  br=0.2424  acc=0.6470


Sample: 100%|██████████| 330/330 [00:18, 17.85it/s, step size=2.59e-01, acc. prob=0.947]


[outer 018] TRAIN (EMA+K-ens) ll=0.6729  br=0.2397  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=3.06e-01, acc. prob=0.931]


[outer 019] TRAIN (EMA+K-ens) ll=0.6659  br=0.2363  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 20.25it/s, step size=3.22e-01, acc. prob=0.949]


[outer 020] TRAIN (EMA+K-ens) ll=0.6685  br=0.2376  acc=0.6720


Sample: 100%|██████████| 330/330 [00:18, 18.25it/s, step size=2.92e-01, acc. prob=0.944]


[outer 021] TRAIN (EMA+K-ens) ll=0.6750  br=0.2407  acc=0.6500


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=2.85e-01, acc. prob=0.919]


[outer 022] TRAIN (EMA+K-ens) ll=0.6749  br=0.2406  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=2.45e-01, acc. prob=0.960]


[outer 023] TRAIN (EMA+K-ens) ll=0.6767  br=0.2415  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.43e-01, acc. prob=0.958]


[outer 024] TRAIN (EMA+K-ens) ll=0.6717  br=0.2390  acc=0.6530


Sample: 100%|██████████| 330/330 [00:16, 19.44it/s, step size=3.35e-01, acc. prob=0.939]


[outer 025] TRAIN (EMA+K-ens) ll=0.6712  br=0.2388  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.69it/s, step size=2.87e-01, acc. prob=0.936]


[outer 026] TRAIN (EMA+K-ens) ll=0.6711  br=0.2388  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 19.77it/s, step size=2.74e-01, acc. prob=0.949]


[outer 027] TRAIN (EMA+K-ens) ll=0.6771  br=0.2415  acc=0.6420
[Early stop @ outer 27] Δll=0.176%, Δbr=0.273%, Δacc=0.002


Sample: 100%|██████████| 330/330 [00:18, 17.90it/s, step size=2.66e-01, acc. prob=0.947]


[outer 000] TRAIN (EMA+K-ens) ll=0.6599  br=0.2332  acc=0.6130


Sample: 100%|██████████| 330/330 [00:15, 20.85it/s, step size=2.56e-01, acc. prob=0.947]


[outer 001] TRAIN (EMA+K-ens) ll=0.6610  br=0.2334  acc=0.6150


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=3.08e-01, acc. prob=0.948]


[outer 002] TRAIN (EMA+K-ens) ll=0.6583  br=0.2321  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.93it/s, step size=3.02e-01, acc. prob=0.915]


[outer 003] TRAIN (EMA+K-ens) ll=0.6618  br=0.2342  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.80e-01, acc. prob=0.952]


[outer 004] TRAIN (EMA+K-ens) ll=0.6654  br=0.2359  acc=0.6540


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=3.34e-01, acc. prob=0.933]


[outer 005] TRAIN (EMA+K-ens) ll=0.6630  br=0.2346  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=2.96e-01, acc. prob=0.945]


[outer 006] TRAIN (EMA+K-ens) ll=0.6658  br=0.2360  acc=0.6830


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=3.39e-01, acc. prob=0.933]


[outer 007] TRAIN (EMA+K-ens) ll=0.6648  br=0.2354  acc=0.6930


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=2.22e-01, acc. prob=0.959]


[outer 008] TRAIN (EMA+K-ens) ll=0.6659  br=0.2360  acc=0.6970


Sample: 100%|██████████| 330/330 [00:17, 18.52it/s, step size=2.38e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6645  br=0.2354  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=3.39e-01, acc. prob=0.882]


[outer 010] TRAIN (EMA+K-ens) ll=0.6662  br=0.2362  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 19.30it/s, step size=2.44e-01, acc. prob=0.953]


[outer 011] TRAIN (EMA+K-ens) ll=0.6638  br=0.2349  acc=0.6940


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=3.75e-01, acc. prob=0.916]


[outer 012] TRAIN (EMA+K-ens) ll=0.6603  br=0.2332  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 19.29it/s, step size=2.70e-01, acc. prob=0.941]


[outer 013] TRAIN (EMA+K-ens) ll=0.6603  br=0.2331  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.83it/s, step size=3.03e-01, acc. prob=0.933]


[outer 014] TRAIN (EMA+K-ens) ll=0.6525  br=0.2294  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 19.86it/s, step size=2.82e-01, acc. prob=0.958]


[outer 015] TRAIN (EMA+K-ens) ll=0.6517  br=0.2292  acc=0.6980


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=2.99e-01, acc. prob=0.917]


[outer 016] TRAIN (EMA+K-ens) ll=0.6508  br=0.2287  acc=0.7040


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=3.12e-01, acc. prob=0.914]


[outer 017] TRAIN (EMA+K-ens) ll=0.6592  br=0.2327  acc=0.7050


Sample: 100%|██████████| 330/330 [00:17, 18.46it/s, step size=2.66e-01, acc. prob=0.939]


[outer 018] TRAIN (EMA+K-ens) ll=0.6596  br=0.2329  acc=0.7020


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=2.72e-01, acc. prob=0.956]


[outer 019] TRAIN (EMA+K-ens) ll=0.6601  br=0.2331  acc=0.7050


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=3.04e-01, acc. prob=0.917]


[outer 020] TRAIN (EMA+K-ens) ll=0.6599  br=0.2331  acc=0.7000


Sample: 100%|██████████| 330/330 [00:17, 18.57it/s, step size=2.66e-01, acc. prob=0.966]


[outer 021] TRAIN (EMA+K-ens) ll=0.6678  br=0.2370  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 20.71it/s, step size=3.04e-01, acc. prob=0.947]


[outer 022] TRAIN (EMA+K-ens) ll=0.6753  br=0.2405  acc=0.6430


Sample: 100%|██████████| 330/330 [00:17, 18.66it/s, step size=2.47e-01, acc. prob=0.963]


[outer 023] TRAIN (EMA+K-ens) ll=0.6774  br=0.2415  acc=0.6390


Sample: 100%|██████████| 330/330 [00:16, 20.59it/s, step size=3.39e-01, acc. prob=0.923]


[outer 024] TRAIN (EMA+K-ens) ll=0.6805  br=0.2431  acc=0.6440


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=2.97e-01, acc. prob=0.937]


[outer 025] TRAIN (EMA+K-ens) ll=0.6759  br=0.2407  acc=0.6690


Sample: 100%|██████████| 330/330 [00:18, 18.14it/s, step size=2.72e-01, acc. prob=0.958]


[outer 026] TRAIN (EMA+K-ens) ll=0.6751  br=0.2402  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 22.84it/s, step size=3.47e-01, acc. prob=0.929]


[outer 027] TRAIN (EMA+K-ens) ll=0.6799  br=0.2426  acc=0.6720


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.89e-01, acc. prob=0.928]


[outer 028] TRAIN (EMA+K-ens) ll=0.6821  br=0.2436  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 20.81it/s, step size=2.73e-01, acc. prob=0.941]


[outer 029] TRAIN (EMA+K-ens) ll=0.6784  br=0.2420  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.27it/s, step size=3.71e-01, acc. prob=0.929]


[outer 030] TRAIN (EMA+K-ens) ll=0.6769  br=0.2413  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=3.18e-01, acc. prob=0.949]


[outer 031] TRAIN (EMA+K-ens) ll=0.6774  br=0.2417  acc=0.6760


Sample: 100%|██████████| 330/330 [00:17, 18.92it/s, step size=2.86e-01, acc. prob=0.956]


[outer 032] TRAIN (EMA+K-ens) ll=0.6754  br=0.2407  acc=0.6590


Sample: 100%|██████████| 330/330 [00:18, 17.75it/s, step size=2.71e-01, acc. prob=0.935]


[outer 033] TRAIN (EMA+K-ens) ll=0.6734  br=0.2397  acc=0.6500


Sample: 100%|██████████| 330/330 [00:18, 18.16it/s, step size=2.62e-01, acc. prob=0.947]


[outer 034] TRAIN (EMA+K-ens) ll=0.6710  br=0.2386  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 20.42it/s, step size=3.35e-01, acc. prob=0.935]


[outer 035] TRAIN (EMA+K-ens) ll=0.6678  br=0.2370  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=2.91e-01, acc. prob=0.937]


[outer 036] TRAIN (EMA+K-ens) ll=0.6681  br=0.2371  acc=0.6610


Sample: 100%|██████████| 330/330 [00:15, 21.54it/s, step size=3.21e-01, acc. prob=0.917]


[outer 037] TRAIN (EMA+K-ens) ll=0.6596  br=0.2331  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 18.70it/s, step size=2.58e-01, acc. prob=0.957]


[outer 038] TRAIN (EMA+K-ens) ll=0.6598  br=0.2331  acc=0.6700


Sample: 100%|██████████| 330/330 [00:16, 19.84it/s, step size=2.95e-01, acc. prob=0.943]


[outer 039] TRAIN (EMA+K-ens) ll=0.6550  br=0.2309  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=3.04e-01, acc. prob=0.958]


[outer 000] TRAIN (EMA+K-ens) ll=0.7163  br=0.2602  acc=0.5590


Sample: 100%|██████████| 330/330 [00:16, 19.92it/s, step size=2.86e-01, acc. prob=0.948]


[outer 001] TRAIN (EMA+K-ens) ll=0.6987  br=0.2520  acc=0.6010


Sample: 100%|██████████| 330/330 [00:18, 17.86it/s, step size=2.85e-01, acc. prob=0.953]


[outer 002] TRAIN (EMA+K-ens) ll=0.6951  br=0.2503  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=3.28e-01, acc. prob=0.918]


[outer 003] TRAIN (EMA+K-ens) ll=0.6845  br=0.2452  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.76it/s, step size=3.53e-01, acc. prob=0.930]


[outer 004] TRAIN (EMA+K-ens) ll=0.6776  br=0.2418  acc=0.6990


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=3.36e-01, acc. prob=0.937]


[outer 005] TRAIN (EMA+K-ens) ll=0.6764  br=0.2413  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 20.14it/s, step size=3.21e-01, acc. prob=0.933]


[outer 006] TRAIN (EMA+K-ens) ll=0.6740  br=0.2401  acc=0.7020


Sample: 100%|██████████| 330/330 [00:18, 18.21it/s, step size=3.61e-01, acc. prob=0.924]


[outer 007] TRAIN (EMA+K-ens) ll=0.6763  br=0.2412  acc=0.7110


Sample: 100%|██████████| 330/330 [00:16, 19.91it/s, step size=3.41e-01, acc. prob=0.942]


[outer 008] TRAIN (EMA+K-ens) ll=0.6724  br=0.2393  acc=0.7030


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=2.51e-01, acc. prob=0.959]


[outer 009] TRAIN (EMA+K-ens) ll=0.6723  br=0.2392  acc=0.6930


Sample: 100%|██████████| 330/330 [00:16, 19.67it/s, step size=3.39e-01, acc. prob=0.925]


[outer 010] TRAIN (EMA+K-ens) ll=0.6727  br=0.2394  acc=0.6910


Sample: 100%|██████████| 330/330 [00:17, 18.39it/s, step size=3.10e-01, acc. prob=0.934]


[outer 011] TRAIN (EMA+K-ens) ll=0.6768  br=0.2414  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 19.59it/s, step size=2.89e-01, acc. prob=0.946]


[outer 012] TRAIN (EMA+K-ens) ll=0.6790  br=0.2423  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=2.83e-01, acc. prob=0.916]


[outer 013] TRAIN (EMA+K-ens) ll=0.6796  br=0.2426  acc=0.6640


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=3.02e-01, acc. prob=0.942]


[outer 014] TRAIN (EMA+K-ens) ll=0.6805  br=0.2432  acc=0.6720


Sample: 100%|██████████| 330/330 [00:16, 19.76it/s, step size=3.30e-01, acc. prob=0.923]


[outer 015] TRAIN (EMA+K-ens) ll=0.6792  br=0.2425  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 19.55it/s, step size=3.12e-01, acc. prob=0.948]


[outer 016] TRAIN (EMA+K-ens) ll=0.6814  br=0.2434  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=3.25e-01, acc. prob=0.918]


[outer 017] TRAIN (EMA+K-ens) ll=0.6822  br=0.2437  acc=0.6740


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.71e-01, acc. prob=0.948]


[outer 018] TRAIN (EMA+K-ens) ll=0.6775  br=0.2414  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.91it/s, step size=3.20e-01, acc. prob=0.928]


[outer 019] TRAIN (EMA+K-ens) ll=0.6761  br=0.2408  acc=0.6860


Sample: 100%|██████████| 330/330 [00:16, 20.12it/s, step size=2.68e-01, acc. prob=0.960]


[outer 020] TRAIN (EMA+K-ens) ll=0.6735  br=0.2396  acc=0.6840


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.94e-01, acc. prob=0.954]


[outer 021] TRAIN (EMA+K-ens) ll=0.6728  br=0.2393  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.31it/s, step size=3.10e-01, acc. prob=0.940]


[outer 022] TRAIN (EMA+K-ens) ll=0.6698  br=0.2378  acc=0.7090


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=3.02e-01, acc. prob=0.928]


[outer 023] TRAIN (EMA+K-ens) ll=0.6654  br=0.2357  acc=0.7160


Sample: 100%|██████████| 330/330 [00:16, 19.58it/s, step size=2.83e-01, acc. prob=0.954]


[outer 024] TRAIN (EMA+K-ens) ll=0.6673  br=0.2366  acc=0.7090


Sample: 100%|██████████| 330/330 [00:17, 19.21it/s, step size=2.94e-01, acc. prob=0.933]


[outer 025] TRAIN (EMA+K-ens) ll=0.6660  br=0.2361  acc=0.7080


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=2.46e-01, acc. prob=0.954]


[outer 026] TRAIN (EMA+K-ens) ll=0.6671  br=0.2366  acc=0.7100


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=2.63e-01, acc. prob=0.950]


[outer 027] TRAIN (EMA+K-ens) ll=0.6646  br=0.2353  acc=0.6980


Sample: 100%|██████████| 330/330 [00:17, 18.38it/s, step size=3.05e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6667  br=0.2360  acc=0.6880


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.68e-01, acc. prob=0.963]


[outer 029] TRAIN (EMA+K-ens) ll=0.6723  br=0.2388  acc=0.6710


Sample: 100%|██████████| 330/330 [00:17, 18.84it/s, step size=3.12e-01, acc. prob=0.923]


[outer 030] TRAIN (EMA+K-ens) ll=0.6754  br=0.2403  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.57it/s, step size=3.30e-01, acc. prob=0.930]


[outer 031] TRAIN (EMA+K-ens) ll=0.6754  br=0.2404  acc=0.6620


Sample: 100%|██████████| 330/330 [00:16, 19.52it/s, step size=3.61e-01, acc. prob=0.911]


[outer 032] TRAIN (EMA+K-ens) ll=0.6745  br=0.2398  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.21e-01, acc. prob=0.964]


[outer 033] TRAIN (EMA+K-ens) ll=0.6737  br=0.2393  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=3.26e-01, acc. prob=0.895]


[outer 034] TRAIN (EMA+K-ens) ll=0.6747  br=0.2399  acc=0.6560


Sample: 100%|██████████| 330/330 [00:16, 19.75it/s, step size=2.56e-01, acc. prob=0.972]


[outer 035] TRAIN (EMA+K-ens) ll=0.6751  br=0.2402  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 20.25it/s, step size=2.52e-01, acc. prob=0.959]


[outer 036] TRAIN (EMA+K-ens) ll=0.6788  br=0.2420  acc=0.6420


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=3.17e-01, acc. prob=0.918]


[outer 037] TRAIN (EMA+K-ens) ll=0.6770  br=0.2409  acc=0.6430


Sample: 100%|██████████| 330/330 [00:15, 20.71it/s, step size=3.68e-01, acc. prob=0.902]


[outer 038] TRAIN (EMA+K-ens) ll=0.6782  br=0.2414  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.11e-01, acc. prob=0.923]


[outer 039] TRAIN (EMA+K-ens) ll=0.6821  br=0.2433  acc=0.6480


Sample: 100%|██████████| 330/330 [00:16, 19.93it/s, step size=3.11e-01, acc. prob=0.916]


[outer 000] TRAIN (EMA+K-ens) ll=0.7118  br=0.2589  acc=0.5080


Sample: 100%|██████████| 330/330 [00:18, 17.44it/s, step size=2.58e-01, acc. prob=0.964]


[outer 001] TRAIN (EMA+K-ens) ll=0.7264  br=0.2657  acc=0.5140


Sample: 100%|██████████| 330/330 [00:18, 17.95it/s, step size=2.24e-01, acc. prob=0.959]


[outer 002] TRAIN (EMA+K-ens) ll=0.7324  br=0.2688  acc=0.4890


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=3.03e-01, acc. prob=0.939]


[outer 003] TRAIN (EMA+K-ens) ll=0.7249  br=0.2653  acc=0.4730


Sample: 100%|██████████| 330/330 [00:18, 18.26it/s, step size=3.07e-01, acc. prob=0.929]


[outer 004] TRAIN (EMA+K-ens) ll=0.7224  br=0.2642  acc=0.4890


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=3.07e-01, acc. prob=0.925]


[outer 005] TRAIN (EMA+K-ens) ll=0.7171  br=0.2615  acc=0.4930


Sample: 100%|██████████| 330/330 [00:18, 18.15it/s, step size=2.51e-01, acc. prob=0.940]


[outer 006] TRAIN (EMA+K-ens) ll=0.7134  br=0.2597  acc=0.5360


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=2.74e-01, acc. prob=0.953]


[outer 007] TRAIN (EMA+K-ens) ll=0.7085  br=0.2573  acc=0.5390


Sample: 100%|██████████| 330/330 [00:15, 21.00it/s, step size=3.29e-01, acc. prob=0.926]


[outer 008] TRAIN (EMA+K-ens) ll=0.7049  br=0.2555  acc=0.5530


Sample: 100%|██████████| 330/330 [00:17, 19.39it/s, step size=3.34e-01, acc. prob=0.926]


[outer 009] TRAIN (EMA+K-ens) ll=0.6973  br=0.2518  acc=0.5680


Sample: 100%|██████████| 330/330 [00:16, 20.45it/s, step size=2.88e-01, acc. prob=0.941]


[outer 010] TRAIN (EMA+K-ens) ll=0.6900  br=0.2482  acc=0.5860


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.92e-01, acc. prob=0.936]


[outer 011] TRAIN (EMA+K-ens) ll=0.6852  br=0.2459  acc=0.6020


Sample: 100%|██████████| 330/330 [00:17, 19.05it/s, step size=2.80e-01, acc. prob=0.952]


[outer 012] TRAIN (EMA+K-ens) ll=0.6832  br=0.2448  acc=0.6150


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=2.74e-01, acc. prob=0.949]


[outer 013] TRAIN (EMA+K-ens) ll=0.6791  br=0.2428  acc=0.6050


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=3.12e-01, acc. prob=0.905]


[outer 014] TRAIN (EMA+K-ens) ll=0.6811  br=0.2438  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=2.87e-01, acc. prob=0.943]


[outer 015] TRAIN (EMA+K-ens) ll=0.6815  br=0.2439  acc=0.6310


Sample: 100%|██████████| 330/330 [00:18, 18.19it/s, step size=2.73e-01, acc. prob=0.958]


[outer 016] TRAIN (EMA+K-ens) ll=0.6827  br=0.2445  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 18.48it/s, step size=2.84e-01, acc. prob=0.935]


[outer 017] TRAIN (EMA+K-ens) ll=0.6880  br=0.2471  acc=0.6270


Sample: 100%|██████████| 330/330 [00:17, 18.89it/s, step size=2.59e-01, acc. prob=0.952]


[outer 018] TRAIN (EMA+K-ens) ll=0.6936  br=0.2497  acc=0.5920


Sample: 100%|██████████| 330/330 [00:16, 20.23it/s, step size=3.47e-01, acc. prob=0.920]


[outer 019] TRAIN (EMA+K-ens) ll=0.6936  br=0.2498  acc=0.5840


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=3.13e-01, acc. prob=0.942]


[outer 020] TRAIN (EMA+K-ens) ll=0.6926  br=0.2494  acc=0.6100


Sample: 100%|██████████| 330/330 [00:18, 18.14it/s, step size=3.25e-01, acc. prob=0.885]


[outer 021] TRAIN (EMA+K-ens) ll=0.6973  br=0.2517  acc=0.6030


Sample: 100%|██████████| 330/330 [00:19, 17.09it/s, step size=2.98e-01, acc. prob=0.936]


[outer 022] TRAIN (EMA+K-ens) ll=0.6956  br=0.2509  acc=0.6080


Sample: 100%|██████████| 330/330 [00:18, 17.66it/s, step size=3.22e-01, acc. prob=0.916]


[outer 023] TRAIN (EMA+K-ens) ll=0.6961  br=0.2512  acc=0.6200


Sample: 100%|██████████| 330/330 [00:18, 17.80it/s, step size=2.54e-01, acc. prob=0.955]


[outer 024] TRAIN (EMA+K-ens) ll=0.6971  br=0.2517  acc=0.6270


Sample: 100%|██████████| 330/330 [00:18, 17.38it/s, step size=2.61e-01, acc. prob=0.956]


[outer 025] TRAIN (EMA+K-ens) ll=0.6950  br=0.2506  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.52it/s, step size=3.17e-01, acc. prob=0.915]


[outer 026] TRAIN (EMA+K-ens) ll=0.6909  br=0.2486  acc=0.6660


Sample: 100%|██████████| 330/330 [00:16, 19.83it/s, step size=2.98e-01, acc. prob=0.911]


[outer 027] TRAIN (EMA+K-ens) ll=0.6911  br=0.2487  acc=0.6610


Sample: 100%|██████████| 330/330 [00:18, 18.14it/s, step size=2.42e-01, acc. prob=0.951]


[outer 028] TRAIN (EMA+K-ens) ll=0.6910  br=0.2486  acc=0.6590


Sample: 100%|██████████| 330/330 [00:16, 19.63it/s, step size=3.33e-01, acc. prob=0.900]


[outer 029] TRAIN (EMA+K-ens) ll=0.6891  br=0.2477  acc=0.6690


Sample: 100%|██████████| 330/330 [00:18, 17.92it/s, step size=3.33e-01, acc. prob=0.888]


[outer 030] TRAIN (EMA+K-ens) ll=0.6882  br=0.2472  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 18.72it/s, step size=2.39e-01, acc. prob=0.971]


[outer 031] TRAIN (EMA+K-ens) ll=0.6911  br=0.2486  acc=0.6360


Sample: 100%|██████████| 330/330 [00:18, 17.82it/s, step size=2.60e-01, acc. prob=0.957]


[outer 032] TRAIN (EMA+K-ens) ll=0.6869  br=0.2466  acc=0.6530


Sample: 100%|██████████| 330/330 [00:18, 17.89it/s, step size=2.81e-01, acc. prob=0.903]


[outer 033] TRAIN (EMA+K-ens) ll=0.6826  br=0.2446  acc=0.6580


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=3.01e-01, acc. prob=0.929]


[outer 034] TRAIN (EMA+K-ens) ll=0.6842  br=0.2453  acc=0.6520


Sample: 100%|██████████| 330/330 [00:17, 18.97it/s, step size=2.22e-01, acc. prob=0.969]


[outer 035] TRAIN (EMA+K-ens) ll=0.6912  br=0.2485  acc=0.6330


Sample: 100%|██████████| 330/330 [00:16, 19.73it/s, step size=2.88e-01, acc. prob=0.919]


[outer 036] TRAIN (EMA+K-ens) ll=0.6866  br=0.2464  acc=0.6390


Sample: 100%|██████████| 330/330 [00:18, 17.84it/s, step size=2.53e-01, acc. prob=0.957]


[outer 037] TRAIN (EMA+K-ens) ll=0.6926  br=0.2492  acc=0.6220


Sample: 100%|██████████| 330/330 [00:17, 19.32it/s, step size=3.83e-01, acc. prob=0.909]


[outer 038] TRAIN (EMA+K-ens) ll=0.6958  br=0.2509  acc=0.6180


Sample: 100%|██████████| 330/330 [00:17, 19.08it/s, step size=3.32e-01, acc. prob=0.926]


[outer 039] TRAIN (EMA+K-ens) ll=0.6972  br=0.2516  acc=0.6140


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.75e-01, acc. prob=0.939]


[outer 000] TRAIN (EMA+K-ens) ll=0.7192  br=0.2608  acc=0.5490


Sample: 100%|██████████| 330/330 [00:18, 17.99it/s, step size=2.26e-01, acc. prob=0.966]


[outer 001] TRAIN (EMA+K-ens) ll=0.7103  br=0.2576  acc=0.5820


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.67e-01, acc. prob=0.952]


[outer 002] TRAIN (EMA+K-ens) ll=0.7036  br=0.2546  acc=0.5910


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.88e-01, acc. prob=0.938]


[outer 003] TRAIN (EMA+K-ens) ll=0.7015  br=0.2536  acc=0.6290


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=3.30e-01, acc. prob=0.922]


[outer 004] TRAIN (EMA+K-ens) ll=0.6929  br=0.2493  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.46it/s, step size=2.81e-01, acc. prob=0.947]


[outer 005] TRAIN (EMA+K-ens) ll=0.6902  br=0.2480  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=3.28e-01, acc. prob=0.946]


[outer 006] TRAIN (EMA+K-ens) ll=0.6860  br=0.2459  acc=0.6670


Sample: 100%|██████████| 330/330 [00:17, 19.37it/s, step size=2.85e-01, acc. prob=0.944]


[outer 007] TRAIN (EMA+K-ens) ll=0.6885  br=0.2471  acc=0.6670


Sample: 100%|██████████| 330/330 [00:15, 20.84it/s, step size=3.18e-01, acc. prob=0.936]


[outer 008] TRAIN (EMA+K-ens) ll=0.6887  br=0.2472  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=3.27e-01, acc. prob=0.923]


[outer 009] TRAIN (EMA+K-ens) ll=0.6857  br=0.2458  acc=0.6320


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=2.48e-01, acc. prob=0.965]


[outer 010] TRAIN (EMA+K-ens) ll=0.6869  br=0.2464  acc=0.6170


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=3.08e-01, acc. prob=0.922]


[outer 011] TRAIN (EMA+K-ens) ll=0.6856  br=0.2458  acc=0.6190


Sample: 100%|██████████| 330/330 [00:17, 18.68it/s, step size=2.93e-01, acc. prob=0.937]


[outer 012] TRAIN (EMA+K-ens) ll=0.6924  br=0.2490  acc=0.6110


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=3.06e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6854  br=0.2458  acc=0.6420


Sample: 100%|██████████| 330/330 [00:18, 18.18it/s, step size=3.02e-01, acc. prob=0.936]


[outer 014] TRAIN (EMA+K-ens) ll=0.6858  br=0.2460  acc=0.6380


Sample: 100%|██████████| 330/330 [00:17, 19.12it/s, step size=3.78e-01, acc. prob=0.922]


[outer 015] TRAIN (EMA+K-ens) ll=0.6829  br=0.2447  acc=0.6540


Sample: 100%|██████████| 330/330 [00:18, 18.16it/s, step size=3.01e-01, acc. prob=0.935]


[outer 016] TRAIN (EMA+K-ens) ll=0.6789  br=0.2426  acc=0.6410


Sample: 100%|██████████| 330/330 [00:18, 18.31it/s, step size=2.80e-01, acc. prob=0.946]


[outer 017] TRAIN (EMA+K-ens) ll=0.6781  br=0.2422  acc=0.6450


Sample: 100%|██████████| 330/330 [00:17, 19.28it/s, step size=2.82e-01, acc. prob=0.948]


[outer 018] TRAIN (EMA+K-ens) ll=0.6748  br=0.2406  acc=0.6750


Sample: 100%|██████████| 330/330 [00:18, 18.23it/s, step size=2.53e-01, acc. prob=0.946]


[outer 019] TRAIN (EMA+K-ens) ll=0.6780  br=0.2421  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=3.17e-01, acc. prob=0.941]


[outer 020] TRAIN (EMA+K-ens) ll=0.6759  br=0.2410  acc=0.6710


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.54e-01, acc. prob=0.956]


[outer 021] TRAIN (EMA+K-ens) ll=0.6840  br=0.2449  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.90it/s, step size=2.46e-01, acc. prob=0.948]


[outer 022] TRAIN (EMA+K-ens) ll=0.6918  br=0.2486  acc=0.6600


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=2.76e-01, acc. prob=0.958]


[outer 023] TRAIN (EMA+K-ens) ll=0.6879  br=0.2466  acc=0.6630


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=3.73e-01, acc. prob=0.878]


[outer 024] TRAIN (EMA+K-ens) ll=0.6905  br=0.2479  acc=0.6390


Sample: 100%|██████████| 330/330 [00:16, 19.72it/s, step size=2.72e-01, acc. prob=0.943]


[outer 025] TRAIN (EMA+K-ens) ll=0.6923  br=0.2487  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=2.93e-01, acc. prob=0.947]


[outer 026] TRAIN (EMA+K-ens) ll=0.6901  br=0.2477  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.22it/s, step size=3.33e-01, acc. prob=0.941]


[outer 027] TRAIN (EMA+K-ens) ll=0.6877  br=0.2465  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.13it/s, step size=2.85e-01, acc. prob=0.962]


[outer 028] TRAIN (EMA+K-ens) ll=0.6907  br=0.2478  acc=0.6580


Sample: 100%|██████████| 330/330 [00:16, 20.50it/s, step size=3.28e-01, acc. prob=0.917]


[outer 029] TRAIN (EMA+K-ens) ll=0.6868  br=0.2461  acc=0.6490


Sample: 100%|██████████| 330/330 [00:17, 18.75it/s, step size=2.89e-01, acc. prob=0.938]


[outer 030] TRAIN (EMA+K-ens) ll=0.6823  br=0.2439  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 18.86it/s, step size=3.22e-01, acc. prob=0.950]


[outer 031] TRAIN (EMA+K-ens) ll=0.6910  br=0.2482  acc=0.6370


Sample: 100%|██████████| 330/330 [00:18, 17.79it/s, step size=2.80e-01, acc. prob=0.926]


[outer 032] TRAIN (EMA+K-ens) ll=0.6881  br=0.2469  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 18.54it/s, step size=2.81e-01, acc. prob=0.932]


[outer 033] TRAIN (EMA+K-ens) ll=0.6851  br=0.2455  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.90it/s, step size=2.99e-01, acc. prob=0.919]


[outer 034] TRAIN (EMA+K-ens) ll=0.6910  br=0.2481  acc=0.6000


Sample: 100%|██████████| 330/330 [00:16, 20.02it/s, step size=2.89e-01, acc. prob=0.920]


[outer 035] TRAIN (EMA+K-ens) ll=0.6876  br=0.2467  acc=0.6130


Sample: 100%|██████████| 330/330 [00:17, 18.61it/s, step size=2.67e-01, acc. prob=0.938]


[outer 036] TRAIN (EMA+K-ens) ll=0.6827  br=0.2444  acc=0.6150


Sample: 100%|██████████| 330/330 [00:18, 17.95it/s, step size=3.07e-01, acc. prob=0.936]


[outer 037] TRAIN (EMA+K-ens) ll=0.6819  br=0.2441  acc=0.6070


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=3.06e-01, acc. prob=0.941]


[outer 038] TRAIN (EMA+K-ens) ll=0.6762  br=0.2414  acc=0.6380


Sample: 100%|██████████| 330/330 [00:16, 20.06it/s, step size=2.60e-01, acc. prob=0.932]


[outer 039] TRAIN (EMA+K-ens) ll=0.6726  br=0.2396  acc=0.6420


Sample: 100%|██████████| 330/330 [00:15, 21.57it/s, step size=3.56e-01, acc. prob=0.923]


[outer 000] TRAIN (EMA+K-ens) ll=0.6994  br=0.2526  acc=0.5640


Sample: 100%|██████████| 330/330 [00:17, 18.60it/s, step size=2.94e-01, acc. prob=0.931]


[outer 001] TRAIN (EMA+K-ens) ll=0.6744  br=0.2405  acc=0.6260


Sample: 100%|██████████| 330/330 [00:16, 20.04it/s, step size=2.73e-01, acc. prob=0.958]


[outer 002] TRAIN (EMA+K-ens) ll=0.6706  br=0.2388  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 18.95it/s, step size=2.71e-01, acc. prob=0.949]


[outer 003] TRAIN (EMA+K-ens) ll=0.6676  br=0.2372  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 19.98it/s, step size=3.04e-01, acc. prob=0.941]


[outer 004] TRAIN (EMA+K-ens) ll=0.6622  br=0.2346  acc=0.6690


Sample: 100%|██████████| 330/330 [00:17, 19.33it/s, step size=2.63e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6621  br=0.2346  acc=0.6730


Sample: 100%|██████████| 330/330 [00:17, 18.88it/s, step size=3.07e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6639  br=0.2354  acc=0.6660


Sample: 100%|██████████| 330/330 [00:17, 18.73it/s, step size=2.48e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6644  br=0.2356  acc=0.6620


Sample: 100%|██████████| 330/330 [00:15, 20.80it/s, step size=3.12e-01, acc. prob=0.926]


[outer 008] TRAIN (EMA+K-ens) ll=0.6631  br=0.2348  acc=0.6970


Sample: 100%|██████████| 330/330 [00:16, 20.28it/s, step size=2.82e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.6660  br=0.2361  acc=0.6890


Sample: 100%|██████████| 330/330 [00:17, 18.94it/s, step size=2.77e-01, acc. prob=0.938]


[outer 010] TRAIN (EMA+K-ens) ll=0.6708  br=0.2383  acc=0.6550


Sample: 100%|██████████| 330/330 [00:18, 17.69it/s, step size=2.82e-01, acc. prob=0.942]


[outer 011] TRAIN (EMA+K-ens) ll=0.6741  br=0.2397  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.31it/s, step size=3.27e-01, acc. prob=0.938]


[outer 012] TRAIN (EMA+K-ens) ll=0.6798  br=0.2424  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.18it/s, step size=2.69e-01, acc. prob=0.913]


[outer 013] TRAIN (EMA+K-ens) ll=0.6879  br=0.2460  acc=0.6300


Sample: 100%|██████████| 330/330 [00:16, 20.39it/s, step size=3.33e-01, acc. prob=0.942]


[outer 014] TRAIN (EMA+K-ens) ll=0.6894  br=0.2469  acc=0.6290


Sample: 100%|██████████| 330/330 [00:17, 18.78it/s, step size=2.85e-01, acc. prob=0.952]


[outer 015] TRAIN (EMA+K-ens) ll=0.6893  br=0.2471  acc=0.6060


Sample: 100%|██████████| 330/330 [00:17, 18.71it/s, step size=2.64e-01, acc. prob=0.938]


[outer 016] TRAIN (EMA+K-ens) ll=0.6933  br=0.2490  acc=0.6230


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=2.57e-01, acc. prob=0.926]


[outer 017] TRAIN (EMA+K-ens) ll=0.6937  br=0.2493  acc=0.6230


Sample: 100%|██████████| 330/330 [00:16, 19.56it/s, step size=2.40e-01, acc. prob=0.959]


[outer 018] TRAIN (EMA+K-ens) ll=0.6933  br=0.2491  acc=0.6280


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.98e-01, acc. prob=0.948]


[outer 019] TRAIN (EMA+K-ens) ll=0.6901  br=0.2476  acc=0.6570


Sample: 100%|██████████| 330/330 [00:18, 18.31it/s, step size=2.85e-01, acc. prob=0.950]


[outer 020] TRAIN (EMA+K-ens) ll=0.6907  br=0.2479  acc=0.6440


Sample: 100%|██████████| 330/330 [00:16, 19.51it/s, step size=3.35e-01, acc. prob=0.898]


[outer 021] TRAIN (EMA+K-ens) ll=0.6837  br=0.2446  acc=0.6560


Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=3.02e-01, acc. prob=0.934]


[outer 022] TRAIN (EMA+K-ens) ll=0.6820  br=0.2438  acc=0.6670


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=2.74e-01, acc. prob=0.956]


[outer 023] TRAIN (EMA+K-ens) ll=0.6826  br=0.2438  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=2.64e-01, acc. prob=0.950]


[outer 024] TRAIN (EMA+K-ens) ll=0.6787  br=0.2419  acc=0.6540


Sample: 100%|██████████| 330/330 [00:17, 18.43it/s, step size=3.01e-01, acc. prob=0.953]


[outer 025] TRAIN (EMA+K-ens) ll=0.6761  br=0.2407  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 20.13it/s, step size=3.24e-01, acc. prob=0.930]


[outer 026] TRAIN (EMA+K-ens) ll=0.6760  br=0.2406  acc=0.6800


Sample: 100%|██████████| 330/330 [00:17, 18.51it/s, step size=2.95e-01, acc. prob=0.971]


[outer 027] TRAIN (EMA+K-ens) ll=0.6762  br=0.2409  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.66it/s, step size=3.44e-01, acc. prob=0.920]


[outer 028] TRAIN (EMA+K-ens) ll=0.6784  br=0.2418  acc=0.6750


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=2.81e-01, acc. prob=0.938]


[outer 029] TRAIN (EMA+K-ens) ll=0.6763  br=0.2408  acc=0.6880


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=3.23e-01, acc. prob=0.922]


[outer 030] TRAIN (EMA+K-ens) ll=0.6738  br=0.2395  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 20.43it/s, step size=3.16e-01, acc. prob=0.918]


[outer 031] TRAIN (EMA+K-ens) ll=0.6769  br=0.2410  acc=0.6850


Sample: 100%|██████████| 330/330 [00:16, 19.64it/s, step size=3.23e-01, acc. prob=0.913]


[outer 032] TRAIN (EMA+K-ens) ll=0.6760  br=0.2406  acc=0.6710


Sample: 100%|██████████| 330/330 [00:17, 18.52it/s, step size=3.20e-01, acc. prob=0.881]


[outer 033] TRAIN (EMA+K-ens) ll=0.6818  br=0.2433  acc=0.6620


Sample: 100%|██████████| 330/330 [00:17, 18.90it/s, step size=2.98e-01, acc. prob=0.928]


[outer 034] TRAIN (EMA+K-ens) ll=0.6773  br=0.2412  acc=0.6640


Sample: 100%|██████████| 330/330 [00:16, 20.09it/s, step size=2.91e-01, acc. prob=0.956]


[outer 035] TRAIN (EMA+K-ens) ll=0.6775  br=0.2413  acc=0.6410
[Early stop @ outer 35] Δll=0.219%, Δbr=0.344%, Δacc=0.004


Sample: 100%|██████████| 330/330 [00:18, 18.26it/s, step size=2.44e-01, acc. prob=0.959]


[outer 000] TRAIN (EMA+K-ens) ll=0.7377  br=0.2707  acc=0.4330


Sample: 100%|██████████| 330/330 [00:17, 18.44it/s, step size=3.24e-01, acc. prob=0.943]


[outer 001] TRAIN (EMA+K-ens) ll=0.7225  br=0.2642  acc=0.5340


Sample: 100%|██████████| 330/330 [00:17, 18.77it/s, step size=3.05e-01, acc. prob=0.938]


[outer 002] TRAIN (EMA+K-ens) ll=0.7158  br=0.2609  acc=0.5380


Sample: 100%|██████████| 330/330 [00:17, 19.40it/s, step size=2.13e-01, acc. prob=0.957]


[outer 003] TRAIN (EMA+K-ens) ll=0.7021  br=0.2542  acc=0.5760


Sample: 100%|██████████| 330/330 [00:18, 18.26it/s, step size=3.09e-01, acc. prob=0.904]


[outer 004] TRAIN (EMA+K-ens) ll=0.6992  br=0.2528  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 19.15it/s, step size=2.87e-01, acc. prob=0.922]


[outer 005] TRAIN (EMA+K-ens) ll=0.6995  br=0.2529  acc=0.6150


Sample: 100%|██████████| 330/330 [00:16, 20.00it/s, step size=2.81e-01, acc. prob=0.931]


[outer 006] TRAIN (EMA+K-ens) ll=0.7002  br=0.2533  acc=0.6400


Sample: 100%|██████████| 330/330 [00:17, 18.99it/s, step size=2.68e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6961  br=0.2512  acc=0.6410


Sample: 100%|██████████| 330/330 [00:16, 20.38it/s, step size=2.70e-01, acc. prob=0.954]


[outer 008] TRAIN (EMA+K-ens) ll=0.6938  br=0.2500  acc=0.6090


Sample: 100%|██████████| 330/330 [00:17, 19.35it/s, step size=2.80e-01, acc. prob=0.943]


[outer 009] TRAIN (EMA+K-ens) ll=0.6908  br=0.2486  acc=0.5920


Sample: 100%|██████████| 330/330 [00:16, 19.50it/s, step size=2.84e-01, acc. prob=0.945]


[outer 010] TRAIN (EMA+K-ens) ll=0.6879  br=0.2472  acc=0.5970


Sample: 100%|██████████| 330/330 [00:17, 19.10it/s, step size=3.02e-01, acc. prob=0.957]


[outer 011] TRAIN (EMA+K-ens) ll=0.6899  br=0.2482  acc=0.6190


Sample: 100%|██████████| 330/330 [00:18, 18.22it/s, step size=3.16e-01, acc. prob=0.948]


[outer 012] TRAIN (EMA+K-ens) ll=0.6917  br=0.2490  acc=0.6210


Sample: 100%|██████████| 330/330 [00:15, 20.80it/s, step size=3.10e-01, acc. prob=0.939]


[outer 013] TRAIN (EMA+K-ens) ll=0.6935  br=0.2498  acc=0.6060


Sample: 100%|██████████| 330/330 [00:18, 18.10it/s, step size=3.03e-01, acc. prob=0.962]


[outer 014] TRAIN (EMA+K-ens) ll=0.6904  br=0.2483  acc=0.6120


Sample: 100%|██████████| 330/330 [00:17, 19.25it/s, step size=3.17e-01, acc. prob=0.962]


[outer 015] TRAIN (EMA+K-ens) ll=0.6897  br=0.2480  acc=0.6310


Sample: 100%|██████████| 330/330 [00:17, 19.16it/s, step size=3.61e-01, acc. prob=0.906]


[outer 016] TRAIN (EMA+K-ens) ll=0.6864  br=0.2464  acc=0.6280


Sample: 100%|██████████| 330/330 [00:17, 18.65it/s, step size=2.38e-01, acc. prob=0.963]


[outer 017] TRAIN (EMA+K-ens) ll=0.6819  br=0.2441  acc=0.6310


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=3.34e-01, acc. prob=0.923]


[outer 018] TRAIN (EMA+K-ens) ll=0.6867  br=0.2463  acc=0.6510


Sample: 100%|██████████| 330/330 [00:16, 20.47it/s, step size=2.93e-01, acc. prob=0.920]


[outer 019] TRAIN (EMA+K-ens) ll=0.6893  br=0.2476  acc=0.6320


Sample: 100%|██████████| 330/330 [00:16, 20.03it/s, step size=3.30e-01, acc. prob=0.907]


[outer 020] TRAIN (EMA+K-ens) ll=0.6906  br=0.2482  acc=0.6330


Sample: 100%|██████████| 330/330 [00:16, 19.73it/s, step size=3.29e-01, acc. prob=0.909]


[outer 021] TRAIN (EMA+K-ens) ll=0.6816  br=0.2437  acc=0.6490


Sample: 100%|██████████| 330/330 [00:16, 20.27it/s, step size=3.03e-01, acc. prob=0.923]


[outer 022] TRAIN (EMA+K-ens) ll=0.6800  br=0.2429  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.29it/s, step size=2.56e-01, acc. prob=0.947]


[outer 023] TRAIN (EMA+K-ens) ll=0.6804  br=0.2430  acc=0.6490


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=3.01e-01, acc. prob=0.958]


[outer 024] TRAIN (EMA+K-ens) ll=0.6814  br=0.2434  acc=0.6480


Sample: 100%|██████████| 330/330 [00:18, 18.24it/s, step size=3.57e-01, acc. prob=0.911]


[outer 025] TRAIN (EMA+K-ens) ll=0.6822  br=0.2441  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.74it/s, step size=2.85e-01, acc. prob=0.955]


[outer 026] TRAIN (EMA+K-ens) ll=0.6786  br=0.2424  acc=0.6430


Sample: 100%|██████████| 330/330 [00:16, 19.97it/s, step size=2.61e-01, acc. prob=0.960]


[outer 027] TRAIN (EMA+K-ens) ll=0.6777  br=0.2419  acc=0.6470


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=3.24e-01, acc. prob=0.941]


[outer 028] TRAIN (EMA+K-ens) ll=0.6748  br=0.2406  acc=0.6670


Sample: 100%|██████████| 330/330 [00:17, 19.19it/s, step size=2.84e-01, acc. prob=0.942]


[outer 029] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.6740


Sample: 100%|██████████| 330/330 [00:16, 19.71it/s, step size=3.25e-01, acc. prob=0.941]


[outer 030] TRAIN (EMA+K-ens) ll=0.6757  br=0.2410  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 19.95it/s, step size=2.74e-01, acc. prob=0.937]


[outer 031] TRAIN (EMA+K-ens) ll=0.6796  br=0.2429  acc=0.6460


Sample: 100%|██████████| 330/330 [00:17, 18.50it/s, step size=2.84e-01, acc. prob=0.937]


[outer 032] TRAIN (EMA+K-ens) ll=0.6815  br=0.2439  acc=0.6400


Sample: 100%|██████████| 330/330 [00:16, 19.61it/s, step size=2.96e-01, acc. prob=0.915]


[outer 033] TRAIN (EMA+K-ens) ll=0.6851  br=0.2456  acc=0.6410


Sample: 100%|██████████| 330/330 [00:20, 16.33it/s, step size=2.57e-01, acc. prob=0.942]


[outer 034] TRAIN (EMA+K-ens) ll=0.6848  br=0.2454  acc=0.6300


Sample: 100%|██████████| 330/330 [00:18, 17.83it/s, step size=3.42e-01, acc. prob=0.925]


[outer 035] TRAIN (EMA+K-ens) ll=0.6799  br=0.2430  acc=0.6320


Sample: 100%|██████████| 330/330 [00:18, 18.27it/s, step size=2.82e-01, acc. prob=0.944]


[outer 036] TRAIN (EMA+K-ens) ll=0.6799  br=0.2431  acc=0.6250


Sample: 100%|██████████| 330/330 [00:15, 21.08it/s, step size=3.06e-01, acc. prob=0.951]


[outer 037] TRAIN (EMA+K-ens) ll=0.6824  br=0.2443  acc=0.6030


Sample: 100%|██████████| 330/330 [00:16, 19.43it/s, step size=3.61e-01, acc. prob=0.905]


[outer 038] TRAIN (EMA+K-ens) ll=0.6903  br=0.2482  acc=0.5990


Sample: 100%|██████████| 330/330 [00:17, 18.80it/s, step size=3.18e-01, acc. prob=0.933]


[outer 039] TRAIN (EMA+K-ens) ll=0.6895  br=0.2477  acc=0.5950


Sample: 100%|██████████| 330/330 [00:17, 18.83it/s, step size=2.83e-01, acc. prob=0.939]


[outer 000] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.5740


Sample: 100%|██████████| 330/330 [00:16, 19.96it/s, step size=3.66e-01, acc. prob=0.881]


[outer 001] TRAIN (EMA+K-ens) ll=0.6714  br=0.2388  acc=0.6510


Sample: 100%|██████████| 330/330 [00:15, 21.64it/s, step size=4.01e-01, acc. prob=0.851]


[outer 002] TRAIN (EMA+K-ens) ll=0.6764  br=0.2412  acc=0.6390


Sample: 100%|██████████| 330/330 [00:18, 17.84it/s, step size=2.88e-01, acc. prob=0.946]


[outer 003] TRAIN (EMA+K-ens) ll=0.6796  br=0.2428  acc=0.6520


Sample: 100%|██████████| 330/330 [00:18, 18.05it/s, step size=2.17e-01, acc. prob=0.960]


[outer 004] TRAIN (EMA+K-ens) ll=0.6779  br=0.2420  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 20.65it/s, step size=3.01e-01, acc. prob=0.913]


[outer 005] TRAIN (EMA+K-ens) ll=0.6781  br=0.2422  acc=0.6650


Sample: 100%|██████████| 330/330 [00:19, 16.95it/s, step size=2.78e-01, acc. prob=0.929]


[outer 006] TRAIN (EMA+K-ens) ll=0.6782  br=0.2422  acc=0.6610


Sample: 100%|██████████| 330/330 [00:16, 19.82it/s, step size=3.10e-01, acc. prob=0.890]


[outer 007] TRAIN (EMA+K-ens) ll=0.6746  br=0.2405  acc=0.6800


Sample: 100%|██████████| 330/330 [00:16, 19.69it/s, step size=2.92e-01, acc. prob=0.943]


[outer 008] TRAIN (EMA+K-ens) ll=0.6771  br=0.2416  acc=0.6870


Sample: 100%|██████████| 330/330 [00:18, 17.61it/s, step size=2.75e-01, acc. prob=0.948]


[outer 009] TRAIN (EMA+K-ens) ll=0.6803  br=0.2431  acc=0.6900


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=2.99e-01, acc. prob=0.900]


[outer 010] TRAIN (EMA+K-ens) ll=0.6782  br=0.2421  acc=0.6960


Sample: 100%|██████████| 330/330 [00:19, 17.11it/s, step size=3.00e-01, acc. prob=0.942]


[outer 011] TRAIN (EMA+K-ens) ll=0.6756  br=0.2408  acc=0.6960


Sample: 100%|██████████| 330/330 [00:18, 18.10it/s, step size=2.44e-01, acc. prob=0.962]


[outer 012] TRAIN (EMA+K-ens) ll=0.6746  br=0.2404  acc=0.6680


Sample: 100%|██████████| 330/330 [00:15, 20.77it/s, step size=3.02e-01, acc. prob=0.958]


[outer 013] TRAIN (EMA+K-ens) ll=0.6745  br=0.2402  acc=0.6730


Sample: 100%|██████████| 330/330 [00:16, 19.68it/s, step size=3.39e-01, acc. prob=0.933]


[outer 014] TRAIN (EMA+K-ens) ll=0.6766  br=0.2412  acc=0.6650


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=2.83e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6793  br=0.2425  acc=0.6550


Sample: 100%|██████████| 330/330 [00:17, 19.06it/s, step size=2.90e-01, acc. prob=0.924]


[outer 016] TRAIN (EMA+K-ens) ll=0.6817  br=0.2436  acc=0.6420


Sample: 100%|██████████| 330/330 [00:16, 19.54it/s, step size=2.89e-01, acc. prob=0.939]


[outer 017] TRAIN (EMA+K-ens) ll=0.6765  br=0.2412  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.26it/s, step size=3.18e-01, acc. prob=0.898]


[outer 018] TRAIN (EMA+K-ens) ll=0.6739  br=0.2400  acc=0.6450


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=2.99e-01, acc. prob=0.934]


[outer 019] TRAIN (EMA+K-ens) ll=0.6765  br=0.2412  acc=0.6460


Sample: 100%|██████████| 330/330 [00:15, 20.94it/s, step size=2.90e-01, acc. prob=0.945]


[outer 020] TRAIN (EMA+K-ens) ll=0.6761  br=0.2412  acc=0.6550


Sample: 100%|██████████| 330/330 [00:16, 19.45it/s, step size=2.69e-01, acc. prob=0.953]


[outer 021] TRAIN (EMA+K-ens) ll=0.6720  br=0.2391  acc=0.6790


Sample: 100%|██████████| 330/330 [00:17, 18.98it/s, step size=3.05e-01, acc. prob=0.928]


[outer 022] TRAIN (EMA+K-ens) ll=0.6708  br=0.2385  acc=0.6830


Sample: 100%|██████████| 330/330 [00:16, 19.88it/s, step size=2.82e-01, acc. prob=0.939]


[outer 023] TRAIN (EMA+K-ens) ll=0.6690  br=0.2377  acc=0.6690


Sample: 100%|██████████| 330/330 [00:16, 19.46it/s, step size=3.16e-01, acc. prob=0.935]


[outer 024] TRAIN (EMA+K-ens) ll=0.6642  br=0.2354  acc=0.6530


Sample: 100%|██████████| 330/330 [00:17, 18.67it/s, step size=3.10e-01, acc. prob=0.937]


[outer 025] TRAIN (EMA+K-ens) ll=0.6674  br=0.2371  acc=0.6370


Sample: 100%|██████████| 330/330 [00:17, 18.59it/s, step size=2.67e-01, acc. prob=0.949]


[outer 026] TRAIN (EMA+K-ens) ll=0.6667  br=0.2366  acc=0.6670


Sample: 100%|██████████| 330/330 [00:18, 18.12it/s, step size=3.04e-01, acc. prob=0.952]


[outer 027] TRAIN (EMA+K-ens) ll=0.6676  br=0.2372  acc=0.6540


Sample: 100%|██████████| 330/330 [00:16, 19.94it/s, step size=3.04e-01, acc. prob=0.949]


[outer 028] TRAIN (EMA+K-ens) ll=0.6664  br=0.2365  acc=0.6480


Sample: 100%|██████████| 330/330 [00:17, 18.37it/s, step size=3.02e-01, acc. prob=0.940]


[outer 029] TRAIN (EMA+K-ens) ll=0.6686  br=0.2376  acc=0.6500


Sample: 100%|██████████| 330/330 [00:19, 17.20it/s, step size=2.83e-01, acc. prob=0.943]


[outer 030] TRAIN (EMA+K-ens) ll=0.6697  br=0.2382  acc=0.6510


Sample: 100%|██████████| 330/330 [00:17, 18.82it/s, step size=3.41e-01, acc. prob=0.920]


[outer 031] TRAIN (EMA+K-ens) ll=0.6719  br=0.2392  acc=0.6750


Sample: 100%|██████████| 330/330 [00:16, 19.48it/s, step size=3.35e-01, acc. prob=0.927]


[outer 032] TRAIN (EMA+K-ens) ll=0.6714  br=0.2390  acc=0.6630


Sample: 100%|██████████| 330/330 [00:16, 20.07it/s, step size=2.99e-01, acc. prob=0.938]


[outer 033] TRAIN (EMA+K-ens) ll=0.6724  br=0.2395  acc=0.6470


Sample: 100%|██████████| 330/330 [00:17, 18.49it/s, step size=2.53e-01, acc. prob=0.968]


[outer 034] TRAIN (EMA+K-ens) ll=0.6803  br=0.2433  acc=0.6300


Sample: 100%|██████████| 330/330 [00:17, 19.31it/s, step size=2.86e-01, acc. prob=0.926]


[outer 035] TRAIN (EMA+K-ens) ll=0.6816  br=0.2439  acc=0.6340


Sample: 100%|██████████| 330/330 [00:17, 18.40it/s, step size=3.11e-01, acc. prob=0.950]


[outer 036] TRAIN (EMA+K-ens) ll=0.6840  br=0.2452  acc=0.6200


Sample: 100%|██████████| 330/330 [00:17, 19.15it/s, step size=2.85e-01, acc. prob=0.962]


[outer 037] TRAIN (EMA+K-ens) ll=0.6863  br=0.2463  acc=0.5840


Sample: 100%|██████████| 330/330 [00:17, 18.87it/s, step size=2.34e-01, acc. prob=0.959]


[outer 038] TRAIN (EMA+K-ens) ll=0.6859  br=0.2461  acc=0.5840


Sample: 100%|██████████| 330/330 [00:14, 22.69it/s, step size=3.36e-01, acc. prob=0.949]

[outer 039] TRAIN (EMA+K-ens) ll=0.6855  br=0.2458  acc=0.5840
          accuracy     brier   logloss
mean      0.578846  0.277436  0.856851
std       0.004835  0.004939  0.030516
median    0.579780  0.277597  0.860290
<lambda>  0.575095  0.274042  0.837054
<lambda>  0.582435  0.280247  0.880823
   accuracy     brier   logloss
0   0.58252  0.284985  0.867139
1   0.57718  0.278085  0.853441
2   0.58110  0.270300  0.841767
3   0.58540  0.273499  0.874698
4   0.58310  0.283482  0.882864
5   0.57050  0.275668  0.806867
6   0.58218  0.277108  0.835483
7   0.57440  0.279853  0.893974
8   0.57846  0.271002  0.819351
9   0.57362  0.280378  0.892929





In [None]:

noise_type = "t"
n_per_group_train = 200
use_long = False

for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_simulated_t_test, feature_cols,
        interaction=False, nonlinear=False, group=False,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    # 마지막 θ로 test 예측 (또는 NUTS 마지막 50%로 p_mean만)
    p_test, m = predict_probit(res["final_theta"], df_simulated_t_test, feature_cols, False, False, False)
    all_metrics.append(m)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median',lambda s: s.quantile(0.25),lambda s: s.quantile(0.75)])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:17, 19.27it/s, step size=2.71e-01, acc. prob=0.936]


[outer 000] TRAIN (EMA+K-ens) ll=0.6498  br=0.2284  acc=0.6640


Sample: 100%|██████████| 330/330 [00:14, 22.40it/s, step size=2.71e-01, acc. prob=0.968]


[outer 001] TRAIN (EMA+K-ens) ll=0.6625  br=0.2342  acc=0.6610


Sample: 100%|██████████| 330/330 [00:14, 23.35it/s, step size=2.55e-01, acc. prob=0.965]


[outer 002] TRAIN (EMA+K-ens) ll=0.6499  br=0.2284  acc=0.6640


Sample: 100%|██████████| 330/330 [00:14, 22.88it/s, step size=2.62e-01, acc. prob=0.969]


[outer 003] TRAIN (EMA+K-ens) ll=0.6713  br=0.2380  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 24.13it/s, step size=2.61e-01, acc. prob=0.959]


[outer 004] TRAIN (EMA+K-ens) ll=0.6583  br=0.2320  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.10it/s, step size=2.85e-01, acc. prob=0.938]


[outer 005] TRAIN (EMA+K-ens) ll=0.6678  br=0.2365  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 24.26it/s, step size=2.54e-01, acc. prob=0.966]


[outer 006] TRAIN (EMA+K-ens) ll=0.6594  br=0.2327  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.38it/s, step size=2.43e-01, acc. prob=0.966]


[outer 007] TRAIN (EMA+K-ens) ll=0.6665  br=0.2358  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 23.52it/s, step size=2.68e-01, acc. prob=0.937]


[outer 008] TRAIN (EMA+K-ens) ll=0.6718  br=0.2382  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.78it/s, step size=2.69e-01, acc. prob=0.946]


[outer 009] TRAIN (EMA+K-ens) ll=0.6611  br=0.2331  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.18it/s, step size=2.37e-01, acc. prob=0.979]


[outer 010] TRAIN (EMA+K-ens) ll=0.6634  br=0.2341  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.47it/s, step size=2.88e-01, acc. prob=0.965]


[outer 011] TRAIN (EMA+K-ens) ll=0.6465  br=0.2264  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=2.41e-01, acc. prob=0.974]


[outer 012] TRAIN (EMA+K-ens) ll=0.6441  br=0.2253  acc=0.7050


Sample: 100%|██████████| 330/330 [00:12, 26.61it/s, step size=3.86e-01, acc. prob=0.912]


[outer 013] TRAIN (EMA+K-ens) ll=0.6366  br=0.2218  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.88it/s, step size=2.73e-01, acc. prob=0.927]


[outer 014] TRAIN (EMA+K-ens) ll=0.6480  br=0.2272  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.53it/s, step size=2.39e-01, acc. prob=0.957]


[outer 015] TRAIN (EMA+K-ens) ll=0.6364  br=0.2218  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.07it/s, step size=2.92e-01, acc. prob=0.906]


[outer 016] TRAIN (EMA+K-ens) ll=0.6229  br=0.2154  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 22.24it/s, step size=2.54e-01, acc. prob=0.965]


[outer 017] TRAIN (EMA+K-ens) ll=0.6270  br=0.2174  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.87it/s, step size=2.47e-01, acc. prob=0.958]


[outer 018] TRAIN (EMA+K-ens) ll=0.6248  br=0.2163  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.96it/s, step size=2.62e-01, acc. prob=0.970]


[outer 019] TRAIN (EMA+K-ens) ll=0.6308  br=0.2192  acc=0.7050


Sample: 100%|██████████| 330/330 [00:12, 25.45it/s, step size=3.49e-01, acc. prob=0.927]


[outer 020] TRAIN (EMA+K-ens) ll=0.6355  br=0.2213  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 23.88it/s, step size=2.93e-01, acc. prob=0.944]


[outer 021] TRAIN (EMA+K-ens) ll=0.6343  br=0.2208  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.92it/s, step size=2.45e-01, acc. prob=0.968]


[outer 022] TRAIN (EMA+K-ens) ll=0.6276  br=0.2177  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.57it/s, step size=2.56e-01, acc. prob=0.962]


[outer 023] TRAIN (EMA+K-ens) ll=0.6290  br=0.2183  acc=0.7050


Sample: 100%|██████████| 330/330 [00:14, 22.41it/s, step size=2.47e-01, acc. prob=0.961]


[outer 024] TRAIN (EMA+K-ens) ll=0.6319  br=0.2197  acc=0.7050


Sample: 100%|██████████| 330/330 [00:12, 26.48it/s, step size=3.53e-01, acc. prob=0.918]


[outer 025] TRAIN (EMA+K-ens) ll=0.6340  br=0.2207  acc=0.7050
[Early stop @ outer 25] Δll=0.380%, Δbr=0.493%, Δacc=0.000


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=2.67e-01, acc. prob=0.934]


[outer 000] TRAIN (EMA+K-ens) ll=0.7376  br=0.2712  acc=0.4500


Sample: 100%|██████████| 330/330 [00:15, 21.10it/s, step size=2.44e-01, acc. prob=0.951]


[outer 001] TRAIN (EMA+K-ens) ll=0.7362  br=0.2700  acc=0.4720


Sample: 100%|██████████| 330/330 [00:16, 20.26it/s, step size=2.57e-01, acc. prob=0.951]


[outer 002] TRAIN (EMA+K-ens) ll=0.7454  br=0.2745  acc=0.4810


Sample: 100%|██████████| 330/330 [00:14, 22.45it/s, step size=3.21e-01, acc. prob=0.930]


[outer 003] TRAIN (EMA+K-ens) ll=0.7226  br=0.2636  acc=0.5710


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.77e-01, acc. prob=0.966]


[outer 004] TRAIN (EMA+K-ens) ll=0.7125  br=0.2587  acc=0.5970


Sample: 100%|██████████| 330/330 [00:14, 22.09it/s, step size=2.27e-01, acc. prob=0.950]


[outer 005] TRAIN (EMA+K-ens) ll=0.7168  br=0.2607  acc=0.6000


Sample: 100%|██████████| 330/330 [00:13, 25.12it/s, step size=3.04e-01, acc. prob=0.934]


[outer 006] TRAIN (EMA+K-ens) ll=0.7079  br=0.2564  acc=0.6220


Sample: 100%|██████████| 330/330 [00:15, 21.39it/s, step size=2.15e-01, acc. prob=0.958]


[outer 007] TRAIN (EMA+K-ens) ll=0.7189  br=0.2618  acc=0.5740


Sample: 100%|██████████| 330/330 [00:15, 21.58it/s, step size=3.01e-01, acc. prob=0.957]


[outer 008] TRAIN (EMA+K-ens) ll=0.7157  br=0.2601  acc=0.5880


Sample: 100%|██████████| 330/330 [00:14, 22.19it/s, step size=2.64e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.7039  br=0.2545  acc=0.6250


Sample: 100%|██████████| 330/330 [00:13, 25.24it/s, step size=2.34e-01, acc. prob=0.970]


[outer 010] TRAIN (EMA+K-ens) ll=0.6976  br=0.2514  acc=0.6480


Sample: 100%|██████████| 330/330 [00:15, 21.81it/s, step size=2.07e-01, acc. prob=0.961]


[outer 011] TRAIN (EMA+K-ens) ll=0.6990  br=0.2520  acc=0.6500


Sample: 100%|██████████| 330/330 [00:17, 19.17it/s, step size=1.97e-01, acc. prob=0.969]


[outer 012] TRAIN (EMA+K-ens) ll=0.6953  br=0.2502  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 21.37it/s, step size=2.79e-01, acc. prob=0.942]


[outer 013] TRAIN (EMA+K-ens) ll=0.6913  br=0.2484  acc=0.6700


Sample: 100%|██████████| 330/330 [00:14, 22.22it/s, step size=2.62e-01, acc. prob=0.952]


[outer 014] TRAIN (EMA+K-ens) ll=0.6925  br=0.2489  acc=0.6740


Sample: 100%|██████████| 330/330 [00:14, 22.79it/s, step size=2.25e-01, acc. prob=0.958]


[outer 015] TRAIN (EMA+K-ens) ll=0.6795  br=0.2425  acc=0.6830


Sample: 100%|██████████| 330/330 [00:13, 24.23it/s, step size=2.60e-01, acc. prob=0.957]


[outer 016] TRAIN (EMA+K-ens) ll=0.6722  br=0.2390  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.12it/s, step size=2.99e-01, acc. prob=0.922]


[outer 017] TRAIN (EMA+K-ens) ll=0.6714  br=0.2385  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.15it/s, step size=2.57e-01, acc. prob=0.950]


[outer 018] TRAIN (EMA+K-ens) ll=0.6685  br=0.2373  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.13it/s, step size=2.88e-01, acc. prob=0.972]


[outer 019] TRAIN (EMA+K-ens) ll=0.6702  br=0.2381  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.89it/s, step size=2.52e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6673  br=0.2367  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=2.25e-01, acc. prob=0.969]


[outer 021] TRAIN (EMA+K-ens) ll=0.6739  br=0.2398  acc=0.6790


Sample: 100%|██████████| 330/330 [00:14, 22.74it/s, step size=2.79e-01, acc. prob=0.934]


[outer 022] TRAIN (EMA+K-ens) ll=0.6648  br=0.2355  acc=0.6890


Sample: 100%|██████████| 330/330 [00:15, 21.15it/s, step size=2.62e-01, acc. prob=0.948]


[outer 023] TRAIN (EMA+K-ens) ll=0.6577  br=0.2321  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 24.91it/s, step size=2.93e-01, acc. prob=0.941]


[outer 024] TRAIN (EMA+K-ens) ll=0.6601  br=0.2332  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=2.61e-01, acc. prob=0.953]


[outer 025] TRAIN (EMA+K-ens) ll=0.6666  br=0.2364  acc=0.6840


Sample: 100%|██████████| 330/330 [00:15, 21.14it/s, step size=2.17e-01, acc. prob=0.962]


[outer 026] TRAIN (EMA+K-ens) ll=0.6751  br=0.2404  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.42it/s, step size=3.00e-01, acc. prob=0.955]


[outer 027] TRAIN (EMA+K-ens) ll=0.6676  br=0.2369  acc=0.6740


Sample: 100%|██████████| 330/330 [00:14, 22.11it/s, step size=2.01e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6726  br=0.2392  acc=0.6570


Sample: 100%|██████████| 330/330 [00:15, 21.27it/s, step size=2.67e-01, acc. prob=0.930]


[outer 029] TRAIN (EMA+K-ens) ll=0.6630  br=0.2347  acc=0.6730


Sample: 100%|██████████| 330/330 [00:14, 22.91it/s, step size=2.09e-01, acc. prob=0.969]


[outer 030] TRAIN (EMA+K-ens) ll=0.6630  br=0.2348  acc=0.6550


Sample: 100%|██████████| 330/330 [00:14, 22.57it/s, step size=2.45e-01, acc. prob=0.947]


[outer 031] TRAIN (EMA+K-ens) ll=0.6705  br=0.2384  acc=0.6360


Sample: 100%|██████████| 330/330 [00:13, 23.61it/s, step size=2.64e-01, acc. prob=0.966]


[outer 032] TRAIN (EMA+K-ens) ll=0.6683  br=0.2373  acc=0.6510


Sample: 100%|██████████| 330/330 [00:13, 24.81it/s, step size=2.73e-01, acc. prob=0.928]


[outer 033] TRAIN (EMA+K-ens) ll=0.6643  br=0.2352  acc=0.6740


Sample: 100%|██████████| 330/330 [00:14, 22.67it/s, step size=2.76e-01, acc. prob=0.913]


[outer 034] TRAIN (EMA+K-ens) ll=0.6519  br=0.2292  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.25it/s, step size=2.73e-01, acc. prob=0.933]


[outer 035] TRAIN (EMA+K-ens) ll=0.6485  br=0.2275  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.05it/s, step size=2.77e-01, acc. prob=0.930]


[outer 036] TRAIN (EMA+K-ens) ll=0.6507  br=0.2286  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.83it/s, step size=2.69e-01, acc. prob=0.959]


[outer 037] TRAIN (EMA+K-ens) ll=0.6508  br=0.2287  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 24.77it/s, step size=2.82e-01, acc. prob=0.913]


[outer 038] TRAIN (EMA+K-ens) ll=0.6579  br=0.2319  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.31it/s, step size=2.67e-01, acc. prob=0.950]


[outer 039] TRAIN (EMA+K-ens) ll=0.6488  br=0.2275  acc=0.6890


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.63e-01, acc. prob=0.945]


[outer 000] TRAIN (EMA+K-ens) ll=0.7062  br=0.2549  acc=0.6170


Sample: 100%|██████████| 330/330 [00:15, 21.66it/s, step size=2.77e-01, acc. prob=0.936]


[outer 001] TRAIN (EMA+K-ens) ll=0.7113  br=0.2572  acc=0.6440


Sample: 100%|██████████| 330/330 [00:14, 22.44it/s, step size=2.96e-01, acc. prob=0.946]


[outer 002] TRAIN (EMA+K-ens) ll=0.6788  br=0.2423  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 24.48it/s, step size=2.42e-01, acc. prob=0.972]


[outer 003] TRAIN (EMA+K-ens) ll=0.6873  br=0.2460  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.46it/s, step size=3.52e-01, acc. prob=0.938]


[outer 004] TRAIN (EMA+K-ens) ll=0.6897  br=0.2474  acc=0.6690


Sample: 100%|██████████| 330/330 [00:14, 22.70it/s, step size=2.99e-01, acc. prob=0.970]


[outer 005] TRAIN (EMA+K-ens) ll=0.6836  br=0.2444  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.62it/s, step size=3.17e-01, acc. prob=0.924]


[outer 006] TRAIN (EMA+K-ens) ll=0.6861  br=0.2455  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 24.29it/s, step size=3.50e-01, acc. prob=0.910]


[outer 007] TRAIN (EMA+K-ens) ll=0.6879  br=0.2464  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.50it/s, step size=2.15e-01, acc. prob=0.969]


[outer 008] TRAIN (EMA+K-ens) ll=0.6836  br=0.2444  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.26it/s, step size=2.38e-01, acc. prob=0.944]


[outer 009] TRAIN (EMA+K-ens) ll=0.6767  br=0.2412  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 23.19it/s, step size=2.35e-01, acc. prob=0.967]


[outer 010] TRAIN (EMA+K-ens) ll=0.6791  br=0.2424  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.04it/s, step size=3.00e-01, acc. prob=0.943]


[outer 011] TRAIN (EMA+K-ens) ll=0.6799  br=0.2430  acc=0.6830


Sample: 100%|██████████| 330/330 [00:12, 25.41it/s, step size=2.50e-01, acc. prob=0.961]


[outer 012] TRAIN (EMA+K-ens) ll=0.6689  br=0.2376  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.73it/s, step size=3.09e-01, acc. prob=0.960]


[outer 013] TRAIN (EMA+K-ens) ll=0.6690  br=0.2377  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.18it/s, step size=3.07e-01, acc. prob=0.929]


[outer 014] TRAIN (EMA+K-ens) ll=0.6699  br=0.2381  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.22it/s, step size=2.13e-01, acc. prob=0.976]


[outer 015] TRAIN (EMA+K-ens) ll=0.6750  br=0.2405  acc=0.6810


Sample: 100%|██████████| 330/330 [00:14, 22.73it/s, step size=3.16e-01, acc. prob=0.957]


[outer 016] TRAIN (EMA+K-ens) ll=0.6806  br=0.2431  acc=0.6810


Sample: 100%|██████████| 330/330 [00:13, 23.84it/s, step size=2.62e-01, acc. prob=0.953]


[outer 017] TRAIN (EMA+K-ens) ll=0.6831  br=0.2441  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 23.34it/s, step size=3.12e-01, acc. prob=0.923]


[outer 018] TRAIN (EMA+K-ens) ll=0.6901  br=0.2473  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.37it/s, step size=2.86e-01, acc. prob=0.953]


[outer 019] TRAIN (EMA+K-ens) ll=0.6825  br=0.2437  acc=0.6810


Sample: 100%|██████████| 330/330 [00:14, 23.14it/s, step size=2.48e-01, acc. prob=0.950]


[outer 020] TRAIN (EMA+K-ens) ll=0.6872  br=0.2460  acc=0.6680


Sample: 100%|██████████| 330/330 [00:13, 24.18it/s, step size=2.58e-01, acc. prob=0.949]


[outer 021] TRAIN (EMA+K-ens) ll=0.6936  br=0.2491  acc=0.6670


Sample: 100%|██████████| 330/330 [00:13, 25.29it/s, step size=2.76e-01, acc. prob=0.963]


[outer 022] TRAIN (EMA+K-ens) ll=0.6889  br=0.2470  acc=0.6830


Sample: 100%|██████████| 330/330 [00:12, 25.80it/s, step size=2.96e-01, acc. prob=0.953]


[outer 023] TRAIN (EMA+K-ens) ll=0.6847  br=0.2448  acc=0.6840


Sample: 100%|██████████| 330/330 [00:14, 22.82it/s, step size=2.81e-01, acc. prob=0.921]


[outer 024] TRAIN (EMA+K-ens) ll=0.6849  br=0.2448  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.03it/s, step size=2.96e-01, acc. prob=0.947]


[outer 025] TRAIN (EMA+K-ens) ll=0.6855  br=0.2451  acc=0.6690


Sample: 100%|██████████| 330/330 [00:13, 24.11it/s, step size=2.79e-01, acc. prob=0.958]


[outer 026] TRAIN (EMA+K-ens) ll=0.6830  br=0.2439  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.51it/s, step size=3.14e-01, acc. prob=0.954]


[outer 027] TRAIN (EMA+K-ens) ll=0.6816  br=0.2433  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.05it/s, step size=2.65e-01, acc. prob=0.959]


[outer 028] TRAIN (EMA+K-ens) ll=0.6866  br=0.2455  acc=0.6890


Sample: 100%|██████████| 330/330 [00:12, 26.60it/s, step size=2.93e-01, acc. prob=0.957]


[outer 029] TRAIN (EMA+K-ens) ll=0.6827  br=0.2437  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.10it/s, step size=3.14e-01, acc. prob=0.952]


[outer 030] TRAIN (EMA+K-ens) ll=0.6742  br=0.2399  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.07it/s, step size=3.02e-01, acc. prob=0.959]


[outer 031] TRAIN (EMA+K-ens) ll=0.6623  br=0.2344  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.17it/s, step size=2.67e-01, acc. prob=0.934]


[outer 032] TRAIN (EMA+K-ens) ll=0.6562  br=0.2316  acc=0.6820


Sample: 100%|██████████| 330/330 [00:13, 25.06it/s, step size=2.62e-01, acc. prob=0.969]


[outer 033] TRAIN (EMA+K-ens) ll=0.6417  br=0.2247  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.28it/s, step size=2.47e-01, acc. prob=0.934]


[outer 034] TRAIN (EMA+K-ens) ll=0.6374  br=0.2226  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.53e-01, acc. prob=0.968]


[outer 035] TRAIN (EMA+K-ens) ll=0.6558  br=0.2313  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.41it/s, step size=2.33e-01, acc. prob=0.948]


[outer 036] TRAIN (EMA+K-ens) ll=0.6523  br=0.2297  acc=0.6810


Sample: 100%|██████████| 330/330 [00:13, 24.78it/s, step size=2.87e-01, acc. prob=0.964]


[outer 037] TRAIN (EMA+K-ens) ll=0.6499  br=0.2285  acc=0.6850


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=3.16e-01, acc. prob=0.948]


[outer 038] TRAIN (EMA+K-ens) ll=0.6578  br=0.2323  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.97it/s, step size=2.95e-01, acc. prob=0.954]


[outer 039] TRAIN (EMA+K-ens) ll=0.6722  br=0.2391  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.27it/s, step size=2.39e-01, acc. prob=0.965]


[outer 000] TRAIN (EMA+K-ens) ll=0.6822  br=0.2442  acc=0.5910


Sample: 100%|██████████| 330/330 [00:13, 23.98it/s, step size=2.43e-01, acc. prob=0.973]


[outer 001] TRAIN (EMA+K-ens) ll=0.6769  br=0.2414  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 23.67it/s, step size=2.38e-01, acc. prob=0.964]


[outer 002] TRAIN (EMA+K-ens) ll=0.6706  br=0.2382  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.09it/s, step size=2.26e-01, acc. prob=0.979]


[outer 003] TRAIN (EMA+K-ens) ll=0.6640  br=0.2351  acc=0.7030


Sample: 100%|██████████| 330/330 [00:12, 25.71it/s, step size=2.46e-01, acc. prob=0.958]


[outer 004] TRAIN (EMA+K-ens) ll=0.6720  br=0.2386  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 24.24it/s, step size=2.81e-01, acc. prob=0.943]


[outer 005] TRAIN (EMA+K-ens) ll=0.6736  br=0.2394  acc=0.7070


Sample: 100%|██████████| 330/330 [00:14, 22.05it/s, step size=2.87e-01, acc. prob=0.942]


[outer 006] TRAIN (EMA+K-ens) ll=0.6667  br=0.2361  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=2.62e-01, acc. prob=0.950]


[outer 007] TRAIN (EMA+K-ens) ll=0.6690  br=0.2371  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 25.33it/s, step size=2.89e-01, acc. prob=0.905]


[outer 008] TRAIN (EMA+K-ens) ll=0.6669  br=0.2360  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.56it/s, step size=3.19e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.6659  br=0.2354  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.26it/s, step size=2.26e-01, acc. prob=0.967]


[outer 010] TRAIN (EMA+K-ens) ll=0.6657  br=0.2353  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.49it/s, step size=3.26e-01, acc. prob=0.948]


[outer 011] TRAIN (EMA+K-ens) ll=0.6655  br=0.2353  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 23.17it/s, step size=2.54e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6522  br=0.2291  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.80it/s, step size=2.30e-01, acc. prob=0.950]


[outer 013] TRAIN (EMA+K-ens) ll=0.6487  br=0.2276  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.56e-01, acc. prob=0.953]


[outer 014] TRAIN (EMA+K-ens) ll=0.6415  br=0.2241  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 23.76it/s, step size=3.18e-01, acc. prob=0.951]


[outer 015] TRAIN (EMA+K-ens) ll=0.6464  br=0.2266  acc=0.6980


Sample: 100%|██████████| 330/330 [00:15, 21.87it/s, step size=2.55e-01, acc. prob=0.959]


[outer 016] TRAIN (EMA+K-ens) ll=0.6422  br=0.2246  acc=0.7070


Sample: 100%|██████████| 330/330 [00:14, 23.45it/s, step size=2.55e-01, acc. prob=0.962]


[outer 017] TRAIN (EMA+K-ens) ll=0.6353  br=0.2214  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.97it/s, step size=2.54e-01, acc. prob=0.944]


[outer 018] TRAIN (EMA+K-ens) ll=0.6433  br=0.2249  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 23.85it/s, step size=3.05e-01, acc. prob=0.954]


[outer 019] TRAIN (EMA+K-ens) ll=0.6428  br=0.2246  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 24.00it/s, step size=2.91e-01, acc. prob=0.960]


[outer 020] TRAIN (EMA+K-ens) ll=0.6522  br=0.2292  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=3.24e-01, acc. prob=0.940]


[outer 021] TRAIN (EMA+K-ens) ll=0.6617  br=0.2335  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.97it/s, step size=2.94e-01, acc. prob=0.930]


[outer 022] TRAIN (EMA+K-ens) ll=0.6875  br=0.2457  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 23.89it/s, step size=2.63e-01, acc. prob=0.957]


[outer 023] TRAIN (EMA+K-ens) ll=0.6793  br=0.2416  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.91it/s, step size=3.14e-01, acc. prob=0.943]


[outer 024] TRAIN (EMA+K-ens) ll=0.6848  br=0.2443  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.03it/s, step size=2.69e-01, acc. prob=0.942]


[outer 025] TRAIN (EMA+K-ens) ll=0.6879  br=0.2457  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 25.07it/s, step size=2.49e-01, acc. prob=0.956]


[outer 026] TRAIN (EMA+K-ens) ll=0.6755  br=0.2400  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 25.19it/s, step size=3.02e-01, acc. prob=0.938]


[outer 027] TRAIN (EMA+K-ens) ll=0.6739  br=0.2392  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 24.51it/s, step size=2.40e-01, acc. prob=0.960]


[outer 028] TRAIN (EMA+K-ens) ll=0.6699  br=0.2372  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.33it/s, step size=3.19e-01, acc. prob=0.946]


[outer 029] TRAIN (EMA+K-ens) ll=0.6603  br=0.2327  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.65it/s, step size=2.88e-01, acc. prob=0.950]


[outer 030] TRAIN (EMA+K-ens) ll=0.6449  br=0.2254  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.45it/s, step size=2.92e-01, acc. prob=0.938]


[outer 031] TRAIN (EMA+K-ens) ll=0.6422  br=0.2242  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.64it/s, step size=2.70e-01, acc. prob=0.949]


[outer 032] TRAIN (EMA+K-ens) ll=0.6378  br=0.2221  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 24.20it/s, step size=3.95e-01, acc. prob=0.936]


[outer 033] TRAIN (EMA+K-ens) ll=0.6446  br=0.2253  acc=0.7100


Sample: 100%|██████████| 330/330 [00:15, 20.99it/s, step size=2.09e-01, acc. prob=0.979]


[outer 034] TRAIN (EMA+K-ens) ll=0.6526  br=0.2290  acc=0.7100


Sample: 100%|██████████| 330/330 [00:12, 25.54it/s, step size=2.74e-01, acc. prob=0.938]


[outer 035] TRAIN (EMA+K-ens) ll=0.6604  br=0.2328  acc=0.7100


Sample: 100%|██████████| 330/330 [00:15, 21.28it/s, step size=2.67e-01, acc. prob=0.938]


[outer 036] TRAIN (EMA+K-ens) ll=0.6625  br=0.2338  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.49it/s, step size=2.45e-01, acc. prob=0.958]


[outer 037] TRAIN (EMA+K-ens) ll=0.6569  br=0.2314  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.88it/s, step size=2.94e-01, acc. prob=0.938]


[outer 038] TRAIN (EMA+K-ens) ll=0.6538  br=0.2299  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 25.13it/s, step size=3.83e-01, acc. prob=0.954]


[outer 039] TRAIN (EMA+K-ens) ll=0.6548  br=0.2303  acc=0.7100


Sample: 100%|██████████| 330/330 [00:13, 23.96it/s, step size=2.73e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.7048  br=0.2546  acc=0.5820


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.28e-01, acc. prob=0.974]


[outer 001] TRAIN (EMA+K-ens) ll=0.6525  br=0.2298  acc=0.6790


Sample: 100%|██████████| 330/330 [00:12, 27.24it/s, step size=2.81e-01, acc. prob=0.964]


[outer 002] TRAIN (EMA+K-ens) ll=0.6589  br=0.2328  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.51it/s, step size=2.41e-01, acc. prob=0.961]


[outer 003] TRAIN (EMA+K-ens) ll=0.6588  br=0.2324  acc=0.7010


Sample: 100%|██████████| 330/330 [00:13, 24.75it/s, step size=2.52e-01, acc. prob=0.968]


[outer 004] TRAIN (EMA+K-ens) ll=0.6641  br=0.2348  acc=0.7140


Sample: 100%|██████████| 330/330 [00:13, 24.32it/s, step size=2.18e-01, acc. prob=0.971]


[outer 005] TRAIN (EMA+K-ens) ll=0.6592  br=0.2325  acc=0.7220


Sample: 100%|██████████| 330/330 [00:12, 25.84it/s, step size=3.12e-01, acc. prob=0.934]


[outer 006] TRAIN (EMA+K-ens) ll=0.6592  br=0.2325  acc=0.7270


Sample: 100%|██████████| 330/330 [00:14, 23.04it/s, step size=2.03e-01, acc. prob=0.957]


[outer 007] TRAIN (EMA+K-ens) ll=0.6511  br=0.2287  acc=0.7270


Sample: 100%|██████████| 330/330 [00:14, 22.94it/s, step size=2.31e-01, acc. prob=0.948]


[outer 008] TRAIN (EMA+K-ens) ll=0.6430  br=0.2249  acc=0.7250


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=2.76e-01, acc. prob=0.955]


[outer 009] TRAIN (EMA+K-ens) ll=0.6491  br=0.2276  acc=0.7270


Sample: 100%|██████████| 330/330 [00:14, 23.36it/s, step size=3.16e-01, acc. prob=0.958]


[outer 010] TRAIN (EMA+K-ens) ll=0.6526  br=0.2291  acc=0.7280


Sample: 100%|██████████| 330/330 [00:12, 25.51it/s, step size=3.60e-01, acc. prob=0.922]


[outer 011] TRAIN (EMA+K-ens) ll=0.6475  br=0.2266  acc=0.7210


Sample: 100%|██████████| 330/330 [00:15, 21.28it/s, step size=3.27e-01, acc. prob=0.923]


[outer 012] TRAIN (EMA+K-ens) ll=0.6382  br=0.2222  acc=0.7290


Sample: 100%|██████████| 330/330 [00:13, 23.84it/s, step size=2.51e-01, acc. prob=0.955]


[outer 013] TRAIN (EMA+K-ens) ll=0.6399  br=0.2229  acc=0.7260


Sample: 100%|██████████| 330/330 [00:13, 23.75it/s, step size=2.62e-01, acc. prob=0.965]


[outer 014] TRAIN (EMA+K-ens) ll=0.6441  br=0.2249  acc=0.7090


Sample: 100%|██████████| 330/330 [00:13, 24.31it/s, step size=3.12e-01, acc. prob=0.928]


[outer 015] TRAIN (EMA+K-ens) ll=0.6499  br=0.2274  acc=0.7250


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.96e-01, acc. prob=0.952]


[outer 016] TRAIN (EMA+K-ens) ll=0.6539  br=0.2292  acc=0.7230


Sample: 100%|██████████| 330/330 [00:13, 23.59it/s, step size=3.55e-01, acc. prob=0.930]


[outer 017] TRAIN (EMA+K-ens) ll=0.6544  br=0.2295  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.06e-01, acc. prob=0.973]


[outer 018] TRAIN (EMA+K-ens) ll=0.6449  br=0.2252  acc=0.7010


Sample: 100%|██████████| 330/330 [00:15, 21.59it/s, step size=1.98e-01, acc. prob=0.945]


[outer 019] TRAIN (EMA+K-ens) ll=0.6486  br=0.2267  acc=0.7150


Sample: 100%|██████████| 330/330 [00:12, 25.57it/s, step size=2.62e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6516  br=0.2281  acc=0.7180


Sample: 100%|██████████| 330/330 [00:13, 24.75it/s, step size=2.58e-01, acc. prob=0.945]


[outer 021] TRAIN (EMA+K-ens) ll=0.6497  br=0.2272  acc=0.7250


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=2.61e-01, acc. prob=0.942]


[outer 022] TRAIN (EMA+K-ens) ll=0.6527  br=0.2282  acc=0.7290


Sample: 100%|██████████| 330/330 [00:14, 22.82it/s, step size=3.68e-01, acc. prob=0.906]


[outer 023] TRAIN (EMA+K-ens) ll=0.6463  br=0.2256  acc=0.7290


Sample: 100%|██████████| 330/330 [00:13, 24.35it/s, step size=2.85e-01, acc. prob=0.941]


[outer 024] TRAIN (EMA+K-ens) ll=0.6421  br=0.2236  acc=0.7290


Sample: 100%|██████████| 330/330 [00:12, 25.57it/s, step size=2.97e-01, acc. prob=0.962]


[outer 025] TRAIN (EMA+K-ens) ll=0.6373  br=0.2214  acc=0.7310


Sample: 100%|██████████| 330/330 [00:15, 21.30it/s, step size=3.20e-01, acc. prob=0.926]


[outer 026] TRAIN (EMA+K-ens) ll=0.6392  br=0.2223  acc=0.7270


Sample: 100%|██████████| 330/330 [00:12, 25.89it/s, step size=3.30e-01, acc. prob=0.929]


[outer 027] TRAIN (EMA+K-ens) ll=0.6391  br=0.2224  acc=0.7270


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=2.92e-01, acc. prob=0.927]


[outer 028] TRAIN (EMA+K-ens) ll=0.6404  br=0.2230  acc=0.7280


Sample: 100%|██████████| 330/330 [00:13, 25.18it/s, step size=3.22e-01, acc. prob=0.939]


[outer 029] TRAIN (EMA+K-ens) ll=0.6434  br=0.2248  acc=0.7210


Sample: 100%|██████████| 330/330 [00:13, 24.38it/s, step size=3.03e-01, acc. prob=0.944]


[outer 030] TRAIN (EMA+K-ens) ll=0.6345  br=0.2207  acc=0.7260


Sample: 100%|██████████| 330/330 [00:13, 24.84it/s, step size=4.08e-01, acc. prob=0.911]


[outer 031] TRAIN (EMA+K-ens) ll=0.6350  br=0.2207  acc=0.7250


Sample: 100%|██████████| 330/330 [00:13, 23.97it/s, step size=3.63e-01, acc. prob=0.934]


[outer 032] TRAIN (EMA+K-ens) ll=0.6366  br=0.2212  acc=0.7260


Sample: 100%|██████████| 330/330 [00:14, 23.42it/s, step size=3.28e-01, acc. prob=0.928]


[outer 033] TRAIN (EMA+K-ens) ll=0.6322  br=0.2191  acc=0.7250


Sample: 100%|██████████| 330/330 [00:14, 22.19it/s, step size=2.72e-01, acc. prob=0.934]


[outer 034] TRAIN (EMA+K-ens) ll=0.6318  br=0.2192  acc=0.7220


Sample: 100%|██████████| 330/330 [00:12, 26.66it/s, step size=3.31e-01, acc. prob=0.948]


[outer 035] TRAIN (EMA+K-ens) ll=0.6359  br=0.2208  acc=0.7240


Sample: 100%|██████████| 330/330 [00:12, 25.72it/s, step size=2.87e-01, acc. prob=0.921]


[outer 036] TRAIN (EMA+K-ens) ll=0.6323  br=0.2193  acc=0.7250


Sample: 100%|██████████| 330/330 [00:13, 24.28it/s, step size=2.31e-01, acc. prob=0.967]


[outer 037] TRAIN (EMA+K-ens) ll=0.6358  br=0.2206  acc=0.7270


Sample: 100%|██████████| 330/330 [00:12, 25.63it/s, step size=2.90e-01, acc. prob=0.963]


[outer 038] TRAIN (EMA+K-ens) ll=0.6415  br=0.2232  acc=0.7270


Sample: 100%|██████████| 330/330 [00:12, 26.45it/s, step size=3.36e-01, acc. prob=0.959]


[outer 039] TRAIN (EMA+K-ens) ll=0.6492  br=0.2267  acc=0.7270


Sample: 100%|██████████| 330/330 [00:12, 26.02it/s, step size=2.54e-01, acc. prob=0.960]


[outer 000] TRAIN (EMA+K-ens) ll=0.6972  br=0.2514  acc=0.5480


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=2.72e-01, acc. prob=0.954]


[outer 001] TRAIN (EMA+K-ens) ll=0.6713  br=0.2391  acc=0.6560


Sample: 100%|██████████| 330/330 [00:13, 25.01it/s, step size=2.83e-01, acc. prob=0.967]


[outer 002] TRAIN (EMA+K-ens) ll=0.6725  br=0.2397  acc=0.6460


Sample: 100%|██████████| 330/330 [00:15, 21.21it/s, step size=2.43e-01, acc. prob=0.967]


[outer 003] TRAIN (EMA+K-ens) ll=0.6713  br=0.2391  acc=0.6350


Sample: 100%|██████████| 330/330 [00:15, 21.40it/s, step size=2.41e-01, acc. prob=0.946]


[outer 004] TRAIN (EMA+K-ens) ll=0.6685  br=0.2376  acc=0.6540


Sample: 100%|██████████| 330/330 [00:13, 23.94it/s, step size=3.32e-01, acc. prob=0.930]


[outer 005] TRAIN (EMA+K-ens) ll=0.6775  br=0.2420  acc=0.6550


Sample: 100%|██████████| 330/330 [00:13, 24.13it/s, step size=2.51e-01, acc. prob=0.945]


[outer 006] TRAIN (EMA+K-ens) ll=0.6811  br=0.2437  acc=0.6480


Sample: 100%|██████████| 330/330 [00:13, 23.77it/s, step size=2.72e-01, acc. prob=0.973]


[outer 007] TRAIN (EMA+K-ens) ll=0.6779  br=0.2421  acc=0.6580


Sample: 100%|██████████| 330/330 [00:13, 25.00it/s, step size=2.51e-01, acc. prob=0.947]


[outer 008] TRAIN (EMA+K-ens) ll=0.6733  br=0.2399  acc=0.6580


Sample: 100%|██████████| 330/330 [00:14, 22.65it/s, step size=2.91e-01, acc. prob=0.935]


[outer 009] TRAIN (EMA+K-ens) ll=0.6710  br=0.2388  acc=0.6580


Sample: 100%|██████████| 330/330 [00:14, 22.29it/s, step size=2.68e-01, acc. prob=0.945]


[outer 010] TRAIN (EMA+K-ens) ll=0.6640  br=0.2354  acc=0.6580


Sample: 100%|██████████| 330/330 [00:15, 21.11it/s, step size=2.52e-01, acc. prob=0.972]


[outer 011] TRAIN (EMA+K-ens) ll=0.6598  br=0.2334  acc=0.6580


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.88e-01, acc. prob=0.943]


[outer 012] TRAIN (EMA+K-ens) ll=0.6617  br=0.2342  acc=0.6580


Sample: 100%|██████████| 330/330 [00:13, 24.78it/s, step size=2.43e-01, acc. prob=0.967]


[outer 013] TRAIN (EMA+K-ens) ll=0.6576  br=0.2322  acc=0.6610


Sample: 100%|██████████| 330/330 [00:13, 24.08it/s, step size=2.44e-01, acc. prob=0.936]


[outer 014] TRAIN (EMA+K-ens) ll=0.6508  br=0.2289  acc=0.6600


Sample: 100%|██████████| 330/330 [00:12, 26.39it/s, step size=2.96e-01, acc. prob=0.948]


[outer 015] TRAIN (EMA+K-ens) ll=0.6612  br=0.2339  acc=0.6550


Sample: 100%|██████████| 330/330 [00:13, 24.15it/s, step size=3.20e-01, acc. prob=0.947]


[outer 016] TRAIN (EMA+K-ens) ll=0.6626  br=0.2346  acc=0.6560


Sample: 100%|██████████| 330/330 [00:13, 24.54it/s, step size=2.92e-01, acc. prob=0.931]


[outer 017] TRAIN (EMA+K-ens) ll=0.6702  br=0.2383  acc=0.6570


Sample: 100%|██████████| 330/330 [00:13, 23.95it/s, step size=3.17e-01, acc. prob=0.915]


[outer 018] TRAIN (EMA+K-ens) ll=0.6872  br=0.2465  acc=0.6420


Sample: 100%|██████████| 330/330 [00:14, 22.67it/s, step size=1.99e-01, acc. prob=0.967]


[outer 019] TRAIN (EMA+K-ens) ll=0.6925  br=0.2489  acc=0.6240


Sample: 100%|██████████| 330/330 [00:14, 22.92it/s, step size=2.79e-01, acc. prob=0.965]


[outer 020] TRAIN (EMA+K-ens) ll=0.6960  br=0.2506  acc=0.6040


Sample: 100%|██████████| 330/330 [00:14, 22.33it/s, step size=2.21e-01, acc. prob=0.967]


[outer 021] TRAIN (EMA+K-ens) ll=0.6841  br=0.2451  acc=0.6450


Sample: 100%|██████████| 330/330 [00:13, 24.66it/s, step size=3.51e-01, acc. prob=0.927]


[outer 022] TRAIN (EMA+K-ens) ll=0.6983  br=0.2519  acc=0.6280


Sample: 100%|██████████| 330/330 [00:12, 26.73it/s, step size=2.92e-01, acc. prob=0.921]


[outer 023] TRAIN (EMA+K-ens) ll=0.6885  br=0.2473  acc=0.6520


Sample: 100%|██████████| 330/330 [00:14, 22.52it/s, step size=1.99e-01, acc. prob=0.969]


[outer 024] TRAIN (EMA+K-ens) ll=0.6889  br=0.2473  acc=0.6570


Sample: 100%|██████████| 330/330 [00:15, 21.90it/s, step size=3.35e-01, acc. prob=0.906]


[outer 025] TRAIN (EMA+K-ens) ll=0.6915  br=0.2485  acc=0.6580


Sample: 100%|██████████| 330/330 [00:13, 23.74it/s, step size=2.41e-01, acc. prob=0.945]


[outer 026] TRAIN (EMA+K-ens) ll=0.6821  br=0.2439  acc=0.6620


Sample: 100%|██████████| 330/330 [00:13, 24.61it/s, step size=3.00e-01, acc. prob=0.957]


[outer 027] TRAIN (EMA+K-ens) ll=0.6847  br=0.2453  acc=0.6660


Sample: 100%|██████████| 330/330 [00:14, 22.41it/s, step size=2.92e-01, acc. prob=0.956]


[outer 028] TRAIN (EMA+K-ens) ll=0.6783  br=0.2422  acc=0.6560


Sample: 100%|██████████| 330/330 [00:13, 23.65it/s, step size=2.62e-01, acc. prob=0.959]


[outer 029] TRAIN (EMA+K-ens) ll=0.6818  br=0.2438  acc=0.6580


Sample: 100%|██████████| 330/330 [00:14, 22.51it/s, step size=2.42e-01, acc. prob=0.971]


[outer 030] TRAIN (EMA+K-ens) ll=0.6752  br=0.2406  acc=0.6670


Sample: 100%|██████████| 330/330 [00:15, 21.85it/s, step size=2.91e-01, acc. prob=0.942]


[outer 031] TRAIN (EMA+K-ens) ll=0.6752  br=0.2405  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 21.76it/s, step size=2.81e-01, acc. prob=0.930]


[outer 032] TRAIN (EMA+K-ens) ll=0.6791  br=0.2425  acc=0.6600


Sample: 100%|██████████| 330/330 [00:13, 24.87it/s, step size=3.28e-01, acc. prob=0.947]


[outer 033] TRAIN (EMA+K-ens) ll=0.6799  br=0.2429  acc=0.6610


Sample: 100%|██████████| 330/330 [00:13, 23.75it/s, step size=2.72e-01, acc. prob=0.951]


[outer 034] TRAIN (EMA+K-ens) ll=0.6907  br=0.2481  acc=0.6600


Sample: 100%|██████████| 330/330 [00:13, 23.72it/s, step size=2.83e-01, acc. prob=0.953]


[outer 035] TRAIN (EMA+K-ens) ll=0.6966  br=0.2510  acc=0.6470


Sample: 100%|██████████| 330/330 [00:12, 25.95it/s, step size=3.41e-01, acc. prob=0.916]


[outer 036] TRAIN (EMA+K-ens) ll=0.7058  br=0.2554  acc=0.6520


Sample: 100%|██████████| 330/330 [00:16, 20.55it/s, step size=2.25e-01, acc. prob=0.964]


[outer 037] TRAIN (EMA+K-ens) ll=0.7111  br=0.2579  acc=0.6360


Sample: 100%|██████████| 330/330 [00:14, 22.31it/s, step size=2.72e-01, acc. prob=0.944]


[outer 038] TRAIN (EMA+K-ens) ll=0.7040  br=0.2546  acc=0.6510


Sample: 100%|██████████| 330/330 [00:13, 24.42it/s, step size=3.53e-01, acc. prob=0.934]


[outer 039] TRAIN (EMA+K-ens) ll=0.7014  br=0.2532  acc=0.6570


Sample: 100%|██████████| 330/330 [00:14, 23.43it/s, step size=2.90e-01, acc. prob=0.953]


[outer 000] TRAIN (EMA+K-ens) ll=0.7852  br=0.2932  acc=0.3470


Sample: 100%|██████████| 330/330 [00:14, 22.22it/s, step size=2.70e-01, acc. prob=0.951]


[outer 001] TRAIN (EMA+K-ens) ll=0.6955  br=0.2502  acc=0.6190


Sample: 100%|██████████| 330/330 [00:14, 22.71it/s, step size=2.59e-01, acc. prob=0.969]


[outer 002] TRAIN (EMA+K-ens) ll=0.6972  br=0.2508  acc=0.6850


Sample: 100%|██████████| 330/330 [00:15, 21.01it/s, step size=2.15e-01, acc. prob=0.975]


[outer 003] TRAIN (EMA+K-ens) ll=0.6581  br=0.2323  acc=0.6840


Sample: 100%|██████████| 330/330 [00:13, 24.02it/s, step size=2.74e-01, acc. prob=0.933]


[outer 004] TRAIN (EMA+K-ens) ll=0.6644  br=0.2353  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 23.79it/s, step size=2.83e-01, acc. prob=0.945]


[outer 005] TRAIN (EMA+K-ens) ll=0.6853  br=0.2453  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.82it/s, step size=3.35e-01, acc. prob=0.938]


[outer 006] TRAIN (EMA+K-ens) ll=0.6835  br=0.2443  acc=0.6930


Sample: 100%|██████████| 330/330 [00:15, 21.80it/s, step size=2.65e-01, acc. prob=0.944]


[outer 007] TRAIN (EMA+K-ens) ll=0.6860  br=0.2455  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.94it/s, step size=2.34e-01, acc. prob=0.965]


[outer 008] TRAIN (EMA+K-ens) ll=0.6737  br=0.2396  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 23.90it/s, step size=3.03e-01, acc. prob=0.928]


[outer 009] TRAIN (EMA+K-ens) ll=0.6785  br=0.2419  acc=0.6930


Sample: 100%|██████████| 330/330 [00:14, 22.55it/s, step size=2.69e-01, acc. prob=0.932]


[outer 010] TRAIN (EMA+K-ens) ll=0.6675  br=0.2367  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 24.05it/s, step size=2.58e-01, acc. prob=0.959]


[outer 011] TRAIN (EMA+K-ens) ll=0.6824  br=0.2438  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 23.85it/s, step size=2.71e-01, acc. prob=0.945]


[outer 012] TRAIN (EMA+K-ens) ll=0.6822  br=0.2436  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.11it/s, step size=2.27e-01, acc. prob=0.962]


[outer 013] TRAIN (EMA+K-ens) ll=0.6656  br=0.2357  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 23.87it/s, step size=2.31e-01, acc. prob=0.956]


[outer 014] TRAIN (EMA+K-ens) ll=0.6681  br=0.2368  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 23.36it/s, step size=2.94e-01, acc. prob=0.963]


[outer 015] TRAIN (EMA+K-ens) ll=0.6734  br=0.2393  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 23.40it/s, step size=2.78e-01, acc. prob=0.912]


[outer 016] TRAIN (EMA+K-ens) ll=0.6783  br=0.2417  acc=0.6910
[Early stop @ outer 16] Δll=0.256%, Δbr=0.392%, Δacc=0.004


Sample: 100%|██████████| 330/330 [00:14, 23.27it/s, step size=2.62e-01, acc. prob=0.964]


[outer 000] TRAIN (EMA+K-ens) ll=0.6659  br=0.2364  acc=0.5910


Sample: 100%|██████████| 330/330 [00:14, 23.30it/s, step size=2.78e-01, acc. prob=0.944]


[outer 001] TRAIN (EMA+K-ens) ll=0.6931  br=0.2492  acc=0.5950


Sample: 100%|██████████| 330/330 [00:15, 21.97it/s, step size=2.76e-01, acc. prob=0.943]


[outer 002] TRAIN (EMA+K-ens) ll=0.6690  br=0.2375  acc=0.6630


Sample: 100%|██████████| 330/330 [00:13, 24.83it/s, step size=2.20e-01, acc. prob=0.960]


[outer 003] TRAIN (EMA+K-ens) ll=0.6641  br=0.2351  acc=0.7030


Sample: 100%|██████████| 330/330 [00:13, 23.89it/s, step size=2.63e-01, acc. prob=0.968]


[outer 004] TRAIN (EMA+K-ens) ll=0.6618  br=0.2339  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.12it/s, step size=3.42e-01, acc. prob=0.915]


[outer 005] TRAIN (EMA+K-ens) ll=0.6635  br=0.2346  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 24.79it/s, step size=2.71e-01, acc. prob=0.955]


[outer 006] TRAIN (EMA+K-ens) ll=0.6641  br=0.2349  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 23.27it/s, step size=2.79e-01, acc. prob=0.975]


[outer 007] TRAIN (EMA+K-ens) ll=0.6601  br=0.2330  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.98it/s, step size=3.37e-01, acc. prob=0.907]


[outer 008] TRAIN (EMA+K-ens) ll=0.6532  br=0.2296  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.02e-01, acc. prob=0.970]


[outer 009] TRAIN (EMA+K-ens) ll=0.6423  br=0.2245  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.01it/s, step size=2.33e-01, acc. prob=0.961]


[outer 010] TRAIN (EMA+K-ens) ll=0.6509  br=0.2287  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 23.77it/s, step size=3.06e-01, acc. prob=0.957]


[outer 011] TRAIN (EMA+K-ens) ll=0.6537  br=0.2300  acc=0.7120


Sample: 100%|██████████| 330/330 [00:12, 26.62it/s, step size=3.04e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6578  br=0.2320  acc=0.7110


Sample: 100%|██████████| 330/330 [00:14, 22.31it/s, step size=2.76e-01, acc. prob=0.959]


[outer 013] TRAIN (EMA+K-ens) ll=0.6570  br=0.2316  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.17it/s, step size=1.89e-01, acc. prob=0.946]


[outer 014] TRAIN (EMA+K-ens) ll=0.6575  br=0.2318  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.43it/s, step size=2.66e-01, acc. prob=0.945]


[outer 015] TRAIN (EMA+K-ens) ll=0.6533  br=0.2298  acc=0.7080


Sample: 100%|██████████| 330/330 [00:16, 19.62it/s, step size=2.27e-01, acc. prob=0.964]


[outer 016] TRAIN (EMA+K-ens) ll=0.6591  br=0.2325  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.53it/s, step size=3.21e-01, acc. prob=0.943]


[outer 017] TRAIN (EMA+K-ens) ll=0.6602  br=0.2329  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.66it/s, step size=2.63e-01, acc. prob=0.953]


[outer 018] TRAIN (EMA+K-ens) ll=0.6551  br=0.2306  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 23.09it/s, step size=3.15e-01, acc. prob=0.942]


[outer 019] TRAIN (EMA+K-ens) ll=0.6493  br=0.2278  acc=0.7080


Sample: 100%|██████████| 330/330 [00:12, 25.57it/s, step size=3.30e-01, acc. prob=0.926]


[outer 020] TRAIN (EMA+K-ens) ll=0.6445  br=0.2257  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.14it/s, step size=2.31e-01, acc. prob=0.960]


[outer 021] TRAIN (EMA+K-ens) ll=0.6420  br=0.2244  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.43it/s, step size=3.28e-01, acc. prob=0.931]


[outer 022] TRAIN (EMA+K-ens) ll=0.6398  br=0.2233  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.37it/s, step size=2.64e-01, acc. prob=0.961]


[outer 023] TRAIN (EMA+K-ens) ll=0.6451  br=0.2259  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.76it/s, step size=2.89e-01, acc. prob=0.925]


[outer 024] TRAIN (EMA+K-ens) ll=0.6465  br=0.2265  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.59it/s, step size=3.03e-01, acc. prob=0.939]


[outer 025] TRAIN (EMA+K-ens) ll=0.6471  br=0.2268  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.81it/s, step size=3.21e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6528  br=0.2295  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 21.81it/s, step size=3.18e-01, acc. prob=0.947]


[outer 027] TRAIN (EMA+K-ens) ll=0.6533  br=0.2298  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.21it/s, step size=3.28e-01, acc. prob=0.935]


[outer 028] TRAIN (EMA+K-ens) ll=0.6505  br=0.2285  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.80it/s, step size=2.57e-01, acc. prob=0.968]


[outer 029] TRAIN (EMA+K-ens) ll=0.6500  br=0.2284  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.49it/s, step size=2.36e-01, acc. prob=0.941]


[outer 030] TRAIN (EMA+K-ens) ll=0.6520  br=0.2293  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.85it/s, step size=3.38e-01, acc. prob=0.957]


[outer 031] TRAIN (EMA+K-ens) ll=0.6585  br=0.2322  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 24.75it/s, step size=3.50e-01, acc. prob=0.906]


[outer 032] TRAIN (EMA+K-ens) ll=0.6638  br=0.2347  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 20.66it/s, step size=2.71e-01, acc. prob=0.967]


[outer 033] TRAIN (EMA+K-ens) ll=0.6628  br=0.2342  acc=0.7030


Sample: 100%|██████████| 330/330 [00:13, 24.08it/s, step size=2.92e-01, acc. prob=0.939]


[outer 034] TRAIN (EMA+K-ens) ll=0.6517  br=0.2289  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 23.39it/s, step size=2.69e-01, acc. prob=0.930]


[outer 035] TRAIN (EMA+K-ens) ll=0.6531  br=0.2294  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.17it/s, step size=3.35e-01, acc. prob=0.935]


[outer 036] TRAIN (EMA+K-ens) ll=0.6544  br=0.2299  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 24.47it/s, step size=2.44e-01, acc. prob=0.969]


[outer 037] TRAIN (EMA+K-ens) ll=0.6524  br=0.2287  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 23.26it/s, step size=2.69e-01, acc. prob=0.947]


[outer 038] TRAIN (EMA+K-ens) ll=0.6500  br=0.2277  acc=0.7080


Sample: 100%|██████████| 330/330 [00:15, 21.11it/s, step size=2.08e-01, acc. prob=0.968]


[outer 039] TRAIN (EMA+K-ens) ll=0.6361  br=0.2214  acc=0.7080


Sample: 100%|██████████| 330/330 [00:14, 22.15it/s, step size=2.80e-01, acc. prob=0.929]


[outer 000] TRAIN (EMA+K-ens) ll=0.7958  br=0.2932  acc=0.4730


Sample: 100%|██████████| 330/330 [00:14, 22.65it/s, step size=2.30e-01, acc. prob=0.943]


[outer 001] TRAIN (EMA+K-ens) ll=0.7200  br=0.2624  acc=0.5550


Sample: 100%|██████████| 330/330 [00:14, 23.39it/s, step size=2.49e-01, acc. prob=0.960]


[outer 002] TRAIN (EMA+K-ens) ll=0.7296  br=0.2665  acc=0.5530


Sample: 100%|██████████| 330/330 [00:15, 20.89it/s, step size=2.45e-01, acc. prob=0.957]


[outer 003] TRAIN (EMA+K-ens) ll=0.7385  br=0.2706  acc=0.5420


Sample: 100%|██████████| 330/330 [00:13, 24.29it/s, step size=2.96e-01, acc. prob=0.962]


[outer 004] TRAIN (EMA+K-ens) ll=0.7256  br=0.2648  acc=0.5950


Sample: 100%|██████████| 330/330 [00:14, 23.49it/s, step size=2.88e-01, acc. prob=0.957]


[outer 005] TRAIN (EMA+K-ens) ll=0.7102  br=0.2574  acc=0.6320


Sample: 100%|██████████| 330/330 [00:15, 21.35it/s, step size=2.94e-01, acc. prob=0.948]


[outer 006] TRAIN (EMA+K-ens) ll=0.7056  br=0.2552  acc=0.6330


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=2.26e-01, acc. prob=0.970]


[outer 007] TRAIN (EMA+K-ens) ll=0.7007  br=0.2529  acc=0.6680


Sample: 100%|██████████| 330/330 [00:14, 23.35it/s, step size=3.49e-01, acc. prob=0.918]


[outer 008] TRAIN (EMA+K-ens) ll=0.6954  br=0.2503  acc=0.6490


Sample: 100%|██████████| 330/330 [00:13, 24.74it/s, step size=2.43e-01, acc. prob=0.984]


[outer 009] TRAIN (EMA+K-ens) ll=0.6823  br=0.2440  acc=0.6860


Sample: 100%|██████████| 330/330 [00:12, 26.52it/s, step size=3.58e-01, acc. prob=0.913]


[outer 010] TRAIN (EMA+K-ens) ll=0.6694  br=0.2379  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.02it/s, step size=3.02e-01, acc. prob=0.944]


[outer 011] TRAIN (EMA+K-ens) ll=0.6493  br=0.2282  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 25.12it/s, step size=2.72e-01, acc. prob=0.964]


[outer 012] TRAIN (EMA+K-ens) ll=0.6535  br=0.2302  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 23.83it/s, step size=2.80e-01, acc. prob=0.953]


[outer 013] TRAIN (EMA+K-ens) ll=0.6576  br=0.2321  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.14it/s, step size=3.17e-01, acc. prob=0.953]


[outer 014] TRAIN (EMA+K-ens) ll=0.6576  br=0.2321  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.31e-01, acc. prob=0.964]


[outer 015] TRAIN (EMA+K-ens) ll=0.6523  br=0.2295  acc=0.6860


Sample: 100%|██████████| 330/330 [00:12, 26.01it/s, step size=2.46e-01, acc. prob=0.949]


[outer 016] TRAIN (EMA+K-ens) ll=0.6492  br=0.2281  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 23.97it/s, step size=2.94e-01, acc. prob=0.932]


[outer 017] TRAIN (EMA+K-ens) ll=0.6549  br=0.2308  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 25.53it/s, step size=3.63e-01, acc. prob=0.935]


[outer 018] TRAIN (EMA+K-ens) ll=0.6620  br=0.2341  acc=0.6890


Sample: 100%|██████████| 330/330 [00:13, 24.36it/s, step size=3.43e-01, acc. prob=0.953]


[outer 019] TRAIN (EMA+K-ens) ll=0.6644  br=0.2350  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.43it/s, step size=2.94e-01, acc. prob=0.944]


[outer 020] TRAIN (EMA+K-ens) ll=0.6650  br=0.2352  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=2.94e-01, acc. prob=0.931]


[outer 021] TRAIN (EMA+K-ens) ll=0.6668  br=0.2363  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=3.28e-01, acc. prob=0.941]


[outer 022] TRAIN (EMA+K-ens) ll=0.6643  br=0.2351  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.43it/s, step size=3.03e-01, acc. prob=0.957]


[outer 023] TRAIN (EMA+K-ens) ll=0.6694  br=0.2373  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.12it/s, step size=2.66e-01, acc. prob=0.944]


[outer 024] TRAIN (EMA+K-ens) ll=0.6628  br=0.2341  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.56it/s, step size=2.93e-01, acc. prob=0.943]


[outer 025] TRAIN (EMA+K-ens) ll=0.6691  br=0.2371  acc=0.6830


Sample: 100%|██████████| 330/330 [00:13, 23.71it/s, step size=2.78e-01, acc. prob=0.929]


[outer 026] TRAIN (EMA+K-ens) ll=0.6627  br=0.2342  acc=0.6820


Sample: 100%|██████████| 330/330 [00:14, 22.38it/s, step size=2.85e-01, acc. prob=0.936]


[outer 027] TRAIN (EMA+K-ens) ll=0.6749  br=0.2402  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.57it/s, step size=2.57e-01, acc. prob=0.953]


[outer 028] TRAIN (EMA+K-ens) ll=0.6732  br=0.2393  acc=0.6810


Sample: 100%|██████████| 330/330 [00:14, 23.41it/s, step size=2.54e-01, acc. prob=0.960]


[outer 029] TRAIN (EMA+K-ens) ll=0.6713  br=0.2381  acc=0.6750


Sample: 100%|██████████| 330/330 [00:14, 23.17it/s, step size=2.80e-01, acc. prob=0.947]


[outer 030] TRAIN (EMA+K-ens) ll=0.6756  br=0.2401  acc=0.6560


Sample: 100%|██████████| 330/330 [00:13, 24.72it/s, step size=2.93e-01, acc. prob=0.951]


[outer 031] TRAIN (EMA+K-ens) ll=0.6775  br=0.2411  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 24.92it/s, step size=3.02e-01, acc. prob=0.948]


[outer 032] TRAIN (EMA+K-ens) ll=0.6836  br=0.2442  acc=0.6540


Sample: 100%|██████████| 330/330 [00:14, 22.99it/s, step size=2.45e-01, acc. prob=0.959]


[outer 033] TRAIN (EMA+K-ens) ll=0.6751  br=0.2402  acc=0.6720


Sample: 100%|██████████| 330/330 [00:13, 23.73it/s, step size=3.62e-01, acc. prob=0.920]


[outer 034] TRAIN (EMA+K-ens) ll=0.6755  br=0.2404  acc=0.6760


Sample: 100%|██████████| 330/330 [00:13, 23.87it/s, step size=2.05e-01, acc. prob=0.959]


[outer 035] TRAIN (EMA+K-ens) ll=0.6720  br=0.2386  acc=0.6680


Sample: 100%|██████████| 330/330 [00:13, 24.04it/s, step size=2.88e-01, acc. prob=0.947]


[outer 036] TRAIN (EMA+K-ens) ll=0.6609  br=0.2336  acc=0.6780


Sample: 100%|██████████| 330/330 [00:13, 23.69it/s, step size=2.11e-01, acc. prob=0.969]


[outer 037] TRAIN (EMA+K-ens) ll=0.6626  br=0.2345  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 24.26it/s, step size=3.32e-01, acc. prob=0.935]


[outer 038] TRAIN (EMA+K-ens) ll=0.6620  br=0.2343  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 25.32it/s, step size=3.03e-01, acc. prob=0.951]


[outer 039] TRAIN (EMA+K-ens) ll=0.6552  br=0.2309  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.96it/s, step size=2.75e-01, acc. prob=0.948]


[outer 000] TRAIN (EMA+K-ens) ll=0.8022  br=0.3013  acc=0.3880


Sample: 100%|██████████| 330/330 [00:13, 23.72it/s, step size=2.93e-01, acc. prob=0.932]


[outer 001] TRAIN (EMA+K-ens) ll=0.7260  br=0.2653  acc=0.5060


Sample: 100%|██████████| 330/330 [00:15, 21.50it/s, step size=2.37e-01, acc. prob=0.961]


[outer 002] TRAIN (EMA+K-ens) ll=0.7229  br=0.2638  acc=0.5090


Sample: 100%|██████████| 330/330 [00:14, 22.53it/s, step size=3.86e-01, acc. prob=0.869]


[outer 003] TRAIN (EMA+K-ens) ll=0.7147  br=0.2597  acc=0.6140


Sample: 100%|██████████| 330/330 [00:12, 26.62it/s, step size=3.56e-01, acc. prob=0.936]


[outer 004] TRAIN (EMA+K-ens) ll=0.7330  br=0.2681  acc=0.5560


Sample: 100%|██████████| 330/330 [00:13, 23.78it/s, step size=2.43e-01, acc. prob=0.961]


[outer 005] TRAIN (EMA+K-ens) ll=0.7230  br=0.2630  acc=0.6330


Sample: 100%|██████████| 330/330 [00:12, 25.49it/s, step size=3.46e-01, acc. prob=0.886]


[outer 006] TRAIN (EMA+K-ens) ll=0.7107  br=0.2571  acc=0.6590


Sample: 100%|██████████| 330/330 [00:13, 24.34it/s, step size=2.51e-01, acc. prob=0.956]


[outer 007] TRAIN (EMA+K-ens) ll=0.7092  br=0.2564  acc=0.6780


Sample: 100%|██████████| 330/330 [00:13, 24.54it/s, step size=2.95e-01, acc. prob=0.948]


[outer 008] TRAIN (EMA+K-ens) ll=0.6973  br=0.2507  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 24.47it/s, step size=2.81e-01, acc. prob=0.929]


[outer 009] TRAIN (EMA+K-ens) ll=0.6946  br=0.2494  acc=0.6920


Sample: 100%|██████████| 330/330 [00:14, 23.35it/s, step size=2.36e-01, acc. prob=0.930]


[outer 010] TRAIN (EMA+K-ens) ll=0.6847  br=0.2444  acc=0.6920


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=2.49e-01, acc. prob=0.951]


[outer 011] TRAIN (EMA+K-ens) ll=0.6874  br=0.2457  acc=0.6910


Sample: 100%|██████████| 330/330 [00:15, 21.87it/s, step size=2.54e-01, acc. prob=0.965]


[outer 012] TRAIN (EMA+K-ens) ll=0.6675  br=0.2363  acc=0.6930


Sample: 100%|██████████| 330/330 [00:14, 22.58it/s, step size=3.26e-01, acc. prob=0.917]


[outer 013] TRAIN (EMA+K-ens) ll=0.6659  br=0.2358  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.87it/s, step size=2.30e-01, acc. prob=0.971]


[outer 014] TRAIN (EMA+K-ens) ll=0.6696  br=0.2377  acc=0.6730


Sample: 100%|██████████| 330/330 [00:13, 24.29it/s, step size=2.67e-01, acc. prob=0.954]


[outer 015] TRAIN (EMA+K-ens) ll=0.6584  br=0.2324  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.78e-01, acc. prob=0.958]


[outer 016] TRAIN (EMA+K-ens) ll=0.6600  br=0.2331  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 25.92it/s, step size=2.87e-01, acc. prob=0.944]


[outer 017] TRAIN (EMA+K-ens) ll=0.6569  br=0.2318  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 24.37it/s, step size=3.13e-01, acc. prob=0.943]


[outer 018] TRAIN (EMA+K-ens) ll=0.6552  br=0.2310  acc=0.6730


Sample: 100%|██████████| 330/330 [00:14, 22.07it/s, step size=2.04e-01, acc. prob=0.955]


[outer 019] TRAIN (EMA+K-ens) ll=0.6365  br=0.2222  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 24.99it/s, step size=3.00e-01, acc. prob=0.956]


[outer 020] TRAIN (EMA+K-ens) ll=0.6312  br=0.2198  acc=0.6760


Sample: 100%|██████████| 330/330 [00:13, 23.72it/s, step size=2.74e-01, acc. prob=0.949]


[outer 021] TRAIN (EMA+K-ens) ll=0.6270  br=0.2178  acc=0.6780


Sample: 100%|██████████| 330/330 [00:13, 24.22it/s, step size=3.11e-01, acc. prob=0.947]


[outer 022] TRAIN (EMA+K-ens) ll=0.6161  br=0.2125  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 25.19it/s, step size=3.73e-01, acc. prob=0.958]


[outer 023] TRAIN (EMA+K-ens) ll=0.6258  br=0.2170  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=3.27e-01, acc. prob=0.946]


[outer 024] TRAIN (EMA+K-ens) ll=0.6223  br=0.2154  acc=0.6840


Sample: 100%|██████████| 330/330 [00:13, 24.35it/s, step size=2.66e-01, acc. prob=0.969]


[outer 025] TRAIN (EMA+K-ens) ll=0.6276  br=0.2179  acc=0.6920


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.66e-01, acc. prob=0.938]


[outer 026] TRAIN (EMA+K-ens) ll=0.6344  br=0.2212  acc=0.6770


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=3.23e-01, acc. prob=0.938]


[outer 027] TRAIN (EMA+K-ens) ll=0.6426  br=0.2250  acc=0.6770


Sample: 100%|██████████| 330/330 [00:12, 25.77it/s, step size=2.40e-01, acc. prob=0.966]


[outer 028] TRAIN (EMA+K-ens) ll=0.6573  br=0.2320  acc=0.6620


Sample: 100%|██████████| 330/330 [00:13, 24.62it/s, step size=2.82e-01, acc. prob=0.928]


[outer 029] TRAIN (EMA+K-ens) ll=0.6530  br=0.2299  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 22.92it/s, step size=2.95e-01, acc. prob=0.912]


[outer 030] TRAIN (EMA+K-ens) ll=0.6514  br=0.2292  acc=0.6680


Sample: 100%|██████████| 330/330 [00:15, 21.53it/s, step size=2.29e-01, acc. prob=0.959]


[outer 031] TRAIN (EMA+K-ens) ll=0.6455  br=0.2264  acc=0.6690


Sample: 100%|██████████| 330/330 [00:13, 24.50it/s, step size=2.37e-01, acc. prob=0.961]


[outer 032] TRAIN (EMA+K-ens) ll=0.6401  br=0.2237  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.40it/s, step size=3.19e-01, acc. prob=0.944]


[outer 033] TRAIN (EMA+K-ens) ll=0.6349  br=0.2213  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 24.24it/s, step size=2.91e-01, acc. prob=0.968]


[outer 034] TRAIN (EMA+K-ens) ll=0.6419  br=0.2246  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.50it/s, step size=3.13e-01, acc. prob=0.922]


[outer 035] TRAIN (EMA+K-ens) ll=0.6481  br=0.2275  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 24.37it/s, step size=2.60e-01, acc. prob=0.961]


[outer 036] TRAIN (EMA+K-ens) ll=0.6369  br=0.2222  acc=0.6910


Sample: 100%|██████████| 330/330 [00:13, 24.00it/s, step size=2.90e-01, acc. prob=0.953]


[outer 037] TRAIN (EMA+K-ens) ll=0.6511  br=0.2289  acc=0.6880


Sample: 100%|██████████| 330/330 [00:15, 21.43it/s, step size=2.80e-01, acc. prob=0.952]


[outer 038] TRAIN (EMA+K-ens) ll=0.6580  br=0.2322  acc=0.6880


Sample: 100%|██████████| 330/330 [00:12, 26.21it/s, step size=3.09e-01, acc. prob=0.932]

[outer 039] TRAIN (EMA+K-ens) ll=0.6563  br=0.2315  acc=0.6750
          accuracy     brier   logloss
mean      0.601351  0.265223  0.781542
std       0.023335  0.013889  0.081066
median    0.603720  0.266275  0.770433
<lambda>  0.580440  0.253691  0.707593
<lambda>  0.623880  0.277352  0.856865
    accuracy     brier   logloss
0    0.58252  0.284985  0.867139
1    0.57718  0.278085  0.853441
2    0.58110  0.270300  0.841767
3    0.58540  0.273499  0.874698
4    0.58310  0.283482  0.882864
5    0.57050  0.275668  0.806867
6    0.58218  0.277108  0.835483
7    0.57440  0.279853  0.893974
8    0.57846  0.271002  0.819351
9    0.57362  0.280378  0.892929
10   0.62388  0.243343  0.682571
11   0.62388  0.246363  0.689104
12   0.62348  0.262089  0.726670
13   0.62388  0.261240  0.725931
14   0.62406  0.262250  0.733998
15   0.62372  0.244719  0.684333
16   0.62388  0.254721  0.708690
17   0.62542  0.254283  0.710906
18   0.62432  0.249182  0.695819
19   0.62204  0.251912  0.704302





In [None]:

noise_type = "contaminated"
n_per_group_train = 200
use_long = False

for seed in range(10):

    np.random.seed(seed); torch.manual_seed(seed)
    df_train = simulate_dataset(
        noise_type=noise_type,
        n_per_group=200
    )
    res = fit_ksd_bayes_nuts_ema_ensemble(
        df_train, df_simulated_contaminated_test, feature_cols,
        interaction=False, nonlinear=False, group=False,
        n_outer=40, nuts_warmup=300, nuts_samples=30,
        beta_lr=0.01, target_accept_prob=0.90,
        device="cuda", verbose=True
    )
    # 마지막 θ로 test 예측 (또는 NUTS 마지막 50%로 p_mean만)
    p_test, m = predict_probit(res["final_theta"], df_simulated_contaminated_test, feature_cols, False, False, False)
    all_metrics.append(m)

# 집계
df = pd.DataFrame(all_metrics)
summary = df.agg(['mean','std','median',lambda s: s.quantile(0.25),lambda s: s.quantile(0.75)])
print(summary)
print(df)

Sample: 100%|██████████| 330/330 [00:15, 21.46it/s, step size=2.97e-01, acc. prob=0.959]


[outer 000] TRAIN (EMA+K-ens) ll=0.7828  br=0.2896  acc=0.4840


Sample: 100%|██████████| 330/330 [00:13, 23.66it/s, step size=2.96e-01, acc. prob=0.953]


[outer 001] TRAIN (EMA+K-ens) ll=0.7309  br=0.2661  acc=0.5760


Sample: 100%|██████████| 330/330 [00:14, 22.50it/s, step size=2.77e-01, acc. prob=0.959]


[outer 002] TRAIN (EMA+K-ens) ll=0.7443  br=0.2718  acc=0.6010


Sample: 100%|██████████| 330/330 [00:14, 23.14it/s, step size=2.61e-01, acc. prob=0.955]


[outer 003] TRAIN (EMA+K-ens) ll=0.7263  br=0.2634  acc=0.6590


Sample: 100%|██████████| 330/330 [00:14, 22.87it/s, step size=2.26e-01, acc. prob=0.951]


[outer 004] TRAIN (EMA+K-ens) ll=0.7062  br=0.2547  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 23.49it/s, step size=2.76e-01, acc. prob=0.956]


[outer 005] TRAIN (EMA+K-ens) ll=0.7033  br=0.2534  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=2.94e-01, acc. prob=0.916]


[outer 006] TRAIN (EMA+K-ens) ll=0.6948  br=0.2494  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.46it/s, step size=2.46e-01, acc. prob=0.964]


[outer 007] TRAIN (EMA+K-ens) ll=0.6880  br=0.2462  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.08it/s, step size=2.66e-01, acc. prob=0.924]


[outer 008] TRAIN (EMA+K-ens) ll=0.6756  br=0.2404  acc=0.6950


Sample: 100%|██████████| 330/330 [00:16, 20.43it/s, step size=2.75e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.6737  br=0.2394  acc=0.6960


Sample: 100%|██████████| 330/330 [00:16, 19.70it/s, step size=2.50e-01, acc. prob=0.950]


[outer 010] TRAIN (EMA+K-ens) ll=0.6518  br=0.2292  acc=0.6940


Sample: 100%|██████████| 330/330 [00:15, 20.81it/s, step size=2.15e-01, acc. prob=0.976]


[outer 011] TRAIN (EMA+K-ens) ll=0.6438  br=0.2254  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.91it/s, step size=3.12e-01, acc. prob=0.971]


[outer 012] TRAIN (EMA+K-ens) ll=0.6417  br=0.2244  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.03it/s, step size=3.12e-01, acc. prob=0.930]


[outer 013] TRAIN (EMA+K-ens) ll=0.6395  br=0.2234  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.09it/s, step size=3.09e-01, acc. prob=0.956]


[outer 014] TRAIN (EMA+K-ens) ll=0.6424  br=0.2248  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 25.28it/s, step size=2.68e-01, acc. prob=0.962]


[outer 015] TRAIN (EMA+K-ens) ll=0.6410  br=0.2241  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 23.75it/s, step size=2.90e-01, acc. prob=0.952]


[outer 016] TRAIN (EMA+K-ens) ll=0.6311  br=0.2193  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 23.95it/s, step size=2.67e-01, acc. prob=0.945]


[outer 017] TRAIN (EMA+K-ens) ll=0.6300  br=0.2189  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.37it/s, step size=2.80e-01, acc. prob=0.957]


[outer 018] TRAIN (EMA+K-ens) ll=0.6329  br=0.2202  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=3.16e-01, acc. prob=0.942]


[outer 019] TRAIN (EMA+K-ens) ll=0.6443  br=0.2257  acc=0.6970


Sample: 100%|██████████| 330/330 [00:12, 25.71it/s, step size=2.64e-01, acc. prob=0.949]


[outer 020] TRAIN (EMA+K-ens) ll=0.6548  br=0.2304  acc=0.6810


Sample: 100%|██████████| 330/330 [00:15, 21.53it/s, step size=2.45e-01, acc. prob=0.946]


[outer 021] TRAIN (EMA+K-ens) ll=0.6508  br=0.2287  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.24it/s, step size=3.04e-01, acc. prob=0.916]


[outer 022] TRAIN (EMA+K-ens) ll=0.6477  br=0.2272  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 24.79it/s, step size=3.02e-01, acc. prob=0.926]


[outer 023] TRAIN (EMA+K-ens) ll=0.6512  br=0.2290  acc=0.6960


Sample: 100%|██████████| 330/330 [00:14, 22.52it/s, step size=3.08e-01, acc. prob=0.932]


[outer 024] TRAIN (EMA+K-ens) ll=0.6640  br=0.2350  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.73it/s, step size=3.00e-01, acc. prob=0.954]


[outer 025] TRAIN (EMA+K-ens) ll=0.6613  br=0.2336  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.12it/s, step size=2.88e-01, acc. prob=0.942]


[outer 026] TRAIN (EMA+K-ens) ll=0.6684  br=0.2370  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 23.19it/s, step size=3.39e-01, acc. prob=0.922]


[outer 027] TRAIN (EMA+K-ens) ll=0.6599  br=0.2330  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 23.99it/s, step size=2.99e-01, acc. prob=0.927]


[outer 028] TRAIN (EMA+K-ens) ll=0.6542  br=0.2303  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.16it/s, step size=2.74e-01, acc. prob=0.959]


[outer 029] TRAIN (EMA+K-ens) ll=0.6458  br=0.2264  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=2.86e-01, acc. prob=0.968]


[outer 030] TRAIN (EMA+K-ens) ll=0.6376  br=0.2224  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.46it/s, step size=2.65e-01, acc. prob=0.957]


[outer 031] TRAIN (EMA+K-ens) ll=0.6432  br=0.2251  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.32it/s, step size=2.86e-01, acc. prob=0.948]


[outer 032] TRAIN (EMA+K-ens) ll=0.6462  br=0.2266  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.18it/s, step size=3.08e-01, acc. prob=0.927]


[outer 033] TRAIN (EMA+K-ens) ll=0.6599  br=0.2330  acc=0.6810


Sample: 100%|██████████| 330/330 [00:15, 21.56it/s, step size=2.58e-01, acc. prob=0.952]


[outer 034] TRAIN (EMA+K-ens) ll=0.6593  br=0.2328  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 23.40it/s, step size=2.63e-01, acc. prob=0.944]


[outer 035] TRAIN (EMA+K-ens) ll=0.6669  br=0.2361  acc=0.6650


Sample: 100%|██████████| 330/330 [00:13, 23.86it/s, step size=2.81e-01, acc. prob=0.937]


[outer 036] TRAIN (EMA+K-ens) ll=0.6670  br=0.2363  acc=0.6570


Sample: 100%|██████████| 330/330 [00:14, 23.53it/s, step size=2.99e-01, acc. prob=0.913]


[outer 037] TRAIN (EMA+K-ens) ll=0.6761  br=0.2405  acc=0.6510


Sample: 100%|██████████| 330/330 [00:13, 24.05it/s, step size=2.47e-01, acc. prob=0.972]


[outer 038] TRAIN (EMA+K-ens) ll=0.6848  br=0.2444  acc=0.6650


Sample: 100%|██████████| 330/330 [00:16, 20.11it/s, step size=2.65e-01, acc. prob=0.963]


[outer 039] TRAIN (EMA+K-ens) ll=0.6747  br=0.2397  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.53it/s, step size=3.03e-01, acc. prob=0.932]


[outer 000] TRAIN (EMA+K-ens) ll=0.6243  br=0.2159  acc=0.7500


Sample: 100%|██████████| 330/330 [00:12, 26.72it/s, step size=2.73e-01, acc. prob=0.955]


[outer 001] TRAIN (EMA+K-ens) ll=0.6429  br=0.2252  acc=0.7140


Sample: 100%|██████████| 330/330 [00:13, 24.03it/s, step size=2.94e-01, acc. prob=0.931]


[outer 002] TRAIN (EMA+K-ens) ll=0.6265  br=0.2173  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 21.96it/s, step size=2.35e-01, acc. prob=0.960]


[outer 003] TRAIN (EMA+K-ens) ll=0.6437  br=0.2255  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 25.29it/s, step size=2.82e-01, acc. prob=0.955]


[outer 004] TRAIN (EMA+K-ens) ll=0.6409  br=0.2241  acc=0.6800


Sample: 100%|██████████| 330/330 [00:14, 23.29it/s, step size=2.42e-01, acc. prob=0.959]


[outer 005] TRAIN (EMA+K-ens) ll=0.6448  br=0.2261  acc=0.6770


Sample: 100%|██████████| 330/330 [00:15, 21.52it/s, step size=2.52e-01, acc. prob=0.957]


[outer 006] TRAIN (EMA+K-ens) ll=0.6442  br=0.2258  acc=0.6760


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=1.99e-01, acc. prob=0.971]


[outer 007] TRAIN (EMA+K-ens) ll=0.6486  br=0.2279  acc=0.6790


Sample: 100%|██████████| 330/330 [00:13, 23.93it/s, step size=2.82e-01, acc. prob=0.940]


[outer 008] TRAIN (EMA+K-ens) ll=0.6478  br=0.2275  acc=0.6790


Sample: 100%|██████████| 330/330 [00:13, 24.64it/s, step size=2.93e-01, acc. prob=0.924]


[outer 009] TRAIN (EMA+K-ens) ll=0.6549  br=0.2309  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=2.39e-01, acc. prob=0.971]


[outer 010] TRAIN (EMA+K-ens) ll=0.6610  br=0.2339  acc=0.6780


Sample: 100%|██████████| 330/330 [00:12, 27.36it/s, step size=3.25e-01, acc. prob=0.949]


[outer 011] TRAIN (EMA+K-ens) ll=0.6575  br=0.2323  acc=0.6760


Sample: 100%|██████████| 330/330 [00:16, 20.08it/s, step size=2.31e-01, acc. prob=0.965]


[outer 012] TRAIN (EMA+K-ens) ll=0.6625  br=0.2347  acc=0.6650


Sample: 100%|██████████| 330/330 [00:14, 22.78it/s, step size=2.77e-01, acc. prob=0.948]


[outer 013] TRAIN (EMA+K-ens) ll=0.6609  br=0.2339  acc=0.6730


Sample: 100%|██████████| 330/330 [00:15, 21.86it/s, step size=2.45e-01, acc. prob=0.966]


[outer 014] TRAIN (EMA+K-ens) ll=0.6564  br=0.2317  acc=0.6780


Sample: 100%|██████████| 330/330 [00:13, 24.48it/s, step size=3.24e-01, acc. prob=0.923]


[outer 015] TRAIN (EMA+K-ens) ll=0.6551  br=0.2310  acc=0.6780


Sample: 100%|██████████| 330/330 [00:15, 21.11it/s, step size=2.47e-01, acc. prob=0.957]


[outer 016] TRAIN (EMA+K-ens) ll=0.6634  br=0.2350  acc=0.6790


Sample: 100%|██████████| 330/330 [00:12, 25.49it/s, step size=2.73e-01, acc. prob=0.950]


[outer 017] TRAIN (EMA+K-ens) ll=0.6688  br=0.2376  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 22.64it/s, step size=2.94e-01, acc. prob=0.940]


[outer 018] TRAIN (EMA+K-ens) ll=0.6788  br=0.2423  acc=0.6780


Sample: 100%|██████████| 330/330 [00:12, 25.78it/s, step size=2.27e-01, acc. prob=0.968]


[outer 019] TRAIN (EMA+K-ens) ll=0.6835  br=0.2445  acc=0.6720


Sample: 100%|██████████| 330/330 [00:14, 22.22it/s, step size=3.25e-01, acc. prob=0.943]


[outer 020] TRAIN (EMA+K-ens) ll=0.6892  br=0.2471  acc=0.6720


Sample: 100%|██████████| 330/330 [00:14, 22.68it/s, step size=3.00e-01, acc. prob=0.910]


[outer 021] TRAIN (EMA+K-ens) ll=0.6943  br=0.2495  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 25.04it/s, step size=2.71e-01, acc. prob=0.961]


[outer 022] TRAIN (EMA+K-ens) ll=0.7107  br=0.2571  acc=0.6420


Sample: 100%|██████████| 330/330 [00:14, 23.37it/s, step size=3.08e-01, acc. prob=0.942]


[outer 023] TRAIN (EMA+K-ens) ll=0.7122  br=0.2577  acc=0.6400


Sample: 100%|██████████| 330/330 [00:13, 23.62it/s, step size=2.84e-01, acc. prob=0.949]


[outer 024] TRAIN (EMA+K-ens) ll=0.7079  br=0.2556  acc=0.6440


Sample: 100%|██████████| 330/330 [00:15, 21.73it/s, step size=2.89e-01, acc. prob=0.950]


[outer 025] TRAIN (EMA+K-ens) ll=0.6933  br=0.2487  acc=0.6680


Sample: 100%|██████████| 330/330 [00:12, 25.57it/s, step size=2.67e-01, acc. prob=0.960]


[outer 026] TRAIN (EMA+K-ens) ll=0.6912  br=0.2477  acc=0.6680


Sample: 100%|██████████| 330/330 [00:13, 23.91it/s, step size=2.83e-01, acc. prob=0.955]


[outer 027] TRAIN (EMA+K-ens) ll=0.6886  br=0.2465  acc=0.6660


Sample: 100%|██████████| 330/330 [00:15, 21.63it/s, step size=2.65e-01, acc. prob=0.959]


[outer 028] TRAIN (EMA+K-ens) ll=0.6856  br=0.2450  acc=0.6580


Sample: 100%|██████████| 330/330 [00:12, 26.16it/s, step size=2.89e-01, acc. prob=0.930]


[outer 029] TRAIN (EMA+K-ens) ll=0.6863  br=0.2455  acc=0.6640


Sample: 100%|██████████| 330/330 [00:14, 22.39it/s, step size=3.09e-01, acc. prob=0.928]


[outer 030] TRAIN (EMA+K-ens) ll=0.6754  br=0.2403  acc=0.6620


Sample: 100%|██████████| 330/330 [00:13, 23.60it/s, step size=3.19e-01, acc. prob=0.937]


[outer 031] TRAIN (EMA+K-ens) ll=0.6787  br=0.2418  acc=0.6680


Sample: 100%|██████████| 330/330 [00:16, 20.54it/s, step size=2.29e-01, acc. prob=0.970]


[outer 032] TRAIN (EMA+K-ens) ll=0.6749  br=0.2401  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 22.56it/s, step size=2.57e-01, acc. prob=0.963]


[outer 033] TRAIN (EMA+K-ens) ll=0.6768  br=0.2411  acc=0.6790


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=3.48e-01, acc. prob=0.916]


[outer 034] TRAIN (EMA+K-ens) ll=0.6753  br=0.2405  acc=0.6760


Sample: 100%|██████████| 330/330 [00:14, 23.35it/s, step size=2.87e-01, acc. prob=0.945]


[outer 035] TRAIN (EMA+K-ens) ll=0.6657  br=0.2358  acc=0.6800


Sample: 100%|██████████| 330/330 [00:13, 23.59it/s, step size=2.54e-01, acc. prob=0.955]


[outer 036] TRAIN (EMA+K-ens) ll=0.6621  br=0.2341  acc=0.6820


Sample: 100%|██████████| 330/330 [00:13, 23.74it/s, step size=2.43e-01, acc. prob=0.969]


[outer 037] TRAIN (EMA+K-ens) ll=0.6639  br=0.2349  acc=0.6800


Sample: 100%|██████████| 330/330 [00:13, 24.36it/s, step size=3.08e-01, acc. prob=0.957]


[outer 038] TRAIN (EMA+K-ens) ll=0.6665  br=0.2362  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 23.02it/s, step size=3.11e-01, acc. prob=0.948]


[outer 039] TRAIN (EMA+K-ens) ll=0.6626  br=0.2345  acc=0.6780


Sample: 100%|██████████| 330/330 [00:12, 26.45it/s, step size=3.24e-01, acc. prob=0.932]


[outer 000] TRAIN (EMA+K-ens) ll=0.7005  br=0.2524  acc=0.6230


Sample: 100%|██████████| 330/330 [00:12, 25.72it/s, step size=2.50e-01, acc. prob=0.971]


[outer 001] TRAIN (EMA+K-ens) ll=0.7033  br=0.2529  acc=0.6960


Sample: 100%|██████████| 330/330 [00:14, 22.26it/s, step size=2.40e-01, acc. prob=0.958]


[outer 002] TRAIN (EMA+K-ens) ll=0.6899  br=0.2468  acc=0.6920


Sample: 100%|██████████| 330/330 [00:12, 25.53it/s, step size=3.49e-01, acc. prob=0.935]


[outer 003] TRAIN (EMA+K-ens) ll=0.6991  br=0.2509  acc=0.6970


Sample: 100%|██████████| 330/330 [00:13, 23.61it/s, step size=2.95e-01, acc. prob=0.943]


[outer 004] TRAIN (EMA+K-ens) ll=0.6985  br=0.2507  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.22it/s, step size=2.80e-01, acc. prob=0.952]


[outer 005] TRAIN (EMA+K-ens) ll=0.6864  br=0.2454  acc=0.6790


Sample: 100%|██████████| 330/330 [00:13, 24.06it/s, step size=3.58e-01, acc. prob=0.916]


[outer 006] TRAIN (EMA+K-ens) ll=0.6829  br=0.2438  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.25it/s, step size=3.29e-01, acc. prob=0.949]


[outer 007] TRAIN (EMA+K-ens) ll=0.6731  br=0.2391  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.62it/s, step size=2.67e-01, acc. prob=0.960]


[outer 008] TRAIN (EMA+K-ens) ll=0.6623  br=0.2340  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 23.32it/s, step size=2.47e-01, acc. prob=0.963]


[outer 009] TRAIN (EMA+K-ens) ll=0.6587  br=0.2323  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.42it/s, step size=2.52e-01, acc. prob=0.963]


[outer 010] TRAIN (EMA+K-ens) ll=0.6592  br=0.2325  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 21.24it/s, step size=2.54e-01, acc. prob=0.951]


[outer 011] TRAIN (EMA+K-ens) ll=0.6493  br=0.2279  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.64it/s, step size=2.49e-01, acc. prob=0.961]


[outer 012] TRAIN (EMA+K-ens) ll=0.6419  br=0.2244  acc=0.7040


Sample: 100%|██████████| 330/330 [00:13, 23.87it/s, step size=3.00e-01, acc. prob=0.938]


[outer 013] TRAIN (EMA+K-ens) ll=0.6450  br=0.2256  acc=0.7030


Sample: 100%|██████████| 330/330 [00:13, 25.23it/s, step size=2.88e-01, acc. prob=0.954]


[outer 014] TRAIN (EMA+K-ens) ll=0.6455  br=0.2257  acc=0.7060


Sample: 100%|██████████| 330/330 [00:15, 21.16it/s, step size=2.91e-01, acc. prob=0.937]


[outer 015] TRAIN (EMA+K-ens) ll=0.6506  br=0.2280  acc=0.7080


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=3.13e-01, acc. prob=0.938]


[outer 016] TRAIN (EMA+K-ens) ll=0.6594  br=0.2320  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.43it/s, step size=3.33e-01, acc. prob=0.953]


[outer 017] TRAIN (EMA+K-ens) ll=0.6523  br=0.2289  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 25.08it/s, step size=2.72e-01, acc. prob=0.937]


[outer 018] TRAIN (EMA+K-ens) ll=0.6570  br=0.2312  acc=0.7060


Sample: 100%|██████████| 330/330 [00:11, 27.71it/s, step size=3.02e-01, acc. prob=0.951]


[outer 019] TRAIN (EMA+K-ens) ll=0.6527  br=0.2290  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.72it/s, step size=2.66e-01, acc. prob=0.940]


[outer 020] TRAIN (EMA+K-ens) ll=0.6520  br=0.2289  acc=0.7060


Sample: 100%|██████████| 330/330 [00:14, 22.36it/s, step size=2.43e-01, acc. prob=0.956]


[outer 021] TRAIN (EMA+K-ens) ll=0.6410  br=0.2237  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.78it/s, step size=3.04e-01, acc. prob=0.923]


[outer 022] TRAIN (EMA+K-ens) ll=0.6407  br=0.2236  acc=0.7060


Sample: 100%|██████████| 330/330 [00:13, 24.59it/s, step size=3.21e-01, acc. prob=0.889]


[outer 023] TRAIN (EMA+K-ens) ll=0.6532  br=0.2295  acc=0.7060
[Early stop @ outer 23] Δll=0.042%, Δbr=0.075%, Δacc=0.000


Sample: 100%|██████████| 330/330 [00:13, 24.22it/s, step size=3.05e-01, acc. prob=0.928]


[outer 000] TRAIN (EMA+K-ens) ll=0.6458  br=0.2266  acc=0.6630


Sample: 100%|██████████| 330/330 [00:13, 23.87it/s, step size=2.28e-01, acc. prob=0.955]


[outer 001] TRAIN (EMA+K-ens) ll=0.6366  br=0.2219  acc=0.7110


Sample: 100%|██████████| 330/330 [00:13, 24.70it/s, step size=2.68e-01, acc. prob=0.923]


[outer 002] TRAIN (EMA+K-ens) ll=0.6171  br=0.2130  acc=0.7190


Sample: 100%|██████████| 330/330 [00:13, 24.35it/s, step size=3.27e-01, acc. prob=0.923]


[outer 003] TRAIN (EMA+K-ens) ll=0.6303  br=0.2189  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 22.19it/s, step size=2.90e-01, acc. prob=0.961]


[outer 004] TRAIN (EMA+K-ens) ll=0.6299  br=0.2187  acc=0.7150


Sample: 100%|██████████| 330/330 [00:12, 25.72it/s, step size=2.77e-01, acc. prob=0.953]


[outer 005] TRAIN (EMA+K-ens) ll=0.6273  br=0.2175  acc=0.7150


Sample: 100%|██████████| 330/330 [00:13, 24.84it/s, step size=2.82e-01, acc. prob=0.953]


[outer 006] TRAIN (EMA+K-ens) ll=0.6264  br=0.2171  acc=0.7160


Sample: 100%|██████████| 330/330 [00:14, 22.64it/s, step size=3.41e-01, acc. prob=0.930]


[outer 007] TRAIN (EMA+K-ens) ll=0.6247  br=0.2163  acc=0.7160


Sample: 100%|██████████| 330/330 [00:13, 24.04it/s, step size=3.26e-01, acc. prob=0.918]


[outer 008] TRAIN (EMA+K-ens) ll=0.6335  br=0.2205  acc=0.7000


Sample: 100%|██████████| 330/330 [00:15, 21.80it/s, step size=2.72e-01, acc. prob=0.950]


[outer 009] TRAIN (EMA+K-ens) ll=0.6322  br=0.2199  acc=0.7020


Sample: 100%|██████████| 330/330 [00:13, 24.61it/s, step size=2.82e-01, acc. prob=0.942]


[outer 010] TRAIN (EMA+K-ens) ll=0.6453  br=0.2259  acc=0.6970


Sample: 100%|██████████| 330/330 [00:15, 21.91it/s, step size=2.38e-01, acc. prob=0.932]


[outer 011] TRAIN (EMA+K-ens) ll=0.6367  br=0.2219  acc=0.7040


Sample: 100%|██████████| 330/330 [00:12, 25.76it/s, step size=2.83e-01, acc. prob=0.962]


[outer 012] TRAIN (EMA+K-ens) ll=0.6421  br=0.2244  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 24.87it/s, step size=3.16e-01, acc. prob=0.946]


[outer 013] TRAIN (EMA+K-ens) ll=0.6454  br=0.2258  acc=0.7070


Sample: 100%|██████████| 330/330 [00:14, 22.81it/s, step size=2.14e-01, acc. prob=0.970]


[outer 014] TRAIN (EMA+K-ens) ll=0.6487  br=0.2272  acc=0.7100


Sample: 100%|██████████| 330/330 [00:12, 25.88it/s, step size=2.88e-01, acc. prob=0.945]


[outer 015] TRAIN (EMA+K-ens) ll=0.6381  br=0.2222  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 23.22it/s, step size=3.09e-01, acc. prob=0.948]


[outer 016] TRAIN (EMA+K-ens) ll=0.6421  br=0.2240  acc=0.7120


Sample: 100%|██████████| 330/330 [00:12, 25.70it/s, step size=2.64e-01, acc. prob=0.949]


[outer 017] TRAIN (EMA+K-ens) ll=0.6459  br=0.2259  acc=0.7160


Sample: 100%|██████████| 330/330 [00:13, 24.36it/s, step size=3.37e-01, acc. prob=0.902]


[outer 018] TRAIN (EMA+K-ens) ll=0.6454  br=0.2256  acc=0.7150


Sample: 100%|██████████| 330/330 [00:15, 21.81it/s, step size=3.00e-01, acc. prob=0.952]


[outer 019] TRAIN (EMA+K-ens) ll=0.6490  br=0.2272  acc=0.7150


Sample: 100%|██████████| 330/330 [00:13, 23.63it/s, step size=2.91e-01, acc. prob=0.941]


[outer 020] TRAIN (EMA+K-ens) ll=0.6407  br=0.2234  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 22.71it/s, step size=2.71e-01, acc. prob=0.957]


[outer 021] TRAIN (EMA+K-ens) ll=0.6344  br=0.2204  acc=0.7150


Sample: 100%|██████████| 330/330 [00:13, 24.82it/s, step size=2.62e-01, acc. prob=0.949]


[outer 022] TRAIN (EMA+K-ens) ll=0.6455  br=0.2255  acc=0.7140


Sample: 100%|██████████| 330/330 [00:13, 23.85it/s, step size=2.50e-01, acc. prob=0.974]


[outer 023] TRAIN (EMA+K-ens) ll=0.6631  br=0.2334  acc=0.7050


Sample: 100%|██████████| 330/330 [00:15, 21.86it/s, step size=2.33e-01, acc. prob=0.969]


[outer 024] TRAIN (EMA+K-ens) ll=0.6580  br=0.2312  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 24.57it/s, step size=3.02e-01, acc. prob=0.906]


[outer 025] TRAIN (EMA+K-ens) ll=0.6572  br=0.2309  acc=0.7070


Sample: 100%|██████████| 330/330 [00:13, 25.24it/s, step size=2.32e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6451  br=0.2254  acc=0.7120


Sample: 100%|██████████| 330/330 [00:12, 26.58it/s, step size=2.92e-01, acc. prob=0.941]


[outer 027] TRAIN (EMA+K-ens) ll=0.6396  br=0.2230  acc=0.7120


Sample: 100%|██████████| 330/330 [00:13, 24.21it/s, step size=2.77e-01, acc. prob=0.966]


[outer 028] TRAIN (EMA+K-ens) ll=0.6478  br=0.2267  acc=0.7120


Sample: 100%|██████████| 330/330 [00:13, 23.86it/s, step size=2.94e-01, acc. prob=0.965]


[outer 029] TRAIN (EMA+K-ens) ll=0.6519  br=0.2286  acc=0.7150


Sample: 100%|██████████| 330/330 [00:13, 24.80it/s, step size=2.44e-01, acc. prob=0.964]


[outer 030] TRAIN (EMA+K-ens) ll=0.6310  br=0.2189  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 22.81it/s, step size=2.59e-01, acc. prob=0.972]


[outer 031] TRAIN (EMA+K-ens) ll=0.6395  br=0.2229  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 22.97it/s, step size=2.58e-01, acc. prob=0.927]


[outer 032] TRAIN (EMA+K-ens) ll=0.6332  br=0.2200  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 23.43it/s, step size=2.88e-01, acc. prob=0.958]


[outer 033] TRAIN (EMA+K-ens) ll=0.6277  br=0.2174  acc=0.7140


Sample: 100%|██████████| 330/330 [00:13, 24.02it/s, step size=2.86e-01, acc. prob=0.951]


[outer 034] TRAIN (EMA+K-ens) ll=0.6432  br=0.2245  acc=0.7150


Sample: 100%|██████████| 330/330 [00:12, 25.78it/s, step size=3.16e-01, acc. prob=0.942]


[outer 035] TRAIN (EMA+K-ens) ll=0.6483  br=0.2270  acc=0.7150


Sample: 100%|██████████| 330/330 [00:15, 21.51it/s, step size=2.42e-01, acc. prob=0.968]


[outer 036] TRAIN (EMA+K-ens) ll=0.6465  br=0.2262  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=2.91e-01, acc. prob=0.948]


[outer 037] TRAIN (EMA+K-ens) ll=0.6518  br=0.2285  acc=0.7140


Sample: 100%|██████████| 330/330 [00:14, 23.36it/s, step size=2.55e-01, acc. prob=0.961]


[outer 038] TRAIN (EMA+K-ens) ll=0.6699  br=0.2367  acc=0.7120


Sample: 100%|██████████| 330/330 [00:13, 24.21it/s, step size=2.93e-01, acc. prob=0.946]


[outer 039] TRAIN (EMA+K-ens) ll=0.6597  br=0.2321  acc=0.7150


Sample: 100%|██████████| 330/330 [00:14, 22.52it/s, step size=2.34e-01, acc. prob=0.966]


[outer 000] TRAIN (EMA+K-ens) ll=0.5953  br=0.2021  acc=0.7340


Sample: 100%|██████████| 330/330 [00:13, 24.20it/s, step size=3.37e-01, acc. prob=0.930]


[outer 001] TRAIN (EMA+K-ens) ll=0.6416  br=0.2245  acc=0.7130


Sample: 100%|██████████| 330/330 [00:15, 21.36it/s, step size=2.43e-01, acc. prob=0.961]


[outer 002] TRAIN (EMA+K-ens) ll=0.6443  br=0.2257  acc=0.7130


Sample: 100%|██████████| 330/330 [00:13, 24.86it/s, step size=2.82e-01, acc. prob=0.964]


[outer 003] TRAIN (EMA+K-ens) ll=0.6469  br=0.2269  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 22.85it/s, step size=2.74e-01, acc. prob=0.939]


[outer 004] TRAIN (EMA+K-ens) ll=0.6469  br=0.2269  acc=0.7120


Sample: 100%|██████████| 330/330 [00:15, 21.85it/s, step size=2.11e-01, acc. prob=0.961]


[outer 005] TRAIN (EMA+K-ens) ll=0.6501  br=0.2284  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.14e-01, acc. prob=0.954]


[outer 006] TRAIN (EMA+K-ens) ll=0.6487  br=0.2278  acc=0.7130


Sample: 100%|██████████| 330/330 [00:12, 26.27it/s, step size=2.78e-01, acc. prob=0.966]


[outer 007] TRAIN (EMA+K-ens) ll=0.6446  br=0.2257  acc=0.7130


Sample: 100%|██████████| 330/330 [00:12, 25.58it/s, step size=3.00e-01, acc. prob=0.928]


[outer 008] TRAIN (EMA+K-ens) ll=0.6544  br=0.2304  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 23.28it/s, step size=2.31e-01, acc. prob=0.947]


[outer 009] TRAIN (EMA+K-ens) ll=0.6431  br=0.2250  acc=0.7130


Sample: 100%|██████████| 330/330 [00:13, 24.24it/s, step size=2.41e-01, acc. prob=0.964]


[outer 010] TRAIN (EMA+K-ens) ll=0.6441  br=0.2254  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 22.93it/s, step size=2.67e-01, acc. prob=0.955]


[outer 011] TRAIN (EMA+K-ens) ll=0.6424  br=0.2245  acc=0.7130


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=3.14e-01, acc. prob=0.931]


[outer 012] TRAIN (EMA+K-ens) ll=0.6427  br=0.2245  acc=0.7130


Sample: 100%|██████████| 330/330 [00:15, 20.90it/s, step size=2.39e-01, acc. prob=0.961]


[outer 013] TRAIN (EMA+K-ens) ll=0.6405  br=0.2234  acc=0.7140


Sample: 100%|██████████| 330/330 [00:13, 25.10it/s, step size=2.92e-01, acc. prob=0.964]


[outer 014] TRAIN (EMA+K-ens) ll=0.6492  br=0.2273  acc=0.7150
[Early stop @ outer 14] Δll=0.151%, Δbr=0.341%, Δacc=0.001


Sample: 100%|██████████| 330/330 [00:13, 24.23it/s, step size=2.91e-01, acc. prob=0.945]


[outer 000] TRAIN (EMA+K-ens) ll=0.7674  br=0.2836  acc=0.4970


Sample: 100%|██████████| 330/330 [00:15, 21.34it/s, step size=3.22e-01, acc. prob=0.954]


[outer 001] TRAIN (EMA+K-ens) ll=0.7447  br=0.2736  acc=0.5270


Sample: 100%|██████████| 330/330 [00:12, 25.54it/s, step size=2.75e-01, acc. prob=0.949]


[outer 002] TRAIN (EMA+K-ens) ll=0.7193  br=0.2613  acc=0.6750


Sample: 100%|██████████| 330/330 [00:13, 23.66it/s, step size=3.30e-01, acc. prob=0.942]


[outer 003] TRAIN (EMA+K-ens) ll=0.7206  br=0.2623  acc=0.6420


Sample: 100%|██████████| 330/330 [00:12, 26.64it/s, step size=2.19e-01, acc. prob=0.949]


[outer 004] TRAIN (EMA+K-ens) ll=0.7150  br=0.2593  acc=0.6530


Sample: 100%|██████████| 330/330 [00:13, 23.62it/s, step size=2.77e-01, acc. prob=0.952]


[outer 005] TRAIN (EMA+K-ens) ll=0.7204  br=0.2616  acc=0.6540


Sample: 100%|██████████| 330/330 [00:13, 24.20it/s, step size=2.52e-01, acc. prob=0.947]


[outer 006] TRAIN (EMA+K-ens) ll=0.7107  br=0.2571  acc=0.6760


Sample: 100%|██████████| 330/330 [00:13, 23.80it/s, step size=3.26e-01, acc. prob=0.958]


[outer 007] TRAIN (EMA+K-ens) ll=0.7095  br=0.2565  acc=0.7010


Sample: 100%|██████████| 330/330 [00:13, 24.32it/s, step size=2.61e-01, acc. prob=0.949]


[outer 008] TRAIN (EMA+K-ens) ll=0.7029  br=0.2534  acc=0.7020


Sample: 100%|██████████| 330/330 [00:13, 24.40it/s, step size=3.13e-01, acc. prob=0.919]


[outer 009] TRAIN (EMA+K-ens) ll=0.7067  br=0.2551  acc=0.6990


Sample: 100%|██████████| 330/330 [00:13, 24.51it/s, step size=3.56e-01, acc. prob=0.938]


[outer 010] TRAIN (EMA+K-ens) ll=0.7019  br=0.2529  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 22.83it/s, step size=2.75e-01, acc. prob=0.953]


[outer 011] TRAIN (EMA+K-ens) ll=0.6977  br=0.2506  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 23.26it/s, step size=3.16e-01, acc. prob=0.929]


[outer 012] TRAIN (EMA+K-ens) ll=0.7004  br=0.2518  acc=0.6870


Sample: 100%|██████████| 330/330 [00:12, 25.89it/s, step size=3.06e-01, acc. prob=0.942]


[outer 013] TRAIN (EMA+K-ens) ll=0.6913  br=0.2478  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.49it/s, step size=2.60e-01, acc. prob=0.942]


[outer 014] TRAIN (EMA+K-ens) ll=0.6961  br=0.2499  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 23.91it/s, step size=3.07e-01, acc. prob=0.945]


[outer 015] TRAIN (EMA+K-ens) ll=0.6861  br=0.2452  acc=0.6670


Sample: 100%|██████████| 330/330 [00:15, 21.28it/s, step size=2.71e-01, acc. prob=0.968]


[outer 016] TRAIN (EMA+K-ens) ll=0.6808  br=0.2428  acc=0.6730


Sample: 100%|██████████| 330/330 [00:14, 23.14it/s, step size=2.58e-01, acc. prob=0.961]


[outer 017] TRAIN (EMA+K-ens) ll=0.6747  br=0.2401  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 22.91it/s, step size=2.10e-01, acc. prob=0.969]


[outer 018] TRAIN (EMA+K-ens) ll=0.6733  br=0.2393  acc=0.7000


Sample: 100%|██████████| 330/330 [00:14, 23.01it/s, step size=2.64e-01, acc. prob=0.964]


[outer 019] TRAIN (EMA+K-ens) ll=0.6719  br=0.2388  acc=0.6590


Sample: 100%|██████████| 330/330 [00:14, 23.40it/s, step size=2.59e-01, acc. prob=0.957]


[outer 020] TRAIN (EMA+K-ens) ll=0.6555  br=0.2312  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 23.30it/s, step size=2.48e-01, acc. prob=0.944]


[outer 021] TRAIN (EMA+K-ens) ll=0.6533  br=0.2300  acc=0.7020


Sample: 100%|██████████| 330/330 [00:13, 24.39it/s, step size=3.05e-01, acc. prob=0.948]


[outer 022] TRAIN (EMA+K-ens) ll=0.6462  br=0.2265  acc=0.7020


Sample: 100%|██████████| 330/330 [00:12, 26.68it/s, step size=2.85e-01, acc. prob=0.962]


[outer 023] TRAIN (EMA+K-ens) ll=0.6522  br=0.2293  acc=0.7000


Sample: 100%|██████████| 330/330 [00:13, 25.36it/s, step size=2.68e-01, acc. prob=0.967]


[outer 024] TRAIN (EMA+K-ens) ll=0.6496  br=0.2280  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.23it/s, step size=2.55e-01, acc. prob=0.971]


[outer 025] TRAIN (EMA+K-ens) ll=0.6455  br=0.2260  acc=0.7020


Sample: 100%|██████████| 330/330 [00:12, 25.48it/s, step size=2.59e-01, acc. prob=0.934]


[outer 026] TRAIN (EMA+K-ens) ll=0.6477  br=0.2270  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 23.26it/s, step size=2.57e-01, acc. prob=0.941]


[outer 027] TRAIN (EMA+K-ens) ll=0.6521  br=0.2291  acc=0.7000


Sample: 100%|██████████| 330/330 [00:16, 20.29it/s, step size=2.65e-01, acc. prob=0.977]


[outer 028] TRAIN (EMA+K-ens) ll=0.6688  br=0.2368  acc=0.6820


Sample: 100%|██████████| 330/330 [00:14, 22.42it/s, step size=2.33e-01, acc. prob=0.965]


[outer 029] TRAIN (EMA+K-ens) ll=0.6710  br=0.2377  acc=0.6840


Sample: 100%|██████████| 330/330 [00:13, 23.79it/s, step size=2.70e-01, acc. prob=0.962]


[outer 030] TRAIN (EMA+K-ens) ll=0.6702  br=0.2374  acc=0.7010


Sample: 100%|██████████| 330/330 [00:13, 24.02it/s, step size=2.51e-01, acc. prob=0.955]


[outer 031] TRAIN (EMA+K-ens) ll=0.6827  br=0.2431  acc=0.7020


Sample: 100%|██████████| 330/330 [00:12, 25.91it/s, step size=3.07e-01, acc. prob=0.952]


[outer 032] TRAIN (EMA+K-ens) ll=0.6818  br=0.2428  acc=0.6860


Sample: 100%|██████████| 330/330 [00:13, 25.00it/s, step size=3.41e-01, acc. prob=0.941]


[outer 033] TRAIN (EMA+K-ens) ll=0.6754  br=0.2399  acc=0.6990


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.73e-01, acc. prob=0.960]


[outer 034] TRAIN (EMA+K-ens) ll=0.6849  br=0.2441  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 23.59it/s, step size=2.93e-01, acc. prob=0.930]


[outer 035] TRAIN (EMA+K-ens) ll=0.6809  br=0.2422  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.61it/s, step size=3.37e-01, acc. prob=0.930]


[outer 036] TRAIN (EMA+K-ens) ll=0.6600  br=0.2329  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=3.06e-01, acc. prob=0.952]


[outer 037] TRAIN (EMA+K-ens) ll=0.6553  br=0.2307  acc=0.7020


Sample: 100%|██████████| 330/330 [00:13, 25.11it/s, step size=3.42e-01, acc. prob=0.930]


[outer 038] TRAIN (EMA+K-ens) ll=0.6605  br=0.2331  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.98it/s, step size=2.27e-01, acc. prob=0.952]


[outer 039] TRAIN (EMA+K-ens) ll=0.6525  br=0.2294  acc=0.7010


Sample: 100%|██████████| 330/330 [00:14, 22.04it/s, step size=2.90e-01, acc. prob=0.940]


[outer 000] TRAIN (EMA+K-ens) ll=0.7145  br=0.2589  acc=0.5280


Sample: 100%|██████████| 330/330 [00:13, 24.44it/s, step size=3.16e-01, acc. prob=0.952]


[outer 001] TRAIN (EMA+K-ens) ll=0.6829  br=0.2448  acc=0.5690


Sample: 100%|██████████| 330/330 [00:13, 24.03it/s, step size=2.84e-01, acc. prob=0.943]


[outer 002] TRAIN (EMA+K-ens) ll=0.6690  br=0.2377  acc=0.6420


Sample: 100%|██████████| 330/330 [00:13, 23.75it/s, step size=2.67e-01, acc. prob=0.953]


[outer 003] TRAIN (EMA+K-ens) ll=0.6634  br=0.2349  acc=0.6820


Sample: 100%|██████████| 330/330 [00:12, 26.76it/s, step size=2.85e-01, acc. prob=0.959]


[outer 004] TRAIN (EMA+K-ens) ll=0.6541  br=0.2305  acc=0.6780


Sample: 100%|██████████| 330/330 [00:14, 22.12it/s, step size=3.26e-01, acc. prob=0.948]


[outer 005] TRAIN (EMA+K-ens) ll=0.6393  br=0.2233  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 23.95it/s, step size=3.38e-01, acc. prob=0.949]


[outer 006] TRAIN (EMA+K-ens) ll=0.6300  br=0.2188  acc=0.6990


Sample: 100%|██████████| 330/330 [00:13, 25.31it/s, step size=3.15e-01, acc. prob=0.946]


[outer 007] TRAIN (EMA+K-ens) ll=0.6377  br=0.2225  acc=0.7050


Sample: 100%|██████████| 330/330 [00:13, 24.35it/s, step size=2.31e-01, acc. prob=0.951]


[outer 008] TRAIN (EMA+K-ens) ll=0.6195  br=0.2138  acc=0.7180


Sample: 100%|██████████| 330/330 [00:12, 26.36it/s, step size=2.53e-01, acc. prob=0.959]


[outer 009] TRAIN (EMA+K-ens) ll=0.6296  br=0.2186  acc=0.7100


Sample: 100%|██████████| 330/330 [00:14, 23.45it/s, step size=3.08e-01, acc. prob=0.919]


[outer 010] TRAIN (EMA+K-ens) ll=0.6316  br=0.2196  acc=0.7020


Sample: 100%|██████████| 330/330 [00:14, 22.62it/s, step size=3.01e-01, acc. prob=0.882]


[outer 011] TRAIN (EMA+K-ens) ll=0.6350  br=0.2212  acc=0.7050


Sample: 100%|██████████| 330/330 [00:12, 25.60it/s, step size=3.08e-01, acc. prob=0.931]


[outer 012] TRAIN (EMA+K-ens) ll=0.6382  br=0.2227  acc=0.7030


Sample: 100%|██████████| 330/330 [00:13, 23.78it/s, step size=3.60e-01, acc. prob=0.951]


[outer 013] TRAIN (EMA+K-ens) ll=0.6539  br=0.2302  acc=0.7010


Sample: 100%|██████████| 330/330 [00:13, 23.60it/s, step size=2.27e-01, acc. prob=0.978]


[outer 014] TRAIN (EMA+K-ens) ll=0.6702  br=0.2381  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.61it/s, step size=3.18e-01, acc. prob=0.964]


[outer 015] TRAIN (EMA+K-ens) ll=0.6603  br=0.2334  acc=0.6950


Sample: 100%|██████████| 330/330 [00:12, 25.51it/s, step size=3.14e-01, acc. prob=0.923]


[outer 016] TRAIN (EMA+K-ens) ll=0.6845  br=0.2449  acc=0.6830


Sample: 100%|██████████| 330/330 [00:12, 25.98it/s, step size=3.43e-01, acc. prob=0.943]


[outer 017] TRAIN (EMA+K-ens) ll=0.6749  br=0.2403  acc=0.6730


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=2.48e-01, acc. prob=0.943]


[outer 018] TRAIN (EMA+K-ens) ll=0.6718  br=0.2387  acc=0.6880


Sample: 100%|██████████| 330/330 [00:13, 24.95it/s, step size=2.71e-01, acc. prob=0.954]


[outer 019] TRAIN (EMA+K-ens) ll=0.6749  br=0.2402  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.39it/s, step size=2.39e-01, acc. prob=0.948]


[outer 020] TRAIN (EMA+K-ens) ll=0.6784  br=0.2417  acc=0.6680


Sample: 100%|██████████| 330/330 [00:14, 23.02it/s, step size=2.84e-01, acc. prob=0.932]


[outer 021] TRAIN (EMA+K-ens) ll=0.6689  br=0.2372  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.89it/s, step size=2.45e-01, acc. prob=0.960]


[outer 022] TRAIN (EMA+K-ens) ll=0.6625  br=0.2343  acc=0.6960


Sample: 100%|██████████| 330/330 [00:13, 23.64it/s, step size=2.92e-01, acc. prob=0.924]


[outer 023] TRAIN (EMA+K-ens) ll=0.6617  br=0.2339  acc=0.6960


Sample: 100%|██████████| 330/330 [00:15, 21.33it/s, step size=2.82e-01, acc. prob=0.936]


[outer 024] TRAIN (EMA+K-ens) ll=0.6508  br=0.2287  acc=0.6960


Sample: 100%|██████████| 330/330 [00:14, 22.34it/s, step size=3.06e-01, acc. prob=0.934]


[outer 025] TRAIN (EMA+K-ens) ll=0.6449  br=0.2259  acc=0.6960
[Early stop @ outer 25] Δll=0.249%, Δbr=0.387%, Δacc=0.002


Sample: 100%|██████████| 330/330 [00:14, 22.90it/s, step size=2.27e-01, acc. prob=0.968]


[outer 000] TRAIN (EMA+K-ens) ll=0.6810  br=0.2437  acc=0.6420


Sample: 100%|██████████| 330/330 [00:14, 22.39it/s, step size=2.65e-01, acc. prob=0.958]


[outer 001] TRAIN (EMA+K-ens) ll=0.7298  br=0.2657  acc=0.6170


Sample: 100%|██████████| 330/330 [00:15, 21.82it/s, step size=2.74e-01, acc. prob=0.949]


[outer 002] TRAIN (EMA+K-ens) ll=0.7145  br=0.2591  acc=0.6100


Sample: 100%|██████████| 330/330 [00:14, 22.88it/s, step size=2.55e-01, acc. prob=0.960]


[outer 003] TRAIN (EMA+K-ens) ll=0.6814  br=0.2435  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.35it/s, step size=3.03e-01, acc. prob=0.953]


[outer 004] TRAIN (EMA+K-ens) ll=0.6818  br=0.2436  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.34it/s, step size=2.58e-01, acc. prob=0.938]


[outer 005] TRAIN (EMA+K-ens) ll=0.6782  br=0.2419  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.63it/s, step size=2.85e-01, acc. prob=0.942]


[outer 006] TRAIN (EMA+K-ens) ll=0.6678  br=0.2370  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.07it/s, step size=3.09e-01, acc. prob=0.929]


[outer 007] TRAIN (EMA+K-ens) ll=0.6667  br=0.2364  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.44it/s, step size=2.33e-01, acc. prob=0.964]


[outer 008] TRAIN (EMA+K-ens) ll=0.6790  br=0.2424  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=3.23e-01, acc. prob=0.955]


[outer 009] TRAIN (EMA+K-ens) ll=0.6688  br=0.2376  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.33it/s, step size=2.78e-01, acc. prob=0.967]


[outer 010] TRAIN (EMA+K-ens) ll=0.6674  br=0.2368  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 20.31it/s, step size=2.44e-01, acc. prob=0.960]


[outer 011] TRAIN (EMA+K-ens) ll=0.6696  br=0.2378  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.40it/s, step size=2.59e-01, acc. prob=0.950]


[outer 012] TRAIN (EMA+K-ens) ll=0.6741  br=0.2398  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 22.00it/s, step size=2.42e-01, acc. prob=0.951]


[outer 013] TRAIN (EMA+K-ens) ll=0.6768  br=0.2412  acc=0.6870


Sample: 100%|██████████| 330/330 [00:16, 20.62it/s, step size=2.66e-01, acc. prob=0.959]


[outer 014] TRAIN (EMA+K-ens) ll=0.6909  br=0.2476  acc=0.6860


Sample: 100%|██████████| 330/330 [00:15, 21.32it/s, step size=1.88e-01, acc. prob=0.967]


[outer 015] TRAIN (EMA+K-ens) ll=0.6959  br=0.2501  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.48it/s, step size=2.74e-01, acc. prob=0.946]


[outer 016] TRAIN (EMA+K-ens) ll=0.6836  br=0.2442  acc=0.6810


Sample: 100%|██████████| 330/330 [00:16, 20.10it/s, step size=2.76e-01, acc. prob=0.965]


[outer 017] TRAIN (EMA+K-ens) ll=0.6837  br=0.2442  acc=0.6670


Sample: 100%|██████████| 330/330 [00:14, 23.33it/s, step size=2.87e-01, acc. prob=0.941]


[outer 018] TRAIN (EMA+K-ens) ll=0.6965  br=0.2502  acc=0.6350


Sample: 100%|██████████| 330/330 [00:16, 20.33it/s, step size=2.60e-01, acc. prob=0.962]


[outer 019] TRAIN (EMA+K-ens) ll=0.7126  br=0.2579  acc=0.6480


Sample: 100%|██████████| 330/330 [00:13, 23.89it/s, step size=2.77e-01, acc. prob=0.952]


[outer 020] TRAIN (EMA+K-ens) ll=0.7012  br=0.2527  acc=0.6770


Sample: 100%|██████████| 330/330 [00:13, 24.36it/s, step size=2.58e-01, acc. prob=0.949]


[outer 021] TRAIN (EMA+K-ens) ll=0.6790  br=0.2423  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.26it/s, step size=3.04e-01, acc. prob=0.929]


[outer 022] TRAIN (EMA+K-ens) ll=0.6698  br=0.2379  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.18it/s, step size=3.44e-01, acc. prob=0.945]


[outer 023] TRAIN (EMA+K-ens) ll=0.6617  br=0.2340  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 23.55it/s, step size=2.98e-01, acc. prob=0.953]


[outer 024] TRAIN (EMA+K-ens) ll=0.6589  br=0.2328  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.25it/s, step size=3.07e-01, acc. prob=0.934]


[outer 025] TRAIN (EMA+K-ens) ll=0.6663  br=0.2363  acc=0.6870


Sample: 100%|██████████| 330/330 [00:15, 20.86it/s, step size=2.52e-01, acc. prob=0.961]


[outer 026] TRAIN (EMA+K-ens) ll=0.6470  br=0.2271  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.06it/s, step size=2.99e-01, acc. prob=0.926]


[outer 027] TRAIN (EMA+K-ens) ll=0.6397  br=0.2236  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.42it/s, step size=3.37e-01, acc. prob=0.933]


[outer 028] TRAIN (EMA+K-ens) ll=0.6376  br=0.2226  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.69it/s, step size=2.55e-01, acc. prob=0.974]


[outer 029] TRAIN (EMA+K-ens) ll=0.6578  br=0.2323  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.02it/s, step size=2.72e-01, acc. prob=0.962]


[outer 030] TRAIN (EMA+K-ens) ll=0.6598  br=0.2332  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.93it/s, step size=2.92e-01, acc. prob=0.940]


[outer 031] TRAIN (EMA+K-ens) ll=0.6640  br=0.2352  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.63it/s, step size=2.41e-01, acc. prob=0.952]


[outer 032] TRAIN (EMA+K-ens) ll=0.6688  br=0.2374  acc=0.6870


Sample: 100%|██████████| 330/330 [00:12, 25.84it/s, step size=3.38e-01, acc. prob=0.942]


[outer 033] TRAIN (EMA+K-ens) ll=0.6638  br=0.2349  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.21it/s, step size=2.54e-01, acc. prob=0.973]


[outer 034] TRAIN (EMA+K-ens) ll=0.6702  br=0.2380  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 23.75it/s, step size=3.38e-01, acc. prob=0.945]


[outer 035] TRAIN (EMA+K-ens) ll=0.6581  br=0.2322  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.49it/s, step size=3.55e-01, acc. prob=0.937]


[outer 036] TRAIN (EMA+K-ens) ll=0.6613  br=0.2336  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 23.83it/s, step size=2.99e-01, acc. prob=0.968]


[outer 037] TRAIN (EMA+K-ens) ll=0.6508  br=0.2286  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.71it/s, step size=2.73e-01, acc. prob=0.944]


[outer 038] TRAIN (EMA+K-ens) ll=0.6529  br=0.2297  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 22.43it/s, step size=2.28e-01, acc. prob=0.965]


[outer 039] TRAIN (EMA+K-ens) ll=0.6532  br=0.2298  acc=0.6870


Sample: 100%|██████████| 330/330 [00:13, 24.93it/s, step size=2.83e-01, acc. prob=0.943]


[outer 000] TRAIN (EMA+K-ens) ll=0.6666  br=0.2366  acc=0.6310


Sample: 100%|██████████| 330/330 [00:14, 22.09it/s, step size=2.07e-01, acc. prob=0.961]


[outer 001] TRAIN (EMA+K-ens) ll=0.6703  br=0.2384  acc=0.6830


Sample: 100%|██████████| 330/330 [00:14, 22.80it/s, step size=2.70e-01, acc. prob=0.957]


[outer 002] TRAIN (EMA+K-ens) ll=0.6905  br=0.2480  acc=0.6450


Sample: 100%|██████████| 330/330 [00:14, 23.08it/s, step size=2.38e-01, acc. prob=0.957]


[outer 003] TRAIN (EMA+K-ens) ll=0.6679  br=0.2370  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.29it/s, step size=3.25e-01, acc. prob=0.909]


[outer 004] TRAIN (EMA+K-ens) ll=0.6625  br=0.2344  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 23.11it/s, step size=2.92e-01, acc. prob=0.943]


[outer 005] TRAIN (EMA+K-ens) ll=0.6700  br=0.2379  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.28it/s, step size=2.83e-01, acc. prob=0.964]


[outer 006] TRAIN (EMA+K-ens) ll=0.6666  br=0.2363  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.16it/s, step size=2.39e-01, acc. prob=0.975]


[outer 007] TRAIN (EMA+K-ens) ll=0.6642  br=0.2351  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 21.67it/s, step size=2.29e-01, acc. prob=0.971]


[outer 008] TRAIN (EMA+K-ens) ll=0.6663  br=0.2360  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.62it/s, step size=3.66e-01, acc. prob=0.929]


[outer 009] TRAIN (EMA+K-ens) ll=0.6673  br=0.2365  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.40it/s, step size=2.95e-01, acc. prob=0.955]


[outer 010] TRAIN (EMA+K-ens) ll=0.6670  br=0.2364  acc=0.6870


Sample: 100%|██████████| 330/330 [00:14, 23.57it/s, step size=3.04e-01, acc. prob=0.924]


[outer 011] TRAIN (EMA+K-ens) ll=0.6753  br=0.2401  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 23.25it/s, step size=2.93e-01, acc. prob=0.926]


[outer 012] TRAIN (EMA+K-ens) ll=0.6867  br=0.2455  acc=0.6830


Sample: 100%|██████████| 330/330 [00:15, 21.04it/s, step size=3.10e-01, acc. prob=0.931]


[outer 013] TRAIN (EMA+K-ens) ll=0.6737  br=0.2394  acc=0.6870


Sample: 100%|██████████| 330/330 [00:12, 26.26it/s, step size=2.83e-01, acc. prob=0.947]


[outer 014] TRAIN (EMA+K-ens) ll=0.6732  br=0.2391  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.90it/s, step size=2.16e-01, acc. prob=0.969]


[outer 015] TRAIN (EMA+K-ens) ll=0.6760  br=0.2404  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 23.65it/s, step size=2.72e-01, acc. prob=0.962]


[outer 016] TRAIN (EMA+K-ens) ll=0.6709  br=0.2381  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.76it/s, step size=2.63e-01, acc. prob=0.934]


[outer 017] TRAIN (EMA+K-ens) ll=0.6770  br=0.2409  acc=0.6900


Sample: 100%|██████████| 330/330 [00:15, 20.98it/s, step size=2.78e-01, acc. prob=0.944]


[outer 018] TRAIN (EMA+K-ens) ll=0.6707  br=0.2380  acc=0.6890


Sample: 100%|██████████| 330/330 [00:14, 22.66it/s, step size=2.53e-01, acc. prob=0.963]


[outer 019] TRAIN (EMA+K-ens) ll=0.6682  br=0.2368  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 23.21it/s, step size=2.73e-01, acc. prob=0.969]


[outer 020] TRAIN (EMA+K-ens) ll=0.6571  br=0.2316  acc=0.6890
[Early stop @ outer 20] Δll=0.158%, Δbr=0.253%, Δacc=0.001


Sample: 100%|██████████| 330/330 [00:14, 22.11it/s, step size=3.11e-01, acc. prob=0.945]


[outer 000] TRAIN (EMA+K-ens) ll=0.6326  br=0.2206  acc=0.6430


Sample: 100%|██████████| 330/330 [00:14, 23.57it/s, step size=2.74e-01, acc. prob=0.948]


[outer 001] TRAIN (EMA+K-ens) ll=0.6155  br=0.2119  acc=0.7460


Sample: 100%|██████████| 330/330 [00:13, 23.96it/s, step size=3.79e-01, acc. prob=0.940]


[outer 002] TRAIN (EMA+K-ens) ll=0.6227  br=0.2153  acc=0.7160


Sample: 100%|██████████| 330/330 [00:13, 23.96it/s, step size=2.49e-01, acc. prob=0.927]


[outer 003] TRAIN (EMA+K-ens) ll=0.6346  br=0.2211  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.05it/s, step size=3.12e-01, acc. prob=0.941]


[outer 004] TRAIN (EMA+K-ens) ll=0.6472  br=0.2271  acc=0.6880


Sample: 100%|██████████| 330/330 [00:14, 22.38it/s, step size=2.96e-01, acc. prob=0.931]


[outer 005] TRAIN (EMA+K-ens) ll=0.6537  br=0.2303  acc=0.6920


Sample: 100%|██████████| 330/330 [00:15, 20.99it/s, step size=2.60e-01, acc. prob=0.959]


[outer 006] TRAIN (EMA+K-ens) ll=0.6565  br=0.2316  acc=0.6980


Sample: 100%|██████████| 330/330 [00:13, 23.70it/s, step size=2.86e-01, acc. prob=0.960]


[outer 007] TRAIN (EMA+K-ens) ll=0.6549  br=0.2308  acc=0.6950


Sample: 100%|██████████| 330/330 [00:15, 21.70it/s, step size=2.75e-01, acc. prob=0.959]


[outer 008] TRAIN (EMA+K-ens) ll=0.6519  br=0.2294  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.30it/s, step size=2.49e-01, acc. prob=0.930]


[outer 009] TRAIN (EMA+K-ens) ll=0.6547  br=0.2306  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 25.02it/s, step size=3.39e-01, acc. prob=0.932]


[outer 010] TRAIN (EMA+K-ens) ll=0.6689  br=0.2373  acc=0.6940


Sample: 100%|██████████| 330/330 [00:13, 24.32it/s, step size=3.71e-01, acc. prob=0.918]


[outer 011] TRAIN (EMA+K-ens) ll=0.6590  br=0.2327  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.18it/s, step size=3.12e-01, acc. prob=0.941]


[outer 012] TRAIN (EMA+K-ens) ll=0.6610  br=0.2337  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 23.63it/s, step size=2.51e-01, acc. prob=0.971]


[outer 013] TRAIN (EMA+K-ens) ll=0.6574  br=0.2322  acc=0.6950


Sample: 100%|██████████| 330/330 [00:13, 24.89it/s, step size=3.22e-01, acc. prob=0.943]


[outer 014] TRAIN (EMA+K-ens) ll=0.6600  br=0.2333  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 23.41it/s, step size=3.15e-01, acc. prob=0.937]


[outer 015] TRAIN (EMA+K-ens) ll=0.6653  br=0.2358  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 24.51it/s, step size=2.61e-01, acc. prob=0.949]


[outer 016] TRAIN (EMA+K-ens) ll=0.6794  br=0.2424  acc=0.6930


Sample: 100%|██████████| 330/330 [00:13, 23.92it/s, step size=3.40e-01, acc. prob=0.913]


[outer 017] TRAIN (EMA+K-ens) ll=0.6816  br=0.2434  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 23.85it/s, step size=2.81e-01, acc. prob=0.973]


[outer 018] TRAIN (EMA+K-ens) ll=0.6668  br=0.2364  acc=0.6950


Sample: 100%|██████████| 330/330 [00:14, 22.50it/s, step size=2.49e-01, acc. prob=0.962]


[outer 019] TRAIN (EMA+K-ens) ll=0.6722  br=0.2388  acc=0.6840


Sample: 100%|██████████| 330/330 [00:13, 23.77it/s, step size=2.21e-01, acc. prob=0.961]


[outer 020] TRAIN (EMA+K-ens) ll=0.6664  br=0.2360  acc=0.6940


Sample: 100%|██████████| 330/330 [00:14, 22.60it/s, step size=3.11e-01, acc. prob=0.944]


[outer 021] TRAIN (EMA+K-ens) ll=0.6687  br=0.2371  acc=0.6910


Sample: 100%|██████████| 330/330 [00:12, 25.59it/s, step size=3.48e-01, acc. prob=0.944]


[outer 022] TRAIN (EMA+K-ens) ll=0.6639  br=0.2349  acc=0.6920


Sample: 100%|██████████| 330/330 [00:12, 25.42it/s, step size=3.16e-01, acc. prob=0.925]


[outer 023] TRAIN (EMA+K-ens) ll=0.6623  br=0.2341  acc=0.6860


Sample: 100%|██████████| 330/330 [00:14, 22.84it/s, step size=2.74e-01, acc. prob=0.967]


[outer 024] TRAIN (EMA+K-ens) ll=0.6642  br=0.2351  acc=0.6550


Sample: 100%|██████████| 330/330 [00:14, 23.52it/s, step size=2.68e-01, acc. prob=0.980]


[outer 025] TRAIN (EMA+K-ens) ll=0.6685  br=0.2371  acc=0.6530


Sample: 100%|██████████| 330/330 [00:14, 22.63it/s, step size=2.41e-01, acc. prob=0.948]


[outer 026] TRAIN (EMA+K-ens) ll=0.6699  br=0.2376  acc=0.6350


Sample: 100%|██████████| 330/330 [00:14, 22.33it/s, step size=2.79e-01, acc. prob=0.965]


[outer 027] TRAIN (EMA+K-ens) ll=0.6785  br=0.2413  acc=0.6350


Sample: 100%|██████████| 330/330 [00:14, 23.14it/s, step size=2.33e-01, acc. prob=0.970]


[outer 028] TRAIN (EMA+K-ens) ll=0.6708  br=0.2378  acc=0.6420


Sample: 100%|██████████| 330/330 [00:12, 26.95it/s, step size=2.70e-01, acc. prob=0.953]


[outer 029] TRAIN (EMA+K-ens) ll=0.6703  br=0.2375  acc=0.6650


Sample: 100%|██████████| 330/330 [00:15, 21.75it/s, step size=2.34e-01, acc. prob=0.939]


[outer 030] TRAIN (EMA+K-ens) ll=0.6638  br=0.2344  acc=0.6740


Sample: 100%|██████████| 330/330 [00:15, 21.84it/s, step size=3.28e-01, acc. prob=0.971]


[outer 031] TRAIN (EMA+K-ens) ll=0.6736  br=0.2390  acc=0.6630


Sample: 100%|██████████| 330/330 [00:13, 24.06it/s, step size=2.95e-01, acc. prob=0.936]


[outer 032] TRAIN (EMA+K-ens) ll=0.6648  br=0.2349  acc=0.6900


Sample: 100%|██████████| 330/330 [00:14, 22.89it/s, step size=2.47e-01, acc. prob=0.952]


[outer 033] TRAIN (EMA+K-ens) ll=0.6664  br=0.2356  acc=0.6790


Sample: 100%|██████████| 330/330 [00:14, 22.19it/s, step size=2.88e-01, acc. prob=0.946]


[outer 034] TRAIN (EMA+K-ens) ll=0.6686  br=0.2367  acc=0.6930


Sample: 100%|██████████| 330/330 [00:12, 25.93it/s, step size=2.74e-01, acc. prob=0.952]


[outer 035] TRAIN (EMA+K-ens) ll=0.6592  br=0.2325  acc=0.6800


Sample: 100%|██████████| 330/330 [00:13, 23.58it/s, step size=2.55e-01, acc. prob=0.963]


[outer 036] TRAIN (EMA+K-ens) ll=0.6726  br=0.2387  acc=0.6850


Sample: 100%|██████████| 330/330 [00:13, 25.25it/s, step size=3.36e-01, acc. prob=0.896]


[outer 037] TRAIN (EMA+K-ens) ll=0.6660  br=0.2356  acc=0.6910


Sample: 100%|██████████| 330/330 [00:14, 23.44it/s, step size=2.88e-01, acc. prob=0.948]


[outer 038] TRAIN (EMA+K-ens) ll=0.6755  br=0.2400  acc=0.6900


Sample: 100%|██████████| 330/330 [00:13, 24.05it/s, step size=2.96e-01, acc. prob=0.922]

[outer 039] TRAIN (EMA+K-ens) ll=0.6647  br=0.2348  acc=0.6950
          accuracy     brier   logloss
mean      0.623495  0.258142  0.750645
std       0.037033  0.015872  0.080097
median    0.623880  0.254017  0.711209
<lambda>  0.582665  0.246189  0.692860
<lambda>  0.667300  0.272875  0.831450
    accuracy     brier   logloss
0    0.58252  0.284985  0.867139
1    0.57718  0.278085  0.853441
2    0.58110  0.270300  0.841767
3    0.58540  0.273499  0.874698
4    0.58310  0.283482  0.882864
5    0.57050  0.275668  0.806867
6    0.58218  0.277108  0.835483
7    0.57440  0.279853  0.893974
8    0.57846  0.271002  0.819351
9    0.57362  0.280378  0.892929
10   0.62388  0.243343  0.682571
11   0.62388  0.246363  0.689104
12   0.62348  0.262089  0.726670
13   0.62388  0.261240  0.725931
14   0.62406  0.262250  0.733998
15   0.62372  0.244719  0.684333
16   0.62388  0.254721  0.708690
17   0.62542  0.254283  0.710906
18   0.62432  0.249182  0.695819
19   0.62204  0.251912  0.704302
20   0.668


