You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem: Strange loss and metric progression for focal loss.
When training with focal loss the loss actually increases and the MCC
metric gets worse over iterations when looking at it in tensorboard.
For log loss it works but for focal loss it looks off and I would suspect that alpha of 0.5
and a gamma close to zero should give similar results to log loss.
If i use the the learning rate found by the logloss using model the problem persists.
If I reduce the learning rate to a very low level and increase the gamma it works
With a low gamma and a low learning rate the problem is back.
Focal Loss seems to need a different range for the learning rate than logloss
catboost version: 1.2.2
Here is some code to reproduce this finding:
from itertools import product
import numpy as np
from catboost import CatBoostClassifier, Pool, cv
from sklearn.metrics import roc_auc_score
from sklearn.datasets import make_blobs
from matplotlib import pyplot
from pandas import DataFrame
I am also experiencing a similar problem. With an increase in focal gamma focal_gamma=3, it's required to reduce the learning rate to prevent the degradation of the MCC metric across iterations
Problem: Strange loss and metric progression for focal loss.
When training with focal loss the loss actually increases and the MCC
metric gets worse over iterations when looking at it in tensorboard.
For log loss it works but for focal loss it looks off and I would suspect that alpha of 0.5
and a gamma close to zero should give similar results to log loss.
If i use the the learning rate found by the logloss using model the problem persists.
If I reduce the learning rate to a very low level and increase the gamma it works
With a low gamma and a low learning rate the problem is back.
Focal Loss seems to need a different range for the learning rate than logloss
catboost version: 1.2.2
Here is some code to reproduce this finding:
from itertools import product
import numpy as np
from catboost import CatBoostClassifier, Pool, cv
from sklearn.metrics import roc_auc_score
from sklearn.datasets import make_blobs
from matplotlib import pyplot
from pandas import DataFrame
generate 2d classification dataset
X,y = make_blobs(n_samples=1000, centers=2, n_features=1000)
Split dataset into train and test sets
train_size = int(0.8 * len(X))
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]
Create CatBoost dataset
train_pool = Pool(X_train, y_train)
#train a model using log loss
loss = "Logloss"
model_log_loss = CatBoostClassifier(loss_function=loss, eval_metric="MCC", verbose=0)
test_pool = Pool(X_test)
model_log_loss.fit(train_pool)
#now train a model using focal loss
#with learning rate from ther log loss model
#alpha and gamma were set to make it close
#to log loss
loss = "Focal:focal_alpha=0.5;focal_gamma=0.0001"
model_focal = CatBoostClassifier(learning_rate = model_log_loss.learning_rate_, loss_function=loss, eval_metric="MCC", verbose=0)
model_focal.fit(train_pool)
#also trained a model with a very low learning rate and a higher gamma this seems to fix it
but focal loss seems to need a much lower leanring rate than log loss
loss = "Focal:focal_alpha=0.5;focal_gamma=3"
model_focal_low_learning = CatBoostClassifier(learning_rate = 0.00001, loss_function=loss, eval_metric="MCC", verbose=0)
model_focal_low_learning.fit(train_pool)
#also trained a model with a very low learning rate and low gamma then the problem shows up again
loss = "Focal:focal_alpha=0.5;focal_gamma=0.00001"
model_focal_low_learning = CatBoostClassifier(learning_rate = 0.00001, loss_function=loss, eval_metric="MCC", verbose=0)
The text was updated successfully, but these errors were encountered: