Saving best metrics based on Custom metrics failing (WARNING:tensorflow:Can save best model only with CUSTOM METRICS available, skipping) #53155

ravinderkhatri · 2021-11-22T06:23:00Z

I have defined a callback that runs on the epoch end and calculated the metrics. It is working fine in terms of calculating the desired metrics. Below is the function for reference

class Metrics(tf.keras.callbacks.Callback):
    def __init__(self, train_tf_data, val_tf_data, model, CLASSES, logs={}, **kwargs):
        super().__init__(**kwargs)
        self.train_tf_data = train_tf_data
        self.val_tf_data = val_tf_data
        self.model = model
        self.CLASSES = CLASSES
        # for train data
        self.train_f1_after_epoch = 0
        self.train_prec_after_epoch = 0
        self.train_recall_after_epoch = 0
        # for val data
        self.val_f1_after_epoch = 0
        self.val_prec_after_epoch = 0
        self.val_recall_after_epoch = 0

    def on_train_begin(self, logs={}):
        self.train_reports = None
        self.val_reports = None
        self.val_f1_after_epoch = 0

    def on_epoch_end(self, epoch, logs={}):
        # for train data
        self.train_reports = test_model(model=self.model, data=self.train_tf_data, 
                                        CLASSES=self.CLASSES)
        self.train_f1_after_epoch = self.train_reports['f1_score']
        self.train_recall_after_epoch = self.train_reports['recall']
        self.train_prec_after_epoch = self.train_reports['precision']

        # for val data
        self.val_reports = test_model(model=self.model, data=self.val_tf_data, 
                                      CLASSES=self.CLASSES)
        self.val_f1_after_epoch = self.val_reports['f1_score']
        self.val_recall_after_epoch = self.val_reports['recall']
        self.val_prec_after_epoch = self.val_reports['precision']

        # saving train results to log dir
        logs["train_f1_after_epoch"]=self.train_f1_after_epoch
        logs['train_precision_after_epoch'] = self.train_prec_after_epoch
        logs['train_recall_after_epoch'] = self.train_recall_after_epoch
        
        # saving val results to log dir
        logs['val_f1_after_epoch'] = self.val_f1_after_epoch
        logs['val_precision_after_epoch'] = self.val_prec_after_epoch
        logs['val_recall_after_epoch'] = self.val_recall_after_epoch


        print('train_reports_after_epoch', self.train_reports)
        print('val_reports_after_epoch', self.val_reports)

** .....Some model code .....**

Using this in call back

m1 = tf.keras.metrics.CategoricalAccuracy()
m2 = tf.keras.metrics.Recall()
m3 = tf.keras.metrics.Precision()
m4 = Metrics(train_tf_data=train_data, 
             val_tf_data=test_data, model=model, 
             CLASSES=CLASS_NAMES)
optimizers = [
        tfa.optimizers.AdamW(learning_rate=lr * .001 , weight_decay=wd),
        tfa.optimizers.AdamW(learning_rate=lr, weight_decay=wd)

           ]
optimizers_and_layers = [(optimizers[0], model.layers[0]), (optimizers[1], model.layers[1:])]
    
optimizer = tfa.optimizers.MultiOptimizer(optimizers_and_layers)


model.compile(
    optimizer= optimizer,
    loss = 'categorical_crossentropy',
    metrics=[m1, m2, m3],
    )

checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path, 
                                                    monitor = 'val_f1_after_epoch',
                                                    save_best_only=True,
                                                    save_weights_only=True,
                                                    mode='max',
                                                    save_freq='epoch',
                                                    verbose=1)
                                                    
checkpoint_cb._supports_tf_logs = False

The issue that I am facing is that it is giving me a warning that says

WARNING:TensorFlow: Can save best model only with val_f1_after_epoch available, skipping

Upon investigating history I found that metrics is available in the history

print(list(history.history.keys()))
['loss',
'categorical_accuracy',
'recall',
'precision',
'val_loss',
'val_categorical_accuracy',
'val_recall',
'val_precision',
'train_f1_after_epoch',
'train_precision_after_epoch',
'train_recall_after_epoch',
'val_f1_after_epoch', #this is the metrics
'val_precision_after_epoch',
'val_recall_after_epoch']

I think there is a bug in ModelCheckpoint where is it not looking at the custom metrics and not saving the model.

I am using Tensorflow 2.7 (Also tried this with Tensorflow 2.5)

The text was updated successfully, but these errors were encountered:

tilakrayal · 2021-11-22T12:43:08Z

@ravinderkhatri ,
Please take a look at this link 1 and 2 with the similar error.It helps.Thanks

tilakrayal · 2021-11-22T12:44:02Z

Please post this issue on keras-team/keras repo.
To know more refer to:
https://discuss.tensorflow.org/t/keras-project-moved-to-new-repository-in-https-github-com-keras-team-keras/1999

ravinderkhatri · 2021-11-22T15:35:43Z

@tilakrayal I have tried all the approaches mentioned in the link. Nothing is working. I have created a ticket at the link keras-team/Keras but I was thinking it is related to TensorFlow. Below is the link for reference
keras-team/keras#15684

tilakrayal · 2021-11-23T02:58:00Z

@ravinderkhatri ,
Please feel free to close this issue as it has been tracked in Keras repo.

google-ml-butler · 2021-11-30T03:30:16Z

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler · 2021-12-04T14:25:28Z

Are you satisfied with the resolution of your issue?
Yes
No

ravinderkhatri added the type:bug Bug label Nov 22, 2021

google-ml-butler bot assigned tilakrayal Nov 22, 2021

tilakrayal added TF 2.7 Issues related to TF 2.7.0 comp:keras Keras related issues labels Nov 22, 2021

tilakrayal added the stat:awaiting response Status - Awaiting response from author label Nov 22, 2021

tilakrayal added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting response Status - Awaiting response from author labels Nov 23, 2021

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Nov 30, 2021

ravinderkhatri closed this as completed Dec 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving best metrics based on Custom metrics failing (WARNING:tensorflow:Can save best model only with CUSTOM METRICS available, skipping) #53155

Saving best metrics based on Custom metrics failing (WARNING:tensorflow:Can save best model only with CUSTOM METRICS available, skipping) #53155

ravinderkhatri commented Nov 22, 2021 •

edited

tilakrayal commented Nov 22, 2021

tilakrayal commented Nov 22, 2021

ravinderkhatri commented Nov 22, 2021 •

edited

tilakrayal commented Nov 23, 2021

google-ml-butler bot commented Nov 30, 2021

google-ml-butler bot commented Dec 4, 2021

Saving best metrics based on Custom metrics failing (WARNING:tensorflow:Can save best model only with CUSTOM METRICS available, skipping) #53155

Saving best metrics based on Custom metrics failing (WARNING:tensorflow:Can save best model only with CUSTOM METRICS available, skipping) #53155

Comments

ravinderkhatri commented Nov 22, 2021 • edited

Using this in call back

tilakrayal commented Nov 22, 2021

tilakrayal commented Nov 22, 2021

ravinderkhatri commented Nov 22, 2021 • edited

tilakrayal commented Nov 23, 2021

google-ml-butler bot commented Nov 30, 2021

google-ml-butler bot commented Dec 4, 2021

ravinderkhatri commented Nov 22, 2021 •

edited

ravinderkhatri commented Nov 22, 2021 •

edited