How do keras calulate the metric? #11705

Suncicie · 2018-11-22T02:48:36Z

I'm trying to write a new metric of f1-macro for the softmax, I have tried two methods, I can't understand why the result of them is different.
The first method, I defined a f1-macro function like that,

def f1_macro(y_true,y_pred):
    y_pred_tmp=K.argmax(y_pred,axis=-1)
    y_pred_tmp=K.one_hot(y_pred_tmp,4)  (MULTI CLASS OF 4)
    tp=y_pred_tmp*y_true
    tp=tf.reduce_sum(tp,axis=0)

    fp=K.greater(y_pred_tmp,y_true)
    fp=tf.reduce_sum(tf.cast(fp,tf.float32),axis=0)

    fn = K.greater(y_true,y_pred_tmp)
    fn = tf.reduce_sum(tf.cast(fn, tf.float32), axis=0)

    prec_list=[]
    recal_list=[]
    f1_list=[]
    for i in range(0,4):
        prec=tp[i]/(tp[i]+fp[i]+K.epsilon())
        rec=tp[i]/(tp[i]+fn[i]+K.epsilon())
        f1=2*prec*rec/(prec+rec+K.epsilon())
        prec_list.append(prec)
        recal_list.append(rec)
        f1_list.append(f1)
    return tf.reduce_mean(f1_list)

then use it like that model.compile(..., metric=[f1-macro])
The second method, I used the Callback to calculate the f1-macro by sklearn f1-score, the f1 calculated by f1_macro function is always less than the Callback. the Callback function is defined below:

class Metrics(Callback):
    def on_train_begin(self, logs={}):
        self.val_f1s = []
        self.val_recalls = []
        self.val_precisions = []

    def on_epoch_end(self, epoch, logs={}):
     
        pred=np.asarray(self.model.predict(self.validation_data[0]))
        val_predict = np.argmax(pred, axis=-1)
        val_targ = np.argmax(self.validation_data[1], axis=-1)
        _val_recall = recall_score(val_targ, val_predict, average='macro')
        _val_precision = precision_score(val_targ, val_predict, average='macro')
        _val_f1 = f1_score(val_targ, val_predict, average='macro')
        self.val_recalls.append(_val_recall, )
        self.val_precisions.append(_val_precision)
        self.val_f1s.append(_val_f1)
        print('— val_f1: %f — val_precision: %f — val_recall %f' % (_val_f1, _val_precision, _val_recall))
        return

and I use it by

 metrics = Metrics()
 model.fit(X_train, y_train,
                      epochs=10,
                      batch_size=batch_size,
                      validation_data=(X_valid, y_valid),
                      callbacks=[early_stopping, plateau, checkpoint, metrics],
                      verbose=2
                      )

In the two methods, is the validation data for keras to calculate (X_valid, y_valid)?
Is the calculation principle of sklearn f1 score the same with f1_macro ?
How do keras calculate the metrics, is it after each batch size, or after each epoch?

The text was updated successfully, but these errors were encountered:

PhilipMay · 2018-12-01T19:23:07Z

Hi,
as far as I know it is that the metrics are applied for each batch and then averaged. So your f1_macro is garbage. Your callback is calculating on the complete validation set and should be right. That is why f1, precision and recall was once removed from keras again: see here #5794 for more infos.

Here ths the quote from #5794:

Basically these are all global metrics that were approximated batch-wise, which is more misleading than helpful. This was mentioned in the docs but it's much cleaner to remove them altogether. It was a mistake to merge them in the first place.

Now ppl tend to do these callback implementations for metrics. But I think they are ugly and feel like hacks. In your example the validation data is processed twice. Once by keras itself and once by your callback. That is why I did write a callback that is not picking the validation data from the model (self.validation_data[0]) but getting them by parameter. Please have a look here: https://github.com/PhilipMay/mltb/blob/master/mltb/keras.py and demo code here: https://github.com/PhilipMay/mltb/blob/master/demo/keras_demo.py

PhilipMay · 2018-12-14T06:50:13Z

@Suncicie Did this help you?

msymp · 2019-01-18T22:32:59Z

@Suncicie , can you study @PhilipMay suggestion and let us know. Thanks.

Suncicie · 2019-01-30T10:25:55Z

@Suncicie Did this help you?

It'a appericiate of you, now I get the idea that the f1_macro always samller thab Callback, and this gap is getting large with the increase of epoch. Indeed, the average result each epoch of model.compile(..., metric=[f1-macro]) is misleading, at the first epochs, the model haven't fit the data carefully, it's f1-macro of couse will be low than which after the batch. So f1_macro is misleading, Callback is correct.

ymodak · 2019-04-24T22:01:49Z

Closing this issue since its been addressed. Feel free to reopen if have any further questions. Thanks!

gabrieldemarmiesse added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. type:tensorFlow labels Nov 22, 2018

msymp self-assigned this Jan 18, 2019

msymp added the stat:awaiting response from contributor label Jan 18, 2019

ymodak assigned ymodak and unassigned msymp Jan 30, 2019

ymodak closed this as completed Apr 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do keras calulate the metric? #11705

How do keras calulate the metric? #11705

Suncicie commented Nov 22, 2018 •

edited

Loading

PhilipMay commented Dec 1, 2018 •

edited

Loading

PhilipMay commented Dec 14, 2018

msymp commented Jan 18, 2019

Suncicie commented Jan 30, 2019

ymodak commented Apr 24, 2019

How do keras calulate the metric? #11705

How do keras calulate the metric? #11705

Comments

Suncicie commented Nov 22, 2018 • edited Loading

PhilipMay commented Dec 1, 2018 • edited Loading

PhilipMay commented Dec 14, 2018

msymp commented Jan 18, 2019

Suncicie commented Jan 30, 2019

ymodak commented Apr 24, 2019

Suncicie commented Nov 22, 2018 •

edited

Loading

PhilipMay commented Dec 1, 2018 •

edited

Loading