Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do keras calulate the metric? #11705

Closed
Suncicie opened this issue Nov 22, 2018 · 5 comments
Closed

How do keras calulate the metric? #11705

Suncicie opened this issue Nov 22, 2018 · 5 comments
Assignees
Labels
stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@Suncicie
Copy link

Suncicie commented Nov 22, 2018

I'm trying to write a new metric of f1-macro for the softmax, I have tried two methods, I can't understand why the result of them is different.
The first method, I defined a f1-macro function like that,

def f1_macro(y_true,y_pred):
    y_pred_tmp=K.argmax(y_pred,axis=-1)
    y_pred_tmp=K.one_hot(y_pred_tmp,4)  (MULTI CLASS OF 4)
    tp=y_pred_tmp*y_true
    tp=tf.reduce_sum(tp,axis=0)

    fp=K.greater(y_pred_tmp,y_true)
    fp=tf.reduce_sum(tf.cast(fp,tf.float32),axis=0)

    fn = K.greater(y_true,y_pred_tmp)
    fn = tf.reduce_sum(tf.cast(fn, tf.float32), axis=0)

    prec_list=[]
    recal_list=[]
    f1_list=[]
    for i in range(0,4):
        prec=tp[i]/(tp[i]+fp[i]+K.epsilon())
        rec=tp[i]/(tp[i]+fn[i]+K.epsilon())
        f1=2*prec*rec/(prec+rec+K.epsilon())
        prec_list.append(prec)
        recal_list.append(rec)
        f1_list.append(f1)
    return tf.reduce_mean(f1_list)

then use it like that model.compile(..., metric=[f1-macro])
The second method, I used the Callback to calculate the f1-macro by sklearn f1-score, the f1 calculated by f1_macro function is always less than the Callback. the Callback function is defined below:

class Metrics(Callback):
    def on_train_begin(self, logs={}):
        self.val_f1s = []
        self.val_recalls = []
        self.val_precisions = []

    def on_epoch_end(self, epoch, logs={}):
     
        pred=np.asarray(self.model.predict(self.validation_data[0]))
        val_predict = np.argmax(pred, axis=-1)
        val_targ = np.argmax(self.validation_data[1], axis=-1)
        _val_recall = recall_score(val_targ, val_predict, average='macro')
        _val_precision = precision_score(val_targ, val_predict, average='macro')
        _val_f1 = f1_score(val_targ, val_predict, average='macro')
        self.val_recalls.append(_val_recall, )
        self.val_precisions.append(_val_precision)
        self.val_f1s.append(_val_f1)
        print('— val_f1: %f — val_precision: %f — val_recall %f' % (_val_f1, _val_precision, _val_recall))
        return

and I use it by

 metrics = Metrics()
 model.fit(X_train, y_train,
                      epochs=10,
                      batch_size=batch_size,
                      validation_data=(X_valid, y_valid),
                      callbacks=[early_stopping, plateau, checkpoint, metrics],
                      verbose=2
                      )
  1. In the two methods, is the validation data for keras to calculate (X_valid, y_valid)?
  2. Is the calculation principle of sklearn f1 score the same with f1_macro ?
  3. How do keras calculate the metrics, is it after each batch size, or after each epoch?
@gabrieldemarmiesse gabrieldemarmiesse added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. type:tensorFlow labels Nov 22, 2018
@PhilipMay
Copy link
Contributor

PhilipMay commented Dec 1, 2018

Hi,
as far as I know it is that the metrics are applied for each batch and then averaged. So your f1_macro is garbage. Your callback is calculating on the complete validation set and should be right. That is why f1, precision and recall was once removed from keras again: see here #5794 for more infos.

Here ths the quote from #5794:

Basically these are all global metrics that were approximated batch-wise, which is more misleading than helpful. This was mentioned in the docs but it's much cleaner to remove them altogether. It was a mistake to merge them in the first place.

Now ppl tend to do these callback implementations for metrics. But I think they are ugly and feel like hacks. In your example the validation data is processed twice. Once by keras itself and once by your callback. That is why I did write a callback that is not picking the validation data from the model (self.validation_data[0]) but getting them by parameter. Please have a look here: https://github.com/PhilipMay/mltb/blob/master/mltb/keras.py and demo code here: https://github.com/PhilipMay/mltb/blob/master/demo/keras_demo.py

@PhilipMay
Copy link
Contributor

@Suncicie Did this help you?

@msymp
Copy link
Collaborator

msymp commented Jan 18, 2019

@Suncicie , can you study @PhilipMay suggestion and let us know. Thanks.

@Suncicie
Copy link
Author

@Suncicie Did this help you?

It'a appericiate of you, now I get the idea that the f1_macro always samller thab Callback, and this gap is getting large with the increase of epoch. Indeed, the average result each epoch of model.compile(..., metric=[f1-macro]) is misleading, at the first epochs, the model haven't fit the data carefully, it's f1-macro of couse will be low than which after the batch. So f1_macro is misleading, Callback is correct.

@ymodak ymodak assigned ymodak and unassigned msymp Jan 30, 2019
@ymodak
Copy link
Collaborator

ymodak commented Apr 24, 2019

Closing this issue since its been addressed. Feel free to reopen if have any further questions. Thanks!

@ymodak ymodak closed this as completed Apr 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

No branches or pull requests

5 participants