Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get both test accuracy and validation accuracy for each epoch #2548

Closed
philokey opened this issue Apr 28, 2016 · 13 comments
Closed

Comments

@philokey
Copy link

Hi, everyone.

I can use model.evaluate() to calculate the test accuracy for the last epoch, but how can I get both test accuracy and validation for each epoch?

@joelthchao
Copy link
Contributor

You can train the model with only one epoch and evaluate it. Then use a for loop to repeat this procedure for several epochs. However, doing test in each epoch doesn't like what we should do when training a model, test should be only done after you finished training.

@joelthchao
Copy link
Contributor

Well, I wrote a callback for this purpose.

class TestCallback(Callback):
    def __init__(self, test_data):
        self.test_data = test_data

    def on_epoch_end(self, epoch, logs={}):
        x, y = self.test_data
        loss, acc = self.model.evaluate(x, y, verbose=0)
        print('\nTesting loss: {}, acc: {}\n'.format(loss, acc))

Then, you can call fit with callbacks

model.fit(X_train, Y_train, validation_data=(X_val, Y_val), 
          callbacks=[TestCallback((X_test, Y_test))])

However, since there is no on_evaluate_end in the Callback class. So actually it does testing before evaluation and results in bad output format Q_Q.

Epoch 1/2
 9984/10000 [============================>.] - ETA: 0s - loss: 0.6496 - acc: 0.7921
Testing loss: 0.2133, acc: 0.9319
10000/10000 [==============================] - 8s - loss: 0.6491 - acc: 0.7922 - val_loss: 0.2133 - val_acc: 0.9319

@carlthome
Copy link
Contributor

Having an on_evaluate_end would be great and would totally work for me on this one #2521

@zbyte64
Copy link

zbyte64 commented Dec 2, 2016

evaluate now only returns loss and not accuracy, rendering the above work around broken :(

Edit: but passing in metrics=['accuracy'] during compile works fine for me

@patyork
Copy link
Contributor

patyork commented Dec 2, 2016

evaluate() will return the list of metrics that the model was compiled with. So if your compile metrics list include 'accuracy', then this should still work.

E.g.

tmp = Sequential()
tmp.add(Dense(20, input_shape=(10,)))

tmp.compile(optimizer='adadelta', loss='mse', metrics=['mse', 'accuracy'])
tmp.evaluate(np.asarray([np.zeros((10))]), np.asarray([np.zeros((20))]))

yields

1/1 [==============================] - 0s
[0.0, 0.0, 1.0]

..which is [loss, mse, accuracy] as a list.

@piotrbazan
Copy link

model.fit returns history object which contains information about training an validation accuracy and loss:

{'acc': [0.9843952109499714],
 'loss': [0.050826362343496051],
 'val_acc': [0.98403786838658314],
 'val_loss': [0.0502210383056177]
}

@Alhawawreh
Copy link

can anyone help me, pleae? the evaluation function in keras has a very bad performance although the result of training and validation is very good. what is the problem ?

@Hisairnessag3
Copy link

@manallllll This is a question better suited for stackoverflow but you should always post code if you want help

@nobeldang
Copy link

@joelthchao i defined my custom callback as per your code but its showing an error that too many values to unpack at line
self.model.evaluate(...)

@samra-irshad
Copy link

print('\nTesting loss: {}, acc: {}\n'.format(loss, acc))

Your method works, however, Keras is not logging test results in log file, its doing that for validation set but not for test set

@KhawYewOnn
Copy link

KhawYewOnn commented Sep 19, 2020

Epoch 1/2
9984/10000 [============================>.] - ETA: 0s - loss: 0.6496 - acc: 0.7921
Testing loss: 0.2133, acc: 0.9319
10000/10000 [==============================] - 8s - loss: 0.6491 - acc: 0.7922 - val_loss: 0.2133 - val_acc: 0.9319

@joelthchao is 0.9319 the testing accuracy or the validation accuracy? Notice that acc:0.9319 is exactly the same as val_acc: 0.9319. Also, Testing loss: 0.2133 is the exact same value as val_loss: 0.2133. I ran the code as well, and I notice that it always print the same value as validation accuracy. I notice that somehow self.model.evaluate(x, y) is not using the value in x and y, but instead uses the validation data.

francescomilano172 added a commit to ethz-asl/background_foreground_segmentation that referenced this issue Feb 6, 2021
Note: logging is still broken, but as also stated in keras-team/keras#2548 (comment), the Test Callback from keras-team/keras#2548 (comment) doe s not work: when the `evaluate()` method is called in a `on_epoch_end` callback, the validation datasets is always used.
@SherinBojappa
Copy link

@ KhawYewOnn were you able to get around the issue where self.model.evaluate(x, y) is using the validation data instead of test data?

@NikitaShubin
Copy link

NikitaShubin commented Oct 24, 2023

Well, I wrote a callback for this purpose.

class TestCallback(Callback):
    def __init__(self, test_data):
        self.test_data = test_data

    def on_epoch_end(self, epoch, logs={}):
        x, y = self.test_data
        loss, acc = self.model.evaluate(x, y, verbose=0)
        print('\nTesting loss: {}, acc: {}\n'.format(loss, acc))

Based on this I made that:

class TestCallback(callbacks.Callback):
    '''
    Outputs the metrics of the test sample at the end of each epoch.
    '''
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs | {'verbose': 0} # Verbose is disable
    
    def on_epoch_end(self, epoch, logs={}):
        metrics_values = self.model.evaluate(*self.args, **self.kwargs)
        print('\r\033[1mTest\033[0m', end='') # The word "Test" is highlighted in bold
        for key, val in zip(self.model.metrics_names, metrics_values):
            print(' - test_%s: %.4f' % (key, val), end='1')
        print(' ' * 1000) # To rewrite the entire last line`

Not the most beautiful solution, but more general. It looks like that:

Epoch 1/2000
Test - test_loss: 0.29841 - test_mIoU: 0.73281 - test_P: 0.95191 - test_R: 0.95161                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
1990/1990 [==============================] - 820s 393ms/step - loss: 0.0112 - mIoU: 0.8784 - P: 0.9953 - R: 0.9952 - val_loss: 0.1600 - val_mIoU: 0.8663 - val_P: 0.9767 - val_R: 0.9766 - lr: 0.0010
Epoch 2/2000
Test- test_loss: 0.47551 - test_mIoU: 0.61951 - test_P: 0.92081 - test_R: 0.92051                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
1990/1990 [==============================] - 780s 392ms/step - loss: 0.0113 - mIoU: 0.8757 - P: 0.9953 - R: 0.9951 - val_loss: 0.2160 - val_mIoU: 0.8112 - val_P: 0.9666 - val_R: 0.9665 - lr: 0.0010
Epoch 3/2000
1696/1990 [========================>.....] - ETA: 1:51 - loss: 0.0117 - mIoU: 0.8714 - P: 0.9952 - R: 0.9950

Yes, test metrics are written before the rest. But I didn't figure out how to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests