Skip to content
This repository has been archived by the owner on Nov 15, 2020. It is now read-only.

Evaluation process and metrics #22

Closed
jjerphan opened this issue Oct 13, 2018 · 3 comments
Closed

Evaluation process and metrics #22

jjerphan opened this issue Oct 13, 2018 · 3 comments
Assignees

Comments

@jjerphan
Copy link
Owner

Accuracy is not sufficient to evaluate models. Different Metrics can be used, mainly:

  • F1score
  • Recall
  • Precision
  • AUC

This issue explore the evaluation process with metrics, there is two scenarios:

  • use what Keras proposes, that is its own handling of metrics as well, as the way to define custom metrics
    • this way, everything is delegated to model.compile and model.evaluate do the job
    • we may need to define more metrics
    • we will need to keep track of custom metrics somewhere for the custom_metrics argument when using keras.models.loadmodels
  • or use custom evaluation process using "manual a posteriori" computations of score with the outputs of model.predict.
    • this might be more complex but we might have more liberty and flexibility for this step
    • scikit-learn proposes a bunch of metrics
@jjerphan jjerphan self-assigned this Oct 13, 2018
@jjerphan
Copy link
Owner Author

jjerphan commented Oct 13, 2018

Keras used to proposes those metrics before but they have been removed as they were approximated on batches. For more information see this issue.

Another package, keras-metrics proposes ready to use metrics for Keras, but it seems that there is a problem with models that get saved using model.save() — basically, metrics defined by this package aren't correctly saved/serialized in the .h5 file. See this issue.

As fchollet suggests, the best option might be to use a custom workflow (hence our second scenario).

Moving on it now! 🏃

@jjerphan
Copy link
Owner Author

jjerphan commented Oct 14, 2018

This has been made in #23.

@jjerphan
Copy link
Owner Author

This has been done ine #23.
Metrics from scikit-learn has been used for the evaluation, mainly accuracy_score, precision_score, recall_score, f1_score, confusion_matrix. More may be added in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant