Performance Metrics & Fitness Functions #33

bartleyn · 2015-11-30T05:11:09Z

I imagine that it would be useful to use TPOT to find models that optimize alternate performance metrics like precision, recall, etc. As such, I've come up with the following brainstorming questions:

Does having alternate performance metrics/fitness functions make sense for the users?
Does it make sense to add alternate metrics when reporting the best model? If so, which metric do we use as the fitness function?
Since there is native support for multi-class/label classification regular precision, recall, and F1 may not be that useful. Should we just take the average version of scores like these when necessary?

There are plenty of other questions, but I figured this would be a decent place to start. Let me know if I'm totally misguided in proposing this -- I won't profess to be an expert in genetic programming.

rhiever · 2015-12-02T18:31:51Z

Does having alternate performance metrics/fitness functions make sense for the users?

Yes. I think we should eventually support allowing the user to pass arbitrary scoring functions, similar to how sklearn does it.

Does it make sense to add alternate metrics when reporting the best model? If so, which metric do we use as the fitness function?

As in, use one scoring functions for optimization then a different scoring metric for final model selection? Interesting idea. I'm currently working on a version of TPOT that allows it to optimize on multiple criteria simultaneously, so perhaps that will help in this regard.

Since there is native support for multi-class/label classification regular precision, recall, and F1 may not be that useful. Should we just take the average version of scores like these when necessary?

That's what we currently do with accuracy. I think it makes sense to do the same with other measures.

bartleyn · 2015-12-02T21:25:36Z

Yes. I think we should eventually support allowing the user to pass arbitrary scoring functions, similar to how sklearn does it.

Would passing in a keyword for the specific metric work for now?

As in, use one scoring functions for optimization then a different scoring metric for final model selection? Interesting idea. I'm currently working on a version of TPOT that allows it to optimize on multiple criteria simultaneously, so perhaps that will help in this regard.

Yeah, I was thinking that and/or adding additional metrics to the report of the final model.

That's what we currently do with accuracy. I think it makes sense to do the same with other measures.

I'm happy to contribute by adding simple support for F1/Precision/Recall in the same vein as the accuracy. I think it should be relatively straightforward.

rhiever · 2015-12-02T21:33:39Z

Yes. I think we should eventually support allowing the user to pass arbitrary scoring functions, similar to how sklearn does it.

Would passing in a keyword for the specific metric work for now?

At first thought, I think it'd be better/easier to simply allow the user to pass an arbitrary scoring function. Otherwise, we have to choose what scoring functions to support, write a special case for each one, etc. Not very scalable from a coding point of view. Of course, we'd have to also clarify that the user should provide a scoring function that's appropriate for their data.

As in, use one scoring functions for optimization then a different scoring metric for final model selection? Interesting idea. I'm currently working on a version of TPOT that allows it to optimize on multiple criteria simultaneously, so perhaps that will help in this regard.

Yeah, I was thinking that and/or adding additional metrics to the report of the final model.

Is there a specific use case for this that you can think of?

That's what we currently do with accuracy. I think it makes sense to do the same with other measures.

I'm happy to contribute by adding simple support for F1/Precision/Recall in the same vein as the accuracy. I think it should be relatively straightforward.

Would you be interested in making an attempt at the implementation that allows the passing of arbitrary scoring functions discussed above? Perhaps we could also provide some example snippets in the docs of how to expand F1/precision/recall/etc. to support multiple classes, which can then be passed as the arbitrary scoring function.

bartleyn · 2015-12-02T21:43:20Z

I see your point about making the effort now to support the arbitrary functions. I'll make an attempt at it and provide examples for at least F/P/R.

bartleyn changed the title ~~Performance Metrics & Fitness Functions~~ Performance Metrics & Fitness Functions (brainstorm) Nov 30, 2015

rhiever added the question label Dec 2, 2015

bartleyn mentioned this issue Dec 3, 2015

Arbitrary scoring functions + doc example #38

Merged

bartleyn changed the title ~~Performance Metrics & Fitness Functions (brainstorm)~~ Performance Metrics & Fitness Functions Dec 3, 2015

rhiever closed this as completed Feb 8, 2016

AIAdventures mentioned this issue Jun 6, 2017

Titanic example -problem with 2nd last cell. #492

Closed

saddy001 mentioned this issue Mar 20, 2018

Segfault on optimization process #676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Metrics & Fitness Functions #33

Performance Metrics & Fitness Functions #33

bartleyn commented Nov 30, 2015

rhiever commented Dec 2, 2015

bartleyn commented Dec 2, 2015

rhiever commented Dec 2, 2015

bartleyn commented Dec 2, 2015

Performance Metrics & Fitness Functions #33

Performance Metrics & Fitness Functions #33

Comments

bartleyn commented Nov 30, 2015

rhiever commented Dec 2, 2015

bartleyn commented Dec 2, 2015

rhiever commented Dec 2, 2015

bartleyn commented Dec 2, 2015