Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using external validation set to optimize TPOT? #1045

Open
fcoppey opened this issue Apr 15, 2020 · 2 comments
Open

Using external validation set to optimize TPOT? #1045

fcoppey opened this issue Apr 15, 2020 · 2 comments
Labels

Comments

@fcoppey
Copy link

fcoppey commented Apr 15, 2020

I was wondering if there is a possibility to apply scoring an external X_test, y_test dataset instead of cross-validation in order to optimise TPOT Pipeline?

I'm thinking the way hypopt (https://pypi.org/project/hypopt/) does...

Would that be useful to anyone?

@weixuanfu
Copy link
Contributor

This issue is related to issue #919. Please check a possible solution via a custom cv setting there.

TPOT does not provide this option by default since we think CV can help to avoid overfitting issue.

@Lamiane
Copy link

Lamiane commented Jul 14, 2020

I have run into the same problem some time ago. The proposed solution with using custom cv works for me but I'd like to leave an argument for why this feature could be helpful.

Using cross-validation is a default choice when using small datasets. However:

  • with big datasets it might be a better option to just have a train-validation-test split,
  • some benchmark datasets define train-validation-test split and this has to be followed.

No ready way to use an external validation set made me decide not to use TPOT at all and I've spend quite some time looking for alternatives. I've found nothing more suitable than TPOT and ended up writing custom cv.

To wrap up: I believe this feature is useful and it's worth considering adding it to TPOT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants