Is it possible to disable CV? #919

jhmenke · 2019-09-23T09:40:36Z

Context of the issue

My dataset is very small and contains at maximum 380 samples and about 800 features. Naturally, this will mean that the data is highly noisy and depending on the CV splits and also the random_state of the model. Through another issue i learned that fitted_pipeline_ is different from the best pipeline during the search process, so it cannot be used either.

My idea therefore was to write a custom scorer which does not use the training data to score but directly uses the fixed and separate validation set to generate the score. It also keeps track of the best pipeline and saves it independently in the trained state.

There are some issues, e.g., the scorer object being copied i guess when using n_jobs>1, but since my case is an edge case that's okay, i can use n_jobs=1.
However, i really would like to deactivate cross validation for my case, since i want the estimator to be trained on the full training data and then the scorer score it on the full validation data. Using CV with training / validation set indices would not make my validation set independent of the training anymore, which is what i want to avoid.

Is there a workaround to make this happen?

weixuanfu · 2019-09-23T10:28:35Z

I think you could use custom validation set(s) via cv parameter (similar to issue #767).

Please check the cv parameter in TPOT API. You can merge the validation set with the trainset for fitting in tpot_obj.fit(X, y) and then specify train/test splits via an iterable (see the example for GridSearchCV)

jhmenke · 2019-09-23T12:58:45Z

Thank you @weixuanfu , it works well!

phihongvanvcb · 2023-08-01T16:17:18Z

@weixuanfu can you please give an example of using an iterator in detail code?

weixuanfu added the question label Sep 23, 2019

jhmenke closed this as completed Sep 23, 2019

weixuanfu mentioned this issue Apr 16, 2020

Using external validation set to optimize TPOT? #1045

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to disable CV? #919

Is it possible to disable CV? #919

jhmenke commented Sep 23, 2019 •

edited

weixuanfu commented Sep 23, 2019

jhmenke commented Sep 23, 2019

phihongvanvcb commented Aug 1, 2023

Is it possible to disable CV? #919

Is it possible to disable CV? #919

Comments

jhmenke commented Sep 23, 2019 • edited

Context of the issue

weixuanfu commented Sep 23, 2019

jhmenke commented Sep 23, 2019

phihongvanvcb commented Aug 1, 2023

jhmenke commented Sep 23, 2019 •

edited