set fill value to max of full dataset + 1 #36

jonathanburns · 2020-03-23T00:02:39Z

Hello, I'm somewhat new to this repo 👋 .

It looks like if the global max is in not in the current cross_validation split, using X[train_indices] would lead to different fill values in different CV splits. My impression is that this is undesired, but correct me if I am wrong.

Similarly, all_nan_columns should be the same across all CV splits, I think?

LMZimmer · 2020-03-23T08:38:49Z

Hi there,
I believe the idea was that if NaNs occur for a feature in the validation set but not the train set, we can still train on the train set with that feature (in particular for .refit). However, this might through errors when trying to validate so I also think this is cleaner.

set fill value to max of full dataset + 1

cbcdd79

jonathanburns force-pushed the nanmax_X branch from 907279a to cbcdd79 Compare March 23, 2020 01:29

LMZimmer merged commit d603692 into automl:master Mar 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set fill value to max of full dataset + 1 #36

set fill value to max of full dataset + 1 #36

jonathanburns commented Mar 23, 2020 •

edited

LMZimmer commented Mar 23, 2020

set fill value to max of full dataset + 1 #36

set fill value to max of full dataset + 1 #36

Conversation

jonathanburns commented Mar 23, 2020 • edited

LMZimmer commented Mar 23, 2020

jonathanburns commented Mar 23, 2020 •

edited