Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

set fill value to max of full dataset + 1 #36

Merged
merged 1 commit into from Mar 23, 2020

Conversation

jonathanburns
Copy link
Contributor

@jonathanburns jonathanburns commented Mar 23, 2020

Hello, I'm somewhat new to this repo 馃憢 .

It looks like if the global max is in not in the current cross_validation split, using X[train_indices] would lead to different fill values in different CV splits. My impression is that this is undesired, but correct me if I am wrong.

Similarly, all_nan_columns should be the same across all CV splits, I think?

@LMZimmer
Copy link
Contributor

Hi there,
I believe the idea was that if NaNs occur for a feature in the validation set but not the train set, we can still train on the train set with that feature (in particular for .refit). However, this might through errors when trying to validate so I also think this is cleaner.

@LMZimmer LMZimmer merged commit d603692 into automl:master Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants