Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only a fraction of models show up in modeltime_resamples... #18

Closed
Steviey opened this issue Jan 13, 2022 · 1 comment
Closed

Only a fraction of models show up in modeltime_resamples... #18

Steviey opened this issue Jan 13, 2022 · 1 comment

Comments

@Steviey
Copy link

Steviey commented Jan 13, 2022

Ubuntu 16.x LTS, R latest, modeltime.ensemble latest

A submodels_tbl has 15 correctly fitted models.
When I try to use them with modeltime_fit_resamples(), only a fraction of them show up in the result of that function (only 4)
Is there an explanation available?

resamples_tscv <- df_train %>%
	time_series_cv(
		assess   = test_len
		,initial = train_len
		#skip    = "2 years",
		,slice_limit = dplyr::n()
	) 

submodel_predictions <- submodels_tbl %>%
	modeltime_fit_resamples(
	resamples = resamples_tscv,
	control   = control_resamples(verbose = TRUE)
)

debugAnalyse<-1
if(debugAnalyse>0){
	# Visualize the Resample Sets
	myPlot<-resamples_tscv %>%
		tk_time_series_cv_plan() %>%
		plot_time_series_cv_plan(
			date, value,
			.facet_ncol  = 2,
			.interactive = TRUE
		)

	print(myPlot)    

	myPlot<-submodel_predictions %>%
		plot_modeltime_resamples(
			.interactive = TRUE
		)

	print(myPlot)

	#View(submodel_predictions)
	
	predictions_tbl <- modeltime.resample::unnest_modeltime_resamples(submodel_predictions)
	
	View(predictions_tbl)

	predictions_tbl$editDate        <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
	predictions_tbl$pslMetaLearner  <- metaLearner
	fastInOut('predictions_tbl.Rds',predictions_tbl)

	predictions_by_rowid_tbl <- predictions_tbl %>%
		dplyr::select(.row_id, .model_id, .pred) %>%
		dplyr::mutate(.model_id = stringr::str_c(".model_id_", .model_id)) %>%
		tidyr::pivot_wider(names_from  = .model_id,values_from = .pred)

	View(predictions_by_rowid_tbl)                        
				
}    

Side note: When I use less then the 15 models, the code breaks while fitting a glmnet-metaLearner. It promps:

 x Slice1: preprocessor 1/1, model 1/1: Error: For the glmnet engine, `penalty` 
 must be a single number (or a value of `tune()`).

... where the model ist correctly tagged with 'penalty=tune::tune()'
I noticed the same effect with lasso (mixture=1).

My guess is, it will be forgotten anywhere in modeltime.ensemble-internal code. Currently I'm testing different metaLearners. Xgboost metaLearner seem to work only without xgboost submodels. Others work fine so far.

It would be nice to have a fallback/try-catch option in modeltime.resample. Otherwise code breaks in huge projects, any time something fails at this point.

@Steviey
Copy link
Author

Steviey commented Jan 23, 2022

Could it be related to different lengths of .resample_results per model/workflow?

I noticed doing this:

predictions_tbl <- modeltime.resample::unnest_modeltime_resamples(m750_training_resamples_fitted)
View(predictions_tbl)

... results in a clickable, ready to drill down view in Rstudio.

... comparing with this, the result seems to be corrupted and is not clickable. The difference lies in the different lengths of .resample_results.

image

What would be, if we were able to reduce the predictions on min. length of resample.results?

update: There is more... .predictions=NULL in a GLMNET-model

image
Every failing model will be lost...
Since we have no influence on how models will be treated in modeltime_fit_resamples(), there is no chance to fix it- other then hardcoded.

Error in if (is.numeric(args$mixture) && (args$mixture < 0 | args$mixture > : missing value where TRUE/FALSE needed

@Steviey Steviey closed this as completed Jan 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant