You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before we start to tune a model, we need to create a simple baseline model first. Please refer to this post on how you can do that. We show a simple reduced regression model below which uses linear regression to model the monthly airline dataset. A complete notebook example for tuning can be found here.
#### What are the model hyper-parameters? ----model
The model uses a degree of 1 to detrend the data set before fitting. It also uses a seasonal_period of 1 by default with a window length of 10 (used to extract lagged features. i.e. uses up to 10 lags from the past).
Step 2: Tune the model
Random Grid Search
By default, pycaret performs a random grid search using a predefined internal grid. This default internal grid uses sensible defaults (such as seasonal_period) obtained during the setup.
#### Tune the model (Default = Random Grid Search)tuned_model_random=exp.tune_model(model)
exp.plot_model(tuned_model_random)
#### What are the tuned model hyper-parameters? ----tuned_model_random
As we can see, the tuned model has a much better performance than the original model (e.g. MAE reduced substantially). The tuned model uses a degree of 2 to detrend the data, seasonal period of 12 which seems reasonable for a monthly airline data and a window length (number of lags) of 23.
One thing that you might be curious about is what was the search space used? This can be obtained easily using the models() call with internal=True. The column Tune Distributions provides the default random grid used for tuning. In this case, six (6) hyper parameters were tuned in a random manner. Note that
The limits of the search space are chosen carefully based on the time series characteristics. For example,
The seasonal period sp is chosen to be a choice between 12 and 24. This is based on the seasonality of 12 that is detected during the setup process. You may wonder why we also add a seasonality of 24 in the search space. This is because in some cases, a seasonal period that is a harmonic of the dominant frequency can lead to better modeling results.
The same rationale goes into choosing the window_length (number of lags used for modeling) search space as well.
The default number of searches is 10 hyper-parameter combinations.
#### OK, so what search space was used? ----random_grid=exp.models(internal=True).loc["lr_cds_dt", "Tune Distributions"]
random_grid
Fixed Grid Search
Alternately (instead of performing a random grid search), one may choose to do hyper-parameter tuning using a "fixed" grid search. This can be done in pycaret using the search_algorithm argument.
#### Tune the model (Fixed Grid Search) ----tuned_model_grid=exp.tune_model(model, search_algorithm="grid")
exp.plot_model(tuned_model_grid)
Again, you might be interested in learning about the search space used for this tuning. This can be obtained easily using the models() call with internal=True. The column Tune Grid provides the default fixed grid used for tuning.
#### What search space was used? ----fixed_grid=exp.models(internal=True).loc["lr_cds_dt", "Tune Grid"]
fixed_grid
Reasons for preferring Random Grid Search over a Fixed Grid Search
In this case too, six (6) hyper parameters were tuned. The difference is that in case of a fixed grid, all permutations of the hyper-parameters are tried (vs. sampling from a search space in case of a random grid search). Because this can be exhaustive and time consuming, generally the fixed grid searches over a limited space compared to a random grid. This can cause the results of a random grid search to be better than that of a fixed grid search (in general, though not all the time).
In addition, random grid search is preferred because it ends up not wasting valuable time if a certain hyper-parameter is not helpful in improving the performance. This is explained in this wonderful video by Andrew Ng.
For these reasons, "random grid search" is the default option in the pycaret time series module.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Step 1: Create a simple baseline model
Before we start to tune a model, we need to create a simple baseline model first. Please refer to this post on how you can do that. We show a simple reduced regression model below which uses linear regression to model the monthly airline dataset. A complete notebook example for tuning can be found here.
The model uses a degree of 1 to detrend the data set before fitting. It also uses a seasonal_period of 1 by default with a window length of 10 (used to extract lagged features. i.e. uses up to 10 lags from the past).
Step 2: Tune the model
Random Grid Search
By default,
pycaret
performs a random grid search using a predefined internal grid. This default internal grid uses sensible defaults (such asseasonal_period
) obtained during thesetup
.As we can see, the tuned model has a much better performance than the original model (e.g. MAE reduced substantially). The tuned model uses a degree of 2 to detrend the data, seasonal period of 12 which seems reasonable for a monthly airline data and a window length (number of lags) of 23.
One thing that you might be curious about is what was the search space used? This can be obtained easily using the
models()
call withinternal=True
. The columnTune Distributions
provides the default random grid used for tuning. In this case, six (6) hyper parameters were tuned in a random manner. Note thatsp
is chosen to be a choice between 12 and 24. This is based on the seasonality of 12 that is detected during thesetup
process. You may wonder why we also add a seasonality of 24 in the search space. This is because in some cases, a seasonal period that is a harmonic of the dominant frequency can lead to better modeling results.window_length
(number of lags used for modeling) search space as well.Fixed Grid Search
Alternately (instead of performing a random grid search), one may choose to do hyper-parameter tuning using a "fixed" grid search. This can be done in
pycaret
using thesearch_algorithm
argument.Again, you might be interested in learning about the search space used for this tuning. This can be obtained easily using the
models()
call withinternal=True
. The columnTune Grid
provides the default fixed grid used for tuning.Reasons for preferring Random Grid Search over a Fixed Grid Search
For these reasons, "random grid search" is the default option in the
pycaret
time series module.Suggested Next Reads
Beta Was this translation helpful? Give feedback.
All reactions