[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

Ali-Flt · 2022-08-29T23:50:55Z

pycaret version checks

I have checked that this issue has not already been reported here.
I have confirmed this bug exists on the latest version of pycaret.
I have confirmed this bug exists on the master branch of pycaret (pip install -U git+https://github.com/pycaret/pycaret.git@master).

Issue Description

automl function returning error when using use_holdout=True argument.

Reproducible Example

best_model = automl(optimize = 'mse', use_holdout=True)

Expected Behavior

show the best model.

Actual Results

---------------------------------------------------------------------------
NotFittedError                            Traceback (most recent call last)
File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:5337, in _SupervisedExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   5336 try:
-> 5337     self.predict_model(model, verbose=False)  # type: ignore
   5338 except:

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/oop.py:2181, in RegressionExperiment.predict_model(self, estimator, data, drift_report, round, verbose)
   2129 """
   2130 This function predicts ``Label`` using a trained model. When ``data`` is
   2131 None, it predicts label on the holdout set.
   (...)
   2178 
   2179 """
-> 2181 return super().predict_model(
   2182     estimator=estimator,
   2183     data=data,
   2184     probability_threshold=None,
   2185     encoded_labels=False,
   2186     drift_report=drift_report,
   2187     round=round,
   2188     verbose=verbose,
   2189 )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:4999, in _SupervisedExperiment.predict_model(self, estimator, data, probability_threshold, encoded_labels, raw_score, drift_report, round, verbose, ml_usecase, preprocess)
   4997     estimator = get_estimator_from_meta_estimator(estimator)
-> 4999 pred = np.nan_to_num(estimator.predict(X_test_))
   5001 try:

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/linear_model/_base.py:386, in LinearModel.predict(self, X)
    373 """
    374 Predict using the linear model.
    375 
   (...)
    384     Returns predicted values.
    385 """
--> 386 return self._decision_function(X)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/linear_model/_base.py:367, in LinearModel._decision_function(self, X)
    366 def _decision_function(self, X):
--> 367     check_is_fitted(self)
    369     X = self._validate_data(X, accept_sparse=["csr", "csc", "coo"], reset=False)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/utils/validation.py:1345, in check_is_fitted(estimator, attributes, msg, all_or_any)
   1344 if not fitted:
-> 1345     raise NotFittedError(msg % {"name": type(estimator).__name__})

NotFittedError: This LassoLars instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
Input In [31], in <cell line: 1>()
----> 1 best_model = automl(optimize = 'mse', use_holdout=True)
      2 evaluate_model(best_model)
      3 predict_model(best_model)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/utils.py:922, in check_if_global_is_not_none.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    920     if globals_d[name] is None:
    921         raise ValueError(message)
--> 922 return func(*args, **kwargs)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/functional.py:2225, in automl(optimize, use_holdout, turbo, return_train_score)
   2169 @check_if_global_is_not_none(globals(), _CURRENT_EXPERIMENT_DECORATOR_DICT)
   2170 def automl(
   2171     optimize: str = "R2",
   (...)
   2174     return_train_score: bool = False,
   2175 ) -> Any:
   2177     """
   2178     This function returns the best model out of all trained models in
   2179     current session based on the ``optimize`` parameter. Metrics
   (...)
   2222 
   2223     """
-> 2225     return _CURRENT_EXPERIMENT.automl(
   2226         optimize=optimize,
   2227         use_holdout=use_holdout,
   2228         turbo=turbo,
   2229         return_train_score=return_train_score,
   2230     )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/oop.py:2512, in RegressionExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   2456 def automl(
   2457     self,
   2458     optimize: str = "Accuracy",
   (...)
   2461     return_train_score: bool = False,
   2462 ) -> Any:
   2464     """
   2465     This function returns the best model out of all trained models in
   2466     current session based on the ``optimize`` parameter. Metrics
   (...)
   2509 
   2510     """
-> 2512     return super().automl(
   2513         optimize=optimize,
   2514         use_holdout=use_holdout,
   2515         turbo=turbo,
   2516         return_train_score=return_train_score,
   2517     )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:5351, in _SupervisedExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   5339     self.logger.warning(
   5340         f"Model {model} is not fitted, running create_model"
   5341     )
   5342     model, _ = self._create_model(  # type: ignore
   5343         estimator=model,
   5344         system=False,
   (...)
   5349         return_train_score=return_train_score,
   5350     )
-> 5351     self.pull(pop=True)
   5352     self.predict_model(model, verbose=False)  # type: ignore
   5354 p = self.pull(pop=True)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/pycaret_experiment.py:447, in _PyCaretExperiment.pull(self, pop)
    431 def pull(self, pop=False) -> pd.DataFrame:  # added in pycaret==2.2.0
    432     """
    433     Returns the latest displayed table.
    434 
   (...)
    445 
    446     """
--> 447     return self.display_container.pop(-1) if pop else self.display_container[-1]

IndexError: pop from empty list

Installed Versions

System: python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:58:50) [GCC 10.3.0] executable: /home/ali/anaconda3/envs/pycaret_env/bin/python machine: Linux-5.4.0-122-generic-x86_64-with-glibc2.31

PyCaret required dependencies:
pip: 22.2.2
setuptools: 60.10.0
pycaret: 3.0.0.rc3
IPython: 8.4.0
ipywidgets: 8.0.1
tqdm: 4.64.0
numpy: 1.21.6
pandas: 1.4.3
jinja2: 3.1.2
scipy: 1.8.1
joblib: 1.1.0
sklearn: 1.1.2
pyod: Installed but version unavailable
imblearn: 0.9.1
category_encoders: 2.5.0
lightgbm: 3.3.2
numba: 0.55.2
requests: 2.28.1
matplotlib: 3.6.0rc2
scikitplot: 0.3.7
yellowbrick: 1.5
plotly: 5.10.0
kaleido: 0.2.1
statsmodels: 0.13.2
sktime: 0.11.4
tbats: Installed but version unavailable
pmdarima: 2.0.1
psutil: 5.9.1

The text was updated successfully, but these errors were encountered:

ngupta23 · 2022-11-05T12:50:07Z

5th Nov 2022 meeting:

@moezali1 reproduced the error (it works without use_holdout but gives the error with use_holdout)

from pycaret.datasets import get_data
data = get_data('juice')

from pycaret.classification import *
s = setup(data, target = 'Purchase', session_id = 123)

best = compare_models()

automl(use_holdout = True)

Ali-Flt · 2022-11-05T15:56:26Z

Actually if you call get_leaderboard() before you call automl each time, you can use use_holdout=True too. It appears to me that the bug is related to automl() popping the best model out of the models' list each time its called, so next time you call it the model is not there and you get an error unless you've called get_leaderboard() to recreate the models' list.
Note that these are all assumptions and I've not looked at the code.

Fix `automl` with `use_holdout` (#2924)

Ali-Flt added the bug Something isn't working label Aug 29, 2022

ngupta23 added this to the pycaret 3.0.0rc5 milestone Oct 9, 2022

ngupta23 added the automl label Oct 9, 2022

ngupta23 assigned Yard1 Nov 5, 2022

Yard1 mentioned this issue Nov 14, 2022

Fix automl with use_holdout (#2924) #3086

Merged

13 tasks

Yard1 closed this as completed in #3086 Nov 15, 2022

Yard1 added a commit that referenced this issue Nov 15, 2022

Merge pull request #3086 from pycaret/fix_automl

7d645c7

Fix `automl` with `use_holdout` (#2924)

github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

Ali-Flt commented Aug 29, 2022 •

edited

ngupta23 commented Nov 5, 2022

Ali-Flt commented Nov 5, 2022

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

Comments

Ali-Flt commented Aug 29, 2022 • edited

pycaret version checks

Issue Description

Reproducible Example

Expected Behavior

Actual Results

Installed Versions

ngupta23 commented Nov 5, 2022

Ali-Flt commented Nov 5, 2022

Ali-Flt commented Aug 29, 2022 •

edited