Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

Closed
2 of 3 tasks
Ali-Flt opened this issue Aug 29, 2022 · 2 comments · Fixed by #3086
Closed
2 of 3 tasks

[BUG]: use_holdout=True causes automl function to fail (PyCaret 3.0-rc) #2924

Ali-Flt opened this issue Aug 29, 2022 · 2 comments · Fixed by #3086
Assignees
Labels
automl bug Something isn't working

Comments

@Ali-Flt
Copy link

Ali-Flt commented Aug 29, 2022

pycaret version checks

Issue Description

automl function returning error when using use_holdout=True argument.

Reproducible Example

best_model = automl(optimize = 'mse', use_holdout=True)

Expected Behavior

show the best model.

Actual Results

---------------------------------------------------------------------------
NotFittedError                            Traceback (most recent call last)
File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:5337, in _SupervisedExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   5336 try:
-> 5337     self.predict_model(model, verbose=False)  # type: ignore
   5338 except:

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/oop.py:2181, in RegressionExperiment.predict_model(self, estimator, data, drift_report, round, verbose)
   2129 """
   2130 This function predicts ``Label`` using a trained model. When ``data`` is
   2131 None, it predicts label on the holdout set.
   (...)
   2178 
   2179 """
-> 2181 return super().predict_model(
   2182     estimator=estimator,
   2183     data=data,
   2184     probability_threshold=None,
   2185     encoded_labels=False,
   2186     drift_report=drift_report,
   2187     round=round,
   2188     verbose=verbose,
   2189 )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:4999, in _SupervisedExperiment.predict_model(self, estimator, data, probability_threshold, encoded_labels, raw_score, drift_report, round, verbose, ml_usecase, preprocess)
   4997     estimator = get_estimator_from_meta_estimator(estimator)
-> 4999 pred = np.nan_to_num(estimator.predict(X_test_))
   5001 try:

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/linear_model/_base.py:386, in LinearModel.predict(self, X)
    373 """
    374 Predict using the linear model.
    375 
   (...)
    384     Returns predicted values.
    385 """
--> 386 return self._decision_function(X)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/linear_model/_base.py:367, in LinearModel._decision_function(self, X)
    366 def _decision_function(self, X):
--> 367     check_is_fitted(self)
    369     X = self._validate_data(X, accept_sparse=["csr", "csc", "coo"], reset=False)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/sklearn/utils/validation.py:1345, in check_is_fitted(estimator, attributes, msg, all_or_any)
   1344 if not fitted:
-> 1345     raise NotFittedError(msg % {"name": type(estimator).__name__})

NotFittedError: This LassoLars instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
Input In [31], in <cell line: 1>()
----> 1 best_model = automl(optimize = 'mse', use_holdout=True)
      2 evaluate_model(best_model)
      3 predict_model(best_model)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/utils.py:922, in check_if_global_is_not_none.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    920     if globals_d[name] is None:
    921         raise ValueError(message)
--> 922 return func(*args, **kwargs)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/functional.py:2225, in automl(optimize, use_holdout, turbo, return_train_score)
   2169 @check_if_global_is_not_none(globals(), _CURRENT_EXPERIMENT_DECORATOR_DICT)
   2170 def automl(
   2171     optimize: str = "R2",
   (...)
   2174     return_train_score: bool = False,
   2175 ) -> Any:
   2177     """
   2178     This function returns the best model out of all trained models in
   2179     current session based on the ``optimize`` parameter. Metrics
   (...)
   2222 
   2223     """
-> 2225     return _CURRENT_EXPERIMENT.automl(
   2226         optimize=optimize,
   2227         use_holdout=use_holdout,
   2228         turbo=turbo,
   2229         return_train_score=return_train_score,
   2230     )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/regression/oop.py:2512, in RegressionExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   2456 def automl(
   2457     self,
   2458     optimize: str = "Accuracy",
   (...)
   2461     return_train_score: bool = False,
   2462 ) -> Any:
   2464     """
   2465     This function returns the best model out of all trained models in
   2466     current session based on the ``optimize`` parameter. Metrics
   (...)
   2509 
   2510     """
-> 2512     return super().automl(
   2513         optimize=optimize,
   2514         use_holdout=use_holdout,
   2515         turbo=turbo,
   2516         return_train_score=return_train_score,
   2517     )

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:5351, in _SupervisedExperiment.automl(self, optimize, use_holdout, turbo, return_train_score)
   5339     self.logger.warning(
   5340         f"Model {model} is not fitted, running create_model"
   5341     )
   5342     model, _ = self._create_model(  # type: ignore
   5343         estimator=model,
   5344         system=False,
   (...)
   5349         return_train_score=return_train_score,
   5350     )
-> 5351     self.pull(pop=True)
   5352     self.predict_model(model, verbose=False)  # type: ignore
   5354 p = self.pull(pop=True)

File ~/anaconda3/envs/pycaret_env/lib/python3.9/site-packages/pycaret/internal/pycaret_experiment/pycaret_experiment.py:447, in _PyCaretExperiment.pull(self, pop)
    431 def pull(self, pop=False) -> pd.DataFrame:  # added in pycaret==2.2.0
    432     """
    433     Returns the latest displayed table.
    434 
   (...)
    445 
    446     """
--> 447     return self.display_container.pop(-1) if pop else self.display_container[-1]

IndexError: pop from empty list

Installed Versions

System: python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:58:50) [GCC 10.3.0] executable: /home/ali/anaconda3/envs/pycaret_env/bin/python machine: Linux-5.4.0-122-generic-x86_64-with-glibc2.31

PyCaret required dependencies:
pip: 22.2.2
setuptools: 60.10.0
pycaret: 3.0.0.rc3
IPython: 8.4.0
ipywidgets: 8.0.1
tqdm: 4.64.0
numpy: 1.21.6
pandas: 1.4.3
jinja2: 3.1.2
scipy: 1.8.1
joblib: 1.1.0
sklearn: 1.1.2
pyod: Installed but version unavailable
imblearn: 0.9.1
category_encoders: 2.5.0
lightgbm: 3.3.2
numba: 0.55.2
requests: 2.28.1
matplotlib: 3.6.0rc2
scikitplot: 0.3.7
yellowbrick: 1.5
plotly: 5.10.0
kaleido: 0.2.1
statsmodels: 0.13.2
sktime: 0.11.4
tbats: Installed but version unavailable
pmdarima: 2.0.1
psutil: 5.9.1

@Ali-Flt Ali-Flt added the bug Something isn't working label Aug 29, 2022
@ngupta23 ngupta23 added this to the pycaret 3.0.0rc5 milestone Oct 9, 2022
@ngupta23 ngupta23 added the automl label Oct 9, 2022
@ngupta23
Copy link
Collaborator

ngupta23 commented Nov 5, 2022

5th Nov 2022 meeting:

@moezali1 reproduced the error (it works without use_holdout but gives the error with use_holdout)

from pycaret.datasets import get_data
data = get_data('juice')

from pycaret.classification import *
s = setup(data, target = 'Purchase', session_id = 123)

best = compare_models()

automl(use_holdout = True)

@Ali-Flt
Copy link
Author

Ali-Flt commented Nov 5, 2022

Actually if you call get_leaderboard() before you call automl each time, you can use use_holdout=True too. It appears to me that the bug is related to automl() popping the best model out of the models' list each time its called, so next time you call it the model is not there and you get an error unless you've called get_leaderboard() to recreate the models' list.
Note that these are all assumptions and I've not looked at the code.

Yard1 added a commit that referenced this issue Nov 15, 2022
Fix `automl` with `use_holdout` (#2924)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
automl bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants