Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massive memory usage running the LazyClassifier #327

Open
qemtek opened this issue Feb 8, 2021 · 3 comments
Open

Massive memory usage running the LazyClassifier #327

qemtek opened this issue Feb 8, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@qemtek
Copy link

qemtek commented Feb 8, 2021

Describe the bug
Using a dataset with 500k rows and 27 features, I ran into a huge memory issue on iteration 12/30. Screenshot included so you can see how much memory was being used.

Screenshot 2021-02-08 at 13 02 49

Desktop (please complete the following information):

  • OS: OSX Catalina 10.15.5

Additional context
Other packages installed

awswrangler==2.4.0
pandas==1.2.1
numpy==1.20.0
scikit-learn==0.23.1
sqlalchemy==1.3.23
psycopg2-binary==2.8.6
lazypredict==0.2.7
tqdm==4.56.0
xgboost==1.3.3
lightgbm==3.1.1
pytest==6.2.2
imblearn
shap==0.38.1
matplotlib==3.3.4
ipython

@shankarpandala shankarpandala added the enhancement New feature or request label Feb 9, 2021
@apostolides
Copy link

Hello,

I have the same issue using a train dataset with 125K rows. I'm training the models on google colaboratory with12G ram available. Runtime crashes on 38% prompting a huge amount of allocated memory. Did you find any workarounds for this issue?

Thanks in advance.

@felixvor
Copy link

felixvor commented Mar 29, 2023

A workaround is to filter out high memory model architectures from the default regressors / classifiers list and to pass that custom list of models to the LazyRegressor / LazyClassifier. For example:

import lazypredict
from lazypredict.Supervised import LazyRegressor

highmem_regressors = [
    "GammaRegressor", "GaussianProcessRegressor", "KernelRidge", "QuantileRegressor"
]
regressors = [reg for reg in lazypredict.Supervised.REGRESSORS if reg[0] not in highmem_regressors]
reg = LazyRegressor(regressors=regressors, verbose=1, ignore_warnings=True, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

@dvijkalsi
Copy link

This worked for me, I was using Google collab 8GB RAM

highmem_classifiers = ["LabelSpreading","LabelPropagation","BernoulliNB","KNeighborsClassifier", "ElasticNetClassifier", "GradientBoostingClassifier", "HistGradientBoostingClassifier"]

# Remove the high memory classifiers from the list
classifiers = [c for c in lazypredict.Supervised.CLASSIFIERS if c[0] not in highmem_classifiers]

clf = LazyClassifier(classifiers=classifiers, verbose=1, ignore_warnings=True, custom_metric=None)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)
model_dictionary = clf.provide_models(X_train, X_test, y_train, y_test)
models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants