-
-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taking too much time to run . #317
Comments
Hi @Vinitkumar89 , thank you for the report. How many features your dataset has? There is categorical features or all features are numerical? |
Hi @brendalf. sorry for the late reply. |
@shankarpandala, can you help me here? You know what model the lazypredict is stuck (step 26/43)? |
We can see which model is running by setting verbose>1 I have manually removed them from the list already but still there are some models that take long time. My long term plan is to divide algorithms by time-complexity let users choose which complexity they want |
I would like to work on this I might have an idea on how to improve the speed. How do I contribute |
@shankarpandala - I also face the same issue. It is stuck at 74% more than 5 hours. My dataset size is also small. It has only 5900 rows and 70 features. could 70 features be the culprit? I didn't do feature engineering/selection yet. I just passed the train and test as it is to see how the model is doing. Can help me please? Is there anyway to fix this issue? I can sponsor by paying 50 USD |
I've had a similar issue as described by @SSMK-wq. My LazyRegressor got stuck on 74% too and I had left it for 2h+. My dataset is around 8000r/150c, filled with binary independent 1/0 values predicting a continuous target variable. I use lazypredict as an initial screen and have enjoyed it's user-friendly low code workflow. @shankarpandala It would be great if you could include a timeout = threshold parameter within the LazyRegressor() that when passed the algorithm would skip to the next model. This would save a lot of time and avoid waiting for a model which you probably wouldn't use. Thanks a lot for all your work. Top stuff! |
There is already a way to skip models by specifying the algorithms. Time based skipping doesn't work with windows so I didn't implement it |
Maybe some algorithm is taking a long time to train. |
@shankarpandala - how to specify the list of algorithms that we want to try? Is there any syntax that you can share? Am not able to find anything in the documentation. Can help please? |
Hello dears, @Vinitkumar89 maybe you are facing the same issue as me . For my case i am using OneHotEncoder for the Categorical Data but when i am fitting the Data to LazyRegressor he show me a warning regrading the unkown categories found . (There is some categories on test dataset not available on training dataset) @shankarpandala could you please help if there is anyway to avoid this issue ?? Thanks Guys for This interessting Subject . |
Hello Dears , i hope you're doing fine :) Thanks for your help |
@SSMK-wq It seems that you can specify it either with a string ("all"), or with a list of classifiers (probably model classifiers from scikit) |
Here's some code to only include regressors that are in the "chosen_regressors" list.
I had this issue with the 'GaussianProcessRegressor' lazypredict/lazypredict/Supervised.py Line 77 in aad245d
|
@dchecks I have the same issue, and using your method, I specified all the models except |
@Lramos505 According to the codebase, LGBMRegressor is already included. Also, you can probably reverse-engineer this GitHub entry to get where you want if you still have issues: https://stackoverflow.com/a/76557962/6712832 |
Just for completion, here is the code for classification algorithms. Also,from my experience, SVC is taking too long in problems with real data, so it's better to drop it from classifiers to try with this LazyClassifier. `from sklearn.utils import all_estimators removed_classifiers = [ |
Description
I tried to run lazypredict regressor on a black friday sales train dataset n it gets stuck on 60 %-63% . dataset has 55,000 rows.
What I Did
The text was updated successfully, but these errors were encountered: