-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
saving OptimalBinning & some other issues #77
Comments
I searched for this issue with joblib, and it seems to be related to multiprocessing. Are you trying to run several OptimalBinning in parallel? |
Ah, no - just a for loop and I don't apply any multiprocessing. Even for one variable the same problem persists. I am doing it in a windows pc. By the way - is it possible to get the bin rules to apply on new datasets? |
Not directly, but you can retrieve the split points for each variable and implement your own transform method. |
I will add new functionality to create a binning_process/scorecard from a set of OptimalBinning objects. In this way, it will be possible to transform and create a scorecard table for large datasets while only keeping the data x and target y in memory. Does it sound reasonable? |
That would be perfect. |
Perfect, it might take a few days. I will keep you informed. |
@guillermo-navas-palencia Having the same issue for exporting the object and looking forward to seeing the new support. However, why did you successfully run without any issue? |
Hi @naenumtou, are you using Windows, Linux or Mac? I only tested it on Linux. |
…dle large data files on disk. Issue #77
@guillermo-navas-palencia I am using on Google Colab. |
I can reproduce the error on Google Colab. However, I run it without any issue on Linux and Windows using anaconda 3.7 and 3.8. I would recommend you to run it locally and see if the error persists. I do not know why it does not work on Google Colab, I do not use it regularly. |
Hi, is there .whl for this package? Tried to install with .whl. #Edit: think I found it - but is there an official link? |
Hi, I found the source of the error. The logger in all OptBinning classes cannot be pickled and Google Colab fails. It will be fixed in the next release. In addition, all optimal binning classes will expose the method save to automatically save the object to a pickle file. |
Hi @similang @naenumtou, Release 0.9.1 is ready. It has been tested on Google Colab installed with the wheel (https://pypi.org/project/optbinning/#files). Please update OptBinning and reopen this issue if you encounter any problem. |
@guillermo-navas-palencia I have tried to install via wheel on my Colab environment. It fail to build because the library required python version >=3.7 but on Colab is 3.6.9. Is there any way to fix it? |
Hi @naenumtou, When I tested it on Colab I created a wheel changing the requirement >= 3.7 in setup.py manually. Python 3.7 is required due to some dependencies with CVXPY (SCS solver) but apparently, it might work with Python 3.6 under some environments. I would recommend updating Colab python to 3.7 if possible (python 3.6 is from 2016). Otherwise, try to run it locally. It might help: https://stackoverflow.com/questions/63867581/install-python-3-7-via-google-colab-as-default-python |
Hey! I'm still getting the same issue saying: I'm running python 3.8 wiht optbinning 0.12.0 |
Hi,
I am dealing with a big dataset => so scorecard module can't be used on my pc. I resort to OptimalBinning of each variable:
Code
optb = OptimalBinning(name=variable, dtype="numerical")
optb.fit(x, y)
import joblib
joblib.dump(optb, output+'txt.pkl')
Error
_TypeError: can't pickle thread.RLock objects
Would love to:
Any thoughts on the above are much appreciated.
The text was updated successfully, but these errors were encountered: