-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make pysurvival work with scikit-learn #15
Comments
FYI, I'm working on a solution to this issue. I expect to have something in a few days. |
I'm happy to announce that I think I have a clean solution to this issue. Please pull from master in my forked repository: https://github.com/bacalfa/pysurvival. If you installed it with setup.py, first uninstall the current version with:
Then reinstall it:
Make sure to check out the new notebook explaining the new feature. Comments and feedback are welcome! |
Omg thank you so much! hahaha |
This comment has been minimized.
This comment has been minimized.
@JCCKwong, can you give more details on the steps you're taking and what happens after you execute them? Also, did you clone my forked repository (https://github.com/bacalfa/pysurvival) instead of the one from the original author (https://github.com/square/pysurvival)? |
This comment has been minimized.
This comment has been minimized.
First change the current directory to C:\Users\Jethro\pysurvival.
Then run the python commands described above in #15 (comment). |
This comment has been minimized.
This comment has been minimized.
|
@DashengSong, have you tried it? I don't think I have. |
Hi @bacalfa . Thanks for creating a package that can be installed on Windows. I'm trying to use the sklearn compatibility feature you've added. Does it work the random survival forest estimator too? |
@KaranMehta21, I think it does. But there may be a caveat: #17. |
@bacalfa OK I'll try it out. Is the benefit of using it to implement cross-validation and hyperparameter tuning and will that lead to higher c-indices? Currently, the RSF model I've trained has a c index of 0.71. I'm looking for ways to increase it closer to 0.80. Any suggestions? |
Honestly, I haven't used this package that much, so I'm not sure what to suggest. There are simpler and more complex models. It's a good habit to evaluate performance with a validation set (like in CV) and perform hyperparameter tuning. Difficult to know which algorithm will be the best a priori. So try (and tune) as many as you can, and make sure you make a fair comparison between them. |
Hi All, Would really appreciate if anyone can help me. I have downloaded the package which is at location : C:\Users\User\Downloads\pysurvival-master. For me , I have installed Anaconda at C:\Users\User. I am providing you with the steps that I think I need to follow, please guide so that I can carry out the installation correctly. Step-1: Create a Directory : C:\Users\User\pysurvival (as Anaconda is installed in C:\Users\User ) |
Hi @bacalfa , I've tried your fork with the setup.py Unfortunatly, still not working for me, because of this line: extra_compile_args = ["/O2"] Error occuring when: building 'pysurvival.utils._functions' extension Error: Thanks! |
@CoteDave, I don't have MinGW installed on my Windows machine (and it's not easy to do so). The error seems to suggest that /O2 is an option for the MS C/C++ compiler, which isn't recognized by MinGW. If you change line 61 in setup.py to the same thing as in line 63, I think it'd work. Let me know. |
Hi @bacalfa , changed the line 61. No more /O2 error, but sadly, a new error occurs at the same place: building 'pysurvival.utils._functions' extension |
I would like as well to make a suggestion. Therefore, if l2_reg=1, one is doing Ridge regularization, if l2_reg=0 one is doing Lasso regularization, and when 0<l2_reg<1 one is doing ElasticNet. |
@elopezfune, regarding your error, see if this helps. I'll see if I can help with the regularization request and will let you know. |
@elopezfune, I'd prefer to create a branch for this request. Let's call it For MTLR, I'd do: loss += l2_reg * torch.sum(w * w) / 2. + (1.0 - l2_reg) * torch.sum(torch.abs(w)) For consistency, I should probably apply the same change to other models. For SVM, that would require modifying Cython code (file |
Thanks for the quick answer. |
Yes, a line of code like this is perfect! I tried once to change it manually on the local files, but I didn't have access to the optimization code (Cython), therefore, it didn't work. |
Well, indeed, there is the need to introduce a new parameter namely |
What Python version do you have installed? It seems to be suggesting that you should have at least 3.8 to be able to install libpython3-dev. These errors you're experiencing are specific to your Ubuntu system, not really to pysurvival. Once you have all the dependencies installed, you should be able to build pysurvival. |
I have Python 3.8.6 |
You'll have to do some searching on the errors you're getting. I can't reproduce it because I currently don't have access to Ubuntu. See this. |
Thanks, I m trying to solve this problem that it is driving me crazy |
Adding support for l1 regularization to SVM isn't trivial. It requires modifications to Cython code (doable), but I can't find the reference for the formulation. And I don't have a lot of time to spend on this. If anyone would like to contribute or help, that'd be appreciated. SVM in this package doesn't use PyTorch (loss, gradient, and Hessian are manually implemented in Cython, so it's important to know the full formulation in order to modify it). |
I could help on that! |
They're in |
Hi, **urvival/cpp_extensions/_functions.o -std=c++11 -O3
|
@byronmamamoney, please see above the discussion with @elopezfune. You'll have to Google the errors that come up. I don't have a way to test it on Ubuntum |
Thanks @bacalfa I've got it working after removing the various python versions (except 3.6), the default for the Ubuntu box. Did a reboot and reinstalled the libraries. Great work on the documentation here: https://square.github.io/pysurvival/index.html Regards |
What does pct_importance mean ?? |
If I understand correctly, it is the percentage of each feature importance. It could be good to include the explanation of this feature in the documentation. |
Hi, the issue comments seem to go a bit off topic. @bacalfa thanks for your work. Any chance to make a pull request to merge your version with the official repo? So that your sklearn add on will become available generally? |
@pransito I think the original author isn't maintaining this package anymore., unfortunately. That's why I forked it and fixed a few issues. But you can try to reach out to him. |
Awesome job @bacalfa, this is exactly what I was looking for!! If there are other less skilled coders like me who needs a bit more clarification then I can spare you some time by following these slightly more detailed steps based on previously mentioned explanations.
The |
With your modified version, I can make MultiTaskModel work fine with scikit-learn. However, with NeuralMultiTaskModel, I write my code as follows. my codingNMTLModel_skl = sklearn_adapter(NeuralMultiTaskModel,time_col='time', event_col='status',predict_method="predict_survival", scoring_method= concordance_index) and get a error as follows. I does not know how to fix it. the errorCannot clone object SkLearnNeuralMultiTaskModel(auto_scaler=True, bins=150, structure=[{'activation': 'ReLU', 'num_units': 150}]), as the constructor either does not set or modifies parameter structure
|
hey @bacalfa - thanks so much for all the answers I recently came across with the library and after performing all the steps I am still not able to install the library in my windows computer. I get the following error
Am I missing anything? I've installed all the suggested libraries and compilers I don't know what else to do. hope you can help me install it. Thanks |
I have noticed that PySurvival does not really follow the priniciples of scikit-learn. Starting with the fact that you input X, T, E, instead of X, y. Further GridSearchCV cannot be used because of the aforementioned problem but also because there is no set_params method in the model objects. (also see pipeline of scikit-learn, which only works after extensive reworking of many classes and functions in scikit-learn). This is very unfortunate, I think, that this great package keeps outside of sklearn. Is there any plan to fix this and make PySurvival connectable to scikit-learn? Or am I missing something?
The text was updated successfully, but these errors were encountered: