Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keyboard interrupt handling into cython files (particular those that run long) #9136

Open
amueller opened this issue Jun 15, 2017 · 5 comments

Comments

@amueller
Copy link
Member

amueller commented Jun 15, 2017

At PyParis @ogrisel and me and some others thought that using cysignals to catch keyboard interrupts would be cool. Unfortunately it's LGPL. But it's not that hard to do that ourselves:

https://stackoverflow.com/questions/16769870/cython-python-and-keyboardinterrupt-ignored

We "only" need to periodically check for it.
The main use-case for this for me is that I ran some line in Jupyter and then realized i did something wrong but it takes 5h to complete. Or it runs longer than I anticipated and I want to stop it earlier.
Right now I think I need to kill the kernel, but that might even still have the process running in the background eating ram and CPU, I'm not sure.

For me this was mostly an issue with liblinear and libsvm so far. Not sure where else.

@amueller
Copy link
Member Author

maybe T-sne is also a candidate? I haven't checked. For tree-based methods I feel like it is less of an issue, since each single tree is reasonably fast usually, and after each tree we go back into Python IIRC.

@ogrisel
Copy link
Member

ogrisel commented Jun 16, 2017

t-SNE is not a problem because while the gradient calls are written in Cython, the main gradient descent loop is written in Python. A single gradient call takes less than 10s on MNIST scale data IIRC so ctrl-c is reactive enough.

More generally we could just acquire the GIL and call a dummy python callback at the end of each large iteration of C / C++ / Cython estimator loops to give Python an opportunity to raise the KeyboardInterrupt exception.

@amueller
Copy link
Member Author

If it's only svc we probably don't want to call into python there, right?

@rth
Copy link
Member

rth commented Aug 15, 2018

Also run into this with LogisticRegression(solver='saga', multi_class='multinomial') on sparse input. Maybe acquiring GIL (as mentioned above) after each epoch could be enough...

@cod3licious
Copy link

It would be great to add this for the tree-based methods as well. E.g., calling DBSCAN with a custom/non-sklearn distance function can take quite a while for a larger dataset and it seems to be because of the BallTree...

@rth rth mentioned this issue Jun 4, 2020
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants