-
-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfaults with Intel NEO driver #276
Comments
Thanks for the elaborate details. I think the first thing I'll do is install NEO myself and try to reproduce the issue. The crash in |
Great - let me know if I can help in any way. |
I managed to get NEO installed, and reproduced the issue. I suspect it is not NEO related since it also happens on Visual Studio. I tried to debug, but without luck. Then I created a relatively small stand-alone example of ~100 lines of code (still to be reduced perhaps?) which reproduces the issue. The file is on gist and can be compiled and run e.g. as |
Good stuff - I can reproduce that error. While debugging I saw a reference to beignet which was worrying, and that reminded me of something I'd had while installing caffe, which makes me think it's not playing nicely with other drivers. (It does, after all, overwrite the Intel OpenCL ones I had.) I also found this. I tried to uninstall beignet and install only NEO to test this theory, but I've kind messed up my system - so I think I'm going to try a reinstall of my system. This might take me a day or two - if you can test it more easily, great, otherwise I'll report back when done. |
OK, thanks for reproducing it! I've made the example a bit simpler and nicer to test, and I've reported this as an issue on the NEO project. Hopefully it is not me doing something stupid here ;-) |
Good find on the cause. How hard do you think it is to fix (i.e. is it something I could try)? Or will you wait to see what the response to the bug is? |
I would just remove the clean-up of the program, i.e. remove the call to |
Awesome. I'll test that out tonight - provided it works, I'm happy to leave the issue to you to close etc. as you see fit. |
Works for me! |
I've spoken to the Intel NEO developers, and they will address the issue. They know what to do and they will work on it. In the meantime, removing the call to |
I'm hoping this doesn't need as much work to debug as #66 , but I figured I'd point it out in case. Latest commit on master for CLBlast, and I built Intel NEO a few days ago. This is an i5-5250u running on Ubuntu 16.04. I'm a relative newbie to debugging this kind of thing, so might need some hand-holding.
Specifically, I seem to be getting segfaults in some cases, at the end of certain scripts, e.g.:
(Note: I do also get a lot of failed tests.) Also
Both seem to 'finish', and then segfault. Interestingly, tuning runs fine though.
I've messed round with the sgemm sample, , and it doesn't segfault if I comment out the
Gemm
call - which seems to imply (??) that it's something specific to CLBlast that's causing the problem (though it still might be the driver's fault). This is further supported by my having run other OpenCL examples fine.In addition, it does 'finish' before segfaulting (i.e. a print at the last line of code is still executed before the segfault), so maybe there is some clean-up happening that's causing the problem? This seems to be supported by the
freeGraphicsMemory
stuff appearing in valgrind (which I've never used before, nor know how to interpret - just copied from here). Also note the mention ofclblast::Cache
, which sounds relevant.It should be noted that I was getting weird segfaults with other Intel drivers before, though never dug too far into them. Then again, Intel NEO is based off them, so that's not surprising.
I'm not sure of next steps, but either:
(NB: someone is probably @CNugteren)
The text was updated successfully, but these errors were encountered: