-
-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unit tests and tuners segfault on Linux/Beignet with a Haswell GT2 GPU #66
Comments
Hmm, not good. So there are two types of tests: the I assume this is on the The first thing we should try is to find out whether clBLAS (the reference) or CLBlast crashes. Perhaps you can go to line 218 of
If it still crashes then the bug is in CLBlast, otherwise it is in clBLAS (not unreasonable to thing since that library hasn't been tested on Intel/Beignet). |
I did your test and tried both clblas and clblast alone and they both segfault ...
Looks like I assumed they all failed the same way a little fast ... |
OK, it seems there is something else causing issues. First of all I've made the invalid-buffer sizes more verbose in verbose mode. For example, the
In the last bit, it shows that it is testing swapping of two buffers with 64 elements using smaller sized buffers. Both clBLAS and CLBlast are protected against this behaviour and return appropriate error codes. One more thing I could think of now is that Beignet isn't happy with zero-sized buffers. Perhaps you can change line 66 of
into:
Let's see if that helps for the For the other errors, I would first suggest to test against a CPU BLAS library, since the reference clBLAS might crash or give incorrect results in some cases on Intel GPUs. You can do this by providing |
I just tested the dev branch and the issue looks gone:
There are 4 more tests that pass compared to the status1/status2 trick 3 days ago. CBLAS is already the default:
I tried with clblas as reference:
|
OK, I should have said this: the From your other tests with GEMV we can conclude that the issue is indeed in CLBlast and not in one of the reference libraries. The configuration afterwards is: Are you on the latest Beignet by the way? Perhaps that is influencing the results as well somehow? |
I use beignet 1.1.2. |
I'm on a git version from 2 weeks back. I had to do that because my Skylake GPU is quite new. But 1.1.2 seems to be from April this year, so that's quite recent. I'll try to think of ways how to debug this property. But for now I think you can actually use the library: it only crashes for invalid configurations it seems. |
As some tuners fail too, maybe we can focus on that. |
The tuners crash as well? I'll also investigate the current issue further, but I don't have time until Monday. |
That is weird.
but running clblast_tuner_xgemm directly works fine ... |
Here is the issue (with complex numbers):
|
OK, thanks for running the tuner and showing the output. First thing to check for now is to see whether it is a bug in the compiler or in CLTune or the CLBlast kernels. Because I don't know how to do that properly if I can't re-produce the errors myself, I've added a 'VERBOSE' setting to CLTune. So, could you do the following for me:
Perhaps it is not verbose enough yet, but this would be the first step I guess. Thanks! |
The answer is Compilation !
|
If it helps, here is the output of valgrind (with lots of memory error suppressed):
|
Yes, so we can indeed conclude this is an issue with Beignet or the Intel drivers. What you can do is modify
with
Then, copy-paste the faulty kernel and report it to the developers of Beignet, possibly with a small test program that does nothing else than compilation. Note that this kernel can be quite long for GEMM. In the worst-case if this kernel is not valid OpenCL, the Beignet compiler should still report the error instead of crash with a segfault. Before you do this, I recommend building the latest version of Beignet from the git source repository. And then run the included unit tests, first see if they pass. That's what the developers of Beignet will ask you to do I guess. Unfortunately Beignet doesn't seem mature enough yet. I've seen some issues myself on Skylake GPUs, mostly with FP16 though. |
I switched to beignet git HEAD and it works:
I think we are done with this issue, thanks a lot for your help! |
OK, good to hear that a new version of Beignet helped with the Tuner issues. But the original issue was with the tests, right? So I suggest that you pull the latests version of the CLBlast |
Before pulling new dev HEAD:
After pulling new dev HEAD (Updating 61105e3..66908ef):
It doesn't seem to help with the unit tests. If thoses tests aren't that important, maybe we can could use a signal handling or child process strategy so that a seg fault in Beignet doesn't crash the whole unit test ... |
Here is the output of all failling tests in case you want to check them: |
Thanks for the data, I will look into it as soon as I have some time. Quick look tells me again the only failures are for tests which should return an error code. So although not crucial, still it would be good if the error codes were returned correctly. And I am also curious why this happens, since no actual OpenCL kernel should be compiled/executed in that case. So this is on the git version of Beignet I presume, the one you used to run the tuners successfully? |
I just checked it: indeed, it only crashes for tests that should return an error code. I am still not sure what is the cause of this issue, so I added extra printing statements (and std::flush) to the tests, hopefully they will help us locate the source of the error, whether it is in the test code or in one of the tested libraries. Could you re-run one of those failing tests after pulling in the latest changes from
|
Is this still an issue with the newest version of Beignet? |
Hi,
I'm not sure whether its an issue or not. What do you think ? |
Incorrect results during tuning are automatically filtered out, so you don't have to worry. Well, as long as not all tuning results fail of course :-) I also had some problems myself with Beignet, it seems it is not 100%. What about the tests, do they work? |
I'll test on the Haswell laptop asap.
I get this results:
There is also a lot of compiler errors in the tuners. |
Thanks for the test. Perhaps these issues are in the FP16 versions of the kernels only? What happens if you for example look at the full output of |
Back to Haswell (I didn't have time to run verbose tests on the Broadwell laptop)
I did not rebuild anything, there is quite less errors on Haswell. |
Valgrind output at the segfault: libcl again
|
Could you perhaps try again with the latest Beignet and CLBlast? CLBlast now has the tuning parameters for your devices included, perhaps that changes something. If not, please post the latest output again and I'll re-investigate what could be the cause. Thanks! |
with beignet 2c1f246 (current HEAD) and clblast b1929d8 (current dev HEAD): identical results.
A pointer with a value of |
Indeed, you are right. Your valgrind trace helped me locate the issue. It crashes indeed on
Taking both observations together: in case the CLBlast routine doesn't finish correctly (it doesn't return This is fixed in the |
Nice, it helped a lot:
I'll post the remaining errors details later. |
First clblast_test_xsyrk:
I tried with CLBlas with little different results:
I think that's a tricky one. |
clblast_test_xher2k crashes in CBLAS:
|
Good to see that most tests now pass. I'll look into the few failure cases in a couple of days. Thanks for the feedback and data! |
I have just fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with specific tuning parameters. Could you perhaps try the tests again and see if those are now successful? |
There is something realy wrong with CBLAS calls:
|
Details for xsyr2k:
|
Thanks again for testing. I am now trying to reproduce it myself. I am also on Beignet, but with a Skylake GPU. I am testing with the tuning parameters for your Haswell GPU, so that's as close as I can get to your set-up. Below are my results for syr2k:
And for her2k:
I also tried to run under valgrind but I didn't observe anything interesting. So in conclusion I don't know if I can help you any further. Perhaps there is a genuine bug in the CBLAS library you're using? Or perhaps there is still an issue in Beignet or in the Intel drivers for your GPU? |
I just updated from on my system from Beignet 1.2 to 1.2.1 and I see a lot improvements, especially related to half-precision (fp16). I also re-run the above commands, and I no longer see any errors. Could you perhaps also re-run the tests with the latest Beignet? |
I am closing this issue, since I think most of the bugs are now fixed. The latest version of the code contains SYRK/SYR2K/HERK/HER2K and TRMM fixes, so that should be good. And then Beignet 1.2.1 should fix any remaining issues. If this is note the case, please open a new issue with a report of which test(s) fail. |
I tried to run the test and they all fail the same way. They perform all sub-tests without errors, they report 100% pass rate then segfault:
My build settings:
-DTESTS=ON -DTUNERS=ON
The text was updated successfully, but these errors were encountered: