Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA_ERROR_ILLEGAL_ADDRESS Error in standalone_detector #257

Closed
nmtimme opened this issue Nov 3, 2020 · 7 comments · Fixed by #595
Closed

CUDA_ERROR_ILLEGAL_ADDRESS Error in standalone_detector #257

nmtimme opened this issue Nov 3, 2020 · 7 comments · Fixed by #595

Comments

@nmtimme
Copy link

nmtimme commented Nov 3, 2020

Hi,

I'm running into the following error:

Error using gpuArray/gather
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_ILLEGAL_ADDRESS

Error in gpuArray/subsindex (line 17)
idx = gather( idx ) - 1;

Error in standalone_detector (line 69)
ys = rez.yc(iC);

Error in datashift2 (line 30)
[st3, rez] = standalone_detector(rez, spkTh);

Error in stage2Core (line 418)
rez = datashift2(rez, 1); % last input is for shifting data

So far, I've found that something is wrong with iC prior to line 69 in standalone_detector, though it is fine after the line iC = iC(:, igood); By this I mean that when I try to look at iC prior to line 69, I get a similar error, but when I try to look at it earlier in the code, it can be shown. Maybe something is happening in spikedetector3, but I don't understand what could be going wrong since iC is not an output of spikedetector3. I can spike sort the eMouse_drift data just fine, so I think it must be something with my data or the change I made based on an issue the other day (#255). I'd appreciate any suggestions on how to resolve this problem. Thanks!

~Nick

@marius10p
Copy link
Contributor

CUDA errors are opaque and it's hard to know what's going in. I will have to dig into this myself, hopefully within a few days.

@nmtimme
Copy link
Author

nmtimme commented Nov 5, 2020

Thanks for replying so quickly! I definitely think the problem is spikedetector3 with my data. I saved the necessary variables that go into spikedetector3 (errorFiles.zip). On my machine, I can load the variables just fine, they look fine, but after I run the following line:

[dat, kkmax, st, cF] = spikedetector3(Params, dataRAW, wTEMP, iC-1, dist, v2, iC2-1, dist2);

if I simply try to look at iC, I get:

Error using gpuArray/gather
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_ILLEGAL_ADDRESS

Error in parallel.internal.shared.buildDisplayHelper>iFirstNNumericDisplayHelper (line 73)
maybeTruncatedValue = gather( x );

Error in parallel.internal.shared.buildDisplayHelper>iBuildDisplayHelper (line 33)
dh = iFirstNNumericDisplayHelper( ...

Error in parallel.internal.shared.buildDisplayHelper (line 24)
dh = iBuildDisplayHelper( x, transferDenseFcn, transferSparseFcn, xClassName, xName, N );

Error in dispInternal (line 10)
dh = parallel.internal.shared.buildDisplayHelper( ...

Error in gpuArray/display (line 21)
dh = dispInternal( obj, thisClassName, objName );

I hope that helps you locate the source of the problem. Let me know if there is any other way I can help. Thanks!

@nmtimme
Copy link
Author

nmtimme commented Nov 5, 2020

Hello

I looked into it a bit more and I think this issue was created by the change I made in response to issue #255. It is necessary to update NchanNearUp in standalone_detector after running getClosestChannels2 on line 40. For instance, I added the following line immediately after line 40 and it seems to have resolved the problem in spikedetector3:

NchanNearUp = size(iC2,1);

Thanks!

~Nick

@AlexSonneborn
Copy link

Hi Nick and Marius,

I am still getting the following problem in spikedetector3 even after making the changes you suggested.

Unrecognized function or variable 'spikedetector3'.

Error in standalone_detector (line 67)
    [dat, kkmax, st, cF] = spikedetector3(Params, dataRAW, wTEMP, iC-1, dist, v2, iC2-1, dist2);

Error in datashift2 (line 31)
[st3, rez] = standalone_detector(rez, spkTh);

Error in main_kilosort (line 42)
rez = datashift2(rez, 1); % last input is for shifting data

I also have a random geometry for my probe and only 32 channels. any idea what I would need to change to fix this? I have been messing with geometry parameters but nothing works.

@nmtimme
Copy link
Author

nmtimme commented Nov 5, 2020

I think that might be an unrelated error. It seems like Matlab can't find spikedetector3, not that spikedetector3 is messing up iC, as was the case for me. Perhaps your paths aren't set properly?

@AlexSonneborn
Copy link

So even after getting the CUDA path sorted out and compiled, I am still getting an error. Are all of the .cu files supposed to be compiled? For me, mexNvidia_quicksort.cu is not compiling. I am getting very similar CUDA errors as you (CUDA_ERROR_ILLEGAL_ADDRESS). But adding in the extra code you suggested didn't work. Sorry for the flurry of questions, but Kilosort2 worked perfectly for me so this doesn't make much sense.

@nmtimme
Copy link
Author

nmtimme commented Nov 6, 2020

Have you isolated the errors to the same part of the code? If not, this might require its own issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants