Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpuarray/subasgn error from drift outlier? #471

Closed
FelixAntonSchneider opened this issue Feb 2, 2022 · 1 comment · Fixed by #595
Closed

gpuarray/subasgn error from drift outlier? #471

FelixAntonSchneider opened this issue Feb 2, 2022 · 1 comment · Fixed by #595

Comments

@FelixAntonSchneider
Copy link

Hi,
I'm currently trying to sort a recording with a file size of 264GB. In the GUI the data looks absolutely fine.
When I just sort the first 100s, everything runs through smoothly. But when I try to sort the whole thing I get an error somewhere between batch 2601-2701.

This is the console output of matlab

Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc-dp'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
>> kilosort
Time   0s. Computing whitening matrix.. 
Getting channel whitening matrix... 
Channel-whitening matrix computed. 
Time  34s. Loading raw data and applying filters... 
Time 1833s. Finished preprocessing 5610 batches. 
vertical pitch size is 20 
horizontal pitch size is 32 
    11    27    43    59

   766

0.49 sec, 1 batches, 6056 spikes 
49.05 sec, 101 batches, 441478 spikes 
97.81 sec, 201 batches, 754254 spikes 
146.46 sec, 301 batches, 1025924 spikes 
195.11 sec, 401 batches, 1311769 spikes 
243.84 sec, 501 batches, 1557758 spikes 
292.62 sec, 601 batches, 1794257 spikes 
341.44 sec, 701 batches, 2068577 spikes 
390.41 sec, 801 batches, 2299914 spikes 
439.17 sec, 901 batches, 2563515 spikes 
488.10 sec, 1001 batches, 2894573 spikes 
537.22 sec, 1101 batches, 3178314 spikes 
586.38 sec, 1201 batches, 3656402 spikes 
635.65 sec, 1301 batches, 4039053 spikes 
684.74 sec, 1401 batches, 4347555 spikes 
734.06 sec, 1501 batches, 4624619 spikes 
783.38 sec, 1601 batches, 5011922 spikes 
832.60 sec, 1701 batches, 5345592 spikes 
881.73 sec, 1801 batches, 5686610 spikes 
930.83 sec, 1901 batches, 5975493 spikes 
980.11 sec, 2001 batches, 6336551 spikes 
1029.34 sec, 2101 batches, 6619720 spikes 
1078.56 sec, 2201 batches, 7081564 spikes 
1127.91 sec, 2301 batches, 7402328 spikes 
1177.22 sec, 2401 batches, 7697661 spikes 
1226.71 sec, 2501 batches, 8007512 spikes 
1275.80 sec, 2601 batches, 8385612 spikes 
1325.11 sec, 2701 batches, 8646642 spikes 
1374.56 sec, 2801 batches, 8919781 spikes 
1423.70 sec, 2901 batches, 9221657 spikes 
1473.13 sec, 3001 batches, 9519470 spikes 
1522.49 sec, 3101 batches, 9804545 spikes 
1571.94 sec, 3201 batches, 10307134 spikes 
1621.31 sec, 3301 batches, 10744946 spikes 
1670.82 sec, 3401 batches, 11219190 spikes 
1720.09 sec, 3501 batches, 11647340 spikes 
1769.32 sec, 3601 batches, 12149741 spikes 
1818.67 sec, 3701 batches, 12613717 spikes 
1867.99 sec, 3801 batches, 13128264 spikes 
1917.18 sec, 3901 batches, 13614520 spikes 
1966.40 sec, 4001 batches, 14161133 spikes 
2015.74 sec, 4101 batches, 14699972 spikes 
2064.89 sec, 4201 batches, 15194900 spikes 
2114.02 sec, 4301 batches, 15696358 spikes 
2163.32 sec, 4401 batches, 16204093 spikes 
2212.61 sec, 4501 batches, 16696564 spikes 
2261.89 sec, 4601 batches, 17176174 spikes 
2311.04 sec, 4701 batches, 17680183 spikes 
2360.46 sec, 4801 batches, 18194200 spikes 
2409.65 sec, 4901 batches, 18690583 spikes 
2458.84 sec, 5001 batches, 19163076 spikes 
2508.00 sec, 5101 batches, 19587916 spikes 
2557.30 sec, 5201 batches, 20089426 spikes 
2606.49 sec, 5301 batches, 20595642 spikes 
2656.00 sec, 5401 batches, 21178073 spikes 
2705.48 sec, 5501 batches, 21747154 spikes 
2755.27 sec, 5601 batches, 22318282 spikes 
2759.74 sec, 5610 batches, 22371492 spikes 
time 4677.50, Shifted up/down 5610 batches. 
0.39 sec, 1 batches, 5132 spikes 
46.37 sec, 101 batches, 385544 spikes 
92.62 sec, 201 batches, 680394 spikes 
138.86 sec, 301 batches, 939232 spikes 
185.49 sec, 401 batches, 1197974 spikes 
231.75 sec, 501 batches, 1423830 spikes 
278.01 sec, 601 batches, 1641895 spikes 
324.28 sec, 701 batches, 1897391 spikes 
370.92 sec, 801 batches, 2110760 spikes 
417.07 sec, 901 batches, 2359008 spikes 
463.28 sec, 1001 batches, 2676012 spikes 
509.42 sec, 1101 batches, 2952384 spikes 
555.83 sec, 1201 batches, 3415629 spikes 
602.35 sec, 1301 batches, 3787459 spikes 
648.96 sec, 1401 batches, 4102398 spikes 
695.54 sec, 1501 batches, 4390717 spikes 
742.02 sec, 1601 batches, 4790394 spikes 
789.70 sec, 1701 batches, 5106556 spikes 
836.31 sec, 1801 batches, 5435015 spikes 
882.85 sec, 1901 batches, 5714099 spikes 
929.66 sec, 2001 batches, 6025654 spikes 
976.06 sec, 2101 batches, 6278841 spikes 
1022.56 sec, 2201 batches, 6708789 spikes 
1069.07 sec, 2301 batches, 6993764 spikes 
1115.79 sec, 2401 batches, 7259877 spikes 
1162.34 sec, 2501 batches, 7522852 spikes 
1209.06 sec, 2601 batches, 7816762 spikes 
Error using gpuArray/subsasgn
An unexpected error occurred trying to launch a kernel. The CUDA error was:
invalid configuration argument

Error in extract_spikes (line 97)
    st(5,:) = cF;

Error in ksGUI/runSpikesort (line 795)
                [obj.rez, st3, tF]     = extract_spikes(obj.rez);

Error in ksGUI/runAll (line 634)
            obj.runSpikesort;

Error in ksGUI>@(~,~)obj.runAll() (line 321)
                'Callback', @(~,~)obj.runAll());
 
Error while evaluating UIControl Callback.

Interesting enough the drift trace plot also looks rather suspicious. It has a few huge outliers
batch_outliers

The biggest outlier happens to be 2670. So right in the batch of batches, where the error occurs.
So I was wondering whether such a huge outlier would maybe pose a problem to Kilosort? Not sure though how it should be really connected to CUDA. But on the other side, such a coincident seems rather unlikely.

I don't exactly know how KS3 works, but if batch sizes are always the same, than batch 2670 should occur around 5744 seconds into the recording. Inspecting the data at this point didn't show anything suspicious.
KSscreenshot

@farazmoradi
Copy link

I have faced the same problem! Did you fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants