CUDA error was: invalid configuration argument #274

claysmyth · 2020-11-09T23:02:34Z

Having trouble approaching the following error. I think it has something to do with an inappropriate ops.Nfilt value, so I will set that to 1024. Any insight would be appreciated.

Error using gpuArray/nan
An unexpected error occurred trying to launch a kernel. The CUDA error was:
invalid configuration argument

Error in median (line 71)
y = nan(s,'like',x);

Error in learnTemplates (line 248)
toc, ibatch, niter, Nfilt, sum(nsp), median(mu), numel(st0), ndrop)

Error in learnAndSolve8b (line 35)
rez = learnTemplates(rez, rez.iorig);

Error in main_kilosort (line 46)
rez = learnAndSolve8b(rez, iseed);

Error in run (line 91)
evalin('caller', strcat(script, ';'));

claysmyth · 2020-11-10T00:04:20Z

I see that #231 is the same problem, but I cannot implement the same workaround of parsing out sparse portions of the recording. I will comment out the print statement for now as it does not seem crucial to the sort, but insight would still be welcomed.

claysmyth · 2020-11-10T02:12:39Z

Now I'm running into this error:

`Error using gpuArray/subsref
An unexpected error occurred trying to launch a kernel. The CUDA error was:
invalid configuration argument

Error in learnTemplates (line 228)
W(:,Nfilt + [1:size(dWU0,3)],:) = W0(:,ones(1,size(dWU0,3)),:);
% initialize temporal components of waveforms

Error in learnAndSolve8b (line 35)
rez = learnTemplates(rez, rez.iorig);

Error in main_kilosort (line 46)
rez = learnAndSolve8b(rez, iseed);

Error in run (line 91)
evalin('caller', strcat(script, ';'));

`

marius10p · 2020-11-20T23:06:01Z

Can you please confirm that the path is pointing only to Kilosort2.5 and you compiled the mex files successfully? Also please copy paste here the output of "gpuDevice(1)" in Matlab.

RobertoDF · 2020-11-24T13:09:13Z

Hi Marius,
I run into exactly the same two errors, I compiled the mex files succesfully and kilosort 2.0 is not on the path.

This is the output of gpuDevice():
      
 Name: 'Quadro RTX 4000'
                     Index: 1
         ComputeCapability: '7.5'
            SupportsDouble: 1
             DriverVersion: 11
            ToolkitVersion: 10.1000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 8.5899e+09
           AvailableMemory: 6.9117e+09
       MultiprocessorCount: 36
              ClockRateKHz: 1545000
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
            DeviceSelected: 1


This is the ops file:
   rootZ: 'F:/ecephys_output/GG_M608__g0_t40,44/catgt_GG_M608__g0/GG_M608__g0_imec0/imec0_ks2'
              datafile: 'F:/ecephys_output/GG_M608__g0_t40,44/catgt_GG_M608__g0/GG_M608__g0_imec0/GG_M608__g0_tcat.imec0.ap.bin'
                ntbuff: 64
              AUCsplit: 0.9000
              nSkipCov: 25
                useRAM: 0
                fshigh: 300
                 minFR: 0
                 nskip: 25
                LTseed: 1
                trange: [0 Inf]
            nNeighbors: 32
             scaleproc: 200
                    NT: 65600
              momentum: [20 400]
               reorder: 1
               chanMap: 'chanMap.mat'
                    Th: [10 4]
                 fproc: 'F:/kilosort_datatemp/temp_wh.dat'
                 ThPre: 8
               CSBseed: 1
                   lam: 10
             sigmaMask: 30
          nfilt_factor: 4
        whiteningRange: 32
                  gain: 2.3438
    minfr_goodchannels: 0.0400
                  nPCs: 3
                   GPU: 1
                 spkTh: -6
                 Nchan: 385
              NchanTOT: 385
                    fs: 3.0000e+04
                   sig: 20
               nblocks: 5
               fbinary: 'F:\ecephys_output\GG_M608__g0_t40,44\catgt_GG_M608__g0\GG_M608__g0_imec0\GG_M608__g0_tcat.imec0.ap.bin'

RobertoDF · 2020-11-24T13:38:24Z

Reading other related issues I think the problem comes from the way ecephys_spike_sorting concatenates trials. Trials are concatenated with zeropadding, so if you concatenate two recordings that had a 5 min break you will have 5 min of no spiking in the middle and that seems to cause the error. What would be the best workaround?

My concatenated file has breaks of maximum 20 seconds.

claysmyth · 2020-12-01T23:23:42Z

@marius10p I can confirm that mexGPUall.m ran successfully, and that I am running kilosort 2.5 (as I only have 2.5 downloaded):

`Building with 'nvcc'.
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(65): warning: variable "C0" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(159): warning: variable "NchanUp" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(161): warning: variable "d2" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(230): warning: variable "nt0" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "nt0max" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "NchanMax" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(65): warning: variable "C0" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(159): warning: variable "NchanUp" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(161): warning: variable "d2" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(230): warning: variable "nt0" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "nt0max" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "NchanMax" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(65): warning: variable "C0" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(159): warning: variable "NchanUp" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(161): warning: variable "d2" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(230): warning: variable "nt0" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "nt0max" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "NchanMax" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(65): warning: variable "C0" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(159): warning: variable "NchanUp" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(161): warning: variable "d2" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(230): warning: variable "nt0" was set but never used
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "nt0max" was declared but never referenced
/home/csmyth/matlab_packages/Kilosort/CUDA/spikedetector3.cu(20): warning: variable "NchanMax" was declared but never referenced

MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc-dp'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
Building with 'nvcc'.
MEX completed successfully.
`

When run gpuDevice(1):

`CUDADevice with properties:

                  Name: 'TITAN Xp'
                 Index: 1
     ComputeCapability: '6.1'
        SupportsDouble: 1
         DriverVersion: 10.1000
        ToolkitVersion: 10
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 1.2788e+10
       AvailableMemory: 1.2521e+10
   MultiprocessorCount: 30
          ClockRateKHz: 1582000
           ComputeMode: 'Exclusive process'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

`

nmtimme · 2020-12-30T14:29:41Z

I've run into a similar error and I agree that it seems to be due to periods of the recording with no spikes. In my investigations it looks like if mexGetSpikes2 in learnTemplates doesn't find any spikes or templates in ibatch = 1, it results in Nfilt = 0. Then, something goes wrong in the later mex functions, which results in a CUDA error if you try to write to W (as shown above) or (in my case) nsp a few lines later. Basically, it seems like additional functionality needs to be added to deal with breaks in the data (e.g. #288) or recordings with few good neurons, at least in terms of reporting the issue so the user knows the problem is a lack of neurons, not some other issue. As always, thanks @marius10p for all your hard work! Our lab really appreciates this resource!

ensorpalacios · 2021-08-28T14:07:03Z

Hi @marius10p , is this still an issue for anyone? I think I'm running on it too. I'm recording a baseline of 15/20 minutes, apply a drug and wait ~15/20, then record other 15/20 min. I'm concatenating the two sessions with CatGT and then use Kilosort2.

The error is Kilosort2 is the following:
Looking for data inside /media/bunaken/Ensor/npx/NNos/EP_NNos_210729_g0
Time 0s. Determining good channels..
found 142624 threshold crossings in 108.35 seconds of data
found 94 bad channels
Time 41s. Computing whitening matrix..
Getting channel whitening matrix...
Channel-whitening filters computed.
Time 87s. Loading raw data and applying filters...
Time 1530s. Finished preprocessing 3353 batches.
Obtained 7 PC waveforms in 3.57 seconds
time 0.37, pre clustered 1 / 3353 batches
time 71.56, pre clustered 501 / 3353 batches
time 142.69, pre clustered 1001 / 3353 batches
time 177.87, pre clustered 1501 / 3353 batches
time 204.67, pre clustered 2001 / 3353 batches
time 254.38, pre clustered 2501 / 3353 batches
time 319.69, pre clustered 3001 / 3353 batches
time 0.11, compared 1 / 3353 batches
time 34.50, compared 501 / 3353 batches
time 68.45, compared 1001 / 3353 batches
time 104.24, compared 1501 / 3353 batches
time 138.71, compared 2001 / 3353 batches
time 173.01, compared 2501 / 3353 batches
time 207.65, compared 3001 / 3353 batches
time 235.45, Re-ordered 3353 batches.
Time 239s. Optimizing templates ...
239.60 sec, 1 / 6707 batches, 40 units, nspks: 21.8025, mu: 23.9409, nst0: 419, merges: 0.0000, 0.0000
273.07 sec, 101 / 6707 batches, 493 units, nspks: 5348.7165, mu: 20.4125, nst0: 6301, merges: 91.7763, 8.7845
331.39 sec, 201 / 6707 batches, 551 units, nspks: 6910.8232, mu: 17.9800, nst0: 7099, merges: 110.9458, 11.9081
375.51 sec, 301 / 6707 batches, 410 units, nspks: 1030.8911, mu: 17.2411, nst0: 0, merges: 64.2969, 4.2458
385.21 sec, 401 / 6707 batches, 88 units, nspks: 4.3247, mu: 14.5654, nst0: 0, merges: 24.2403, 0.5162
Error using gpuArray/nan
An unexpected error occurred trying to launch a kernel. The CUDA error was:
invalid configuration argument

Error in median (line 71)
y = nan(s,'like',x);

Error in learnAndSolve8b (line 264)
toc, ibatch, niter, Nfilt, sum(nsp), median(mu), numel(st0), ndrop)

Error in master_kilosortM (line 53)
rez = learnAndSolve8b(rez);

The output of gpuDevice() is:
ans =
CUDADevice with properties:
Name: 'Quadro P4000'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11.4000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 8.5055e+09
AvailableMemory: 7.7485e+09
MultiprocessorCount: 14
ClockRateKHz: 1480000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1
mexGPUall runs successfully, although I also get this:
Building with 'nvcc'.
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
MEX completed successfully.

Does anyone have any tips? Thanks

RobertoDF · 2021-08-28T14:57:00Z

Hi! You should take a look at issue #275. I have a fork of kilosort that took care of that problem. You can also reduce the "pause" with cat_gt https://billkarsh.github.io/SpikeGLX/help/dmx_vs_gbl/dmx_vs_gbl/ that would solve the problem too and it's simpler if you are already using the eceephys pipeline.

ensorpalacios · 2021-08-31T11:46:13Z

Hi @RobertoDF, thanks a lot for the help. I'll try your fork of KS 2.5 then.

On a side note, I'm already using cat_gt, but I'm unsure how this could help? I'm using cat_gt with the following options:
@echo off
CatGT -dir=X:\Ensor\npx -run=EP_NNos_210805 -g=0 -t=0,1 ^
-prb_fld -t_miss_ok ^
-ap -prb=0 ^
-aphipass=300 -gbldmx -gfix=10,8 ^
-dest=X:\Ensor\npx\EP_NNos_210805_g0
echo done

Thanks
Ensor

ensorpalacios · 2021-08-31T17:03:27Z

@RobertoDF I tried your fork (downloaded and extracted the zip file) but I'm running in the following problems:

if I try to run your fork of KS on data on which KS2 works (straight 40 min recording) I get this error:
Error using gpuArray/nan
An unexpected error occurred trying to launch a kernel. The CUDA error was:
invalid configuration argument

Error in median (line 71)
y = nan(s,'like',x);

Error in learnTemplates (line 248)
toc, ibatch, niter, Nfilt, sum(nsp), median(mu), numel(st0), ndrop)

Error in learnAndSolve8b (line 35)
rez = learnTemplates(rez, rez.iorig);

Error in main_kilosort (line 48)
rez = learnAndSolve8b(rez, iseed);

Instead, if I try your version on data with a pause in between (on which KS2 does not work as reported above) then KS runs in 2 minutes (for recording of 40') and I get odd results (attached example from phy, all units look similar)

Am I doing something obviously wrong when I setup the new KS?

By the way, I'm running this on Ubuntu 20.04; also, KS2 folder does not appear in my MATLAB path.

Thanks
Ensor

claysmyth mentioned this issue Nov 10, 2020

An unexpected error occurred trying to launch a kernel. The CUDA error was: invalid configuration argument. #275

Closed

RobertoDF mentioned this issue Nov 28, 2020

Solve_zero_padding #288

Closed

demas-retina-lab mentioned this issue Jun 30, 2022

Solve_zero_padding DemasLab/Kilosort#1

Closed

carsen-stringer mentioned this issue Feb 29, 2024

adding rtd and badges #595

Merged

carsen-stringer closed this as completed in #595 Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA error was: invalid configuration argument #274

CUDA error was: invalid configuration argument #274

claysmyth commented Nov 9, 2020

claysmyth commented Nov 10, 2020

claysmyth commented Nov 10, 2020

marius10p commented Nov 20, 2020

RobertoDF commented Nov 24, 2020

RobertoDF commented Nov 24, 2020 •

edited

claysmyth commented Dec 1, 2020

nmtimme commented Dec 30, 2020

ensorpalacios commented Aug 28, 2021

RobertoDF commented Aug 28, 2021

ensorpalacios commented Aug 31, 2021

ensorpalacios commented Aug 31, 2021

CUDA error was: invalid configuration argument #274

CUDA error was: invalid configuration argument #274

Comments

claysmyth commented Nov 9, 2020

claysmyth commented Nov 10, 2020

claysmyth commented Nov 10, 2020

marius10p commented Nov 20, 2020

RobertoDF commented Nov 24, 2020

RobertoDF commented Nov 24, 2020 • edited

claysmyth commented Dec 1, 2020

nmtimme commented Dec 30, 2020

ensorpalacios commented Aug 28, 2021

RobertoDF commented Aug 28, 2021

ensorpalacios commented Aug 31, 2021

ensorpalacios commented Aug 31, 2021

RobertoDF commented Nov 24, 2020 •

edited