You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I was trying to run eesen in nvidia's docker container, and failed.
The container has cuda 10.2 in it. Eesen can be compiled, but when invoking "train-ctc-parallel", it crash with following logs:
LOG (train-ctc-parallel:DisableCaching():cuda-device.cc:731) Disabling caching of GPU memory.
LOG (train-ctc-parallel:SetUpdateAlgorithm():net.cc:483) Selecting SGD with momentum as optimization algorithm.
LOG (train-ctc-parallel:SetTrainMode():net.cc:408) Setting TrainMode for layer 0
LOG (train-ctc-parallel:SetTrainMode():net.cc:408) Setting TrainMode for layer 1
LOG (train-ctc-parallel:SetTrainMode():net.cc:408) Setting TrainMode for layer 2
LOG (train-ctc-parallel:SetTrainMode():net.cc:408) Setting TrainMode for layer 3
LOG (train-ctc-parallel:SetTrainMode():net.cc:408) Setting TrainMode for layer 4
add-deltas ark:- ark:-
copy-feats scp:exp/train_char_l5_c320/train_local.scp ark:-
LOG (train-ctc-parallel:main():train-ctc-parallel.cc:133) TRAINING STARTED
ERROR (train-ctc-parallel:AddVecToRows():cuda-matrix.cc:541) cudaError_t 209 : "no kernel image is available for execution on the device" returned from 'cudaGetLastError()'
WARNING (train-ctc-parallel:Close():kaldi-io.cc:446) Pipe gunzip -c exp/train_char_l5_c320/labels.tr.gz| had nonzero return status 13
WARNING (train-ctc-parallel:Close():kaldi-io.cc:446) Pipe copy-feats scp:exp/train_char_l5_c320/train_local.scp ark:- | add-deltas ark:- ark:- | had nonzero return status 36096
ERROR (train-ctc-parallel:AddVecToRows():cuda-matrix.cc:541) cudaError_t 209 : "no kernel image is available for execution on the device" returned from 'cudaGetLastError()'
[stack trace: ]
eesen::KaldiGetStackTraceabi:cxx11
eesen::KaldiErrorMessage::~KaldiErrorMessage()
eesen::CuMatrixBase::AddVecToRows(float, eesen::CuVectorBase const&, float)
eesen::BiLstmParallel::PropagateFncVanillaPassForward(eesen::CuMatrixBase const&, int, int)
eesen::BiLstmParallel::PropagateFnc(eesen::CuMatrixBase const&, eesen::CuMatrixBase)
eesen::Layer::Propagate(eesen::CuMatrixBase const&, eesen::CuMatrix)
eesen::Net::Propagate(eesen::CuMatrixBase const&, eesen::CuMatrix*)
train-ctc-parallel(main+0x148d) [0x5583f00fe692]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f385afb9b97]
train-ctc-parallel(_start+0x2a) [0x5583f00fb44a]
Is there any workaround about this? I don't know much about cuda, I tried to add "-gencode arch=compute_{70,72,75},code={70,72,75}" to gpucompute/Makefile but it still crash.
The text was updated successfully, but these errors were encountered:
Hi, I was trying to run eesen in nvidia's docker container, and failed.
The container has cuda 10.2 in it. Eesen can be compiled, but when invoking "train-ctc-parallel", it crash with following logs:
Is there any workaround about this? I don't know much about cuda, I tried to add "-gencode arch=compute_{70,72,75},code={70,72,75}" to gpucompute/Makefile but it still crash.
The text was updated successfully, but these errors were encountered: