Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train error #19

Closed
Wohoholo opened this issue Jan 12, 2021 · 1 comment
Closed

train error #19

Wohoholo opened this issue Jan 12, 2021 · 1 comment

Comments

@Wohoholo
Copy link

Hello!
I found a problem about seg loss in training with my own dataset. My segment datasets were converted to "L". In ori_big.py, model would predict segment with size[x, 2, x, x]. But I got error when training was at CrossEntropyLoss2d. Can you give some help? Thanks!

Error:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [189,0,0] Assertion t >= 0 && t < n_classes failed.
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED (createCuDNNHandle at /pytorch/aten/src/ATen/cudnn/Handle.cpp:9)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f81564a5536 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: + 0x10a0c28 (0x7f81579a1c28 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #2: at::native::getCudnnHandle() + 0xe54 (0x7f81579a3404 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #3: + 0xf19f4c (0x7f815781af4c in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xf1afe1 (0x7f815781bfe1 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: + 0xf1f01b (0x7f815782001b in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #6: at::native::cudnn_convolution_backward_input(c10::ArrayRef, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool) + 0xb2 (0x7f8157820572 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xf86090 (0x7f8157887090 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #8: + 0xfca928 (0x7f81578cb928 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #9: at::native::cudnn_convolution_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, std::array<bool, 2ul>) + 0x4fa (0x7f8157821c0a in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #10: + 0xf863bb (0x7f81578873bb in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #11: + 0xfca984 (0x7f81578cb984 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #12: + 0x2c80736 (0x7f8191037736 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: + 0x2ccff44 (0x7f8191086f44 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: torch::autograd::generated::CudnnConvolutionBackward::apply(std::vector<at::Tensor, std::allocatorat::Tensor >&&) + 0x378 (0x7f8190c4f908 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: + 0x2d89705 (0x7f8191140705 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #16: torch::autograd::Engine::evaluate_function(std::shared_ptrtorch::autograd::GraphTask&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f819113da03 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::Engine::thread_main(std::shared_ptrtorch::autograd::GraphTask const&, bool) + 0x3d2 (0x7f819113e7e2 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #18: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f8191136e59 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #19: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f819da7e968 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #20: + 0xc819d (0x7f81ac9d019d in /home/derek/anaconda3/envs/jim/bin/../lib/libstdc++.so.6)
frame #21: + 0x76db (0x7f81ae1696db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #22: clone + 0x3f (0x7f81ade9271f in /lib/x86_64-linux-gnu/libc.so.6)

@Wohoholo
Copy link
Author

Target Segment size: torch.Size([6, 512, 680])
Pred_segment size: torch.Size([6, 2, 512, 680])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant