Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker image is not working! #149

Closed
Nazila-H opened this issue Sep 27, 2022 · 4 comments
Closed

Docker image is not working! #149

Nazila-H opened this issue Sep 27, 2022 · 4 comments

Comments

@Nazila-H
Copy link

Thanks a lot for the very helpful project.

Describe the error
The nvcc -V result is:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0 

and nvidia-smi result is:

ped

I used the provided docker file to create the docker image (myimage:0.1) then by running the following command:

podman run --hooks-dir /etc/containers/hooks.d/ --rm -v$(pwd):/work -w/work localhost/myimage:0.1 python tools/demo.py configs/elephant/cityperson/cascade_hrnet.py chkpoint/epoch_5.pth.stu demo/ result_demo/

I got this error:

demo/
['demo/1.png', 'demo/2.png', 'demo/3.png']
unexpected key in source state_dict: mask_head.0.conv_res.conv.weight, mask_head.0.conv_res.conv.bias, mask_head.1.conv_res.conv.weight, mask_head.1.conv_res.conv.bias, mask_head.2.conv_res.conv.weight, mask_head.2.conv_res.conv.bias

[                              ] 0/3, elapsed: 0s, ETA:/pedestron/mmdet/apis/inference.py:39: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default.
  warnings.warn('Class names are not saved in the checkpoint\'s '
Traceback (most recent call last):
  File "tools/demo.py", line 69, in <module>
    run_detector_on_dataset()
  File "tools/demo.py", line 65, in run_detector_on_dataset
    detections = mock_detector(model, im, output_dir)
  File "tools/demo.py", line 37, in mock_detector
    results = inference_detector(model, image)
  File "/pedestron/mmdet/apis/inference.py", line 66, in inference_detector
    return _inference_single(model, imgs, img_transform, device)
  File "/pedestron/mmdet/apis/inference.py", line 93, in _inference_single
    result = model(return_loss=False, rescale=True, **data)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/pedestron/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/pedestron/mmdet/models/detectors/base.py", line 88, in forward
    return self.forward_test(img, img_meta, **kwargs)
  File "/pedestron/mmdet/models/detectors/base.py", line 79, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/pedestron/mmdet/models/detectors/cascade_rcnn.py", line 241, in simple_test
    x = self.extract_feat(img)
  File "/pedestron/mmdet/models/detectors/cascade_rcnn.py", line 115, in extract_feat
    x = self.backbone(img)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/pedestron/mmdet/models/backbones/hrnet.py", line 446, in forward
    x = self.relu(x)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/activation.py", line 94, in forward
    return F.relu(input, inplace=self.inplace)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/functional.py", line 912, in relu
    result = torch.relu_(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device

I really appreciate your help to execute the demo correctly.

@hasanirtiza
Copy link
Owner

I have personally never used docker image for Pedestron. So I am not 100% sure on how to answer. However, from the error it seems that you do not have correct cuda version. Can you confirm your cuda version and PyTorch version ? Secondly is it possible for you to run without the docker first (conda environment etc.,)?

@Nazila-H
Copy link
Author

Nazila-H commented Sep 28, 2022

Thank you for your comment.
I am working on the university server and do not have access as an admin to uninstall CUDA v11.7 to CUDA v10.0, because of that I tried to use the Doker file. On Doker file:

ARG PYTORCH="1.3"
ARG CUDA="10.1"
ARG CUDNN="7"
mmcv==0.2.10

Do you have any suggestions for compatible versions that I can apply on the Docker file? Then if it works we can also modify the Doker file on the repository as well.

@hasanirtiza
Copy link
Owner

hasanirtiza commented Sep 28, 2022

You are in a tough spot Nazila. As for the compatibility version, you can read about CUDA and PyTorch etc version that we did try here. Now if I were you, I would look for issues regarding docker in mmdetection original repo, in particular older issues (around 2020-2021 ish).

@Nazila-H
Copy link
Author

Thank you for your suggestion and sharing the link with me, I will do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants