Bug running docker #400

acamargosonosa · 2022-11-28T20:42:01Z

Describe the bug
I am following the example of https://docs.monai.io/projects/monai-deploy-app-sdk/en/0.2.1/getting_started/tutorials/02_mednist_app.html

I was able to run everything but the part of the docker:
monai-deploy run mednist_app:latest input output

I am getting this error:

(monai) sonosa@sonosa-MS-7B17:~/2022/ProjectsAI/monai$ monai-deploy run mednist_app:latest input output_docker_gpu
/home/sonosa/anaconda3/envs/monai/lib/python3.7/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /home/sonosa/anaconda3/envs/monai/lib/python3.7/site-packages/torchvision/image.so: undefined symbol: _ZNK3c1010TensorImpl36is_contiguous_nondefault_policy_implENS_12MemoryFormatE
warn(f"Failed to load image Python extension: {e}")
Checking dependencies...
--> Verifying if "docker" is installed...

--> Verifying if "mednist_app:latest" is available...

Checking for MAP "mednist_app:latest" locally
"mednist_app:latest" found.

Reading MONAI App Package manifest...
--> Verifying if "nvidia-docker" is installed...

/opt/conda/lib/python3.8/site-packages/scipy/init.py:138: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.5)
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion} is required for this version of "
Going to initiate execution of operator LoadPILOperator
Executing operator LoadPILOperator (Process ID: 1, Operator ID: a37656f0-65c6-46ca-8fdf-e8477bab45d2)
Done performing execution of operator LoadPILOperator

Going to initiate execution of operator MedNISTClassifierOperator
Executing operator MedNISTClassifierOperator (Process ID: 1, Operator ID: c9f285dd-3b8c-4ef9-b700-cd6ac49186a0)
/root/.local/lib/python3.8/site-packages/monai/utils/deprecate_utils.py:107: FutureWarning: <class 'monai.transforms.utility.array.AddChannel'>: Class AddChannel has been deprecated since version 0.8. please use MetaTensor data type and monai.transforms.EnsureChannelFirst instead.
warn_deprecated(obj, msg, warning_category)
/root/.local/lib/python3.8/site-packages/monai/utils/type_conversion.py:134: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:175.)
tensor = torch.as_tensor(tensor, kwargs)
device found : cuda
terminate called after throwing an instance of 'c10::Error'
what(): isTuple()INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h":1397, please report a bug to PyTorch. Expected Tuple but got String
Exception raised from toTuple at /opt/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h:1397 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6c (0x7f27560b224c in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
frame Project-MONAI/MONAI#1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xfa (0x7f275607da66 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
frame Project-MONAI/MONAI#2: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x53 (0x7f27560b0233 in /opt/conda/lib/python3.8/site-packages/torch/lib/libc10.so)
frame Project-MONAI/MONAI#3: + 0x4224e29 (0x7f27a23b4e29 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame Project-MONAI/MONAI#4: + 0x42253e9 (0x7f27a23b53e9 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame Project-MONAI/MONAI#5: torch::jit::SourceRange::highlight(std::ostream&) const + 0x48 (0x7f279f3d5c58 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame Project-MONAI/MONAI#6: torch::jit::ErrorReport::what() const + 0x2c3 (0x7f279f3baac3 in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame Project-MONAI/MONAI#7: + 0x9ea44f (0x7f27a873344f in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame Project-MONAI/MONAI#8: + 0x9fa12d (0x7f27a874312d in /opt/conda/lib/python3.8/site-packages/torch/lib/libtorch_python.so)

frame Project-MONAI/MONAI#45: __libc_start_main + 0xf3 (0x7f27e722d083 in /usr/lib/x86_64-linux-gnu/libc.so.6)

ERROR: MONAI Application "mednist_app:latest" failed.

=================================================================

Please any help could be useful,
Just in case I tested the nvidia-docker and it is working very well

The text was updated successfully, but these errors were encountered:

dbericat · 2022-11-28T21:10:43Z

@zephyrie @vikashg @ericspod can you run and confirm this is reproducible or something with @acamargosonosa env?

It seems that MedNIST example is not updated to use the lates MONAI Core 1.x and metatensor.

Also, your NumPy version is higher than the expected by SciPy. (1.23.5 vs NumPy version >=1.16.5 and <1.23.0).

ericspod · 2022-11-30T15:48:02Z

I was able to run the example without encountering this issue on Ubuntu 20.04, CUDA 11.4, Pytorch 1.13, MONAI 1.0.1. I setup my conda environment from scratch to run this test so it's possible your environment does have incompatible library versions in it.

However I encountered another issue with highdicom being a hard dependency in the library which I solved by adding @md.env(pip_packages=["monai"]) to the specification for the App class in the example script. The root cause is that SegmentDescription in dicom_seg_writer_operator.py relies on a member of highdicom for type annotation, if the library isn't present there is no member to use as a type. highdicom is used elsewhere in that file for critical operations so I don't think it's an optional dependency if classes can't operate without it.

CC @CPBridge @MMelQin

acamargosonosa · 2022-12-02T17:58:50Z

Thanks a lot for the suggestions, I will test them and see if work for my case

george-kuanli-peng · 2023-01-09T09:13:58Z

I was able to run the example without encountering this issue on Ubuntu 20.04, CUDA 11.4, Pytorch 1.13, MONAI 1.0.1. I setup my conda environment from scratch to run this test so it's possible your environment does have incompatible library versions in it.

However I encountered another issue with highdicom being a hard dependency in the library which I solved by adding @md.env(pip_packages=["monai"]) to the specification for the App class in the example script. The root cause is that SegmentDescription in dicom_seg_writer_operator.py relies on a member of highdicom for type annotation, if the library isn't present there is no member to use as a type. highdicom is used elsewhere in that file for critical operations so I don't think it's an optional dependency if classes can't operate without it.

CC @CPBridge @MMelQin

I have the same issue of missing highhdicom which prevent me from importing monai.deploy. How could I workaround it?

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/monai/utils/module.py", line 199, in load_submodules
    mod = import_module(name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.8/dist-packages/monai/deploy/operators/__init__.py", line 38, in <module>
    from .dicom_seg_writer_operator import DICOMSegmentationWriterOperator
  File "/usr/local/lib/python3.8/dist-packages/monai/deploy/operators/dicom_seg_writer_operator.py", line 44, in <module>
    class SegmentDescription:
  File "/usr/local/lib/python3.8/dist-packages/monai/deploy/operators/dicom_seg_writer_operator.py", line 130, in SegmentDescription
    def to_segment_description(self, segment_number: int) -> hd.seg.SegmentDescription:
  File "/usr/local/lib/python3.8/dist-packages/monai/deploy/utils/importutil.py", line 262, in __getattr__
    raise self._exception
  File "/usr/local/lib/python3.8/dist-packages/monai/deploy/utils/importutil.py", line 221, in optional_import
    pkg = __import__(module)  # top level module
monai.deploy.utils.importutil.OptionalImportError: import highdicom (No module named 'highdicom').

laurencejackson · 2023-07-04T15:55:05Z

@acamargosonosa, I just came up against this same error, the key part I think is what(): isTuple()INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h":1397, please report a bug to PyTorch. Expected Tuple but got String.

This suggested some issue with how pytorch was reading the torchscript file. If you enter your MAP container and check the pytorch version with pip show torch you might find like I did that the torch version you used to train and serialise the model is different from what the MAP has installed. In my case, I fixed it by using a newer base image e.g. monai-deploy package app.py -t myapp:latest -m my-model.pt -b nvcr.io/nvidia/pytorch:23.06-py3. Alternatively, I think if you set the torch version in the application's env decorator you will also end up with the right version installed.

MMelQin · 2023-07-04T23:11:53Z

Thanks @laurencejackson for providing the resolutions!
PyTorch is indeed pre-installed and verified in the published base images, and the older version (yy:mm) images likely do not have the torch version compatible with the newer ones used to train the model.

wyli transferred this issue from Project-MONAI/MONAI Nov 28, 2022

ericspod added the bug Something isn't working label Nov 30, 2022

MMelQin self-assigned this Jun 23, 2023

MMelQin closed this as completed Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug running docker #400

Bug running docker #400

acamargosonosa commented Nov 28, 2022

dbericat commented Nov 28, 2022

ericspod commented Nov 30, 2022 •

edited

acamargosonosa commented Dec 2, 2022

george-kuanli-peng commented Jan 9, 2023

laurencejackson commented Jul 4, 2023

MMelQin commented Jul 4, 2023

Bug running docker #400

Bug running docker #400

Comments

acamargosonosa commented Nov 28, 2022

I am getting this error:

dbericat commented Nov 28, 2022

ericspod commented Nov 30, 2022 • edited

acamargosonosa commented Dec 2, 2022

george-kuanli-peng commented Jan 9, 2023

laurencejackson commented Jul 4, 2023

MMelQin commented Jul 4, 2023

ericspod commented Nov 30, 2022 •

edited