Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api/device): further isolate the CUDA_VISIBLE_DEVICE parser in a subprocess #70

Merged
merged 2 commits into from
Apr 11, 2023

Conversation

XuehaiPan
Copy link
Owner

Issue Type

  • Bug fix

Description

Further isolate the CUDA_VISIBLE_DEVICE parser in a subprocess. We are using both subprocess and mutliprocessing modules.

Motivation and Context

Fixes #69

Testing

# test.py

from nvitop import CudaDevice

print(CudaDevice.count())

Before:

$ python3 test.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 129, in _main
    prepare(preparation_data)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/PanXuehai/Projects/nvitop/test.py", line 5, in <module>
    print(CudaDevice.count())
          ^^^^^^^^^^^^^^^^^^
  File "/home/PanXuehai/Projects/nvitop/nvitop/api/device.py", line 2132, in count
    return len(super().parse_cuda_visible_devices())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/PanXuehai/Projects/nvitop/nvitop/api/device.py", line 488, in parse_cuda_visible_devices
    return parse_cuda_visible_devices(cuda_visible_devices)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/PanXuehai/Projects/nvitop/nvitop/api/device.py", line 2357, in parse_cuda_visible_devices
    return _parse_cuda_visible_devices(cuda_visible_devices, format='index')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/PanXuehai/Projects/nvitop/nvitop/api/device.py", line 2491, in _parse_cuda_visible_devices
    raw_uuids = _parse_cuda_visible_devices_to_uuids(cuda_visible_devices, verbose=False)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/PanXuehai/Projects/nvitop/nvitop/api/device.py", line 2616, in _parse_cuda_visible_devices_to_uuids
    parser.start()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 158, in get_preparation_data
    _check_not_importing_main()
  File "/home/linuxbrew/.linuxbrew/opt/python@3.11/lib/python3.11/multiprocessing/spawn.py", line 138, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

After:

$ python3 test.py
1

@XuehaiPan XuehaiPan self-assigned this Apr 10, 2023
@XuehaiPan XuehaiPan added bug Something isn't working enhancement New feature or request api Something related to the core APIs labels Apr 10, 2023
@XuehaiPan XuehaiPan merged commit 7fca245 into main Apr 11, 2023
@XuehaiPan XuehaiPan deleted the spawn-subprocess branch April 11, 2023 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Something related to the core APIs bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Execution Freeze while using CudaDevice.count() in global scope
1 participant