Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据预处理过程内存占用过高以至报错 #61

Closed
FirokOtaku opened this issue Sep 30, 2022 · 1 comment
Closed

数据预处理过程内存占用过高以至报错 #61

FirokOtaku opened this issue Sep 30, 2022 · 1 comment

Comments

@FirokOtaku
Copy link

FirokOtaku commented Sep 30, 2022

执行 python tools/preprocess.py --config-file configs/preprocess/dota1_5_preprocess_config.py 时,
内存占用量在命令开始执行之后若干秒便开始激增,
直到吃满所有系统内存.

系统环境:

  • Windows 专业版 21H1 19043.1706
  • Python 3.10.7
  • 内存 32G
  • 显卡 3070

开始部分日志:

(数据集目录名是 dota-1.0, 里面的数据使用的是 1.5 版本)

D:\project-jittor\jdet>python tools\preprocess.py --config-file configs/preprocess/dota1_5_preprocess_config.py
[i 0930 15:16:14.694000 84 compiler.py:955] Jittor(1.3.5.16) src: c:\users\pc\appdata\local\programs\python\python310\lib\site-packages\jittor-1.3.5.16-py3.10.egg\jittor
[i 0930 15:16:14.740000 84 compiler.py:956] cl at C:\Users\pc\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe(19.29.30133)
[i 0930 15:16:14.741000 84 compiler.py:957] cache_path: C:\Users\pc\.cache\jittor\jt1.3.5\cl\py3.10.7\Windows-10-10.xb1\IntelRCoreTMi7x95\default
[i 0930 15:16:14.745000 84 install_cuda.py:88] cuda_driver_version: [11, 7, 0]
[i 0930 15:16:14.796000 84 __init__.py:411] Found C:\Users\pc\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe(11.2.67) at C:\Users\pc\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe.
[i 0930 15:16:14.911000 84 compiler.py:1010] cuda key:cu11.2.67
[i 0930 15:16:14.913000 84 __init__.py:227] Total mem: 31.91GB, using 10 procs for compiling.
[i 0930 15:16:15.796000 84 jit_compiler.cc:28] Load cc_path: C:\Users\pc\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe
[i 0930 15:16:15.797000 84 init.cc:62] Found cuda archs: [86,]
[i 0930 15:16:15.892000 84 compile_extern.py:517] mpicc not found, distribution disabled.
[w 0930 15:16:15.941000 84 compile_extern.py:200] CUDA related path found in LD_LIBRARY_PATH or PATH(['', 'C', '\\Users\\pc\\.cache\\jittor\\jtcuda\\cuda11.2_cudnn8_win\\lib64', '', 'C', '\\Users\\pc\\.cache\\jittor\\mkl\\dnnl_win_2.2.0_cpu_vcomp\\bin', '', 'C', '\\Users\\pc\\.cache\\jittor\\mkl\\dnnl_win_2.2.0_cpu_vcomp\\lib', '', 'C', '\\Users\\pc\\.cache\\jittor\\jt1.3.5\\cl\\py3.10.7\\Windows-10-10.xb1\\IntelRCoreTMi7x95\\default', '', 'C', '\\Users\\pc\\.cache\\jittor\\jt1.3.5\\cl\\py3.10.7\\Windows-10-10.xb1\\IntelRCoreTMi7x95\\default\\cu11.2.67', '', 'C', '\\Users\\pc\\.cache\\jittor\\jtcuda\\cuda11.2_cudnn8_win\\bin', '', 'C', '\\Users\\pc\\.cache\\jittor\\jtcuda\\cuda11.2_cudnn8_win\\lib\\x64', '', 'C', '\\Users\\pc\\.cache\\jittor\\msvc\\win10_kits\\lib\\ucrt\\x64', '', 'C', '\\Users\\pc\\.cache\\jittor\\msvc\\win10_kits\\lib\\um\\x64', '', 'C', '\\Users\\pc\\.cache\\jittor\\msvc\\VC\\lib', '', 'c', '\\users\\pc\\appdata\\local\\programs\\python\\python310\\libs', 'C', '\\Users\\pc\\.cache\\jittor\\msvc\\VC\\_\\_\\_\\_\\_\\bin', 'C', '\\Users\\pc\\AppData\\Local\\Programs\\Python\\Python310\\Lib\\site-packages\\opencv_python-4.6.0.66-py3.10-win-amd64.egg\\cv2\\../../x64/vc14/bin', 'C', '\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.7\\bin', 'C', '\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.7\\libnvvp', 'D', '\\Release', 'D', '\\ffmpeg\\bin', 'C', '\\ProgramData\\Oracle\\Java\\javapath', 'C', '\\Program Files\\Java\\jdk1.8.0_131\\bin', 'C', '\\Program Files\\Java\\jdk1.8.0_131\\jre\\bin', 'C', '\\Windows\\system32', 'C', '\\Windows', 'C', '\\Windows\\System32\\Wbem', 'C', '\\Windows\\System32\\WindowsPowerShell\\v1.0\\', 'C', '\\Windows\\System32\\OpenSSH\\', 'C', '\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common', 'C', '\\Program Files\\NVIDIA Corporation\\NVIDIA NvDLISR', 'C', '\\Program Files\\NVIDIA Corporation\\Nsight Compute 2022.2.1\\', 'C', '\\Users\\pc\\AppData\\Local\\Programs\\Python\\Python310\\Scripts\\', 'C', '\\Users\\pc\\AppData\\Local\\Programs\\Python\\Python310\\', 'C', '\\Program Files\\MySQL\\MySQL Shell 8.0\\bin\\', 'C', '\\Users\\pc\\AppData\\Local\\Microsoft\\WindowsApps', '', 'C', '\\Users\\pc\\AppData\\Local\\Programs\\Microsoft VS Code\\bin']), This path may cause jittor found the wrong libs, please unset LD_LIBRARY_PATH and remove cuda lib path in Path.
Or you can let jittor install cuda for you: `python3.x -m jittor_utils.install_cuda`
Loading config from:  configs/preprocess/dota1_5_preprocess_config.py
{'type': 'DOTA1_5', 'source_dataset_path': 'D:\\project-jittor\\dataset\\dota-1.0', 'target_dataset_path': 'D:\\project-jittor\\dataset\\dota-1.0-processed', 'tasks': [{'label': 'trainval', 'config': {'subimage_size': 600, 'overlap_size': 150, 'multi_scale': [1.0], 'horizontal_flip': False, 'vertical_flip': False, 'rotation_angles': [0.0]}}, {'label': 'test', 'config': {'subimage_size': 600, 'overlap_size': 150, 'multi_scale': [1.0], 'horizontal_flip': False, 'vertical_flip': False, 'rotation_angles': [0.0]}}], 'name': 'dota1_5_preprocess_config', 'work_dir': 'work_dirs/dota1_5_preprocess_config'}
==============
processing trainval
fatal   fatal   : : Memory allocation failureMemory allocation failure

fatal   : Memory allocation failure
系统无法执行指定的程序。
内存资源不足,无法处理此命令。
内存资源不足,无法处理此命令。
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 289, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\project-jittor\jdet\tools\preprocess.py", line 5, in <module>
    from jdet.config import init_cfg, get_cfg
  File "d:\project-jittor\jdet\python\jdet\__init__.py", line 1, in <module>
    from . import models
  File "d:\project-jittor\jdet\python\jdet\models\__init__.py", line 1, in <module>
    from .networks import *
  File "d:\project-jittor\jdet\python\jdet\models\networks\__init__.py", line 1, in <module>
    from .rcnn import RCNN
  File "d:\project-jittor\jdet\python\jdet\models\networks\rcnn.py", line 2, in <module>
    import jittor as jt
  File "C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\site-packages\jittor-1.3.5.16-py3.10.egg\jittor\__init__.py", line 32, in <module>
    from typing import List, Tuple
  File "C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\typing.py", line 2178, in <module>
    class SupportsInt(Protocol):
  File "c:\users\pc\appdata\local\programs\python\python310\lib\abc.py", line 106, in __new__
    cls = super().__new__(mcls, name, bases, namespace, **kwargs)
  File "C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\typing.py", line 1554, in __init_subclass__
    cls._is_protocol = any(b is Protocol for b in cls.__bases__)
MemoryError: Out of memory interning an attribute name
1 / 10
/ 0
1 / 0
C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\site-packages\shapely-2.0a1-py3.10-win-amd64.egg\shapely\set_operations.py:132: RuntimeWarning: invalid value encountered in intersection
  return lib.intersection(a, b, **kwargs)
C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\site-packages\shapely-2.0a1-py3.10-win-amd64.egg\shapely\set_operations.py:132: RuntimeWarning: invalid value encountered in intersection
  return lib.intersection(a, b, **kwargs)
C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\site-packages\shapely-2.0a1-py3.10-win-amd64.egg\shapely\set_operations.py:132: RuntimeWarning: invalid value encountered in intersection
  return lib.intersection(a, b, **kwargs)
2 / 0
2 / 0
3 / 0
4 / 0

在这后面有很多类似的 "数字 / 0" 日志, 以及下面这样的日志:

C:\Users\pc\AppData\Local\Programs\Python\Python310\lib\site-packages\shapely-2.0a1-py3.10-win-amd64.egg\shapely\set_operations.py:132: RuntimeWarning: invalid value encountered in intersection
  return lib.intersection(a, b, **kwargs)

命令执行过程中会弹出系统错误弹窗: nvcc.exe 应用程序无法正常启动(0xc0000142).请单击"确定"关闭应用程序.

再往后有大量类似的报错日志:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 236, in prepare
Traceback (most recent call last):
      File "<string>", line 1, in <module>
_fixup_main_from_path(data['init_main_from_path'])
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 289, in run_path
  File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 116, in spawn_main
    return _run_module_code(code, init_globals, run_name,
exitcode = _main(fd, parent_sentinel)  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 96, in _run_module_code

      File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 125, in _main
_run_code(code, mod_globals, init_globals,
prepare(preparation_data)  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 86, in _run_code

      File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 236, in prepare
exec(code, run_globals)
_fixup_main_from_path(data['init_main_from_path'])  File "D:\project-jittor\jdet\tools\preprocess.py", line 5, in <module>

      File "c:\users\pc\appdata\local\programs\python\python310\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
from jdet.config import init_cfg, get_cfg
      File "d:\project-jittor\jdet\python\jdet\__init__.py", line 1, in <module>
main_content = runpy.run_path(main_path,
from . import models  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 289, in run_path

  File "d:\project-jittor\jdet\python\jdet\models\__init__.py", line 1, in <module>
    return _run_module_code(code, init_globals, run_name,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 96, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "c:\users\pc\appdata\local\programs\python\python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\project-jittor\jdet\tools\preprocess.py", line 5, in <module>
    from jdet.config import init_cfg, get_cfg
  File "d:\project-jittor\jdet\python\jdet\__init__.py", line 1, in <module>
    from . import models
  File "d:\project-jittor\jdet\python\jdet\models\__init__.py", line 1, in <module>
        from .networks import *from .networks import *

  File "d:\project-jittor\jdet\python\jdet\models\networks\__init__.py", line 1, in <module>
  File "d:\project-jittor\jdet\python\jdet\models\networks\__init__.py", line 1, in <module>
MemoryError: Out of memory interning an attribute name
ImportError: DLL load failed while importing cv2: 页面文件太小,无法完成操作。

请问如此高的内存占用是否是正常情况?
有无限制内存使用量的配置项?
还是说目前只能使用更高内存的设备运行预处理指令?

@FirokOtaku
Copy link
Author

已在讨论群找到解决方案:
修改 jdet/python/jdet/data/devkits/ImgSplit_multi_process.py 中位于 329 行的 num_process 值,
将默认的 32 调低即可.

在我这边的环境下调到 8 就没什么问题了,
可以正常执行完成预处理脚本.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant