Ninja build error. #4756

daje0601 · 2022-08-19T08:39:29Z

In the past, there was no problem when installing with python setup.py develop.
I installed it in a new conda environment for bb3, but the ninjia error keeps occurring.

 File "/home/sseung/ParlAI/parlai/ops/ngram_repeat_block.py", line 23, in <module>
    ngram_repeat_block_cuda = load(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1425, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1506, in _write_ninja_file_and_build_library
    verify_ninja_availability()
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1562, in verify_ninja_availability
    raise RuntimeError("Ninja is required to load C++ extensions")

I try two method 1) pip uninstall Ninja -> pip install Ninja
and 2) I authorize to python3.8/site-packages/ninja-1.10.2.3-py3.8-linux-x86_64.egg/ninja/data/bin/ninja'

and then

torch/cuda/__init__.py:146: UserWarning: 
NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
17:22:58 | building dictionary first...
17:22:58 | No model with opt yet at: /tmp/t5_init_model(.opt)
17:22:58 | Using CUDA
17:23:07 | Total parameters: 95,628,672 (95,628,672 trainable)
Traceback (most recent call last):
  File "/home/sseung/.conda/envs/parlai/bin/parlai", line 33, in <module>
    sys.exit(load_entry_point('parlai', 'console_scripts', 'parlai')())
  File "/home/sseung/ParlAI/parlai/__main__.py", line 14, in main
    superscript_main()
  File "/home/sseung/ParlAI/parlai/core/script.py", line 325, in superscript_main
    return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
  File "/home/sseung/ParlAI/parlai/core/script.py", line 108, in _run_from_parser_and_opt
    return script.run()
  File "/home/sseung/ParlAI/parlai/scripts/train_model.py", line 1054, in run
    self.train_loop = TrainLoop(self.opt)
  File "/home/sseung/ParlAI/parlai/scripts/train_model.py", line 378, in __init__
    self.agent = create_agent(opt)
  File "/home/sseung/ParlAI/parlai/core/agents.py", line 479, in create_agent
    model = model_class(opt)
  File "/home/sseung/ParlAI/parlai/core/torch_generator_agent.py", line 542, in __init__
    was_reset = self.init_optim(
  File "/home/sseung/ParlAI/parlai/core/torch_agent.py", line 1055, in init_optim
    self.optimizer = SafeFP16Optimizer(
  File "/home/sseung/ParlAI/parlai/utils/fp16.py", line 114, in __init__
    self.fp32_params = self._build_fp32_params(self.fp16_params, flatten=False)
  File "/home/sseung/ParlAI/parlai/utils/fp16.py", line 155, in _build_fp32_params
    p32 = torch.nn.Parameter(p.data.float())
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

To fix this error, I reinstalled torch and worked to match the cuda version, but I'm still getting this error.
Please rely on me

The text was updated successfully, but these errors were encountered:

klshuster · 2022-08-19T18:19:45Z

i believe you need to do pip install ninja (lowercase)

additionally it looks like your GPU does not support the version of pytorch you're trying to install?

daje0601 · 2022-08-20T09:11:16Z

I tried reinstalling with 3 torch versions.

pip3 install torch==1.8.2 torchvision==0.9.2 torchaudio==0.8.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

daje0601 · 2022-08-20T09:17:49Z

Today I all remove conda env, ParlAI folder.
I reset all about it.

I met new error.

Installed /home/.conda/envs/parlai/lib/python3.8/site-packages/untokenize-0.1.1-py3.8.egg
error: tomli 1.2.3 is installed but tomli<3.0.0,>=2.0.0 is required by {'docformatter'}

parlai need tomli<2.0.0 but python3.8 need tomli<3.0.0 & tomli>=2.0.0

I've tried upgrading the python version 3.8 to 3.9 just in case, but I'm still getting this error.

How can I fix it...?

stephenroller · 2022-08-20T16:20:24Z

We probably just need to bump tomli to >=2.0.0 in our requirements.txt

klshuster · 2022-08-22T12:44:34Z

That is fixed in #4759

daje0601 · 2022-08-22T12:54:31Z

That is fixed in #4759

Thank you!

klshuster self-assigned this Aug 19, 2022

daje0601 closed this as completed Aug 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ninja build error. #4756

Ninja build error. #4756

daje0601 commented Aug 19, 2022

klshuster commented Aug 19, 2022

daje0601 commented Aug 20, 2022 •

edited

daje0601 commented Aug 20, 2022 •

edited

stephenroller commented Aug 20, 2022

klshuster commented Aug 22, 2022

daje0601 commented Aug 22, 2022

Ninja build error. #4756

Ninja build error. #4756

Comments

daje0601 commented Aug 19, 2022

klshuster commented Aug 19, 2022

daje0601 commented Aug 20, 2022 • edited

daje0601 commented Aug 20, 2022 • edited

stephenroller commented Aug 20, 2022

klshuster commented Aug 22, 2022

daje0601 commented Aug 22, 2022

daje0601 commented Aug 20, 2022 •

edited

daje0601 commented Aug 20, 2022 •

edited