Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Ninja build error. #4756

Closed
daje0601 opened this issue Aug 19, 2022 · 6 comments
Closed

Ninja build error. #4756

daje0601 opened this issue Aug 19, 2022 · 6 comments
Assignees

Comments

@daje0601
Copy link

In the past, there was no problem when installing with python setup.py develop.
I installed it in a new conda environment for bb3, but the ninjia error keeps occurring.

 File "/home/sseung/ParlAI/parlai/ops/ngram_repeat_block.py", line 23, in <module>
    ngram_repeat_block_cuda = load(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1425, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1506, in _write_ninja_file_and_build_library
    verify_ninja_availability()
  File "/home/sseung/.conda/envs/parlai/lib/python3.8/site-packages/torch-1.12.1-py3.8-linux-x86_64.egg/torch/utils/cpp_extension.py", line 1562, in verify_ninja_availability
    raise RuntimeError("Ninja is required to load C++ extensions")

I try two method 1) pip uninstall Ninja -> pip install Ninja
and 2) I authorize to python3.8/site-packages/ninja-1.10.2.3-py3.8-linux-x86_64.egg/ninja/data/bin/ninja'

and then

torch/cuda/__init__.py:146: UserWarning: 
NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
17:22:58 | building dictionary first...
17:22:58 | No model with opt yet at: /tmp/t5_init_model(.opt)
17:22:58 | Using CUDA
17:23:07 | Total parameters: 95,628,672 (95,628,672 trainable)
Traceback (most recent call last):
  File "/home/sseung/.conda/envs/parlai/bin/parlai", line 33, in <module>
    sys.exit(load_entry_point('parlai', 'console_scripts', 'parlai')())
  File "/home/sseung/ParlAI/parlai/__main__.py", line 14, in main
    superscript_main()
  File "/home/sseung/ParlAI/parlai/core/script.py", line 325, in superscript_main
    return SCRIPT_REGISTRY[cmd].klass._run_from_parser_and_opt(opt, parser)
  File "/home/sseung/ParlAI/parlai/core/script.py", line 108, in _run_from_parser_and_opt
    return script.run()
  File "/home/sseung/ParlAI/parlai/scripts/train_model.py", line 1054, in run
    self.train_loop = TrainLoop(self.opt)
  File "/home/sseung/ParlAI/parlai/scripts/train_model.py", line 378, in __init__
    self.agent = create_agent(opt)
  File "/home/sseung/ParlAI/parlai/core/agents.py", line 479, in create_agent
    model = model_class(opt)
  File "/home/sseung/ParlAI/parlai/core/torch_generator_agent.py", line 542, in __init__
    was_reset = self.init_optim(
  File "/home/sseung/ParlAI/parlai/core/torch_agent.py", line 1055, in init_optim
    self.optimizer = SafeFP16Optimizer(
  File "/home/sseung/ParlAI/parlai/utils/fp16.py", line 114, in __init__
    self.fp32_params = self._build_fp32_params(self.fp16_params, flatten=False)
  File "/home/sseung/ParlAI/parlai/utils/fp16.py", line 155, in _build_fp32_params
    p32 = torch.nn.Parameter(p.data.float())
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

To fix this error, I reinstalled torch and worked to match the cuda version, but I'm still getting this error.
Please rely on me

@klshuster
Copy link
Contributor

i believe you need to do pip install ninja (lowercase)

additionally it looks like your GPU does not support the version of pytorch you're trying to install?

@klshuster klshuster self-assigned this Aug 19, 2022
@daje0601
Copy link
Author

daje0601 commented Aug 20, 2022

I tried reinstalling with 3 torch versions.

pip3 install torch==1.8.2 torchvision==0.9.2 torchaudio==0.8.2 --extra-index-url https://download.pytorch.org/whl/lts/1.8/cu111
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

@daje0601
Copy link
Author

daje0601 commented Aug 20, 2022

Today I all remove conda env, ParlAI folder.
I reset all about it.

I met new error.

Installed /home/.conda/envs/parlai/lib/python3.8/site-packages/untokenize-0.1.1-py3.8.egg
error: tomli 1.2.3 is installed but tomli<3.0.0,>=2.0.0 is required by {'docformatter'}

parlai need tomli<2.0.0 but python3.8 need tomli<3.0.0 & tomli>=2.0.0

I've tried upgrading the python version 3.8 to 3.9 just in case, but I'm still getting this error.

How can I fix it...?

@stephenroller
Copy link
Contributor

We probably just need to bump tomli to >=2.0.0 in our requirements.txt

@klshuster
Copy link
Contributor

That is fixed in #4759

@daje0601
Copy link
Author

That is fixed in #4759

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants