Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Error building extension 'quant_cuda' #44

Closed
rodrigolagartera opened this issue May 6, 2021 · 6 comments
Closed

RuntimeError: Error building extension 'quant_cuda' #44

rodrigolagartera opened this issue May 6, 2021 · 6 comments

Comments

@rodrigolagartera
Copy link

rodrigolagartera commented May 6, 2021

I've tried to install qpytorch, and at the begining of the Functionality_Overview example, i got this error:

`Using /home/rodrigo/.cache/torch_extensions as PyTorch extensions root...
Emitting ninja build file /home/rodrigo/.cache/torch_extensions/quant_cpu/build.ninja...
Building extension module quant_cpu...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module quant_cpu...
Using /home/rodrigo/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/rodrigo/.cache/torch_extensions/quant_cuda/build.ninja...
Building extension module quant_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)

CalledProcessError Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix)
1666 stdout_fileno = 1
-> 1667 subprocess.run(
1668 command,

~/anaconda3/lib/python3.8/subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
511 if check and retcode:
--> 512 raise CalledProcessError(retcode, process.args,
513 output=stdout, stderr=stderr)

CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in
1 import torch
----> 2 import qtorch

~/tfm/QPyTorch/qtorch/init.py in
1 from .number import *
----> 2 from .posit_activation import *
3 all = ["FixedPoint", "BlockFloatingPoint", "FloatingPoint", "Posit", "PositTanhModule","PositTanhModuleEnhanced","RefTanhModule"]

~/tfm/QPyTorch/qtorch/posit_activation.py in
2 #Todo : implement sigmoid, rarely used in modern DNN
3 import torch
----> 4 from qtorch.quant import posit_sigmoid, posit_tanh, posit_tanh_enhanced
5 class PositTanhModule(torch.nn.Module):
6 def forward(self, input):

~/tfm/QPyTorch/qtorch/quant/init.py in
----> 1 from .quant_function import *
2 from .quant_module import *
3
4 all = [
5 "fixed_point_quantize",

~/tfm/QPyTorch/qtorch/quant/quant_function.py in
20
21 if torch.cuda.is_available():
---> 22 quant_cuda = load(
23 name="quant_cuda",
24 sources=[

~/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
1077 verbose=True)
1078 '''
-> 1079 return _jit_compile(
1080 name,
1081 [sources] if isinstance(sources, str) else sources,

~/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
1290 clean_ctx=clean_ctx
1291 )
-> 1292 _write_ninja_file_and_build_library(
1293 name=name,
1294 sources=sources,

~/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_standalone)
1402 if verbose:
1403 print(f'Building extension module {name}...')
-> 1404 _run_ninja_build(
1405 build_directory,
1406 verbose,

~/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py in _run_ninja_build(build_directory, verbose, error_prefix)
1681 if hasattr(error, 'output') and error.output: # type: ignore
1682 message += f": {error.output.decode()}" # type: ignore
-> 1683 raise RuntimeError(message) from e
1684
1685

RuntimeError: Error building extension 'quant_cuda'`

@Tiiiger
Copy link
Owner

Tiiiger commented May 6, 2021

hi @rodrigolagartera

what is your pytorch version? does your pytorch support cuda?

@Tiiiger
Copy link
Owner

Tiiiger commented May 6, 2021

also what is your system, is this linux?

@rodrigolagartera
Copy link
Author

rodrigolagartera commented May 7, 2021

Hi @Tiiiger , I use Ubuntu 20.04. My pytorch version is 1.8.0 and it supports cuda.
I have all the requisites of the installation with Python version 3.8, gcc version 9.3.0.
At the moment of that error I was using CUDA version 10.1.
I decided to update to CUDA 11.0, and when I execute the example, it doesn't appear that error, but I think it doesn't use CUDA, because I got this message:

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' Using /home/rodrigo/.cache/torch_extensions as PyTorch extensions root... Emitting ninja build file /home/rodrigo/.cache/torch_extensions/quant_cpu/build.ninja... Building extension module quant_cpu... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Loading extension module quant_cpu...

I've changed the CUDA_HOME variable to he 11.0 version and still get this. The example works, but I'm afraid it doesn't use CUDA

@Tiiiger
Copy link
Owner

Tiiiger commented May 7, 2021

hi @Tiiiger

sorry to hear that. I have tested on 1.8.0 before and it worked for me.

One thing that has caused error for me is staled cache for the compiled package. Can you try remove /home/rodrigo/.cache/torch_extensions/quant_* and recomplie?

I will try to replicate in the weekend.

@rodrigolagartera
Copy link
Author

Hi @Tiiiger
I've tried removing the directory you said, but I got the same result. I don't know if there is a problem with my CUDA environment.
However the main problem is solved, now I can execute the examples.

Thank you for your help.

@Tiiiger
Copy link
Owner

Tiiiger commented May 8, 2021

ok glad to know that it works for you now. closing.

@Tiiiger Tiiiger closed this as completed May 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants