Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error building cpu_adam #2268

Closed
arijitthegame opened this issue Aug 26, 2022 · 2 comments
Closed

Error building cpu_adam #2268

arijitthegame opened this issue Aug 26, 2022 · 2 comments

Comments

@arijitthegame
Copy link

Hi,

I apologize if it is a duplicate issue. I just pip installed deepspeed with pytorch 1.11. But I am still having issues with cpu_adam.

python -c "import deepspeed; deepspeed.ops.op_builder.CPUAdamBuilder().load() "
Installed CUDA version 10.0 does not match the version torch was compiled with 10.2 but since the APIs are compatible, accepting this combination
Using /home/ubuntu/.cache/torch_extensions/py38_cu102 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu102/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
FAILED: cpu_adam.o 
c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
In file included from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:3:0,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
                 from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp:5:
/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:10:10: fatal error: Python.h: No such file or directory
 #include <Python.h>
          ^~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1740, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 470, in load
    return self.jit_load(verbose)
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 512, in jit_load
    op_module = load(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1144, in load
    return _jit_compile(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1469, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'cpu_adam'

What is the easiest way to solve it?

@jeffra
Copy link
Collaborator

jeffra commented Aug 29, 2022

Hi @arijitthegame, thanks for reporting your issue. In this case I see the error is related to not being able to find Python.h. I think you'll want to make sure you have python-dev installed. DeepSpeed compiles several custom cuda/cpp kernels which have python bindings.

Can you try installing python-dev? Examples: https://stackoverflow.com/questions/21530577/fatal-error-python-h-no-such-file-or-directory

@arijitthegame
Copy link
Author

arijitthegame commented Aug 29, 2022

Thank you so much for your reply. When I try to install python-dev, I get
python3-dev is already the newest version (3.6.7-1~18.04).

EDIT: I just needed to install the correct python3.x-dev and it works now. Thank you so much!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants