You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I apologize if it is a duplicate issue. I just pip installed deepspeed with pytorch 1.11. But I am still having issues with cpu_adam.
python -c "import deepspeed; deepspeed.ops.op_builder.CPUAdamBuilder().load() "
Installed CUDA version 10.0 does not match the version torch was compiled with 10.2 but since the APIs are compatible, accepting this combination
Using /home/ubuntu/.cache/torch_extensions/py38_cu102 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu102/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o
FAILED: cpu_adam.o
c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/TH -isystem /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__AVX256__ -c /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o
In file included from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:3:0,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
from /home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp:5:
/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:10:10: fatal error: Python.h: No such file or directory
#include <Python.h>
^~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1740, in _run_ninja_build
subprocess.run(
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 470, in load
return self.jit_load(verbose)
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 512, in jit_load
op_module = load(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1144, in load
return _jit_compile(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1469, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/ubuntu/nlp_prompting/env/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'cpu_adam'
What is the easiest way to solve it?
The text was updated successfully, but these errors were encountered:
Hi @arijitthegame, thanks for reporting your issue. In this case I see the error is related to not being able to find Python.h. I think you'll want to make sure you have python-dev installed. DeepSpeed compiles several custom cuda/cpp kernels which have python bindings.
Hi,
I apologize if it is a duplicate issue. I just pip installed deepspeed with pytorch 1.11. But I am still having issues with cpu_adam.
What is the easiest way to solve it?
The text was updated successfully, but these errors were encountered: