Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU kernel acceleration #15

Closed
mengjingyouling opened this issue Apr 14, 2022 · 19 comments
Closed

CPU kernel acceleration #15

mengjingyouling opened this issue Apr 14, 2022 · 19 comments

Comments

@mengjingyouling
Copy link

A CPU kernel was implemented in the project. We want to know which CPU can support it.What is the acceleration efficiency?
Thank you very much

@mengjingyouling
Copy link
Author

@mengjingyouling
Copy link
Author

We found it due to the two parameters act_ act_integer_bits and act_fraction_bits, and we use semi precision training. When act_integer_bits andact_fraction_bits set to 8bit is right,but sets to 16bit is wrong. What is the reason? Thank you!

@mostafaelhoushi
Copy link
Owner

Sorry for the delay. Let me look into the code.
When you don't use the CPU kernel, and use CUDA instead, are the results correct?

@mengjingyouling
Copy link
Author

when we run "sh install_kernels.sh" command, an error occurs:
unable to execute '/usr/local/cuda-10.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda-10.0/bin/nvcc' failed with exit status 1

@mengjingyouling
Copy link
Author

(deepshift) ubuntu@ubuntu-NF5280M5:~/zj/DeepShift/pytorch$ sh install_kernels.sh
running install
running bdist_egg
running egg_info
writing unoptimized_cuda.egg-info/PKG-INFO
writing dependency_links to unoptimized_cuda.egg-info/dependency_links.txt
writing top-level names to unoptimized_cuda.egg-info/top_level.txt
reading manifest file 'unoptimized_cuda.egg-info/SOURCES.txt'
writing manifest file 'unoptimized_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'unoptimized_cuda' extension
gcc -pthread -B /home/ubuntu/anaconda3/envs/deepshift/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/TH -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-10.0/include -I/home/ubuntu/anaconda3/envs/deepshift/include/python3.6m -c unoptimized_cuda.cpp -o build/temp.linux-x86_64-3.6/unoptimized_cuda.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=unoptimized_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-10.0/bin/nvcc -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/TH -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-10.0/include -I/home/ubuntu/anaconda3/envs/deepshift/include/python3.6m -c unoptimized.cu -o build/temp.linux-x86_64-3.6/unoptimized.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=unoptimized_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
unable to execute '/usr/local/cuda-10.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda-10.0/bin/nvcc' failed with exit status 1
/home/ubuntu/zj/DeepShift/pytorch
running install
running bdist_egg
running egg_info
writing deepshift_cpu.egg-info/PKG-INFO
writing dependency_links to deepshift_cpu.egg-info/dependency_links.txt
writing top-level names to deepshift_cpu.egg-info/top_level.txt
reading manifest file 'deepshift_cpu.egg-info/SOURCES.txt'
writing manifest file 'deepshift_cpu.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
creating build/bdist.linux-x86_64/egg
copying build/lib.linux-x86_64-3.6/deepshift_cpu.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg
creating stub loader for deepshift_cpu.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/deepshift_cpu.py to deepshift_cpu.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying deepshift_cpu.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying deepshift_cpu.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying deepshift_cpu.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying deepshift_cpu.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
pycache.deepshift_cpu.cpython-36: module references file
creating 'dist/deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg
removing '/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg
Extracting deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg to /home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages
deepshift-cpu 0.0.0 is already the active version in easy-install.pth

Installed /home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for deepshift-cpu==0.0.0
Finished processing dependencies for deepshift-cpu==0.0.0
/home/ubuntu/zj/DeepShift/pytorch
running install
running bdist_egg
running egg_info
writing deepshift_cuda.egg-info/PKG-INFO
writing dependency_links to deepshift_cuda.egg-info/dependency_links.txt
writing top-level names to deepshift_cuda.egg-info/top_level.txt
reading manifest file 'deepshift_cuda.egg-info/SOURCES.txt'
writing manifest file 'deepshift_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'deepshift_cuda' extension
gcc -pthread -B /home/ubuntu/anaconda3/envs/deepshift/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/TH -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-10.0/include -I/home/ubuntu/anaconda3/envs/deepshift/include/python3.6m -c shift_cuda.cpp -o build/temp.linux-x86_64-3.6/shift_cuda.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=deepshift_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-10.0/bin/nvcc -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/TH -I/home/ubuntu/anaconda3/envs/deepshift/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-10.0/include -I/home/ubuntu/anaconda3/envs/deepshift/include/python3.6m -c shift.cu -o build/temp.linux-x86_64-3.6/shift.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=deepshift_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
unable to execute '/usr/local/cuda-10.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda-10.0/bin/nvcc' failed with exit status 1

@mengjingyouling
Copy link
Author

Our path is /usr/local/cuda-11.1/bin/nccc. What should we do?

@mengjingyouling
Copy link
Author

@mostafaelhoushi

@mostafaelhoushi
Copy link
Owner

Try to run:

export PATH=$PATH:/ /usr/local/cuda-11.1/bin/

and see if it works

@mostafaelhoushi
Copy link
Owner

I would like to clarify that:

  • CUDA kernel should work but it is inefficient. A lot of work is required to make them more efficient
  • I recall that CPU kernel may need to be updated or fixed. And also they are inefficient as well.

@mengjingyouling
Copy link
Author

we run the command:
export PATH=$PATH:/ /usr/local/cuda-11.1/bin/

but it also hvte the same error as before:
unable to execute '/usr/local/cuda-10.0/bin/nvcc': No such file or directory
error: command '/usr/local/cuda-10.0/bin/nvcc' failed with exit status 1

The kernel of this project can only support cuda10.0? How can we do?Thank you very much.

@mengjingyouling
Copy link
Author

We solved it by run:

export CUDA_HOME=/usr/local/cuda

@mostafaelhoushi
Copy link
Owner

This is great. Please don't hesitate to ask if you have further questions.

@mengjingyouling
Copy link
Author

We encountered a new error,with run the command "sh install_kernels.sh":

Installed /home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/deepshift_cpu-0.0.0-py3.6-linux-x86_64.egg
Processing dependencies for deepshift-cpu==0.0.0
Finished processing dependencies for deepshift-cpu==0.0.0
/home/ubuntu/zj/yolov3/pytorch
running install
running bdist_egg
running egg_info
writing deepshift_cuda.egg-info/PKG-INFO
writing dependency_links to deepshift_cuda.egg-info/dependency_links.txt
writing top-level names to deepshift_cuda.egg-info/top_level.txt
/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/torch/utils/cpp_extension.py:381: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'deepshift_cuda.egg-info/SOURCES.txt'
writing manifest file 'deepshift_cuda.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
Traceback (most recent call last):
File "setup.py", line 13, in
'build_ext': BuildExtension
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/install.py", line 67, in run
self.do_egg_install()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/install.py", line 109, in do_egg_install
self.run_command('bdist_egg')
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/bdist_egg.py", line 164, in run
cmd = self.call_command('install_lib', warn_dir=0)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
self.run_command(cmdname)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/install_lib.py", line 11, in run
self.build()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/command/install_lib.py", line 107, in build
self.run_command('build_ext')
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 404, in build_extensions
self._check_cuda_version()
File "/home/ubuntu/anaconda3/envs/yolov3_new/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 781, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.1) mismatches the version that was used to compile
PyTorch (10.2). Please make sure to use the same CUDA versions.

@mengjingyouling
Copy link
Author

Your code only support specific CUDA version? The CUDA version we use is 11.1. What we can do to support it?
Thany you very much! @mostafaelhoushi

@mostafaelhoushi
Copy link
Owner

I think the error is due to the mismatch of your PyTorch version and CUDA version (rather than my code).

I checked the PyTorch website ( https://pytorch.org/get-started/previous-versions/ ) and found this installation command for CUDA 11.1. So I suggest to start a new conda environment and install PyTorch using this command:

# CUDA 11.1
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

@mengjingyouling
Copy link
Author

Thank you very much for your solution. The kernel is installed correctly.
We also want to discuss a problem with you. In your paper, the shift network is applied in classification network, not target detection. What do you think? Is there a decline in the accuracy?

@mengjingyouling
Copy link
Author

mengjingyouling commented Apr 21, 2022

Because the shift 1 bit will lead to some accuracy loss. We want to shift twice to solve it. For example: 10 = 8 + 2( shift 3 bits + shift 1 bit). Therefore, we modify the code as follows:

def get_shift_and_sign(x, rounding='deterministic'):
sign = torch.sign(x)
x_abs = torch.abs(x)
shift1 = round(torch.log(x_abs) / np.log(2), rounding)
wr1 = 2 ** shift1
w1 = x_abs-wr1
shift2 = round(torch.log(w1) / np.log(2), rounding)
return shift1,shift2, sign

def round_power_of_2(x, rounding='deterministic'):

shift1,shift2,sign = get_shift_and_sign(x, rounding)
x_rounded = (2.0 ** shift1+2.0 ** shift2) * sign
return x_rounded

However, the input in class Conv2dShiftQ(_ConvNdShiftQ): function will become Nan, which should be caused by data overflow:

class Conv2dShiftQ(_ConvNdShiftQ):
... ....
... ...

#@weak_script_method
def forward(self, input):
print("--------------------------------------forward---------------------------------------------------")
print("input======",input)

Can you give some suggestions to solve it? Thank you very much.

@mostafaelhoushi
Copy link
Owner

Thanks @mengjingyouling . I will close this issue and started a new issue #16 to discuss the other question.

@shihuihong214
Copy link

Thank you very much for your solution. The kernel is installed correctly. We also want to discuss a problem with you. In your paper, the shift network is applied in classification network, not target detection. What do you think? Is there a decline in the accuracy?

@mengjingyouling Hi, I recently tried to install the shift kernels with torch1.10.0 and CUDA 11.1 but failed to compile, I wonder whether you have ever successfully compiled the shift kernel under the same torch and cuda version? I would appreciate it if you could give me some guidance, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants