why you not support CPU #34

azuryl · 2019-10-25T10:21:26Z

it is difficult to rrealize by code?

yossibiton · 2019-11-12T09:51:26Z

I agree - CPU version would be very useful for debugging purposes.

I'm trying to use the CornerNet code, which relies on your code (https://github.com/xingyizhou/CenterNet/tree/master/src/lib/models/networks/DCNv2).
It fails to run the inference demo with most of the models, because of memory issues (i have 4GB GPU).

abhigoku10 · 2020-01-04T11:35:28Z

@azuryl @yossibiton @CharlesShang do we have cpu version of dcnv2 ? if not when can we expect the cpu version

palver7 · 2020-03-17T08:35:08Z

@abhigoku10, @yossibiton, @azuryl, I have modified DCNv2 from this repository to add the CPU functionality. I have submitted a pull request to Charles Shang, but so far there is no response from him. Have a look and try my implementation: https://github.com/palver7/DCNv2 .

@CharlesShang Please have a look and comment/review on my pull request.

macqueen09 · 2020-05-20T09:15:46Z

@abhigoku10, @yossibiton, @azuryl, I have modified DCNv2 from this repository to add the CPU functionality. I have submitted a pull request to Charles Shang, but so far there is no response from him. Have a look and try my implementation: https://github.com/palver7/DCNv2 .

@CharlesShang Please have a look and comment/review on my pull request.

your link https://github.com/palver7/DCNv2 are 404 , where can I get CPU DCNv2 .
Thanks very much .

abhigoku10 · 2020-05-20T09:51:32Z

@palver7 thanks for sharing it , but getting 404 error can you share you the link

palver7 · 2020-05-21T13:40:21Z

Hi, @macqueen09 @abhigoku10, Charles Shang has merged my repo with his, now DCNv2 in this repo can operate using cpu or gpu. Because of this, I do not need to maintain my repo and I deleted it. That is why you get the 404 error You can re download the DCNv2 and run python3 testcpu.py to see if it runs on your cpu.

abhigoku10 · 2020-05-21T14:19:16Z

@palver7 can you share the location of ur repo , i tried to find it but could not see in your profile thanks for doing it

palver7 · 2020-05-22T08:24:27Z

@abhigoku10 I have deleted DCNv2 from my repo. Check again this link https://github.com/CharlesShang/DCNv2 readme. it now has a line that says run python testcpu.py to check if it runs on CPU. This was from my merged repo.

Also, If you check the files inside the src/cpu directory you will see that they now contain actual codes instead of the previous "not implemented on cpu" error message placeholders. You can now use Charles' DCNv2 on CPU as well as GPU.

abhigoku10 · 2020-05-23T14:41:02Z

@palver7 @CharlesShang thanks a lot for work you guys have done !!!

tabsun · 2020-06-19T09:24:35Z

@abhigoku10 I have deleted DCNv2 from my repo. Check again this link https://github.com/CharlesShang/DCNv2 readme. it now has a line that says run python testcpu.py to check if it runs on CPU. This was from my merged repo.

Also, If you check the files inside the src/cpu directory you will see that they now contain actual codes instead of the previous "not implemented on cpu" error message placeholders. You can now use Charles' DCNv2 on CPU as well as GPU.

Great work!
And I have used your dcnv2-cpu version into mmdetection for prediction and get correct result.
But the cpu dcnv2 is really slow. In my situation one dcn operation will cost 200~600ms as GPU only use 3ms. For networks with multiple dcn layers, the speed is a real concern. When I want to speed up it, I read the code and "yeah, not much to do".
Do you have some advice for better implementation? Or any other implementation we can refer to ?

Update:
I added openmp into im2col, it's a good tool to speed up loop operations.

palver7 · 2020-06-23T04:58:33Z

@tabsun Hi, I am happy to hear the CPU implementation works for you. Thanks for sharing about openmp too. I was going to suggest that you try making a CPU version of the TH Cuda blas Sgemmbatched routine, since that was what Charles used (in the dcn_v2_cuda.cu file) to improve the CUDA version. I changed that to just ordinary TH float blas gemm because I cannot find the CPU version for the cuda batched gemm routine.

yeyuanzheng177 mentioned this issue Dec 9, 2019

When I use apex on 2080ti, I get the following error, how can I solve it? #42

Open

palver7 mentioned this issue Feb 19, 2020

modified DCNv2 to work on CPU and GPU #52

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why you not support CPU #34

why you not support CPU #34

azuryl commented Oct 25, 2019

yossibiton commented Nov 12, 2019 •

edited

Loading

abhigoku10 commented Jan 4, 2020

palver7 commented Mar 17, 2020

macqueen09 commented May 20, 2020

abhigoku10 commented May 20, 2020

palver7 commented May 21, 2020 •

edited

Loading

abhigoku10 commented May 21, 2020

palver7 commented May 22, 2020 •

edited

Loading

abhigoku10 commented May 23, 2020

tabsun commented Jun 19, 2020 •

edited

Loading

palver7 commented Jun 23, 2020

why you not support CPU #34

why you not support CPU #34

Comments

azuryl commented Oct 25, 2019

yossibiton commented Nov 12, 2019 • edited Loading

abhigoku10 commented Jan 4, 2020

palver7 commented Mar 17, 2020

macqueen09 commented May 20, 2020

abhigoku10 commented May 20, 2020

palver7 commented May 21, 2020 • edited Loading

abhigoku10 commented May 21, 2020

palver7 commented May 22, 2020 • edited Loading

abhigoku10 commented May 23, 2020

tabsun commented Jun 19, 2020 • edited Loading

palver7 commented Jun 23, 2020

yossibiton commented Nov 12, 2019 •

edited

Loading

palver7 commented May 21, 2020 •

edited

Loading

palver7 commented May 22, 2020 •

edited

Loading

tabsun commented Jun 19, 2020 •

edited

Loading