Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I found ChannelwiseConvolution is slow in cpu. #6

Closed
KeyKy opened this issue Jun 15, 2017 · 2 comments
Closed

I found ChannelwiseConvolution is slow in cpu. #6

KeyKy opened this issue Jun 15, 2017 · 2 comments

Comments

@KeyKy
Copy link

KeyKy commented Jun 15, 2017

I use your ChannelwiseConvolution to implement mobilenet. However i only can get (2s/image without mkl, 0.90 with mkl) in cpu while tensorflow-mobilenet is 0.059s/image. So i ask for some idea to improve speed in cpu.

@cypw
Copy link
Owner

cypw commented Jun 15, 2017

@KeyKy

The running speed of channel wise convolution operation really depends on the parallelization strategy.

My implementation uses BatchGEMM which is slightly faster on very small feature maps (e.g. size=7x7 or size=14x14).

While for larger feature maps (e.g. size=56x56 & size=28x28), I'd recommond you to use the official convolutional layer with option 'num_group = num_filter'.

But still, I don't think it can achieve very high training/testing speed by only using these high-level interfaces. A deeply optimized CUDA code is necessary for fast channel wise convolution. : )

@KeyKy
Copy link
Author

KeyKy commented Jun 15, 2017

@cypw Thanks. I use official convolutional layer with group to deal with larger feature maps. I can get 0.401s/image(cpu) in mxnet with mkl. It can be used in some of my classification task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants