Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about stream and Stencil #323

Closed
XinhuaZhang opened this issue Jul 11, 2016 · 1 comment
Closed

Questions about stream and Stencil #323

XinhuaZhang opened this issue Jul 11, 2016 · 1 comment
Labels
question question from the user

Comments

@XinhuaZhang
Copy link

XinhuaZhang commented Jul 11, 2016

  1. Suppose I have 200 data, is there any efficiency difference between streaming them once and streaming the first 100 then the second 100 separately? In the latter case, will there be two CUDA kernels being created?
  2. I would like to use accelerate to do the spatial pooling. However, the largest Stencil pattern is 9. Is there any way to do it with whatever pooling size I want? Also, on a DIM3 array, I have to use a 3D stencil, say Stencil9x9x3. I tried to create a Stencil9x9x1, but it didn't work. Currently, the cuDNN library provides very efficient APIs to do things such as convolution and pool. Is there any plan that the accelerate package is going to use these APIs?
@tmcdonell
Copy link
Member

Do you mean 200 arrays, or a single 200 element array?

At a high level, there won't be too much difference in either case, because behind the scenes the kernels will be cached and hence only need to be compiled once, but it will still be more efficient to use run1 or stream wherever possible.

What do you mean by spatial pooling? Tiling? Stencils weren't meant to describe tiling (although it is possible to implement them in this way, under the hood). At the moment Accelerate focuses on expressing what to compute (e.g. stencil), rather than how to compute it (e.g. stencil with NxM tiling), although that is something we have been thinking about.

What do you mean when you say that the 9x9x1 stencil didn't work? It is not necessary to use all elements of the stencil pattern, so a 9x9x1 stencil can be implemented as a 9x9x3 stencil where you only use the centre element in the last dimension. In terms of the generated code, there will be no difference. Did you run into some problem trying to implement it in this way?

Which parts of the cuDNN library are you interested in? Do you have an example program? We are quite motivated by having real test cases and examples that demonstrate what new features we should focus on...

@tmcdonell tmcdonell added the question question from the user label Jul 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question question from the user
Projects
None yet
Development

No branches or pull requests

2 participants