Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to an organization & renaming tiny-cnn #235

Closed
nyanp opened this issue Jul 17, 2016 · 36 comments
Closed

Move to an organization & renaming tiny-cnn #235

nyanp opened this issue Jul 17, 2016 · 36 comments

Comments

@nyanp
Copy link
Member

nyanp commented Jul 17, 2016

We've decided to move tiny-cnn to an organization account to accelarate its development (discussed in #226).

Since it is clear that we are expanding the scope of tiny-cnn from convolutional networks to general networks, the project name tiny-Cnn seems a bit inaccurate now. I want to change the project name to more appropriate one (if we agreed), at the timing of the transferring repository.

In #226 we have these 3 proposals:

  • tiny-dnn (convolutioanl net -> deep net)
  • hornet (loose acronyms of Header Only Network)
  • tiny-cnn (of course we can keep its name)

Whichever we take, naming of project doesn't affect the library API except for its namespace, and hyperlinks and folks, pull requests will be correctly redirected to the new repository.

Please feel free to give me your feedback if you have suggestions for the naming! We want to decide the name and move to a new account until around 7/25.

@H4kor
Copy link
Contributor

H4kor commented Jul 17, 2016

Sounds good.

I like the name hornet, but tiny-* might be better in terms of visibility/say what it is.

@edgarriba
Copy link
Member

For consistency tiny-dnn I think is okay. However, hornet could reflect a reborn of the library.

@bhack
Copy link
Contributor

bhack commented Jul 17, 2016

/cc @mtamburrano

@bhack
Copy link
Contributor

bhack commented Jul 17, 2016

I start push some names: .hdnn, lightdnn, lightnet, tinygraph, tinyflow, hornet

@patrikhuber
Copy link
Contributor

I would vote for tiny-dnn. "tiny-cnn" is already quite known, and with "tiny-dnn" the familiarity with the name is not lost. It's a nice name.

@edgarriba
Copy link
Member

So people, what's the decision?

@nyanp
Copy link
Member Author

nyanp commented Jul 25, 2016

tiny-dnn seems good for everyone. It's a nice name for me, and have no namespace problem. Let's std::move to tiny-dnn :)

@edgarriba
Copy link
Member

nice!

@bhack green light for the logo

@garybradski
Copy link

How about an actual tiny name?: dnn

@nyanp
Copy link
Member Author

nyanp commented Jul 27, 2016

@garybradski
Thanks, but I think it's too short for the library name. I'd vote for tiny-dnn.

@garybradski
Copy link

OK, tiny-dnn. I had some questions about this module. We have some other deep net modules:
https://github.com/opencv/opencv_contrib/tree/master/modules

dnn, (not very helpful README.md -- module to run caffe deep nets)
dnns_easily_fooled (Separate, good README ...)
cnn_3dobj (Good readme ... but wouldn't it be better as an example of tiny-dnn?)

Anyhow, I'm concerned that a new user won't know where to start. Maybe the READMEs have to be cross-linked and, for my intent, most of the real work should go into tiny-dnn which is the most complete now. I'm also concerned for exactly how tiny-dnn will work. Will it be dependent on tiny-cnn or a completely stand alone module? It will be much cleaner if it is a complete stand alone module. Is this OK? that effectively tiny-cnn becomes tiny-dnn? Of course, full credit can be given both in the module and in the contributors page:
http://code.opencv.org/projects/opencv/wiki/Contributors?version=3

@nyanp
Copy link
Member Author

nyanp commented Jul 28, 2016

@garybradski

Anyhow, I'm concerned that a new user won't know where to start.

I'm worried about it, too. I think current dnn module (lacks too many building blocks to use) should be replaced with the new dnn based on tiny-dnn. Re-organize cnn_3dobj as an example of tiny-dnn seems good. /cc: @wangyida any idea?

I'm also concerned for exactly how tiny-dnn will work

We'll just rename tiny-cnn to tiny-dnn, they're both the same. Basically tiny-dnn is stand alone, and we need some libraries to enable extra functionalities (libdnn for GPU, NNPACK for x86 speedup, and Protocol buffers for importing caffe's model).

Of course, full credit can be given both in the module and in the contributors page:

I'm happy to hear that :) Of course you might want to add two great GSoC students bridging OpenCV and tiny-dnn to this list.

@bhack
Copy link
Contributor

bhack commented Jul 28, 2016

/cc @ludv1x

@wangyida
Copy link
Contributor

@garybradski @nyanp cnn_3dobj is an interesting example for pose estimation together with category prediction, this could be included as an example rather than a standalone module. But the rendering process has a dependency on VTK toolkits.

@bhack
Copy link
Contributor

bhack commented Jul 28, 2016

This is the commit log for the other GSoC project https://github.com/ludv1x/opencv_contrib/commits/dnn-dev.
There are also opencv/opencv_contrib#738 and opencv/opencv_contrib#724

@garybradski
Copy link

As stated in email, and beyond this scope, but:
Running models produced by Caffe, as done in dnn is now better focused on running them on tiny-dnn
cnn_3dobj would be best ported to tiny-dnn
dnns_easily_fooled, probably OK as is

@edgarriba
Copy link
Member

edgarriba commented Jul 30, 2016

@ludv1x I've seen you are contributing with OpenCL kernels to dnn module.
Could we somehow collaborate to have the same in tiny-dnn?
/cc @garybradski @bhack

@garybradski The current dnn module seems to have an extended core, so we'll have a parallelism with the new "tiny" module. What's the plan?

@bhack
Copy link
Contributor

bhack commented Jul 30, 2016

@edgarriba @ludv1x is mentored by @Nerei. /cc @vpisarev

@wangyida
Copy link
Contributor

@garybradski cnn_3dobj module could be included in tiny-dnn because core function is a triplet loss this is included in Caffe triplet loss, I enclosed a trained model less than 5 MB in cnn_3dobj module, this could be used directly in caffe_converter in tiny-cnn.

@ludv1x
Copy link

ludv1x commented Jul 30, 2016

@edgarriba Contributed kernels is based on ones from OpenCL-Caffe repo (which distributed under BSD licence) . So you can use them freely.
Also you can try to use kernels from opencl branch of main Caffe repo which is in active development now. And it seems that these kernels have better optimization (at least convolution).

So, I don't see serious problems with OCL kernels.
Do I correctly understand that development should be focused on adding core OpenCL functionality into tiny-cnn (i. e. implement own opencv2/core/ocl.hpp API) ?

@bhack
Copy link
Contributor

bhack commented Jul 31, 2016

@ludv1x See #208

@edgarriba
Copy link
Member

@ludv1x the plan is to port kernels from opencl branch.

Notice that for convolution, we are porting LibDNN which is the standalone version of greentea. @naibaf7 is giving support in this task.

For general OpenCL functionality we decided to use CLCudaAPI which is a header-project with common interfaces for OpenCL and CUDA.

@zhangqianhui
Copy link
Contributor

tinydnn
I think the name need underline the "tiny" comparing to Caffe or mxnet。

@pansk
Copy link
Contributor

pansk commented Aug 11, 2016

I got distracted for a few months, and this project exploded and even changed its name...
Well, nice job!

Meanwhile, I studied cuDNN, and now I'm catching up with this huge batch of updates (and updating my code as well)... I hope I can provide some help again soon :)

@edgarriba
Copy link
Member

@pansk welcome to the party again!
notce that we're putting some effort to support GPU via OpenCL with a refactored Ops style backend.

Check new ops proposal for custom convolution
https://github.com/edgarriba/tiny-cnn/blob/origin/core-framework-patch/tiny_cnn/core/kernels/conv2d_tiny_op.h

And for OpenCL (not working yet)
https://github.com/edgarriba/tiny-cnn/blob/opencl-ops-patch/tiny_cnn/core/kernels/conv2d_opencl_op.h

@pansk
Copy link
Contributor

pansk commented Aug 11, 2016

Is, like it seems, tensor_t still a std::vector<...>?

This approach requires you to get back and forth to CPU memory between two layers, and it's quite inefficient (it will probably destroy any performance gain given from the GPU).
In an email thread with @nyanp we were discussing about having a structure that represents i/o tensors and another which represents parameters (weights, biases, convolution coefficients, ...).

The first structure was supposed to "automatically" move between the CPU and the GPU when needed (e.g., when interleaving CPU and GPU layers, something which is probably still inefficient, but should be allowed for prototyping new, complex layers), while the second was conceived for full-time resident on the GPU (for GPU backends) and was supposed to be downloaded/uploaded only for serialization/deserialization purposes (probably manually).

I also see some interoperability issue between CLCudaAPI and cuDNN (and most of other cu* libraries): the former wants to wrap the kernel, and the others don't provide them, but are built on top of a CUDA stream. Please note that I didn't explore CLCudaAPI thoroughly, so I might be missing some details that made this last comment useless or wrong.

@naibaf7
Copy link

naibaf7 commented Aug 11, 2016

@pansk
When CLCudaAPI has initialized a device, you can both get the underlying memory pointer for CUDA and OpenCL. Then you can use normal CUDA code or OpenCL code separately from CLCudaAPI.
The only warning is to keep the memory wrapper from CLCudaAPI within the scope (avoid the garbage collector) while handling the underlying memory objects...
In short, cuDNN, cuBLAS and libDNN, etc. can be used in conjunction with CLCudaAPI without issues.

I would also recommend to keep one data structure for both weights and data. If coded correctly, the transfers should not be triggered often anyways, and it allows more flexibility. You could add a profiling method to these memory objects to find unnecessary transfers.
If the additional memory consumption bugs you, there could be a method to explicitly release a memory object from a device or the CPU, and reallocate the memory when the next transfer to the CPU or device is requested.

@edgarriba
Copy link
Member

@naibaf7
appreciable details, thx!

@pansk

Is, like it seems, tensor_t still a std::vector<...>?

Yes, tensor_t is a vector of vectors that was introduced to handle batch normalization.

The approach I propose is just a proof of concept to have a minimum GPU pipeline working and as pointed out memory transfer is needed in order to take advantage of GPU. Things in the TODO list are model tensors in classes and implement a proper memory transfer module.

Since the GSOC is about to finish, probably it's time to rewrite the roadmap and assign tasks. /cc @nyanp @bhack

@pansk
Copy link
Contributor

pansk commented Aug 11, 2016

@naibaf7
Thank you for clarifying, about CLCudaAPI. Another thing that could be interesting in the future is the ability to use more than one cuda stream. Is that possible, as far as you know?

About the data structure, you're correct: another solution is to use a single data structure for both.

At the moment data and weights are already held in two different structures (tensor_t and vec_t respectively).

I quite liked the solution with two different structures, partly because of memory occupation and but mostly because it makes "coding correctly" obvious. It's always possible to check with a profiler later, but I generally prefer making such bad conditions not so easy to code rather than track them down when they've been committed and used for a while... In the end, for most platforms I expect one of the structures to be a decorated version of the other, so there's no real code duplication.

Anyway, I'm not at all strongly against a unified structure, I just have a mild preference for the other solution.

@naibaf7
Copy link

naibaf7 commented Aug 11, 2016

@pansk
Not sure if it handles multiple queues (OpenCL) or streams (CUDA) yet, but I'm sure it would be easy to add.

@CNugteren
Do you want to clarify?

@edgarriba
Copy link
Member

@pansk
I think that having a unified structure it's more simple in terms of design. I agree that we have different types of data representing different concepts, but from the point of view of computation they are all matrix/tensor based objects. I think that with a proper definition we can represent them. Could you explain more your concern about with memory occupation?

@bhack
Copy link
Contributor

bhack commented Aug 11, 2016

There is a queue/stream object constructor.... See https://github.com/CNugteren/CLCudaAPI/blob/master/doc/api.md

@CNugteren
Copy link

CNugteren commented Aug 11, 2016

Indeed, you can have multiple streams/queues, and you can enqueue kernels or memory copy operations (asynchronous or synchronous) to different streams. You can also wait for all tasks in a certain stream/queue to be completed.

If any other features are needed, CLCudaAPI can be extended. However, note that it can only support features which are available in both CUDA and OpenCL.

@pansk
Copy link
Contributor

pansk commented Aug 12, 2016

@edgarriba Memory concerns are two-fold.

  • memory occupation: using memory on CPU when is just needed for serialization/deserialization could be a problem on some embedded platform (we should provide an explicit method to clean up CPU or GPU memory when no longer needed)
  • memory bandwidth: allowing the parameters to be "easily" moved between GPU and CPU simplifies writing bad code without noticing, and requires profiling to be spotted. If, on the other hand, moving parameters requires additional code, contributors will naturally tend to produce "faster" code (reducing the number of parameter moves back and forth, or even moving some logic to the GPU when needed).

@naibaf7, @bhack, @CNugteren Thank you, I'll have a deeper look at CLCudaAPI in the next days.

@bhack
Copy link
Contributor

bhack commented Aug 20, 2016

I think we can close this.

@nyanp
Copy link
Member Author

nyanp commented Aug 21, 2016

Exactly.

@nyanp nyanp closed this as completed Aug 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests