-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize the network into graph of Blob nodes and Layer edges #166
Comments
This is an excellent suggestion. Now that Caffe has learned to experiment with DAGs, weight sharing is a natural next generalization that we have been discussing in lab. Making a public issue like this will help focus the plan. #57 and #119 must be addressed first. Development in earnest of #57 will start after March 7. #119 effectively doubles the size of a model that can be trained on a single GPU, but requires careful changes to the solver, so it should perhaps wait until after #57. Then this and #119 can be pursued. Thanks, especially for the clarity of this proposal and the references. |
Thanks for your support! March 7 has been mentioned several times. What is the blocking factor until then? |
The ECCV '14 paper submission deadline. We are a research group, after all, and there's only so much ☕ |
We've decided on an alternative design for weight sharing. A new |
Closing as #500 will solve. |
Threaded datum parsing
Multiple factors motivate this proposal including #57, #119, #129, the papers and codes of Generative Stochastic Networks (GSN) [1, 2, 3, 4, 5], and the scene labeling paper using Recurrent Convolutional Neural Network (RCNN) [6].
In the new design Blob becomes BlobNode and has source nodes and target nodes. The edges are represented by the LayerEdge which no longer containS bottom or top blobs. The nodes are independent of the processing layers. Both nodes and edges can be reused (data or weight parameter shared) according to the structures of the network which generally will not be linearly arranged layers from the bottom straightly to the top but truly networks like the social graphs.
[1] Yoshua Bengio, Éric Thibodeau-Laufer, Jason Yosinski. Deep Generative Stochastic Networks Trainable by Backprop. arXiv:1306.1091 [cs.LG]. 2013.
[2] Yoshua Bengio, Li Yao, Guillaume Alain, Pascal Vincent. Generalized Denoising Auto-Encoders as Generative Models. NIPS, 2013.
[3] Li Yao. Efficient implementation of Generative Stochastic Networks. https://github.com/yaoli/GSN. 2013.
[4] @lightcatcher. Generative Stochastic Network (GSN) model. lisa-lab/pylearn2#392. 2013.
[5] Jian Zhou, Olga Troyanskaya. Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction. JMLR W&CP 32 (1) : 745–753, 2014.
[6] Pedro Pinheiro, Ronan Collobert. Recurrent Convolutional Neural Networks for Scene Labeling. JMLR W&CP 32 (1) : 82–90, 2014.
The text was updated successfully, but these errors were encountered: