Skip to content

MyCaffe/MyCaffe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to MyCaffe!

❤️ Sponsor to help us keep innovating for you!

MyCaffe is a complete C# re-write of the native C++ CAFFE[1] open source project.

MyCaffe allows Windows C# software developers to use and expand deep learning solutions in their native C# language. All layers except for a few, and nearly every unit test are now provided in C#. Windows programmers can now write their own custom layers in the C# language, yet still enjoy the benefit of an efficient deep learning architecture that supports multi-GPU training on up to 8 headless GPU's using NCCL 1.3.4 ('Nickel').

Now you can create custom layers for MyCaffe in native C# using the full extent of the Windows .NET Framework!

We have made a large effort to keep the MyCaffe C# code true to the original CAFFE[1] down to comment with the hope of making it even easier to extend the general CAFFE architecture for all. In addition, MyCaffe uses the same Proto Buffer file format for solver and model descriptions and model binary files allowing an easy exchange between the MyCaffe and C++ CAFFE platforms.

Most of the MyCaffe C# code is very similar to the C++ CAFFE code, for our goal is to extend the CAFFE platform to C# programmers, while maintaining compatibility with CAFFE's solver descriptions, model descriptions and binary weight format.

The C# based MyCaffe open-source project is independently maintained by SignalPop LLC and made available under the Apache 2.0 License.

Supported Development Environments:

* Visual Studio 2022 & CUDA 11.8.0 & cuDnn 8.8.0 (current test pass)
* Visual Studio 2022 & CUDA 12.2.2 & cuDnn 8.9.5

NOTE: Compute 5.3 and above required for CUDA 11.8.0/cuDNN 8.8.0 when using __half sized memory. NOTE: Only compute 5.2 and above are supported in CUDA 11.8.0/cuDNN 8.8.0 due to 5.1 and lower compute phase-out in CUDA 11.8.

For detailed notes on building MyCaffe, please see the INSTALL.md file.

IMPORTANT: The open-source MyCaffe project on GitHub is considered 'pre-release' and may have bugs. When you find bugs or other issues, please report them here - or better yet, get involved and propose a fix!

We have several new models supported by MyCaffe with the train_val and solution prototxt ready to go:

  • Domain-Adversarial Neural Networks (DANN) as described in [2] with support for source and target datasets.
  • ResNet-56 on the Cifar-10 dataset as described in [3].
  • Deep convolutional auto-encoder neural networks with pooling as described in [4].
  • Policy Gradient Reinforcement Learning networks as described in [5].
  • Recurrent Learning of Char-RNN as described in [8] and [9].
  • Neural Style Transfer as described in [10] and [11] using the VGG model described in [12]
  • Deep Q-Learning [14][15] with Noisy-Net [16] and Prioritized Replay Buffer [17]
  • Siamese Network [18][19]
  • Deep Metric Learning with Triplet Network [20][21]
  • Single-Shot Multi-Box (SSD) Object Detection [22][23]
  • Seq2Seq with Attention [24][25][26] (see MyCaffe-Samples at https://github.com/MyCaffe/MyCaffe-Samples/tree/master/Seq2Seq)
  • Transformer Models (ChatGPT and GPT) [24][27]
  • Temporal Fusion Transformer Models [28][29]

For more information on the MyCaffe implementation of Policy Gradient Reinforcement Learning, see MyCaffe: A Complete C# Re-Write of Caffe with Reinforcement Learning by D. Brown, 2018.

MyCaffe now supports the Arcade-Learning-Environment by [6] based on the Stella Atari-2600 emulator from [7], via the AleControl from SignalPop.
For more information, get the AleControl on Nuget, or visit the AleControl on Github.

License and Citation

MyCaffe is released under the [Apache License 2.0](https://github.com/MyCaffe/MyCaffe/blob/master/LICENSE).

Please cite MyCaffe in your publications and projects if MyCaffe helps you in your research or applications:


	@article 
	{
	  brown2018mycaffe,
	  Author = {Brown, David W.}
	  Journal = {arXiv preprint arXiv:1810.02272},
	  Title = {MyCaffe: A Complete C# Re-Write of Caffe with Reinforcement Learning}
	  Year = {2018}
	  Link = {https://arxiv.org/abs/1810.02272}
	}

Donate

To support this project, kindly send donations to:
ETH (Ethereum): 0xb0d26F749FC3aE8cadb29bA4E224CA4C9Af99e20

References

[1] [CAFFE: Convolutional Architecture for Fast Feature Embedding](https://arxiv.org/abs/1408.5093) by Yangqing Jai, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell, 2014.

[2] Domain-Adversarial Training of Neural Networks by Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky, 2015.

[3] ResNet 20/32/44/56/110 for CIFAR10 with caffe by Yihui He, 2016.

[4] A Deep Convolutional Auto-Encoder with Pooling - Unpooling Layers in Caffe by Volodymyr Turchenko, Eric Chalmers and Artur Luczac, 2017.

[5] Deep Reinforcement Learning: Pong from Pixels by Andrej Karpathy, 2015.

[6] The Arcade Learning Environment: An Evaluation Platform for General Agents by Marc G. Bellemare, Yavar Naddaf, Joel Veness and Michael Bowling, 2012-2013. Source code available on GitHub at mgbellemare/Arcade-Learning-Envrionment

[7] Stella - A multi-platform Atari 2600 VCS emulator by Bradford W. Mott, Stephen Anthony and The Stella Team, 1995-2018 Source code available on GitHub at stella-emu/stella

[8] The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy, 2015.

[9] adepierre/caffe-char-rnn Github by adepierre, 2017.

[10] A Neural Algorithm of Artistic Style Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, 2015, arXiv:1508:06576

[11] ftokarev/caffe Github by ftokarev, 2017

[12] Very Deep Convolutional Networks for Large-Scale Image Recognition by K. Simonyan, A. Zisserman, arXiv:1409.1556

[14] GitHub: Google/dopamine licensed under the Apache 2.0 License;

[15] Dopamine: A Research Framework for Deep Reinforcement Learning by Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare, 2018, arXiv:1812.06110

[16] Noisy Networks for Exploration by Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg, 2018, arXiv:1706.10295

[17] Prioritized Experience Replay by Tom Schaul, John Quan, Ioannis Antonoglou, David Silver, 2016, arXiv:1511.05952

[18] Siamese Network Training with Caffe by Yangqing Jia and Evan Shelhamer, BAIR.

[19] Siamese Neural Network for One-shot Image Recognition by G. Koch, R. Zemel and R. Salakhutdinov, ICML 2015 Deep Learning Workshop, 2015.

[20] Deep metric learning using Triplet network by E. Hoffer and N. Ailon, 2014, 2018, arXiv:1412.6622.

[21] In Defense of the Triplet Loss for Person Re-Identification by A. Hermans, L. Beyer, and B. Leibe, 2017, arXiv:1703.07737v2.

[22] SSD: Single Shot MultiBox Detector by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg, 2016.

[23] GitHub: SSD: Single Shot MultiBox Detector, by weiliu89/caffe, 2016

[24] Attention Is All You Need by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, 2017, arXiv:1706:03762

[25] GitHub: mashmawy/Seq2SeqLearn by Mohamed Ashmawy, 2017

[26] GitHub: HectorPulido/Chatbot-seq2seq-C- by Hector Pulido, 2018

[27] GitHub: devjwsong/transformer-translator-pytorch by Jaewoo (Kyle) Song, 2021, GitHub

[28] Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting by Bryan Lim, Sercan O. Arik, Nicolas Loeff and Tomas Pfister, 2019, arXiv:1912.09363

[29] GitHub: PlaytikaOSS/tft-torch by Playtika Research, 2021, GitHub

For more information on the C++ CAFFE open-source project, please see the following link.