# Overview of CNN Libraries
---
#### By: Jesse Brizzi
#### jbrizzi@cs.stonybrook.edu

### If you want to run this code, you can use an Amazon EC2 server

[set up a deep learning server](http://markus.com/install-theano-on-aws/)

# What is a CNN?

[MINST DEMO](http://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html)

## List of current available libraries
---

**Python**

1.  [Theano](http://deeplearning.net/software/theano) is a python library for defining and evaluating mathematical expressions with numerical arrays. It makes it easy to write deep learning algorithms in python. On the top of the Theano many more libraries are built.

    1.  [Keras](http://keras.io/) is a minimalist, highly modular neural network library in the spirit of Torch, written in Python, that uses Theano under the hood for optimized tensor manipulation on GPU and CPU.

    2.  [Pylearn2](http://deeplearning.net/software/pylearn2/) is a library that wraps a lot of models and training algorithms such as Stochastic Gradient Descent that are commonly used in Deep Learning. Its functional libraries are built on top of Theano.

    3.  [Lasagne](https://github.com/Lasagne/Lasagne) is a lightweight library to build and train neural networks in Theano. It is governed by simplicity, transparency, modularity, pragmatism , focus and restraint principles.

    4.  [Blocks](https://github.com/mila-udem/blocks) a framework that helps you build neural network models on top of Theano.

2.  [Caffe](http://caffe.berkeleyvision.org/) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Google's [DeepDream](http://venturebeat.com/2015/07/01/google-open-sources-its-software-for-making-trippy-images-with-deep-learning/) is based on Caffe Framework. This framework is a BSD-licensed C++ library with Python Interface.

3.  [_nolearn_](https://github.com/dnouri/nolearn) contains a number of wrappers and abstractions around existing neural network libraries, most notably [Lasagne](http://lasagne.readthedocs.org/), along with a few machine learning utility modules.

4.  [Gensim](http://radimrehurek.com/gensim/) is deep learning toolkit implemented in python programming language intended for handling large text collections, using efficient algorithms.

5.  [Chainer](http://chainer.org/) bridge the gap between algorithms and implementations of deep learning. Its powerful, flexible and intuitive and is considered as the [flexible framework](http://www.slideshare.net/beam2d/introduction-to-chainer-a-flexible-framework-for-deep-learning) for Deep Learning.

6.  [deepnet](https://github.com/nitishsrivastava/deepnet) is a GPU-based python implementation of deep learning algorithms like Feed-forward Neural Nets, Restricted Boltzmann Machines, Deep Belief Nets, Autoencoders, Deep Boltzmann Machines and Convolutional Neural Nets.

7.  [Hebel](https://github.com/hannes-brt/hebel) is a library for deep learning with neural networks in Python using GPU acceleration with CUDA through PyCUDA. It implements the most important types of neural network models and offers a variety of different activation functions and training methods such as momentum, Nesterov momentum, dropout, and early stopping.

8.  [CXXNET](https://github.com/dmlc/cxxnet) is fast, concise, distributed deep learning framework based on MShadow. It is a lightweight and easy extensible C++/CUDA neural network toolkit with friendly Python/Matlab interface for training and prediction.

9.  [DeepPy](https://github.com/andersbll/deeppy) is a Pythonic deep learning framework built on top of NumPy.

10.  [DeepLearning](https://github.com/vishwa-raman/DeepLearning) is deep learning library, developed with C++ and python.

11.  [Neon](https://github.com/NervanaSystems/neon) is Nervana's Python based Deep Learning framework.

**Matlab**

1.  [ConvNet](https://github.com/sdemyanov/ConvNet) Convolutional neural net is a type of deep learning classification algorithms, that can learn useful features from raw data by themselves and is performed by tuning its weighs.

2.  [DeepLearnToolBox](https://github.com/rasmusbergpalm/DeepLearnToolbox) is a matlab/octave toolbox for deep learning and includes Deep Belief Nets, Stacked Autoencoders, convolutional neural nets.

3.  [cuda-convnet](https://code.google.com/p/cuda-convnet/) is a fast C++/CUDA implementation of convolutional (or more generally, feed-forward) neural networks. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the backpropagation algorithm.

4.  [MatConvNet](http://www.vlfeat.org/matconvnet/ "MatConvNet") is a MATLAB toolbox implementing Convolutional Neural Networks (CNNs) for computer vision applications. It is simple, efficient, and can run and learn state-of-the-art CNNs.

5.  [MatCaffe](http://caffe.berkeleyvision.org/) is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Google's [DeepDream](http://venturebeat.com/2015/07/01/google-open-sources-its-software-for-making-trippy-images-with-deep-learning/) is based on Caffe Framework. This framework is a BSD-licensed C++ library with Python Interface.

**CPP**

1.  [eblearn](http://eblearn.sourceforge.net/index.shtml) is an open-source C++ library of machine learning by New York University’s machine learning lab, led by Yann LeCun. In particular, implementations of convolutional neural networks with energy-based models along with a GUI, demos and tutorials.

2.  [SINGA](http://www.comp.nus.edu.sg/~dbsystem/singa/) is designed to be general to implement the distributed training algorithms of existing systems. It is supported by Apache Software Foundation.

3.  NVIDIA [DIGITS](https://developer.nvidia.com/digits) is a new system for developing, training and visualizing deep neural networks. It puts the power of deep learning into an intuitive browser-based interface, so that data scientists and researchers can quickly design the best DNN for their data using real-time network behavior visualization.

4.  [Intel® Deep Learning Framework](https://01.org/intel-deep-learning-framework) provides a unified framework for Intel® platforms accelerating Deep Convolutional Neural Networks.

**Java**

1.  [N-Dimensional Arrays for Java](http://nd4j.org/) (ND4J)is scientific computing libraries for the JVM. They are meant to be used in production environments, which means routines are designed to run fast with minimum RAM requirements.

2.  [Deeplearning4j](http://deeplearning4j.org/) is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. It is designed to be used in business environments, rather than as a research tool.

3.  [Encog](http://www.heatonresearch.com/encog) is an advanced machine learning framework which supports Support Vector Machines,Artificial Neural Networks, Genetic Programming, Bayesian Networks, Hidden Markov Models, Genetic Programming and Genetic Algorithms are supported.

**JavaScript**

1.  [Convnet.js](http://cs.stanford.edu/people/karpathy/convnetjs/) is a Javascript library for training Deep Learning models (mainly Neural Networks) entirely in a browser. No software requirements, no compilers, no installations, no GPUs, no sweat.

**Lua**

1.  [Torch](http://torch.ch/) is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and efficient, fast scripting language, LuaJIT, and an underlying C/CUDA implementation. Torch is based on Lua programming language.

**Julia**

1.  [Mocha](https://github.com/pluskid/Mocha.jl) is a Deep Learning framework for Julia, inspired by the C++ framework Caffe. Efficient implementations of general stochastic gradient solvers and common layers in Mocha could be used to train deep / shallow (convolutional) neural networks, with (optional) unsupervised pre-training via (stacked) auto-encoders. Its best feature include Modular architecture, High-level Interface, portability with speed, compatibility and many more.

**Lisp**

1.  [Lush(Lisp Universal Shell)](http://lush.sourceforge.net/) is an object-oriented programming language designed for researchers, experimenters, and engineers interested in large-scale numerical and graphic applications. It comes with rich set of deep learning libraries as a part of machine learning libraries.

**Haskell**

1.  [DNNGraph](https://github.com/ajtulloch/dnngraph) is a deep neural network model generation DSL in Haskell.

**.NET**

1.  [Accord.NET](http://accord-framework.net/ "Accord.NET") is a .NET machine learning framework combined with audio and image processing libraries completely written in C#. It is a complete framework for building production-grade computer vision, computer audition, signal processing and statistics applications

**R**

1.  [darch](http://cran.um.ac.ir/web/packages/darch/index.html "darch") package can be used for generating neural networks with many layers (deep architectures). Training methods includes a pre training with the contrastive divergence method and a fine tuning with common known training algorithms like backpropagation or conjugate gradient.
2.  [deepnet](https://cran.r-project.org/web/packages/deepnet/index.html "deepnet") implements some deep learning architectures and neural network algorithms, including BP,RBM,DBN,Deep autoencoder and so on.



source [http://www.teglor.com/b/deep-learning-libraries-language-cm569/](http://www.teglor.com/b/deep-learning-libraries-language-cm569/)

# convnet-benchmarks
---

Easy benchmarking of all public open-source implementations of convnets.
A summary is provided in the section below.

Machine: `6-core Intel Core i7-5930K CPU @ 3.50GHz` + `NVIDIA Titan X` + `Ubuntu 14.04 x86_64`

## Imagenet Winners Benchmarking
Time for a full forward + backward pass. I average my times over 10 runs. Ignoring dropout and softmax layers.

**[AlexNet (One Weird Trick paper)](https://code.google.com/p/cuda-convnet2/source/browse/layers/layers-imagenet-1gpu.cfg)** - Input 128x3x224x224

| Library         | Class                                                                                                                | Time (ms)  | forward (ms) | backward (ms) |
|:------------------------:|:-----------------------------------------------------------------------------------------------------------:| ----------:| ------------:| -------------:|
| **Nervana-fp16**    | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                    |   **92**   |  **29**      |    **62**     |
| CuDNN[R3]-fp16      | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)       |      96    |  30          |   66          |
| CuDNN[R3]-fp32      | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)       |      96    |  32          |   64          |
| Nervana-fp32        | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                    |      101   |  32          |    69         |
| fbfft                    | [fbnn.SpatialConvolution](https://github.com/facebook/fbcunn/tree/master/src/fft)                           |      104   |  31          |    72         |
| cudaconvnet2*            | [ConvLayer](https://github.com/soumith/cuda-convnet2.torch/blob/master/cudaconv3/src/filter_acts.cu)        |      177   |  42          |   135         |
| CuDNN[R2] *             | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)       |      231   |  70          |   161         |
| Caffe (native)           | [ConvolutionLayer](https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu)                |      324   | 121          |   203         |
| Torch-7 (native)         | [SpatialConvolutionMM](https://github.com/torch/cunn/blob/master/SpatialConvolutionMM.cu)                   |      342   | 132          |   210         |
| CL-nn (Torch)            | [SpatialConvolutionMM](https://github.com/hughperkins/clnn/blob/master/SpatialConvolutionMM.cl)             |      963   | 388          |   574         |
| Caffe-CLGreenTea         | [ConvolutionLayer](https://github.com/naibaf7/caffe)             |      1442   | 210          |   1232         |

**[Overfeat [fast]](http://arxiv.org/abs/1312.6229)** - Input 128x3x231x231

| Library                  | Class                                                                                                                    | Time (ms)         | forward (ms)            | backward (ms)            |
|:------------------------:|:------------------------------------------------------------------------------------------------------------------------:| -----------------:| -----------------------:| ------------------------:|
| **CuDNN[R3]-fp16**       | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |         **313**       |  **107**                    |  **206**             |
| CuDNN[R3]-fp32       | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |         326       |  113                    |   213                    |
| fbfft                    | [SpatialConvolutionCuFFT](https://github.com/facebook/fbcunn/tree/master/src/fft)                                        |         342       |  114                    |   227                    |
| Nervana-fp16          | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |         355       |  112                    |   242                    |
| Nervana-fp32            | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |         398       |  124                    |   273                    |
| cudaconvnet2*            | [ConvLayer](https://github.com/soumith/cuda-convnet2.torch/blob/master/cudaconv3/src/filter_acts.cu)                     |         723       |  176                    |   547                    |
| CuDNN[R2] *             | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |         810       |  234                    |   576                    |
| Caffe                    | [ConvolutionLayer](https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu)                             |         823       |  355                    |   468                    |
| Torch-7 (native)         | [SpatialConvolutionMM](https://github.com/torch/cunn/blob/master/SpatialConvolutionMM.cu)                                |         878       |  379                    |   499                    |
| CL-nn (Torch)            | [SpatialConvolutionMM](https://github.com/hughperkins/clnn/blob/master/SpatialConvolutionMM.cl)                          |         963       |  388                    |   574                    |
| Caffe-CLGreenTea         | [ConvolutionLayer](https://github.com/naibaf7/caffe)             |      2857   | 616          |   2240         |

**[OxfordNet [Model-A]](http://arxiv.org/abs/1409.1556/)** - Input 64x3x224x224

| Library                  | Class                                                                                                                    | Time (ms)         | forward (ms)            | backward (ms)            |
|:------------------------:|:------------------------------------------------------------------------------------------------------------------------:| -----------------:| -----------------------:| ------------------------:|
| **Nervana-fp16**    | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |    **529**        |  **167**                |   **362**                |
| Nervana-fp32        | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |        590        |  180                    |   410                    |
| CuDNN[R3]-fp16      | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |       615         |  179                    |   436                    |
| CuDNN[R3]-fp32      | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |       615         |  196                    |   418                    |
| fbfft                    | [SpatialConvolutionCuFFT](https://github.com/facebook/fbcunn/tree/master/src/fft)                                        |       1092        |  355                    |   737                    |
| cudaconvnet2*            | [ConvLayer](https://github.com/soumith/cuda-convnet2.torch/blob/master/cudaconv3/src/filter_acts.cu)                     |       1229        |  408                    |   821                    |
| CuDNN[R2] *             | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |       1099        |  342                    |   757                    |
| Caffe                    | [ConvolutionLayer](https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu)                             |       1068        |  323                    |   745                    |
| Torch-7 (native)         | [SpatialConvolutionMM](https://github.com/torch/cunn/blob/master/SpatialConvolutionMM.cu)                                |       1105        |  350                    |   755                    |
| CL-nn (Torch)            | [SpatialConvolutionMM](https://github.com/hughperkins/clnn/blob/master/SpatialConvolutionMM.cl)                          |       3437        |  875                    |   2562                   |
| Caffe-CLGreenTea         | [ConvolutionLayer](https://github.com/naibaf7/caffe)             |      5620   | 988          |   4632         |

**[GoogleNet V1](http://research.google.com/pubs/pub43022.html)** - Input 128x3x224x224

| Library                  | Class                                                                                                                    | Time (ms)         | forward (ms)            | backward (ms)            |
|:------------------------:|:------------------------------------------------------------------------------------------------------------------------:| -----------------:| -----------------------:| ------------------------:|
| **Nervana-fp16**    | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |    **283**        |  **85**                 |   **197**                |
| Nervana-fp32        | [ConvLayer](https://github.com/soumith/convnet-benchmarks/blob/master/nervana/README.md)                                 |        322        |  90                     |   232                    |
| CuDNN[R3]-fp32       | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |       431         |  117                    |   313                    |
| CuDNN[R3]-fp16       | [cudnn.SpatialConvolution](https://github.com/soumith/cudnn.torch/blob/master/SpatialConvolution.lua)                    |       501         |  109                    |   392                    |
| Caffe                    | [ConvolutionLayer](https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu)                             |       1935        |  786                    |   1148                   |
| CL-nn (Torch)            | [SpatialConvolutionMM](https://github.com/hughperkins/clnn/blob/master/SpatialConvolutionMM.cl)                          |       7016        |  3027                   |   3988                   |
| Caffe-CLGreenTea         | [ConvolutionLayer](https://github.com/naibaf7/caffe)             |      9462   | 746          |   8716         |

source [https://github.com/soumith/convnet-benchmarks](https://github.com/soumith/convnet-benchmarks)


# The Big 3

**Caffe**
- C++/YAML with python and matlab interfaces
- Only can implement CNN's
- Huge community and active community
- Most computer vision research uses Caffe
- Can implement your own layers in c++/cuda/python
- Colaberates with Nvidia, first to use new CudNN libraries
- Have to define network structure in YAML
- OSX and Linux

**Theano**
- Python library
- Built around the Scypy/Numpy evironment
- General Machine Learning Library
- Outside of computer vision, most used
- Define net structure in python
- Very low level, but has a lot of useful high level libraries 
    - NoLearn
    - Lasagne
- OSX, Linux, and Windows

**Torch**
- Lua Library
- Used by Google and Facebook
- Very low level
- OSX and Linux



## My advice, USE UBUNTU 14.04 LTS

# What GPU Do I Want?

|   GPU Name  |  Cost | Memory |   Power Requirement  | Single Precision Speed |
|:-----------:|:-----:|:------:|:--------------------:|:----------------------:|
|  GTX 750ti  |  \$150 |   2GB  |     55w - No PCIe    |       1306 GFLOPS      |
|   GTX 970   |  \$330 |  3.5GB |    145w - 2x 6-Pin   |       3494 GFLOPS      |
|  GTX 980ti  |  \$650 |   6GB  | 250w - 6-Pin + 8-Pin |       5632 GFLOPS      |
| GTX Titan X | \$1000 |  12GB  | 250w - 6-Pin + 8-Pin |       6144 GLOPS       |
| GTX Titan Z | \$1550 |  6GBx2 |   375w - 2x 8-Pin    |     4061 GFLOPS x2     |


## Tesla? AMD? Titan Black?
- CNN and nural net libraries all use CUDA as thier back end and only depend on 32 bit single percision (or 16 bit half) accuracy.
- AMD cards do not support CUDA
- Tesla GPU's and older Nvidia GPU's (like the Titan, Titan Black, and 700 series) use older architecture that focus more on 64 bit double percision performance, and have poor 32 bit performance for the cost. 
    - The GTX 750ti is an exception as it uses the newer architecture
    - The Titan Z is only useful if you need to fit 2 GPU's in a computer with only 1 open PCIe slot. 

# Can My computer use the GPU?

### An open PCIe x16 slot
![pcie slots](https://upload.wikimedia.org/wikipedia/commons/0/0c/PCI_und_PCIe_Slots.jpg)

### A Power supply with enough wattage and PCIe power cables
![8 and 8 pin connectors](http://cdn.head-fi.org/5/56/56ff61d3_pcie-connectors.jpeg)
![powersupply sticker](http://www.techpowerup.com/reviews/Corsair/GS800/images/psu_label.jpg)

# Useful Education Links
- [Stanford CS231n](http://cs231n.stanford.edu/)
- [ConvnetJS Demos](http://cs.stanford.edu/people/karpathy/convnetjs/)
- [Neural Networks and Deep Learning eBook](http://neuralnetworksanddeeplearning.com/index.html)