A set of models which allow easy creation of Keras models to be used for classification purposes. Also contains modules which offer implementations of recent papers.
SparseNets are a modification of DenseNet and its dense connectivity pattern to reduce memory requirements drastically while still having similar or better performance.
Keras implementation of Non-local blocks from the paper "Non-local Neural Networks".
- Support for "Gaussian", "Embedded Gaussian" and "Dot" instantiations of the Non-Local block.
- Support for shielded computation mode (reduces computation by 4x)
- Support for "Concatenation" instantiation will be supported when authors release their code.
Available at : Non-Local Neural Networks in Keras
An implementation of "NASNet" models from the paper Learning Transferable Architectures for Scalable Image Recognitio in Keras 2.0+.
Supports building NASNet Large (6 @ 4032), NASNet Mobile (4 @ 1056) and custom NASNets.
Available at : Neural Architecture Search Net (NASNet) in Keras
Implementation of Squeeze and Excite networks in Keras. Supports ResNet and Inception v3 models currently. Support for Inception v4 and Inception-ResNet-v2 will also come once the paper comes out.
Available at : Squeeze and Excite Networks in Keras
Implementation of Dual Path Networks, which combine the grouped convolutions of ResNeXt with the dense connections of DenseNet into two path
Available at : Dual Path Networks in Keras
Implementation of MobileNet models from the paper MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications in Keras 2.0+.
Contains code for building the MobileNet model (optimized for datasets similar to ImageNet) and weights for the model trained on ImageNet.
Also contains MobileNet V2 model implementations + weights.
Available at : MobileNets in Keras
Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.
Contains code for building the general ResNeXt model (optimized for datasets similar to CIFAR) and ResNeXtImageNet (optimized for the ImageNet dataset).
Available at : ResNeXt in Keras
Implementations of the Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras using the Functional API. The paper on these architectures is available at "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning".
The models are plotted and shown in the architecture sub folder. Due to lack of suitable training data (ILSVR 2015 dataset) and limited GPU processing power, the weights are not provided.
Contains : Inception v4, Inception-ResNet-v1 and Inception-ResNet-v2
Available at : Inception v4 in Keras
Implementation of Wide Residual Networks from the paper Wide Residual Networks
It can be used by importing the wide_residial_network script and using the create_wide_residual_network() method. There are several parameters which can be changed to increase the depth or width of the network.
Note that the number of layers can be calculated by the formula : nb_layers = 4 + 6 * N
import wide_residial_network as wrn ip = Input(shape=(3, 32, 32)) # For CIFAR 10 wrn_28_10 = wrn.create_wide_residual_network(ip, nb_classes=10, N=4, k=10, dropout=0.0, verbose=1) model = Model(ip, wrn_28_10)
Contains weights for WRN-16-8 and WRN-28-8 models trained on the CIFAR-10 Dataset.
Available at : Wide Residual Network in Keras
Implementation of DenseNet from the paper Densely Connected Convolutional Networks.
- Run the cifar10.py script to train the DenseNet 40 model
- Comment out the model.fit_generator(...) line and uncomment the model.load_weights("weights/DenseNet-40-12-CIFAR10.h5") line to test the classification accuracy.
Contains weights for DenseNet-40-12 and DenseNet-Fast-40-12, trained on CIFAR 10.
Available at : DenseNet in Keras
Implementation of the paper "Residual Networks of Residual Networks: Multilevel Residual Networks"
To create RoR ResNet models, use the
ror.py script :
import ror input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3) model = ror.create_residual_of_residual(input_dim, nb_classes=100, N=2, dropout=0.0) # creates RoR-3-110 (ResNet)
To create RoR Wide Residual Network models, use the
ror_wrn.py script :
import ror_wrn as ror input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3) model = ror.create_pre_residual_of_residual(input_dim, nb_classes=100, N=6, k=2, dropout=0.0) # creates RoR-3-WRN-40-2 (WRN)
Contains weights for RoR-3-WRN-40-2 trained on CIFAR 10
Available at : Residual Networks of Residual Networks in Keras
Neural Architecture Search
PySHAC is a python library to use the Sequential Halving and Classification algorithm from the paper Parallel Architecture and Hyperparameter Search via Successive Halving and Classification with ease.
Basic implementation of Encoder RNN from the paper ["Progressive Neural Architecture Search"]https://arxiv.org/abs/1712.00559), which is an improvement over the original Neural Architecture Search paper since it requires far less time and resources.
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Encoder RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Encoder RNN and the user. Submit custom operations and parse locally as required.
- Encoder RNN trained using a modified Sequential Model Based Optimization algorithm from the paper. Some stability modifications made by me to prevent extreme variance when training to cause failed training.
- NetworkManager handles the training and reward computation of a Keras model
Available at : Progressive Neural Architecture Search in Keras
Basic implementation of Controller RNN from the paper "Neural Architecture Search with Reinforcement Learning " and "Learning Transferable Architectures for Scalable Image Recognition".
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Controller RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Controller RNN and the user.
- Reinforce manages the training and evaluation of the Controller RNN
- NetworkManager handles the training and reward computation of a Keras model
Available at : Neural Architecture Search in Keras
Keras Segmentation Models
A set of models which allow easy creation of Keras models to be used for segmentation tasks.
Implementation of the paper The One Hundred Layers Tiramisu : Fully Convolutional DenseNets for Semantic Segmentation
Simply import the densenet_fc.py script and call the create method:
import densenet_fc as dc model = dc.create_fc_dense_net(img_dim=(3, 224, 224), nb_dense_block=5, growth_rate=12, nb_filter=16, nb_layers=4)
Keras Recurrent Neural Networks
A set of scripts which can be used to add custom Recurrent Neural Networks to Keras.
This model utilizes just 2 gates - forget (f) and context (c) gates out of the 4 gates in a regular LSTM RNN, and uses
Chrono Initialization to acheive better performance than regular LSTMs while using fewer parameters and less complicated gating structure.
Simply import the
janet.py file into your repo and use the
It is not adviseable to use the
JANETCell directly wrapped around a
RNN layer, as this will not allow the
max timesteps calculation that is needed for proper training using the
Chrono Initializer for the forget gate.
chrono_lstm.py script contains the
ChronoLSTM model, as it requires minimal modifications to the original
LSTM layer to use the
ChronoInitializer for the forget and input gates.
Same restrictions to usage as the
JANET layer, use the
ChronoLSTM layer directly instead of the
ChronoLSTMCell wrapped around a
from janet import JANET from chrono_lstm import ChronoLSTM ...
To use just the
ChronoInitializer, import the
Implementation of the paper Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN for Keras 2.0+. IndRNN is a recurrent unit that can run over extremely long time sequences, able to learn the additional problem over 5000 timesteps where most other models fail..
Usage of IndRNNCells
from ind_rnn import IndRNNCell, RNN cells = [IndRNNCell(128), IndRNNCell(128)] ip = Input(...) x = RNN(cells)(ip) ...
Usage of IndRNN layer
from ind_rnn import IndRNN ip = Input(...) x = IndRNN(128)(x) ...
Implementation of the paper Training RNNs as Fast as CNNs for Keras 2.0+. SRU is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks, when implemented with a custom CUDA kernel.
This is a naive implementation with some speed gains over the generic LSTM cells, however its speed is not yet 10x that of cuDNN LSTMs.
Implementation of the paper Multiplicative LSTM for sequence modelling for Keras 2.0+. Multiplicative LSTMs have been shown to achieve state-of-the-art or close to SotA results for sequence modelling datasets. They also perform better than stacked LSTM models for the Hutter-prize dataset and the raw wikipedia dataset.
multiplicative_lstm.py script into your repository, and import the MultiplicativeLSTM layer.
Eg. You can replace Keras LSTM layers with MultiplicativeLSTM layers.
from multiplicative_lstm import MultiplicativeLSTM
Implementation of the paper MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks for Keras 2.0+. Minimal RNNs are a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability
Import minimal_rnn.py and use either the MinimalRNNCell or MinimalRNN layer
from minimal_rnn import MinimalRNN # this imports the layer rather than the cell ip = Input(...) # Rank 3 input shape x = MinimalRNN(units=128)(ip) ...
Implementation of the paper Nested LSTMs for Keras 2.0+. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM
from nested_lstm import NestedLSTM ip = Input(shape=(nb_timesteps, input_dim)) x = NestedLSTM(units=64, depth=2)(ip) ...
A set of scripts which can be used to add advanced functionality to Keras.
Switchable Normalization is a normalization technique that is able to learn different normalization operations for different normalization layers in a deep neural network in an end-to-end manner.
Keras port of the implementation of the paper Differentiable Learning-to-Normalize via Switchable Normalization.
Code ported from the switchnorm official repository.
This only implements the moving average version of batch normalization component from the paper. The batch average technique cannot be easily implemented in Keras as a layer, and therefore it is not supported.
Simply import switchnorm.py and replace BatchNormalization layer with this layer.
from switchnorm import SwitchNormalization ip = Input(...) ... x = SwitchNormalization(axis=-1)(x) ...
A Keras implementation of Group Normalization by Yuxin Wu and Kaiming He.
Useful for fine-tuning of large models on smaller batch sizes than in research setting (where batch size is very large due to multiple GPUs). Similar to Batch Renormalization, but performs significantly better on ImageNet.
As can be seen, GN is independent of batchsize, which is crucial for fine-tuning large models which cannot be retrained with small batch sizes due to Batch Normalization's dependence on large batchsizes to compute the statistics of each batch and update its moving average perameters properly.
Dropin replacement for BatchNormalization layers from Keras. The important parameter that is different from
BatchNormalization is called
groups. This must be appropriately set, and requires certain constraints such as :
- Needs to an integer by which the number of channels is divisible.
1 <= G <= #channels, where #channels is the number of channels in the incomming layer.
from group_norm import GroupNormalization ip = Input(shape=(...)) x = GroupNormalization(groups=32, axis=-1) ...
Keras wrapper class for Normalized Gradient Descent from kmkolasinski/max-normed-optimizer, which can be applied to almost all Keras optimizers.
Partially implements Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network for all base Keras optimizers, and allows flexibility to choose any normalizing function. It does not implement adaptive learning rates however.
from keras.optimizers import Adam, SGD from optimizer import NormalizedOptimizer sgd = SGD(0.01, momentum=0.9, nesterov=True) sgd = NormalizedOptimizer(sgd, normalization='l2') adam = Adam(0.001) adam = NormalizedOptimizer(adam, normalization='l2')
A set of example notebooks and scripts which detail the usage and pitfalls of Eager Execution Mode in Tensorflow using Keras high level APIs.
Implementation of One-Cycle Learning rate policy from the papers by Leslie N. Smith.
- A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay
- Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates
Batch Renormalization algorithm implementation in Keras 1.2.1. Original paper by Sergey Ioffe, Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models.\
batch_renorm.py script into your repository, and import the BatchRenormalization layer.
Eg. You can replace Keras BatchNormalization layers with BatchRenormalization layers.
from batch_renorm import BatchRenormalization
Implementation of the paper Snapshot Ensembles
The technique is simple to implement in Keras, using a custom callback. These callbacks can be built using the SnapshotCallbackBuilder class in snapshot.py. Other models can simply use this callback builder to other models to train them in a similar manner.
- Download the 6 WRN-16-4 weights that are provided in the Release tab of the project and place them in the weights directory
- Run the train_cifar_10.py script to train the WRN-16-4 model on CIFAR-10 dataset (not required since weights are provided)
- Run the predict_cifar_10.py script to make an ensemble prediction.
Contains weights for WRN-CIFAR100-16-4 and WRN-CIFAR10-16-4 (snapshot ensemble weights - ranging from 1-5 and including single best model)
Available at : Snapshot Ensembles in Keras