# Using RCNN: Examples of CNN networks¶

1. using the compiled libraries (GSL)
2. using the native R libraries

#### Load Libraries

In [1]:
library(rcnn);

## CNNs: the MNIST example

### Load Dataset

* Previously it could be loaded through the RDS package.
* Now, it is included in the package datasets as "mnist"

In [2]:
data(mnist)

In [3]:
img_size <- c(28,28);

training_x <- array(mnist$train$x, c(nrow(mnist$train$x), 1, img_size)) / 255;
training_y <- binarization(mnist$train$y);

testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), 1, img_size)) / 255;
testing_y <- binarization(mnist$test$y);

Let's reduce the dataset size for this example

In [4]:
training_x <- training_x[1:1000,,,, drop=FALSE];
training_y <- training_y[1:1000,, drop=FALSE];

testing_x <- testing_x[1:1000,,,, drop=FALSE];
testing_y <- testing_y[1:1000,, drop=FALSE];

### Prepare the Network

The layer descriptor must be a list of vectors with the hyperparameters. Check the help for train.cnn to see the list of layers and the properties of each kind

In [5]:
layers <- list(
    c('type' = "CONV", 'n_channels' = 1, 'n_filters' = 4, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
    c('type' = "POOL", 'n_channels' = 4, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
    c('type' = "RELU", 'n_channels' = 4),
    c('type' = "CONV", 'n_channels' = 4, 'n_filters' = 16, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
    c('type' = "POOL", 'n_channels' = 16, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
    c('type' = "RELU", 'n_channels' = 16),
    c('type' = "FLAT", 'n_channels' = 16),
    c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 64, 'scale' = 0.1),
    c('type' = "GBRL", 'n_visible' = 64, 'n_hidden' = 32, 'scale' = 0.1, 'n_gibbs' = 4),
    c('type' = "RELV"),
    c('type' = "LINE", 'n_visible' = 32, 'n_hidden' = 10, 'scale' = 0.1),
    c('type' = "SOFT", 'n_inputs' = 10)
);

### Train the CNN

The CNN receives as inputs:

* training_x: the datasets to be trained with
* training_y: the labels for the training dataset
* layers: the descriptor of the network layers

Also receives the following hyperparameters:

* batch_size: the size of each training mini-batch
* training_epochs: the number of training epochs
* learning_rate: the learning rate for Gradient Descent
* momentum: the momentum for Gradient Descent (**Not Implemented Yet**)
* rand_seed: the random seed for selecting samples and stochastic layers

In [6]:
mnist_cnn <- train.cnn(training_x,
                       training_y,
                       layers,
                       batch_size = 10,
                       training_epochs = 3,
                       learning_rate = 1e-3,
                       rand_seed = 1234
);

### Predict using the CNN

In [7]:
prediction <- predict(mnist_cnn, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.1 0.1 0.1 0.1 0.1 ...
 $ class: int [1:1000] 7 7 7 7 7 7 7 7 7 7 ...


### Update / Re-train an RBM

You can pass a trained RBM as initial values for a new CNN. The structure (layers) will be copied from the old CNN. The function returns a new updated copy of the old CNN

In [8]:
mnist_cnn_update <- train.cnn(training_x,
                              training_y,
                              batch_size = 10,
                              training_epochs = 3,
                              learning_rate = 1e-3,
                              rand_seed = 1234,
                              init_cnn = mnist_cnn
);

## Using the R native functions

In [9]:
rm (list = ls());

### Load the R sources

In [10]:
setwd("..");
source("./cnn.R");
setwd("./notebooks");

### Load Dataset

In [11]:
mnist <- readRDS("../datasets/mnist.rds");

In [12]:
img_size <- c(28,28);

training_x <- array(mnist$train$x, c(nrow(mnist$train$x), 1, img_size)) / 255;
training_y <- binarization(mnist$train$y);

testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), 1, img_size)) / 255;
testing_y <- binarization(mnist$test$y);

In [13]:
training_x <- training_x[1:1000,,,, drop=FALSE];
training_y <- training_y[1:1000,, drop=FALSE];

testing_x <- testing_x[1:1000,,,, drop=FALSE];
testing_y <- testing_y[1:1000,, drop=FALSE];

### Prepare the Convolutional MLP Network

The same layer descriptor works for the native R version

In [14]:
layers <- list(
    c('type' = "CONV", 'n_channels' = 1, 'n_filters' = 4, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
    c('type' = "POOL", 'n_channels' = 4, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
    c('type' = "RELU", 'n_channels' = 4),
    c('type' = "CONV", 'n_channels' = 4, 'n_filters' = 16, 'filter_size' = 5, 'scale' = 0.1, 'border_mode' = 'same'),
    c('type' = "POOL", 'n_channels' = 16, 'scale' = 0.1, 'win_size' = 3, 'stride' = 2),
    c('type' = "RELU", 'n_channels' = 16),
    c('type' = "FLAT", 'n_channels' = 16),
    c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 64, 'scale' = 0.1),
    c('type' = "GBRL", 'n_visible' = 64, 'n_hidden' = 32, 'scale' = 0.1, 'n_gibbs' = 4),
    c('type' = "RELV"),
    c('type' = "LINE", 'n_visible' = 32, 'n_hidden' = 10, 'scale' = 0.1),
    c('type' = "SOFT", 'n_inputs' = 10)
);

### Train the CNN

The native R CNN is trained like the one in the package

In [15]:
cnn1 <- train_cnn(training_x = training_x,
                  training_y = training_y,
                  layers = layers,
                  batch_size = 10,
                  training_epochs = 3,
                  learning_rate = 1e-3,
                  rand_seed = 1234
);

[1] "Epoch 1 : Mean Loss 2.76630654273982"
[1] "Epoch 1 took 0.493213780721029 minutes"
[1] "Epoch 2 : Mean Loss 2.74857577315278"
[1] "Epoch 2 took 0.508029389381409 minutes"
[1] "Epoch 3 : Mean Loss 2.73429835800919"
[1] "Epoch 3 took 0.501638579368591 minutes"


### Predict using the CNN

In the native version, predict_cnn is not seen as an S3 method

In [16]:
prediction <- predict_cnn(cnn1, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.0491 0.0673 0.0752 0.0752 0.0752 ...
 $ class: int [1:1000] 5 4 5 5 5 5 4 4 4 5 ...


### Update / Re-train CNN

In [17]:
cnn1_update <- train_cnn(training_x = training_x,
                         training_y = training_y,
                         layers = layers,
                         batch_size = 10,
                         training_epochs = 3,
                         learning_rate = 1e-3,
                         rand_seed = 1234,
                         init_cnn = cnn1
);



[1] "Epoch 1 : Mean Loss 2.72344908072433"
[1] "Epoch 1 took 0.494622011979421 minutes"
[1] "Epoch 2 : Mean Loss 2.7149861772083"
[1] "Epoch 2 took 0.495480044682821 minutes"
[1] "Epoch 3 : Mean Loss 2.70824786344924"
[1] "Epoch 3 took 0.493123185634613 minutes"
