# Using RCNN: Examples of MLP networks¶

1. using the compiled libraries (GSL)
2. using the native R libraries

CNNs and MLPs can be trained and used by describing the layers composing the neural network. For CNNs we're using convolutional related layers plus flat layers, while with MLPs we're only using the later ones. Also inputs for CNNs must be 4 dimensional, while for MLPs must be 2 dimensional

#### Load Libraries

In [1]:
library(rcnn);

## MLPs: the MNIST example

### Load Dataset

* Previously it could be loaded through the RDS package.
* Now, it is included in the package datasets as "mnist"

In [2]:
data(mnist)

In case of MLPs, input arrays must be 2 dimensional. In case of CNNs, input arrays must be 4 dimensional. Here we're flattening the inputs as [batch_size x features]

In [3]:
img_size <- c(28,28);

training_x <- array(mnist$train$x, c(nrow(mnist$train$x), prod(img_size))) / 255;
training_y <- binarization(mnist$train$y);

testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), prod(img_size))) / 255;
testing_y <- binarization(mnist$test$y);

Let's reduce the dataset size for this example

In [4]:
training_x <- training_x[1:1000,, drop=FALSE];
training_y <- training_y[1:1000,, drop=FALSE];

testing_x <- testing_x[1:1000,, drop=FALSE];
testing_y <- testing_y[1:1000,, drop=FALSE];

### Prepare the MLP Network

The layer descriptor must be a list of vectors with the hyperparameters. Check the help for train.cnn to see the list of layers and the properties of each kind.

In the case of MLPs (and not CNNs), layers are limited to those who accept matrices as inputs, also input has to be a matrix (not a 4D array)

In [5]:
layers <- list(
    c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 64, 'scale' = 0.1),
    c('type' = "GBRL", 'n_visible' = 64, 'n_hidden' = 32, 'scale' = 0.1, 'n_gibbs' = 4),
    c('type' = "RELV"),
    c('type' = "LINE", 'n_visible' = 32, 'n_hidden' = 10, 'scale' = 0.1),
    c('type' = "SOFT", 'n_inputs' = 10)
);

### Train the MLP

The MLP receives as inputs:

* training_x: the datasets to be trained with
* training_y: the labels for the training dataset
* layers: the descriptor of the network layers

Also receives the following hyperparameters:

* batch_size: the size of each training mini-batch
* training_epochs: the number of training epochs
* learning_rate: the learning rate for Gradient Descent
* rand_seed: the random seed for selecting samples and stochastic layers

In [6]:
mnist_mlp <- train.cnn(training_x,
                       training_y,
                       layers,
                       batch_size = 10,
                       training_epochs = 3,
                       learning_rate = 1e-3,
                       rand_seed = 1234
);

### Predict using the MLP

In [7]:
prediction <- predict(mnist_mlp, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.1 0.1 0.1 0.1 0.1 ...
 $ class: int [1:1000] 7 7 7 7 7 7 7 7 7 7 ...


### Update / Re-train the MLP

You can pass a trained MLP as initial values for a new MLP. The structure (layers) will be copied from the old MLP. The function returns a new updated copy of the old MLP

In [8]:
mnist_mlp_update <- train.cnn(training_x,
                              training_y,
                              batch_size = 10,
                              training_epochs = 3,
                              learning_rate = 1e-3,
                              rand_seed = 1234,
                              init_cnn = mnist_mlp
);

In [9]:
prediction <- predict(mnist_mlp_update, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.1 0.1 0.1 0.1 0.1 ...
 $ class: int [1:1000] 7 7 7 7 7 7 7 7 7 7 ...


## Using the R native functions

In [10]:
rm (list = ls());

### Load the R sources

In [11]:
setwd("..");
source("./cnn.R");
setwd("./notebooks");

### Load Dataset

In [12]:
mnist <- readRDS("../datasets/mnist.rds");

In [13]:
img_size <- c(28,28);

training_x <- array(mnist$train$x, c(nrow(mnist$train$x), prod(img_size))) / 255;
training_y <- binarization(mnist$train$y);

testing_x <- array(mnist$test$x, c(nrow(mnist$test$x), prod(img_size))) / 255;
testing_y <- binarization(mnist$test$y);

In [14]:
training_x <- training_x[1:1000,, drop=FALSE];
training_y <- training_y[1:1000,, drop=FALSE];

testing_x <- testing_x[1:1000,, drop=FALSE];
testing_y <- testing_y[1:1000,, drop=FALSE];

### Prepare the MLP Network

The same layer descriptor works for the native R version

In [15]:
layers <- list(
    c('type' = "LINE", 'n_visible' = 784, 'n_hidden' = 64, 'scale' = 0.1),
    c('type' = "GBRL", 'n_visible' = 64, 'n_hidden' = 32, 'scale' = 0.1, 'n_gibbs' = 4),
    c('type' = "RELV"),
    c('type' = "LINE", 'n_visible' = 32, 'n_hidden' = 10, 'scale' = 0.1),
    c('type' = "SOFT", 'n_inputs' = 10)
);

### Train the MLP

The native R MLP is trained like the one in the package

In [16]:
mlp1 <- train_cnn(training_x = training_x,
                  training_y = training_y,
                  layers = layers,
                  batch_size = 10,
                  training_epochs = 3,
                  learning_rate = 1e-3,
                  rand_seed = 1234
);

[1] "Epoch 1 : Mean Loss 2.74712236792918"
[1] "Epoch 1 took 0.00619792540868123 minutes"
[1] "Epoch 2 : Mean Loss 2.75131438678832"
[1] "Epoch 2 took 0.00450321038564046 minutes"
[1] "Epoch 3 : Mean Loss 2.74430054101487"
[1] "Epoch 3 took 0.00607141256332397 minutes"


### Predict using the MLP

In the native version, predict_cnn is not seen as an S3 method

In [17]:
prediction <- predict_cnn(mlp1, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.1073 0.1845 0.0831 0.1138 0.1062 ...
 $ class: int [1:1000] 2 1 4 2 6 5 6 7 7 4 ...


### Update / Re-train the MLP

In [18]:
mlp1_update <- train_cnn(training_x = training_x,
                         training_y = training_y,
                         layers = layers,
                         batch_size = 10,
                         training_epochs = 3,
                         learning_rate = 1e-3,
                         rand_seed = 1234,
                         init_cnn = mlp1
);



[1] "Epoch 1 : Mean Loss 2.73734172067997"
[1] "Epoch 1 took 0.00547860860824585 minutes"
[1] "Epoch 2 : Mean Loss 2.73679329022619"
[1] "Epoch 2 took 0.00481817722320557 minutes"
[1] "Epoch 3 : Mean Loss 2.72631982060983"
[1] "Epoch 3 took 0.00557696421941121 minutes"


In [19]:
prediction <- predict_cnn(mlp1_update, testing_x);
str(prediction)

List of 2
 $ score: num [1:1000, 1:10] 0.0905 0.1335 0.1203 0.1743 0.0759 ...
 $ class: int [1:1000] 2 4 2 2 2 2 10 2 9 4 ...
