# Understanding Random Multi-Model

## Using RMDL Package 

Random Multi-model Deep Learning for Classification is the Framework for testing various network topology and with varying hyper-parameters on the given data to get the best model with the highest accuracy.

Implementation of RMDL is provided with the git repository: https://github.com/kk7nc/RMDL

RMDL solves the problem of finding the optimum deep neural network architecture by simultaneously improving the accuracy and robustness of deep learning architecture. RMDL is an ensemble consisting of 3 kinds of network architecture:

1. Feed Forward Network (FFN)

2. Convolution Neural Network (CNN)

3. Recurrent neural network (RNN)

The overall model looks as given below.

![](figures/rmdl_architecture.png)

Figure: RMDL schematic architecture

Parameters like a number of layers in FFN, CNN, and RNN is changed randomly and defined number of random configuration are tested on the data and the beast model is given back. Having three essential components in it, RMDL can or with all type of data like text, images, video, and fully structured data. In total 9 deep learning model are generated, 3 from DNN, FFN and CNN. All of them are unique owing to random creation.




## Installation

Can be done in two ways :

```bash
!git clone https://github.com/kk7nc/RMDL.git
python setup.y install
```
Or 

```bash 
pip install rmdl
```

## Importing Requirements

In [0]:

import nltk
nltk.download("reuters")
from nltk.corpus import reuters
from sklearn.preprocessing import MultiLabelBinarizer
import numpy as np
from RMDL import RMDL_Text as RMDL
from sklearn.datasets import fetch_20newsgroups

In [0]:
!pip install rmdl


# Using RMDL

## Training on `reuters` data
Reuters text classification is the benchmark data set for multi-label and multi-class classification. The dataset has 90 classes, 7769 training documents, and 3019 testing documents. More about this dataset is available here.

Loading and preprocessing data: Reuters corpus is present in the NLTK package, if not present can be downloaded by nltk.download("reuters"). Once downloaded, train and test docs can be loaded as given below.

In [0]:
documents = reuters.fileids()
train_docs_id = list(filter(lambda doc: doc.startswith("train"),
                                documents))
test_docs_id = list(filter(lambda doc: doc.startswith("test"),
                               documents))

Then after x, and y for the train are separated. labels y are binarized using  `sklearn.preprocessing.MultiLabelBinarizer`.

In [0]:
X_train = [(reuters.raw(doc_id)) for doc_id in train_docs_id]
X_test = [(reuters.raw(doc_id)) for doc_id in test_docs_id]
mlb = MultiLabelBinarizer()
y_train = mlb.fit_transform([reuters.categories(doc_id)
                                 for doc_id in train_docs_id])
y_test = mlb.transform([reuters.categories(doc_id)
                            for doc_id in test_docs_id])

In [0]:
y_train = np.argmax(y_train, axis=1)
y_test = np.argmax(y_test, axis=1)

**Training the network:** Data prepared in the previous step is given to the RMDL module as shown below. RMDL.A model constraint is provided which specify how many layers a network can have for Feedforward RNN and CNN subnetwork. in below-given model network constrain are specified with variable Random_Deep. Text_Classification function takes X, Y for train and test, batch size, model constraints and embeddings file as input.

In [0]:
batch_size = 100
sparse_categorical = 0
n_epochs = [120, 120, 120]  ## DNN--RNN-CNN
Random_Deep = [3, 3, 3]  ## DNN--RNN-CNN
RMDL.Text_Classification(X_train, y_train, X_test, y_test,
                             batch_size=batch_size,
                             sparse_categorical=True,
                             random_deep=Random_Deep,
                             epochs=n_epochs,GloVe_dir="../embedidngs/glove.6B/")

## Training on `20 news group` data - (Additional example)

In [0]:
newsgroups_train = fetch_20newsgroups(subset='train')
newsgroups_test = fetch_20newsgroups(subset='test')

In [0]:
X_train = newsgroups_train.data
X_test = newsgroups_test.data
y_train = newsgroups_train.target
y_test = newsgroups_test.target

In [0]:
batch_size = 100
sparse_categorical = 0
print(len(X_train))
n_epochs = [500, 500, 500]  ## DNN--RNN-CNN
Random_Deep = [3,3, 3]  ## DNN--RNN-CNN

In [0]:
RMDL.Text_Classification(X_train, y_train, X_test, y_test,
                             batch_size=batch_size,
                             sparse_categorical=True,
                             random_deep=Random_Deep,
                             epochs=n_epochs)