<a href="https://colab.research.google.com/github/vinayakShenoy/DL4CV/blob/master/ensembles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Improving Accuracy withe ensembles
- The term “ensemble methods” generally refers to training a “large” number of models (where the exact value of “large” depends on the classification task) and then combining their output predictions via voting or averaging to yield an increase in classification accuracy.
- By averaging multiple machine learning models together, we can outperform using just a single model chosen at random. 
- Like in Random Forests, where we train multiple Decision Trees, here we train multiple networks and then ask each network to return the probabilities for each class label given an input data point. These probabilities are averaged together and the final classification is obtained.

## Jensen's Inequality
- The formal definition of Jensen’s Inequality states that the convex combined(average) ensemble will have error less than or equal to the average error of the individual models.
---
## Constructing an ensemble of CNNs

In [None]:
!pip install import_ipynb
!git clone https://github.com/vinayakShenoy/DL4CV
%cd DL4CV

Collecting import_ipynb
  Downloading https://files.pythonhosted.org/packages/63/35/495e0021bfdcc924c7cdec4e9fbb87c88dd03b9b9b22419444dc370c8a45/import-ipynb-0.1.3.tar.gz
Building wheels for collected packages: import-ipynb
  Building wheel for import-ipynb (setup.py) ... [?25l[?25hdone
  Created wheel for import-ipynb: filename=import_ipynb-0.1.3-cp36-none-any.whl size=2976 sha256=34e56a46f5300ed6465d92b09e9201ccfc7f09d0170002977e91d93e5460d842
  Stored in directory: /root/.cache/pip/wheels/b4/7b/e9/a3a6e496115dffdb4e3085d0ae39ffe8a814eacc44bbf494b5
Successfully built import-ipynb
Installing collected packages: import-ipynb
Successfully installed import-ipynb-0.1.3
Cloning into 'DL4CV'...
remote: Enumerating objects: 119, done.[K
remote: Counting objects: 100% (119/119), done.[K
remote: Compressing objects: 100% (101/101), done.[K
remote: Total 119 (delta 35), reused 37 (delta 6), pack-reused 0[K
Receiving objects: 100% (119/119), 3.24 MiB | 5.69 MiB/s, done.
Resolving deltas: 1

In [None]:
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import load_model
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.datasets import cifar10
import os
import glob
import numpy as np
import import_ipynb
from pyimage.nn.MiniVGGNet import MiniVGGNet

importing Jupyter notebook from /content/DL4CV/pyimage/nn/MiniVGGNet.ipynb


In [None]:
args = {
    "output":"output",
    "models":"models",
    "num-models":5
}

In [None]:
# load the training and testing data, then scale it into the range [0,1]
((trainX, trainY), (testX, testY)) = cifar10.load_data()
trainX = trainX.astype("float")/255.0
testX = testX.astype("float")/255.0

# convert the labels from integers to vectors
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.fit_transform(testY)

# init the label names for cifar10
labelNames = ["airplanes", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship","truck"]

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [None]:
# construct image generator for data augmentation
aug = ImageDataGenerator(rotation_range=10, width_shift_range=0.1, height_shift_range=0.1,
                         horizontal_flip=True, fill_mode="nearest")

In [None]:
for i in np.arange(0, args["num-models"]):
  # init the optimizer and model
  print("INFO training model {}/{}".format(i+1, args["num-models"]))
  opt = SGD(lr=0.01, decay=0.01/40, momentum=0.9, nesterov=True)
  model = MiniVGGNet.build(width=32, height=32, depth=3, classes=10)
  model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

  # train the network
  H = model.fit(aug.flow(trainX, trainY, batch_size=64),
                validation_data=(testX, testY), epochs=40,
                steps_per_epoch=len(trainX)//64, verbose=1)
  
  # save the mode to disk
  p = [args["models"], "model_{}.model".format(i)]
  model.save(os.path.sep.join(p))

  # evaluate the network
  predictions = model.predict(testX, batch_size=64)
  report = classification_report(testY.argmax(axis=1), 
                                 predictions.argmax(axis=1), target_names=labelNames)

  # save the classification report to file
  p = [args["output"], "model_{}.txt".format(i)]
  f = open(os.path.sep.join(p), "w")
  f.write(report)
  f.close()

  # plot the training loss and accuracy
  p = [args["output"], "model_{}.png".format(i)]
  plt.style.use("ggplot")
  plt.figure()
  plt.plot(np.arange(0, 40), H.history["loss"], label="train_loss")
  plt.plot(np.arange(0, 40), H.history["val_loss"], label="val_loss")
  plt.plot(np.arange(0, 40), H.history["accuracy"], label="train_acc")
  plt.plot(np.arange(0, 40), H.history["val_accuracy"], label="val_acc")
  plt.title("Training Loss and Accuracy for model {}".format(i))
  plt.xlabel("Epoch #")
  plt.ylabel("Loss/Accuracy")
  plt.legend()
  plt.savefig(os.path.sep.join(p))
  plt.close()

In [None]:
(testX, testY) = cifar10.load_data()[1]
testX = testX.astype("float")/255.0

# init labelNames 
labelNames = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]

# convert the labels from the integers to vectors
lb = LabelBinarizer()
testY = lb.fit_transform(testY)
 
# construct path used to collect models then initializer the models list
modelPaths = os.path.sep.join([args["models"], "*.model"])
modelPaths = list(glob.glob(modelPaths))

In [None]:
models = []

# loop over model paths, load each model and add it to list of models
for (i, modelPath) in enumerate(modelPaths):
  print("INFO loading model {}/{}".format(i+1,
                                          len(modelPaths)))
  models.append(load_model(modelPaths))

In [None]:
# init the list of predictions
print("INFO evaluating ensemble")
predictions = []

for model in models:
  # us curr model to make predcitions on testing data 
  #then store these predictions in the aggregate predictions list
  predictions.append(model.predict(testX, batch_size=64))

# prediictions will be of shape (5,10000, 10)
# where 5 is number of models, 10000 is number of images and 10 is probability per
# class.
predictions = np.average(predictions, axis=0)
print(classification_report(testY.argmax(axis=1),
                            predictions.argmax(axis=1), target_names=labelNames))