Skip to content

Example of Feed Forward Network using mlpack

Abhinav Gupta edited this page Apr 19, 2016 · 3 revisions

[Information about dataset]

This dataset represents a set of possible advertisements on Internet pages.

This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image's URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement ("ad") or not ("nonad").

Dataset fetched from: http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements [1]

[1] Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

[Information about FFN]

Feed Forward neural nets (FFNN) are the simplest for of Neural Networks. In FFNN there is an input layer followed by hidden layers and output layer. Every unit from one layer is connected to every unit in the next layer, without any connections among units within a layer.

Input layer: takes feature vector as input to the network Hidden layer: Values of parameters in hidden layer are used to apply the model on the given
dataset Output layer: returns the output (In this case the classification vector)

[Dataset preprocessing steps]

For missing values in Height and Width attributes, most frequent values are used. For the binary attributes 0 (False) is used to fill the missing values. For classification ad is denoted by 1 and nonad by 0

(Note : Right now data has been preprocessed using pandas in python. Preprocessing can be implemented using armadillo library, that will encourage users to preprocess data using mlpack’s tools. )

[Example - code]

#include <mlpack/core.hpp>

#include <mlpack/methods/ann/activation_functions/logistic_function.hpp>
#include <mlpack/methods/ann/activation_functions/tanh_function.hpp>

#include <mlpack/methods/ann/init_rules/random_init.hpp>

#include <mlpack/methods/ann/layer/bias_layer.hpp>
#include <mlpack/methods/ann/layer/linear_layer.hpp>
#include <mlpack/methods/ann/layer/base_layer.hpp>
#include <mlpack/methods/ann/layer/dropout_layer.hpp>
#include <mlpack/methods/ann/layer/binary_classification_layer.hpp>

#include <mlpack/methods/ann/ffn.hpp>
#include <mlpack/methods/ann/performance_functions/mse_function.hpp>
#include <mlpack/core/optimizers/rmsprop/rmsprop.hpp>
#include <mlpack/methods/pca/pca.hpp>

using namespace mlpack;
using namespace mlpack::ann;
using namespace mlpack::optimization;
using namespace arma;
using namespace mlpack::pca;

template<
    typename PerformanceFunction,
    typename OutputLayerType,
    typename PerformanceFunctionType,
    typename MatType = arma::mat
>
double BuildNetwork(MatType& trainData,
                         MatType& trainLabels,
                         MatType& testData,
                         MatType& testLabels,
                         const size_t hiddenLayerSize,
                         const size_t maxEpochs,
                         const double classificationErrorThreshold)
{ 

/*
   * Construct a feed forward network with trainData.n_rows input nodes,
   * hiddenLayerSize hidden nodes and trainLabels.n_rows output nodes. The
   * network structure looks like:
   *
   *  Input         Hidden        Output
   *  Layer         Layer         Layer
   * +-----+       +-----+       +-----+
   * |     |       |     |       |     |
   * |     +------>|     +------>|     |
   * |     |     +>|     |     +>|     |
   * +-----+     | +--+--+     | +-----+
   *             |             |
   *  Bias       |  Bias       |
   *  Layer      |  Layer      |
   * +-----+     | +-----+     |
   * |     |     | |     |     |
   * |     +-----+ |     +-----+
   * |     |       |     |
   * +-----+       +-----+
   */

  LinearLayer<> inputLayer(trainData.n_rows, hiddenLayerSize);
  BiasLayer<> inputBiasLayer(hiddenLayerSize);
  BaseLayer<PerformanceFunction> inputBaseLayer;

  LinearLayer<> hiddenLayer1(hiddenLayerSize, trainLabels.n_rows);
  BiasLayer<> hiddenBiasLayer1(trainLabels.n_rows);
  BaseLayer<PerformanceFunction> outputLayer;

  OutputLayerType classOutputLayer;

  auto modules = std::tie(inputLayer, inputBiasLayer, inputBaseLayer,
                          hiddenLayer1, hiddenBiasLayer1, outputLayer);

  FFN<decltype(modules), decltype(classOutputLayer), RandomInitialization,
      PerformanceFunctionType> net(modules, classOutputLayer);
    
  RMSprop<decltype(net)> opt(net, 0.01, 0.88, 1e-8,
      maxEpochs * trainData.n_cols, 1e-18);

 std::cout<<"Starting Training the network"<<std::endl;

  net.Train(trainData, trainLabels, opt);
  
  MatType prediction;
  net.Predict(testData, prediction);

  size_t error = 0;
  for (size_t i = 0; i < testData.n_cols; i++)
  {
    if (arma::sum(arma::sum(
        arma::abs(prediction.col(i) - testLabels.col(i)))) == 0)
    {
      error++;
    }
  }

  double classificationError = 1 - double(error) / testData.n_cols;
  return classificationError;
}

int main()
{

  mat dataset, trainData;
  data::Load("train_predictors_2.csv", dataset, true);
  trainData = dataset.submat(0, 0, dataset.n_rows - 2,
      dataset.n_cols - 1);
  mat trainLabels = dataset.submat(dataset.n_rows - 1, 0, dataset.n_rows - 1,
      dataset.n_cols - 1);
  data::Load("test_predictors_2.csv", dataset, true);
  mat testData = dataset.submat(0, 0, dataset.n_rows - 2,
      dataset.n_cols - 1);
  mat testLabels = dataset.submat(dataset.n_rows - 1, 0, dataset.n_rows - 1,
      dataset.n_cols - 1);
  
              
  // Reducing the dimension of the dataset
  PCA p;
  double trainVarRetained, testVarRetained;
  std::cout<<"Applying PCA on training dataset\n";
  trainVarRetained = p.Apply(trainData,10);
  std::cout<<"PCA complete on training dataset\n"<<trainData.n_rows<<"\n";
  std::cout<<"Applying PCA on test dataset\n";
  testVarRetained = p.Apply(testData, 10);
  std::cout<<"PCA complete on test dataset\n"<<testData.n_rows<<"\n";



  double classificationError = BuildNetwork<LogisticFunction,
                      BinaryClassificationLayer,
                      MeanSquaredErrorFunction>
      (trainData, trainLabels, testData, testLabels, 8, 200, 0.1);
  std::cout<<"Classification Error: "<< classificationError << std::endl;
  return 0;
}



/**
g++ -std=c++11 -g internet_ads_ffn.cpp  -lmlpack -lblas -llapack -larmadillo -lboost_serialization -        lboost_program_options
 **/

/**
Output:
Applying PCA on training dataset
PCA complete on training dataset
10
Applying PCA on test dataset
PCA complete on test dataset
10
Starting Training the network
Classification Error: 0.00358423
**/