<img src="http://oproject.org/img/ROOTR.png" height="30%" width="30%">
<img src="http://oproject.org/img/tmvalogo.png" height="30%" width="30%">

# The RMVA Inteface: TMVA and R

## Required headers

In [1]:
#include "TRInterface.h"
#include "TMVA/MethodC50.h"
#include "TMVA/MethodRSNNS.h"
#include "TMVA/MethodRXGB.h"

## Declare Factory

In [2]:
TMVA::Tools::Instance();

auto inputFile = TFile::Open("https://raw.githubusercontent.com/iml-wg/tmvatutorials/master/inputdata.root");
auto outputFile = TFile::Open("TMVAOutputCV.root", "RECREATE");

TMVA::Factory factory("TMVAClassification", outputFile,
                      "!V:ROC:!Correlations:!Silent:Color:!DrawProgressBar:AnalysisType=Classification" ); 

--- Factory                  : You are running ROOT Version: 6.07/07, Apr 1, 2016
--- Factory                  : 
--- Factory                  : _/_/_/_/_/ _|      _|  _|      _|    _|_|   
--- Factory                  :    _/      _|_|  _|_|  _|      _|  _|    _| 
--- Factory                  :   _/       _|  _|  _|  _|      _|  _|_|_|_| 
--- Factory                  :  _/        _|      _|    _|  _|    _|    _| 
--- Factory                  : _/         _|      _|      _|      _|    _| 
--- Factory                  : 
--- Factory                  : ___________TMVA Version 4.2.1, Feb 5, 2015
--- Factory                  : 


## Declare DataLoader

In [3]:
TMVA::DataLoader loader("dataset");

//adding variables to dataset
loader.AddVariable("var1");
loader.AddVariable("var2");
loader.AddVariable("var3");
loader.AddVariable("var4");

## Setting up Dataset

In [4]:
TTree *tsignal, *tbackground;
inputFile->GetObject("Sig", tsignal);
inputFile->GetObject("Bkg", tbackground);

TCut mycuts, mycutb;
   
loader.AddSignalTree     (tsignal, 1);      //signal weight = 1
loader.AddBackgroundTree (tbackground, 1);  //background weight = 1 

loader.PrepareTrainingAndTestTree(mycuts, mycutb,
"nTrain_Signal=1000:nTrain_Background=1000:SplitMode=Random:NormMode=NumEvents:!V"); 

--- DataSetInfo              : Dataset[dataset] : Added class "Signal"	 with internal class number 0
--- dataset                  : Add Tree Sig of type Signal with 6000 events
--- DataSetInfo              : Dataset[dataset] : Added class "Background"	 with internal class number 1
--- dataset                  : Add Tree Bkg of type Background with 6000 events
--- dataset                  : Preparing trees for training and testing...


## Booking methods
The available Booking methods with options for RMVA are:

- C50 Boosted Decision Trees http://oproject.org/tiki-index.php?page=RMVA#C50Booking
- RMLP Neural Networks http://oproject.org/tiki-index.php?page=RMVA#RSNNSMLP 
- Extreme Gradient Boosted (RXGB) Decision Trees http://oproject.org/tiki-index.php?page=RMVA#RXGBBooking

In [5]:
//C50 Boosted Decision Trees (BDTs)
factory.BookMethod(&loader, TMVA::Types::kC50, "C50",
                   "!H:NTrials=5:Rules=kTRUE:ControlSubSet=kFALSE:ControlBands=10:ControlWinnow=kFALSE:ControlNoGlobalPruning=kTRUE:ControlCF=0.25:ControlMinCases=2:ControlFuzzyThreshold=kTRUE:ControlSample=0:ControlEarlyStopping=kTRUE:!V" );
   
//Neural Networks using RSNNS package
factory.BookMethod(&loader, TMVA::Types::kRSNNS, "RMLP",
                   "!H:VarTransform=N:Size=c(5):Maxit=10:InitFunc=Randomize_Weights:LearnFunc=Std_Backpropagation:LearnFuncParams=c(0.2,0):!V" );

//eXtreme Gradient Boosted XGB Decision Trees
factory.BookMethod(&loader, TMVA::Types::kRXGB, "RXGB","!V:NRounds=20:MaxDepth=2:Eta=1" );

//TMVA BDTs
factory.BookMethod(&loader,TMVA::Types::kBDT, "BDT",
                   "!V:NTrees=50:MinNodeSize=2.5%:MaxDepth=2:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20" );

--- Factory                  : Booking method: [1mC50[0m DataSet Name: [1mdataset[0m
--- DataSetFactory           : Dataset[dataset] : Splitmode is: "RANDOM" the mixmode is: "SAMEASSPLITMODE"
--- DataSetFactory           : Dataset[dataset] : Create training and testing trees -- looping over class "Signal" ...
--- DataSetFactory           : Dataset[dataset] : Weight expression for class 'Signal': ""
--- DataSetFactory           : Dataset[dataset] : Create training and testing trees -- looping over class "Background" ...
--- DataSetFactory           : Dataset[dataset] : Weight expression for class 'Background': ""
--- DataSetFactory           : Dataset[dataset] : Number of events in input trees (after possible flattening of arrays):
--- DataSetFactory           : Dataset[dataset] :     Signal          -- number of events       : 6000   / sum of weights: 6000 
--- DataSetFactory           : Dataset[dataset] :     Background      -- number of events       : 6000   / sum of weights: 600

## Training the Methods

In [6]:
factory.TrainAllMethods();

--- Factory                  :  
--- Factory                  : Train all methods for Classification ...
--- Factory                  : 
--- Factory                  : current transformation string: 'I'
--- Factory                  : Dataset[dataset] : Create Transformation "I" with events from all classes.
--- Id                       : Transformation, Variable selection : 
--- Id                       : Input : variable 'var1' (index=0).   <---> Output : variable 'var1' (index=0).
--- Id                       : Input : variable 'var2' (index=1).   <---> Output : variable 'var2' (index=1).
--- Id                       : Input : variable 'var3' (index=2).   <---> Output : variable 'var3' (index=2).
--- Id                       : Input : variable 'var4' (index=3).   <---> Output : variable 'var4' (index=3).
--- Id                       : Preparing the Identity transformation...
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Facto

## Testing and Evaluating the data

In [7]:
factory.TestAllMethods();
factory.EvaluateAllMethods();    

--- Factory                  : Test all methods...
--- Factory                  : Test method: C50 for Classification performance
--- C50                      : Dataset[dataset] : Evaluation of C50 on testing sample (10000 events)
--- C50                      : 
--- C50                      : [1m--- Loading State File From:[0mweights/C50Model.RData
--- C50                      : 
--- C50                      : Dataset[dataset] : Elapsed time for evaluation of 10000 events: [1;31m0.546 sec[0m       
--- Factory                  : Test method: RMLP for Classification performance
--- RMLP                     : Dataset[dataset] : Evaluation of RMLP on testing sample (10000 events)
--- RMLP                     : 
--- RMLP                     : [1m--- Loading State File From:[0mweights/RMLPModel.RData
--- RMLP                     : 
--- RMLP                     : Dataset[dataset] : Elapsed time for evaluation of 10000 events: [1;31m3.49 sec[0m       
--- Factory                  : Te

## Ploting ROC Curve
We enable the ROOT JavaScript interactive visualisation.

In [8]:
%jsroot on
auto c = factory.GetROCCurve(&loader);
c->Draw();