# Classification with TMVA and Keras
In this example, we will use the same BDT as before, but with the addition of a neural network with the [Keras](https://keras.io/) deep learning library

In [1]:
from ROOT import TMVA, TFile, TTree, TCut
from subprocess import call
from os.path import isfile

Welcome to JupyROOT 6.14/04


In [2]:
# Setup TMVA
TMVA.Tools.Instance()
TMVA.PyMethodBase.PyInitialize()

output = TFile.Open('TMVA.root', 'RECREATE')
factory = TMVA.Factory('TMVAClassification', output,
                       '!V:!Silent:Color:DrawProgressBar::AnalysisType=Classification')


In [3]:
data = TFile.Open('../data/higgs_small.root')
# data = TFile.Open('../data/higgs.root')
signal = data.Get('TreeS')
background = data.Get('TreeB')

dataloader = TMVA.DataLoader('dataset')
for branch in signal.GetListOfBranches():
    dataloader.AddVariable(branch.GetName())

dataloader.AddSignalTree(signal, 1.0)
dataloader.AddBackgroundTree(background, 1.0)
dataloader.PrepareTrainingAndTestTree(TCut(''),
                                      'nTrain_Signal=300000:nTrain_Background=300000:SplitMode=Random:NormMode=NumEvents:!V')

DataSetInfo              : [dataset] : Added class "Signal"
                         : Add Tree TreeS of type Signal with 5296 events
DataSetInfo              : [dataset] : Added class "Background"
                         : Add Tree TreeB of type Background with 4703 events
                         : Dataset[dataset] : Class index : 0  name : Signal
                         : Dataset[dataset] : Class index : 1  name : Background


---
First, we import the Keras libraries and next we define the neural network. The input shape must be defined and be equal to the number of variables. In this example we have 28 variables

In [4]:
from keras import optimizers
from keras.models import Sequential, save_model
from keras.layers import Dense, Dropout

Using TensorFlow backend.


In [5]:
# Generate model
model = Sequential()
# Input layer
model.add(Dense(150, input_shape=(28,), activation='relu'))
# Hidden layers
for i in range(2):
    model.add(Dense(150, activation='relu'))
# Output layer
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
# Set loss and optimizer
model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(), metrics=['accuracy'])

# Store model to file
model.save('model.h5')
# Print model summary (layers,parameters etc)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 150)               4350      
_________________________________________________________________
dense_2 (Dense)              (None, 150)               22650     
_________________________________________________________________
dense_3 (Dense)              (None, 150)               22650     
_________________________________________________________________
dropout_1 (Dropout)          (None, 150)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 2)                 302       
Total params: 49,952
Trainable params: 49,952
Non-trainable params: 0
_________________________________________________________________


2018-09-06 20:08:51.959651: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-06 20:08:51.960727: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


In [6]:
factory.BookMethod(dataloader, TMVA.Types.kPyKeras, 'DNN',
                   'H:!V:VarTransform=G,D:FilenameModel=model.h5:NumEpochs=10:BatchSize=64')

factory.BookMethod(dataloader, TMVA.Types.kBDT, 'BDT','!H:!V:VarTransform=G,D:NTrees=800:MaxDepth=3:nCuts=20')

TypeError: none of the 3 overloaded methods succeeded. Full details:
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TString theMethodName, TString methodTitle, TString theOption = "") =>
    could not convert argument 2
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TMVA::Types::EMVA theMethod, TString methodTitle, TString theOption = "") =>
    FATAL error (C++ exception of type runtime_error)
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader*, TMVA::Types::EMVA, TString, TString, TMVA::Types::EMVA, TString) =>
    takes at least 6 arguments (4 given)

Factory                  : Booking method: [1mDNN[0m
                         : 
DNN                      : [dataset] : Create Transformation "G" with events from all classes.
                         : 
                         : Transformation, Variable selection : 
                         : Input : variable 'lepton_pT' <---> Output : variable 'lepton_pT'
                         : Input : variable 'lepton_eta' <---> Output : variable 'lepton_eta'
                         : Input : variable 'lepton_phi' <---> Output : variable 'lepton_phi'
                         : Input : variable 'missing_energy_magnitude' <---> Output : variable 'missing_energy_magnitude'
                         : Input : variable 'missing_energy_phi' <---> Output : variable 'missing_energy_phi'
                         : Input : variable 'jet_1_pt' <---> Output : variable 'jet_1_pt'
                         : Input : variable 'jet_1_eta' <---> Output : variable 'jet_1_eta'
                         : Input : 

In [7]:
# Run training, test and evaluation
factory.TrainAllMethods()
factory.TestAllMethods()
factory.EvaluateAllMethods()


Exception: void TMVA::Factory::TrainAllMethods() =>
    FATAL error (C++ exception of type runtime_error)

Factory                  : [1mTrain all methods[0m
DataSetFactory           : [dataset] : Number of events in input trees
                         : 
[37;41;1m<FATAL>                         : Dataset[dataset] : More events requested for training (300000) than available (5296)![0m
***> abort program execution
