#  T M V A_Tutorial_Classification_Tmva
TMVA example, for classification
 with following objectives:
 * Train a BDT with TMVA


Modified from [ClassificationKeras.py](https://root.cern/doc/master/ClassificationKeras_8py.html) and [TMVAClassification.C](https://root.cern/doc/master/TMVAClassification_8C.html)


**Author:** Lailin XU  
<i><small>This notebook tutorial was automatically generated with <a href= "https://github.com/root-project/root/blob/master/documentation/doxygen/converttonotebook.py">ROOTBOOK-izer</a> from the macro found in the ROOT repository  on Tuesday, April 27, 2021 at 01:00 AM.</small></i>

In [1]:
from ROOT import TMVA, TFile, TTree, TCut
from subprocess import call
from os.path import isfile
 

Welcome to JupyROOT 6.22/06


Setup TMVA
=======================

In [2]:
TMVA.Tools.Instance()
(TMVA.gConfig().GetVariablePlotting()).fMaxNumOfAllowedVariablesForScatterPlots = 5 
 
outfileName = 'TMVA_tutorial_cla_1.root'
output = TFile.Open(outfileName, 'RECREATE')

Create the factory object. Later you can choose the methods whose performance you'd like to investigate. The factory is
   the only TMVA object you have to interact with
  
   The first argument is the base of the name of all the weightfiles in the directory weight/
   The second argument is the output file for the training results

In [3]:
factory = TMVA.Factory("TMVAClassification", output,
  "!V:!Silent:Color:!DrawProgressBar:Transformations=I;D;P;G,D:AnalysisType=Classification")
 

Load data
=======================
Background

In [4]:
trfile_B = "example_data/SM_ttbar.root"

Signal

In [5]:
trfile_S = "example_data/Zp1TeV_ttbar.root"
if not isfile('tmva_reg_example.root'):
    call(['curl', '-L', '-O', 'http://root.cern.ch/files/tmva_reg_example.root'])
 
data_B = TFile.Open(trfile_B)
data_S = TFile.Open(trfile_S)
trname = "tree"
tree_B = data_B.Get(trname)
tree_S = data_S.Get(trname)
 
dataloader = TMVA.DataLoader('dataset')
for branch in tree_S.GetListOfBranches():
    name = branch.GetName()
    if name not in ["mtt_truth", "weight", "nlep", "njets"]:
        dataloader.AddVariable(name)
 

Add Signal and background trees

In [6]:
dataloader.AddSignalTree(tree_S, 1.0)
dataloader.AddBackgroundTree(tree_B, 1.0)

DataSetInfo              : [dataset] : Added class "Signal"
                         : Add Tree tree of type Signal with 24634 events
DataSetInfo              : [dataset] : Added class "Background"
                         : Add Tree tree of type Background with 62710 events


Set individual event weights (the variables must exist in the original TTree)
dataloader.SetSignalWeightExpression("weight")
dataloader.SetBackgroundWeightExpression("weight")

Tell the dataloader how to use the training and testing events

If no numbers of events are given, half of the events in the tree are used
for training, and the other half for testing:

   dataloader->PrepareTrainingAndTestTree( mycut, "SplitMode=random:!V" );

To also specify the number of testing events, use:

   dataloader->PrepareTrainingAndTestTree( mycut,
        "NSigTrain=3000:NBkgTrain=3000:NSigTest=3000:NBkgTest=3000:SplitMode=Random:!V" );

In [7]:
dataloader.PrepareTrainingAndTestTree(TCut(''), "nTrain_Signal=10000:nTrain_Background=10000:SplitMode=Random:NormMode=NumEvents:!V")
 

                         : Dataset[dataset] : Class index : 0  name : Signal
                         : Dataset[dataset] : Class index : 1  name : Background


Generate model

BDT

In [8]:
factory.BookMethod( dataloader,  TMVA.Types.kBDT, "BDT",
  "!H:!V:NTrees=100:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20")
factory.BookMethod( dataloader,  TMVA.Types.kBDT, "BDTG",
  "!H:!V:NTrees=1000:MinNodeSize=2.5%:BoostType=Grad:Shrinkage=0.10:UseBaggedBoost:BaggedSampleFraction=0.5:nCuts=20:MaxDepth=2")

 

<cppyy.gbl.TMVA.MethodBDT object at 0xa752b70>

Factory                  : Booking method: [1mBDT[0m
                         : 
                         : Building event vectors for type 2 Signal
                         : Dataset[dataset] :  create input formulas for tree tree
                         : Building event vectors for type 2 Background
                         : Dataset[dataset] :  create input formulas for tree tree
DataSetFactory           : [dataset] : Number of events in input trees
                         : 
                         : 
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Signal     -- training events            : 10000
                         : Signal     -- testing events             : 14634
                         : Signal     -- training and testing events: 24634
                         : Background -- training events            : 10000
            

Run TMVA

In [9]:
factory.TrainAllMethods()
factory.TestAllMethods()
factory.EvaluateAllMethods()

output.Close()

Factory                  : [1mTrain all methods[0m
Factory                  : [dataset] : Create Transformation "I" with events from all classes.
                         : 
                         : Transformation, Variable selection : 
                         : Input : variable 'mtt_reco' <---> Output : variable 'mtt_reco'
                         : Input : variable 'lep1_pt' <---> Output : variable 'lep1_pt'
                         : Input : variable 'lep1_eta' <---> Output : variable 'lep1_eta'
                         : Input : variable 'lep1_phi' <---> Output : variable 'lep1_phi'
                         : Input : variable 'lep1_m' <---> Output : variable 'lep1_m'
                         : Input : variable 'lep2_pt' <---> Output : variable 'lep2_pt'
                         : Input : variable 'lep2_eta' <---> Output : variable 'lep2_eta'
                         : Input : variable 'lep2_phi' <---> Output : variable 'lep2_phi'
                         : Input : variable 'le

Error in <TDecompLU::InvertLU>: matrix is singular, 1 diag elements < tolerance of 2.2204e-16
Error in <TDecompLU::InvertLU>: matrix is singular, 1 diag elements < tolerance of 2.2204e-16
Error in <TDecompLU::InvertLU>: matrix is singular, 1 diag elements < tolerance of 2.2204e-16


Draw all canvases 

In [10]:
from ROOT import gROOT 
gROOT.GetListOfCanvases().Draw()