#  T M V A_Tutorial_Regression_Tmva
TMVA example, for regression
 with following objectives:
 * Train a BDT with TMVA


Modified from [RegressionKeras.py](https://root.cern/doc/master/RegressionKeras_8py.html) and [TMVARegression.C](https://root.cern/doc/master/TMVARegression_8C.html)


**Author:** Lailin XU  
<i><small>This notebook tutorial was automatically generated with <a href= "https://github.com/root-project/root/blob/master/documentation/doxygen/converttonotebook.py">ROOTBOOK-izer</a> from the macro found in the ROOT repository  on Monday, April 26, 2021 at 03:48 PM.</small></i>

In [1]:
from ROOT import TMVA, TFile, TTree, TCut
from subprocess import call
from os.path import isfile
 

Welcome to JupyROOT 6.22/07


Setup TMVA

In [2]:
TMVA.Tools.Instance()
(TMVA.gConfig().GetVariablePlotting()).fMaxNumOfAllowedVariablesForScatterPlots = 5 
 
outfileName = 'TMVA_tutorial_reg_1.root'
output = TFile.Open(outfileName, 'RECREATE')
factory = TMVA.Factory('TMVARegression', output, '!V:!Silent:Color:!DrawProgressBar:Transformations=D,G:AnalysisType=Regression')
 

Load data

In [3]:
trfile = "SM_ttbar.root"
if not isfile('tmva_reg_example.root'):
    call(['curl', '-L', '-O', 'http://root.cern.ch/files/tmva_reg_example.root'])
 
data = TFile.Open(trfile)
if not data:
  print("Error! file not opened", trfile)
trname = "tree"
tree = data.Get(trname)
 
dataloader = TMVA.DataLoader('dataset')
for branch in tree.GetListOfBranches():
    name = branch.GetName()
    if not 'mtt' in name:
        dataloader.AddVariable(name)
dataloader.AddTarget('mtt_truth')
 
dataloader.AddRegressionTree(tree, 1.0)
dataloader.PrepareTrainingAndTestTree(TCut(''), 'nTrain_Regression=10000:SplitMode=Random:NormMode=NumEvents:!V')
 

DataSetInfo              : [dataset] : Added class "Regression"
                         : Add Tree tree of type Regression with 62710 events
                         : Dataset[dataset] : Class index : 0  name : Regression


Generate model

BDT

In [4]:
factory.BookMethod( dataloader,  TMVA.Types.kBDT, "BDT",
  "!H:!V:NTrees=100:MinNodeSize=1.0%:BoostType=AdaBoostR2:SeparationType=RegressionVariance:nCuts=20:PruneMethod=CostComplexity:PruneStrength=30" )
factory.BookMethod( dataloader,  TMVA.Types.kBDT, "BDTG",
  "!H:!V:NTrees=2000::BoostType=Grad:Shrinkage=0.1:UseBaggedBoost:BaggedSampleFraction=0.5:nCuts=20:MaxDepth=3:MaxDepth=4" )

<cppyy.gbl.TMVA.MethodBDT object at 0x12e4ada00>

Factory                  : Booking method: [1mBDT[0m
                         : 
                         : Building event vectors for type 2 Regression
                         : Dataset[dataset] :  create input formulas for tree tree
DataSetFactory           : [dataset] : Number of events in input trees
                         : 
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Regression -- training events            : 10000
                         : Regression -- testing events             : 52710
                         : Regression -- training and testing events: 62710
                         : 
DataSetInfo              : Correlation matrix (Regression):
                         : --------------------------------------------------------------------------------------------------------------------------------------------------------

Neural network (MLP)
actory.BookMethod( dataloader,  TMVA.Types.kMLP, "MLP",

In [5]:
  #"!H:!V:VarTransform=Norm:NeuronType=tanh:NCycles=20000:HiddenLayers=N+20:TestRate=6:TrainingMethod=BFGS:Sampling=0.3:SamplingEpoch=0.8:ConvergenceImprove=1e-6:ConvergenceTests=15:!UseRegulator" )
 

Run TMVA

In [6]:
factory.TrainAllMethods()
factory.TestAllMethods()
factory.EvaluateAllMethods()

output.Close()

Factory                  : [1mTrain all methods[0m
Factory                  : [dataset] : Create Transformation "D" with events from all classes.
                         : 
                         : Transformation, Variable selection : 
                         : Input : variable 'nlep' <---> Output : variable 'nlep'
                         : Input : variable 'lep1_pt' <---> Output : variable 'lep1_pt'
                         : Input : variable 'lep1_eta' <---> Output : variable 'lep1_eta'
                         : Input : variable 'lep1_phi' <---> Output : variable 'lep1_phi'
                         : Input : variable 'lep1_m' <---> Output : variable 'lep1_m'
                         : Input : variable 'lep2_pt' <---> Output : variable 'lep2_pt'
                         : Input : variable 'lep2_eta' <---> Output : variable 'lep2_eta'
                         : Input : variable 'lep2_phi' <---> Output : variable 'lep2_phi'
                         : Input : variable 'lep2_m' <-

Error in <TDecompLU::DecomposeLUCrout>: matrix is singular
Error in <TDecompLU::InvertLU>: matrix is singular, 0 diag elements < tolerance of 2.2204e-16
Error in <TH1F::Smooth>: Smooth only supported for histograms with >= 3 bins. Nbins = 2


Draw all canvases 

In [7]:
from ROOT import gROOT 
gROOT.GetListOfCanvases().Draw()