# Example for running regression training

See $ROOTSYS/tutorials/tmva/TMVARegression.C for more details.
First get the path to the classes. The environment variable HSCODE must be set in your shell setup file (.bashrc or .tchsrc,....)
 Load/compile the hsmva classes via ROOTs ACLiC (clang-based compiler)


In [11]:
hscode=gSystem->Getenv("HSCODE"); //or TString hscode ="/full path/not including /hsmva"
gROOT->LoadMacro(TString(hscode)+"/hsmva/LoadMacros.C+");
LoadMacros();

Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded
Info in <ACLiC>: unmodified script has already been compiled and loaded


In [2]:
unique_ptr<TFile> infile; //use unique pointer so deleted at end
TTree * tree=nullptr;
unique_ptr<TrainReg> train;

Get the TMVA tutorial data files and load the variables and target into a tree

In [3]:
%%cpp -d
#include "GetTutorialFile.h"

In [4]:
infile.reset(GetRegressionFile());
tree=(TTree*)infile->Get("TreeR");

Info in <TFile::OpenFromCache>: using local cache copy of http://root.cern.ch/files/tmva_reg_example.root [./files/tmva_reg_example.root]


--- TMVARegression           : Using input file: ./files/tmva_reg_example.root


Prepare the regression training data. The tree should contain branches with the value to be fitted and 
the variables it depends on

In [5]:
train.reset(new TrainReg("TMVARegressionTut"));

train->SetOutDir("/work/dump"); //Set this to a sensible output directory
train->SetTarget("fvalue"); //function value to "fit"
train->IgnoreBranches("");//Any branches in tree not used must be flagged!
train->AddRegTree(tree); //Add tree with observables and targets
train->SetNTrainTest(5000,5000); //Number of train and test events
train->PrepareTrees(); //make test and training trees


HSMVA::TrainingInterface::LoadTreeVars ingoring branch fvalue
DataSetInfo              : [TMVARegressionTut] : Added class "Regression"
                         : Add Tree TreeR of type Regression with 10000 events
                         : Dataset[TMVARegressionTut] : Class index : 0  name : Regression


Prepare regression training methods. 

In [6]:
 //Can Book methods either via standard TMVA::Factory interface...
train->BookMethod(TMVA::Types::kBDT, "BDT","!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20");

  //..or predefined methods (See HSMVA::MethodConfigure.h)
train->BookMethod(HS::MVA::Meths.MLP);


Factory                  : Booking method: [1mBDT[0m
                         : 
DataSetFactory           : [TMVARegressionTut] : Number of events in input trees
                         : 
                         : Number of training and testing events
                         : ---------------------------------------------------------------------------
                         : Regression -- training events            : 5000
                         : Regression -- testing events             : 5000
                         : Regression -- training and testing events: 10000
                         : 
DataSetInfo              : Correlation matrix (Regression):
                         : ------------------------
                         :             var1    var2
                         :    var1:  +1.000  +0.004
                         :    var2:  +0.004  +1.000
                         : ------------------------
DataSetFactory           : [TMVARegressionTut] :  
               

In [7]:
train->DoTraining();


Factory                  : [1mTrain all methods[0m
Factory                  : [TMVARegressionTut] : Create Transformation "I" with events from all classes.
                         : 
                         : Transformation, Variable selection : 
                         : Input : variable 'var1' <---> Output : variable 'var1'
                         : Input : variable 'var2' <---> Output : variable 'var2'
TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:     2.5097     1.4575   [  0.0010317     4.9985 ]
                         :     var2:     2.4841     1.4381   [ 0.00071490     4.9997 ]
                         :   fvalue:     135.12     85.048   [     1.8147     394.84 ]
                         : -----------------------------------------------------------
                         : Ranking input variables (method unsp

0%, time left: unknown
6%, time left: 1 sec
12%, time left: 1 sec
18%, time left: 1 sec
25%, time left: 1 sec
31%, time left: 0 sec
37%, time left: 0 sec
43%, time left: 0 sec
50%, time left: 0 sec
56%, time left: 0 sec
62%, time left: 0 sec
68%, time left: 0 sec
75%, time left: 0 sec
81%, time left: 0 sec
87%, time left: 0 sec
93%, time left: 0 sec


                         : Elapsed time for training with 5000 events: 1.52 sec         
                         : Dataset[TMVARegressionTut] : Create results for training
                         : Dataset[TMVARegressionTut] : Evaluation of BDT on training sample


0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec


                         : Dataset[TMVARegressionTut] : Elapsed time for evaluation of 5000 events: 0.147 sec       
                         : Create variable histograms
                         : Create regression target histograms
                         : Create regression average deviation
                         : Results created
                         : Creating xml weight file: [0;36mTMVARegressionTut/weights/TMVARegressionTut_BDT.weights.xml[0m
                         : TMVARegressionTut/Training.root:/TMVARegressionTut/Method_BDT/BDT
Factory                  : Training finished
                         : 
Factory                  : Train method: MLP for Regression
                         : 
                         : 
                         : [1mH e l p   f o r   M V A   m e t h o d   [ MLP ] :[0m
                         : 
                         : [1m--- Short description:[0m
                         : 
                         : The MLP artificial neural ne

82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
6%, time left: 12 sec
12%, time left: 14 sec
19%, time left: 14 sec
25%, time left: 11 sec
31%, time left: 10 sec
37%, time left: 8 sec
44%, time left: 7 sec
50%, time left: 6 sec
56%, time left: 5 sec
62%, time left: 4 sec
69%, time left: 3 sec
75%, time left: 3 sec
81%, time left: 2 sec
87%, time left: 1 sec
94%, time left: 0 sec


                         : Elapsed time for training with 5000 events: 11.4 sec         
                         : Dataset[TMVARegressionTut] : Create results for training
                         : Dataset[TMVARegressionTut] : Evaluation of MLP on training sample
                         : Dataset[TMVARegressionTut] : Elapsed time for evaluation of 5000 events: 0.00718 sec       
                         : Create variable histograms
                         : Create regression target histograms
                         : Create regression average deviation
                         : Results created
                         : Creating xml weight file: [0;36mTMVARegressionTut/weights/TMVARegressionTut_MLP.weights.xml[0m
                         : Write special histos to file: TMVARegressionTut/Training.root:/TMVARegressionTut/Method_MLP/MLP
Factory                  : Training finished
                         : 
Factory                  : === Destroy and recreate all methods via weig

0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec


                         : Dataset[TMVARegressionTut] : Elapsed time for evaluation of 5000 events: 0.123 sec       
                         : Create variable histograms
                         : Create regression target histograms
                         : Create regression average deviation
                         : Results created
Factory                  : Test method: MLP for Regression performance
                         : 
                         : Dataset[TMVARegressionTut] : Create results for testing
                         : Dataset[TMVARegressionTut] : Evaluation of MLP on testing sample
                         : Dataset[TMVARegressionTut] : Elapsed time for evaluation of 5000 events: 0.0101 sec       
                         : Create variable histograms
                         : Create regression target histograms
                         : Create regression average deviation
                         : Results created
Factory                  : [1mEvaluate all m

94%, time left: 0 sec
0%, time left: unknown
7%, time left: 0 sec
13%, time left: 0 sec
19%, time left: 0 sec
25%, time left: 0 sec
32%, time left: 0 sec
38%, time left: 0 sec
44%, time left: 0 sec
50%, time left: 0 sec
57%, time left: 0 sec
63%, time left: 0 sec
69%, time left: 0 sec
75%, time left: 0 sec
82%, time left: 0 sec
88%, time left: 0 sec
94%, time left: 0 sec
0%, time left: unknown
10%, time left: 0 sec
15%, time left: 0 sec
20%, time left: 0 sec
25%, time left: 0 sec
35%, time left: 0 sec
40%, time left: 0 sec
46%, time left: 0 sec
51%, time left: 0 sec
56%, time left: 0 sec


                         : Elapsed time for evaluation of 5000 events: 0.119 sec       
                         : TestRegression (training)
                         : Calculate regression for all events


66%, time left: 0 sec
71%, time left: 0 sec
76%, time left: 0 sec
81%, time left: 0 sec
92%, time left: 0 sec
97%, time left: 0 sec
0%, time left: unknown
10%, time left: 0 sec
15%, time left: 0 sec
20%, time left: 0 sec
25%, time left: 0 sec


                         : Elapsed time for evaluation of 5000 events: 0.12 sec       
TFHandler_BDT            : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1:     2.4802     1.4454   [ 0.00020069     5.0000 ]
                         :     var2:     2.4836     1.4437   [ 0.00091723     5.0000 ]
                         :   fvalue:     133.97     84.519   [     1.6186     391.85 ]
                         : -----------------------------------------------------------


35%, time left: 0 sec
40%, time left: 0 sec
46%, time left: 0 sec
51%, time left: 0 sec
56%, time left: 0 sec
66%, time left: 0 sec
71%, time left: 0 sec
76%, time left: 0 sec
81%, time left: 0 sec
92%, time left: 0 sec
97%, time left: 0 sec


                         : Evaluate regression method: MLP
                         : TestRegression (testing)
                         : Calculate regression for all events
                         : Elapsed time for evaluation of 5000 events: 0.00835 sec       
                         : TestRegression (training)
                         : Calculate regression for all events
                         : Elapsed time for evaluation of 5000 events: 0.00715 sec       
TFHandler_MLP            : Variable        Mean        RMS   [        Min        Max ]
                         : -----------------------------------------------------------
                         :     var1: -0.0078286    0.57846   [    -1.0003     1.0006 ]
                         :     var2: -0.0066317    0.57761   [   -0.99992     1.0001 ]
                         :   fvalue:   -0.32752    0.43009   [    -1.0010    0.98480 ]
                         : -----------------------------------------------------------
TFHandle

0%, time left: unknown
10%, time left: 0 sec
15%, time left: 0 sec
20%, time left: 0 sec
25%, time left: 0 sec
35%, time left: 0 sec
40%, time left: 0 sec
46%, time left: 0 sec
51%, time left: 0 sec
56%, time left: 0 sec
66%, time left: 0 sec
71%, time left: 0 sec
76%, time left: 0 sec
81%, time left: 0 sec
92%, time left: 0 sec
97%, time left: 0 sec
0%, time left: unknown
10%, time left: 0 sec
15%, time left: 0 sec
20%, time left: 0 sec
25%, time left: 0 sec
35%, time left: 0 sec
40%, time left: 0 sec
46%, time left: 0 sec
51%, time left: 0 sec
56%, time left: 0 sec
66%, time left: 0 sec
71%, time left: 0 sec
76%, time left: 0 sec
81%, time left: 0 sec
92%, time left: 0 sec
97%, time left: 0 sec


                         : 
                         : Evaluation results ranked by smallest RMS on test sample:
                         : ("Bias" quotes the mean deviation of the regression from true target.
                         :  "MutInf" is the "Mutual Information" between regression and target.
                         :  Indicated by "_T" are the corresponding "truncated" quantities ob-
                         :  tained when removing events deviating more than 2sigma from average.)
                         : --------------------------------------------------------------------------------------------------
                         : --------------------------------------------------------------------------------------------------
                         : TMVARegressionTut    MLP            :  -0.0687    0.219     2.33     1.83  |  3.154  3.150
                         : TMVARegressionTut    BDT            :     30.6     37.5     76.2     68.5  |  1.102  1.033
             

In [8]:
%jsroot on

In [9]:
train->DrawTargetDeviations();

--- Opening root file /work/dump/TMVARegressionTut/Training.root in read mode
--- Plotting deviation for method: BDT
plotting logo
--- --------------------------------------------------------------------
--- If you want to save the image as eps, gif or png, please comment out 
--- the corresponding lines (line no. 239-241) in tmvaglob.C
--- --------------------------------------------------------------------
ERROR in TPostScript::Open: Cannot open file:TMVARegressionTut/plots/deviation_BDT_target_test_c0.eps
--- Plotting deviation for method: MLP
plotting logo
--- --------------------------------------------------------------------
--- If you want to save the image as eps, gif or png, please comment out 
--- the corresponding lines (line no. 239-241) in tmvaglob.C
--- --------------------------------------------------------------------
ERROR in TPostScript::Open: Cannot open file:TMVARegressionTut/plots/deviation_MLP_target_test_c0.eps


Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_BDT_target_test_c0.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_BDT_target_test_c0.png


Created  2 canvases with names:  
--- canvas name :   canvas1
--- canvas name :   canvas2


Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_MLP_target_test_c0.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_MLP_target_test_c0.png


In [10]:
train->DrawVariableDeviations();

--- Plotting deviation for method: BDT
plotting logo
--- --------------------------------------------------------------------
--- If you want to save the image as eps, gif or png, please comment out 
--- the corresponding lines (line no. 239-241) in tmvaglob.C
--- --------------------------------------------------------------------
ERROR in TPostScript::Open: Cannot open file:TMVARegressionTut/plots/deviation_BDT_vars_training_c0.eps
plotting logo
--- --------------------------------------------------------------------
--- If you want to save the image as eps, gif or png, please comment out 
--- the corresponding lines (line no. 239-241) in tmvaglob.C
--- --------------------------------------------------------------------
ERROR in TPostScript::Open: Cannot open file:TMVARegressionTut/plots/deviation_BDT_vars_training_c1.eps
--- Plotting deviation for method: MLP
plotting logo
--- --------------------------------------------------------------------
--- If you want to save the image as 

Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_BDT_vars_training_c0.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_BDT_vars_training_c0.png
Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_BDT_vars_training_c1.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_BDT_vars_training_c1.png
Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_MLP_vars_training_c0.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_MLP_vars_training_c0.png
Error in <TPostScript::Text>: Cannot open temporary file: TMVARegressionTut/plots/deviation_MLP_vars_training_c1.eps_tmp_21804

Error in <TASImage::WriteImage>: error writing file TMVARegressionTut/plots/deviation_MLP_vars_training_c1.png
