<div>
    <div style="float:left;">
        <img src="http://oproject.org/tiki-download_file.php?fileId=8&display&x=450&y=128" width="50%" />
    </div>
    <div style="float:left;">
        <img src="http://gfif.udea.edu.co/root/tmva/img/tmva_logo.gif" width="50%"/>
    </div>
</div>

# JsMVA
<hr style="border-top-width: 4px; border-top-color: #34609b;">

<!--<script src="JsRoot/scripts/JSRootCore.js?jq2d&onload=JsRootLoadedCall" type="text/javascript"></script>-->

In [1]:
import ROOT
from ROOT import TFile, TMVA, TCut

Welcome to JupyROOT 6.07/07


## Import JsMVA and enable JS visualization

In [2]:
import sys, os
sys.path.append(os.path.expanduser("../src/python"))
import JsMVA

In [3]:
%jsmva on

# Dataset infos

In [4]:
infname     = "files/tmva_class_example.root"
dataset     = "files/tmva_class_example"
treeNameSig = "TreeS"
treeNameBkg = "TreeB"
outfname    = "files/TMVA.root"
verbose     = True

## Declare Factory and DataLoader

In [5]:
outputFile = TFile( outfname, 'RECREATE' )

TMVA.Tools.Instance()

factory = TMVA.Factory( "TMVAClassification", outputFile, 
                            "!V:Color:DrawProgressBar:Transformations=I;D;P;G,D:AnalysisType=Classification" )

# Set verbosity
factory.SetVerbose( verbose )

loader = TMVA.DataLoader(dataset)

--- Factory                  : You are running ROOT Version: 6.07/07, Apr 1, 2016
--- Factory                  : 
--- Factory                  : _/_/_/_/_/ _|      _|  _|      _|    _|_|   
--- Factory                  :    _/      _|_|  _|_|  _|      _|  _|    _| 
--- Factory                  :   _/       _|  _|  _|  _|      _|  _|_|_|_| 
--- Factory                  :  _/        _|      _|    _|  _|    _|    _| 
--- Factory                  : _/         _|      _|      _|      _|    _| 
--- Factory                  : 
--- Factory                  : ___________TMVA Version 4.2.1, Feb 5, 2015
--- Factory                  : 


## Adding variables to DataLoader

In [6]:
loader.AddVariable( "myvar1 := var1+var2", 'F' )
loader.AddVariable( "myvar2 := var1-var2", "Expression 2", 'F' )
loader.AddVariable( "var3",                "Variable 3", 'F' )
loader.AddVariable( "var4",                "Variable 4", 'F' )

loader.AddSpectator( "spec1:=var1*2",  "Spectator 1",  'F' )
loader.AddSpectator( "spec2:=var1*3",  "Spectator 2",  'F' )

## If the dataset is not available on local computer we download from cern server

In [7]:
if ROOT.gSystem.AccessPathName( "./"+infname ) != 0: 
    ROOT.gSystem.Exec( "cd files; wget https://root.cern.ch/" + infname)

## Setting up dataset from Trees

In [8]:
input = TFile.Open( infname )

# Get the signal and background trees for training
signal      = input.Get( treeNameSig )
background  = input.Get( treeNameBkg )
    
# Global event weights (see below for setting event-wise weights)
signalWeight     = 1.0
backgroundWeight = 1.0

signalWeight     = 1.0
backgroundWeight = 1.0

mycuts = TCut("")
mycutb = TCut("")

loader.AddSignalTree(signal, signalWeight)
loader.AddBackgroundTree(background, backgroundWeight)
loader.fSignalWeight = signalWeight
loader.fBackgroundWeight = backgroundWeight
loader.fTreeS = signal
loader.fTreeB = background
loader.PrepareTrainingAndTestTree(mycuts,
                                  mycutb,
                                "nTrain_Signal=0:nTrain_Background=0:SplitMode=Random:NormMode=NumEvents:!V");

--- DataSetInfo              : Dataset[files/tmva_class_example] : Added class "Signal"	 with internal class number 0
--- files/tmva_class_example : Add Tree TreeS of type Signal with 6000 events
--- DataSetInfo              : Dataset[files/tmva_class_example] : Added class "Background"	 with internal class number 1
--- files/tmva_class_example : Add Tree TreeB of type Background with 6000 events
--- files/tmva_class_example : Preparing trees for training and testing...


## Visualizing input variables

In [9]:
loader.DrawInputVariable("myvar1")

--- DataSetFactory           : Dataset[files/tmva_class_example] : Splitmode is: "RANDOM" the mixmode is: "SAMEASSPLITMODE"
--- DataSetFactory           : Dataset[files/tmva_class_example] : Create training and testing trees -- looping over class "Signal" ...
--- DataSetFactory           : Dataset[files/tmva_class_example] : Weight expression for class 'Signal': ""
--- DataSetFactory           : Dataset[files/tmva_class_example] : Create training and testing trees -- looping over class "Background" ...
--- DataSetFactory           : Dataset[files/tmva_class_example] : Weight expression for class 'Background': ""
--- DataSetFactory           : Dataset[files/tmva_class_example] : Number of events in input trees (after possible flattening of arrays):
--- DataSetFactory           : Dataset[files/tmva_class_example] :     Signal          -- number of events       : 6000   / sum of weights: 6000 
--- DataSetFactory           : Dataset[files/tmva_class_example] :     Background      -- number

### We can also visualize transformations on input variables

In [10]:
loader.DrawInputVariable("myvar1", processTrfs="D") #I;N;D;P;U;G,D

--- files/tmva_class_example : Dataset[files/tmva_class_example] : Create Transformation "D" with events from all classes.
--- Deco                     : Transformation, Variable selection : 
--- Deco                     : Input : variable 'myvar1' (index=0).   <---> Output : variable 'myvar1' (index=0).
--- Deco                     : Input : variable 'myvar2' (index=1).   <---> Output : variable 'myvar2' (index=1).
--- Deco                     : Input : variable 'var3' (index=2).   <---> Output : variable 'var3' (index=2).
--- Deco                     : Input : variable 'var4' (index=3).   <---> Output : variable 'var4' (index=3).
--- Deco                     : Preparing the Decorrelation transformation...
--- TFHandler_DataLoader     : -----------------------------------------------------------
--- TFHandler_DataLoader     : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_DataLoader     : -----------------------------------------------------------
--- TFHand

## Correlation matrix of input variables

In [11]:
loader.DrawCorrelationMatrix("Signal")

## Booking methods

In [12]:
factory.BookMethod( loader, TMVA.Types.kCuts, "Cuts",
                            "!H:!V:FitMethod=MC:EffSel:SampleSize=200000:VarProp=FSmart" )

factory.BookMethod(  loader,TMVA.Types.kSVM, "SVM", "Gamma=0.25:Tol=0.001:VarTransform=Norm" )

factory.BookMethod(  loader,TMVA.Types.kMLP, "MLP", 
                   "!H:!V:NeuronType=tanh:VarTransform=N:NCycles=600:HiddenLayers=N+5:TestRate=5:!UseRegulator" )

factory.BookMethod(  loader,TMVA.Types.kLD, "LD", "H:!V:VarTransform=None:CreateMVAPdfs:PDFInterpolMVAPdf=Spline2:NbinsMVAPdf=50:NsmoothMVAPdf=10" )

layoutString = "Layout=TANH|100,TANH|50,TANH|10,LINEAR"

training0 = "LearningRate=1e-1,Momentum=0.0,Repetitions=1,ConvergenceSteps=300,BatchSize=20,TestRepetitions=15,"
training0+= "WeightDecay=0.001,Regularization=NONE,DropConfig=0.0+0.5+0.5+0.5,DropRepetitions=1,Multithreading=True"
training1 = "LearningRate=1e-2,Momentum=0.5,Repetitions=1,ConvergenceSteps=300,BatchSize=30,TestRepetitions=7,"
training1+= "WeightDecay=0.001,Regularization=L2,Multithreading=True,DropConfig=0.0+0.1+0.1+0.1,DropRepetitions=1"
training2 = "LearningRate=1e-2,Momentum=0.3,Repetitions=1,ConvergenceSteps=300,BatchSize=40,TestRepetitions=7,"
training2+= "WeightDecay=0.0001,Regularization=L2,Multithreading=True"
training3 = "LearningRate=1e-3,Momentum=0.1,Repetitions=1,ConvergenceSteps=200,BatchSize=70,TestRepetitions=7,"
training3+= "WeightDecay=0.0001,Regularization=NONE,Multithreading=True"

trainingStrategyString = "TrainingStrategy="
trainingStrategyString += training0 + "|" + training1 + "|" + training2 + "|" + training3

nnOptions = "!H:V:VarTransform=Normalize:ErrorStrategy=CROSSENTROPY"

nnOptions += ":" 
nnOptions += layoutString
nnOptions += ":"
nnOptions += trainingStrategyString

factory.BookMethod(loader, TMVA.Types.kDNN, "DNN", nnOptions )

factory.BookMethod( loader, TMVA.Types.kLikelihood, "Likelihood",
    "H:!V:!TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50" )

factory.BookMethod( loader, TMVA.Types.kBDT, "BDT",
    "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20" )

<ROOT.TMVA::MethodBDT object ("BDT") at 0x6fa9ec0>

--- Factory                  : Booking method: [1mCuts[0m DataSet Name: [1mfiles/tmva_class_example[0m
--- Cuts                     : Use optimization method: "Monte Carlo"
--- Cuts                     : Use efficiency computation method: "Event Selection"
--- Cuts                     : Use "FSmart" cuts for variable: 'myvar1'
--- Cuts                     : Use "FSmart" cuts for variable: 'myvar2'
--- Cuts                     : Use "FSmart" cuts for variable: 'var3'
--- Cuts                     : Use "FSmart" cuts for variable: 'var4'
--- Factory                  : Booking method: [1mSVM[0m DataSet Name: [1mfiles/tmva_class_example[0m
--- SVM                      : Dataset[files/tmva_class_example] : Create Transformation "Norm" with events from all classes.
--- Norm                     : Transformation, Variable selection : 
--- Norm                     : Input : variable 'myvar1' (index=0).   <---> Output : variable 'myvar1' (index=0).
--- Norm                     : Input : v

# Train Methods

In [13]:
factory.TrainAllMethods()

--- Cuts                     : Dataset[files/tmva_class_example] : Begin training
--- FitterBase               : <MCFitter> Sampling, please be patient ...
--- FitterBase               : Elapsed time: [1;31m8.29 sec[0m                           
--- Cuts                     : ------------------------------------------
--- Cuts                     : Cut values for requested signal efficiency: 0.1
--- Cuts                     : Corresponding background efficiency       : 0.0276667
--- Cuts                     : Transformation applied to input variables : None
--- Cuts                     : ------------------------------------------
--- Cuts                     : Cut[ 0]:   -8.07609 < myvar1 <=      1e+30
--- Cuts                     : Cut[ 1]:     -1e+30 < myvar2 <=   -1.39928
--- Cuts                     : Cut[ 2]:   -4.38911 <   var3 <=      1e+30
--- Cuts                     : Cut[ 3]:  -0.326104 <   var4 <=      1e+30
--- Cuts                     : ---------------------------------

--- CutsFitter_MC            : [1;42m[33m[[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m][0m[0m ([1;31m0%[0m, time left: [1;31munknown[0m[0m) --- CutsFitter_MC            : [1;42m[33m[[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m][0m[0m ([1;31m1%[0m, time left: [1;31m8 sec[0m[0m) --- CutsFitter_MC            : [1;42m[33m[[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.[0m[1;42m[33m.

## Testing the methods

In [None]:
factory.TestAllMethods()

## Evaluate the methods

In [None]:
factory.EvaluateAllMethods()

## Classifier Output Distributions

In [None]:
factory.DrawOutputDistribution(dataset, "MLP")

## Classifier Probability Distributions

In [None]:
factory.DrawProbabilityDistribution(dataset, "LD")

## ROC curve

In [None]:
factory.DrawROCCurve(dataset)

## Classifier Cut Efficiencies

In [None]:
factory.DrawCutEfficiencies(dataset, "KNN")

## Draw Neural Network

* Mouseover (node, weight): focusing
* Zooming and grab and move supported
* Reset: double click

In [None]:
factory.DrawNeuralNetwork(dataset, "MLP")

## Draw Deep Neural Network

In [None]:
factory.DrawNeuralNetwork(dataset, "DNN")

## Draw Decision Tree

* Mouseover (node, weight): showing decision path
* Zooming and grab and move supported
* Reset: double click
* Click on node: 
    * hiding subtree, if node children are hidden the node will have a green border
    * rescaling: bigger nodes, bigger texts
    * click again to show the subtree

In [None]:
factory.DrawDecisionTree(dataset, "BDT") #11

## Close the factory's output file

In [None]:
outputFile.Close()