Since we're doing a dijet analysis, we'll want to use TLorentzVectors to do things like computing the invariant mass of a two-jet system. But TLorentzVectors are notoriously slow in pyROOT. Even if this weren't the case, looping over big trees is really something you should never do in pyROOT. But pretty much everything besides those CPU-intensive tasks is better in pyROOT :-P

Please take a look at the minimal changes made to `hatsTrees.C` and `hatsTrees.h` that you can find in the `sample_code` directory. A good philosophy with using `TTree.MakeClass()` is to change as little as possible. Please read the below diff -- it contains useful tips on e.g. setting up the class to take arguments. Without the comments, there are about 30 lines of code added, but they're sufficient for all the heavy lifting in the calculation of complicated physical variables.

In [8]:
!diff --side-by-side hatsTrees.C sample_code/hatsTrees.C

#define hatsTrees_cxx						#define hatsTrees_cxx
#include "hatsTrees.h"						#include "hatsTrees.h"
#include <TH2.h>					      |	#include <TFile.h>
#include <TStyle.h>					      |	#include <TTree.h>
#include <TCanvas.h>					      |	#include <TLorentzVector.h>
							      >	#include <iostream>

void hatsTrees::Loop()					      |	/**     HATS comment
							      >	 * First we modify the Loop() method such that we can pass it
							      >	 * Also note that above, we added the includes want for basic
							      >	 * including TLorentzVectors, which are very slow in pyROOT
							      >	 */
							      >
							      >	void hatsTrees::Loop(std::string outFileName)
{								{
//   In a ROOT session, you can do:				//   In a ROOT session, you can do:
//      root> .L hatsTrees.C					//      root> .L hatsTrees.C
//      root> hatsTrees t					//      root> hatsTrees t
//      root> t.GetEntry(12); // Fill t data members with ent	//      root> t.GetEntry(12); // Fill t 

In [7]:
!diff hatsTrees.h sample_code/hatsTrees.h

414c414,417
<    virtual void     Loop();
---
>    /**    HATS comment
>     * We have to modify the Loop() method declaration here too  
>     */
>    virtual void     Loop(std::string outFileName);


Now that we've prepared our C++ class to do the heavy lifting, we will create a python-environment script where we can load it and use it to process our big datasets, while leveraging python to do the things that are annoying in C++. We'll design it to be suitable for use in batch submissions. Please follow along by looking at sample_code/runHatsTrees.py 