DON’T PANIC

A Hitchhiker’s guide to High Energy Physics

Preface

Wow! I wrote this approximately eight years ago. The following is… a lot. I wrote it during the evening to unwind, and I was reading the Hitchhikers Guide to the Galaxy so I took every opportunity to make esoteric references to Douglas Adams. I think each individual section is still pretty readable, and a lot of the information is evergreen. The link density and trivia is insane. Distance has made me realize that its fairly overwhelming. Now that I’ve been outside HEP for a few years (almost five!) I understand just how intense getting into research is. It doesn’t look like its gotten better, ROOT transitioned to cmake, most projects are on git, and a decent fraction appear to have double-downed on docker.

I’ve updated some of the prescriptions to future-proof them and fixed some link-rot. If you’re a fan say so! I occasionally stalk the internet and find references to it so I know there are some readers.

If you’re getting into this business today, emphasize learning transferable skills. Learn how to write maintainable C++, learn docker, learn how to use Jupyter notebooks, only use ROOT if you have to (the rest of Data Science uses numpy, pandas, and python).

If I were to write this again today, it would sound very different, but I’m still very proud of it

Introduction

Welcome to High Energy Physics. Like the Galaxy, HEP is a wonderful and mysterious place filled with amazing things. The intent of this guide is to provide a jump-start to HEP research. Its purpose is to be entertaining and uninformative. The information in this guide was hard won, The Editor does not intend it to be different for our dear reader. This guide is not intended to make the reader’s life easier. It is not intended to teach anything. It is intended to provide the dear reader with a glossary of concepts for further research.

If the reader finds this guide entertaining, but uninformative, then please send a bottle of Old Janx Spirit to The Editor. The Guide is not intended to be read linearly. Jump to the section in need, and if that section contains links to other sections, then please see above regarding edits. Prepare to be frustrated.

Navigating the guide

A note on key notation: Keys sequences follow the “emacs” style for notating keys. Therefore ”C-c” means hold down the control key, while pressing the c key. ”M-a” would mean hold down the “Meta” key and press the a key. On most keyboards the Meta key is the key labeled “Alt”. There is also the “Super” key, on most PC keyboards this key has the “Windows icon” on it. The function keys 1-12 are notated as F1 etc. The escape key is usually notated ESC, and the enter key is notated RET for Return. A sequence is denoted by a series of keys and spaces or dashes. A dash means hold the two keys at the same time, a space means release the previous keys and continue with the next instruction. Some examples:

“C-c C-a” Control-C release Control-A
“C-c a” Control-c release A
“C-M-f” Press Control, then Meta, then F without releasing

The online version of the guide includes a Table of Contents; simply mouse over it and a full list of topics will pop up. You can click any topic to jump to that section. If you read this in org-mode, the file will open folded. It will look like:

#+TITLE: DON'T PANIC
#+AUTHOR: David Bjergaard
#+EMAIL:  hhg-svc@ccbar.us
#+OPTIONS:   H:5 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t
#+OPTIONS:   TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc
#+STYLE:    <link rel="stylesheet" type="text/css" href="./css/style.css" />
#+LaTeX_CLASS: article

 * A Hitchhiker's guide to High Energy Physics...

Place your cursor at the beginning of * A Hitchhiker's... and hit TAB, this will expand the topics allowing you to see and over-view of the document. Move to whichever topic you’re interested in and hit TAB again to expand that section.

If you are reading this in Vim without Vim-OrgMode, then you will have no folding and the whole document will be expanded. Jump to various headlines by searching for ”**” (two levels deep), ”***” (three levels deep), ”****” (four levels deep), etc.

If you are reading this as a PDF, there is no Table of Contents. Just scroll to the section you are interested in.

NOTE If you are reading this on GitHub, a link to another section will probably be broken due to a bug in how GitHub parses org-flavored markdown.

Disclaimer

All bottles of donated Old Janx Spirit are redirected to our lawyers, who are out enjoying them now. They insisted that we include this disclaimer:

The Hitchhiker’s Guide to High Energy Physics is a set of documents and software herein referred to as “The Guide”

The Guide is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.

The Guide is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this software. If not, see http://www.gnu.org/licenses/.

Also, note that the content of this guide is satirical in nature, and is intended to be sarcastic. If you have a weak heart or are easily offended, it may be better to seek out other sources of information.

Obtaining a copy and supporting material

All of the sources for the guide are hosted on GitHub.

Here are some quick-links:

The Guide the online version of the guide
PDF of The Guide if you prefer that sort of thing, though it’s of limited use in printed form.
Org source of the online version
GitHub Repo where source code is hosted
Compiled Macros for Level 2 of ROOT enlightenment
Compiled Programs for Level 3 of ROOT enlightenment
Bug Reports/Feature Requests
Pull Requests for submitting patches

To get a local copy:

git clone https://github.com/dmb2/hitchhikers-guide-to-hep.git

Then you’ll have a copy of the org file, as well as the compiled macros and compiled programs.

For Windows Hitchhikers

Everyone should read For Linux Hitchhikers to understand what functionality they’ll need (especially when working with or on remote machines)

While it is possible to practice HEP from the comfort of Bill Gates’ brain child, it is not recommended by The Editor. (He doesn’t run Windows anyway, daylight scares him.) If you insist on using Windows, the following is a list of useful software.

Software you will need

For Linux Hitchhikers
PuTTY (ssh client for Windows): Secure SHell is the standard way of accessing *nix machines remotely. PuTTY is the Windows client for this.
ROOT: The industry standard for High Energy Physics analysis. Beware: this program uses an Infinite Improbability Drive to perform analysis.
Xming an X11 server for Windows: This allows you to tunnel X11 applications (ROOT’s histogram interface) to your Windows desktop, this way your data (and ROOT) can live on a remote machine, but you can still interact with them as if they were on your desktop. (You need a fast internet connection to do this). Xming comes in two flavors: the “web” version, which is locked behind a paywall for people who donate to the project, and a slightly less up-to-date public version available for free. Choose the public version. See Getting Started and Trouble shooting for tips on getting set up.
WSL: Windows 10 released a Linux subsystem that allows a unix-like environment in Windows.
Cygwin: Adds a substrate of the GNU system to Windows (in addition to an X11 server), you can use this to create a more Unix-like environment to work from.
VirtualBox: Allows you to boot operating systems within operating systems (useful if you don’t want to dual boot Ubuntu) see <a href=”For Linux Hitchhikers”>For Linux Hitchhikers after you’ve setup a working distro.

See here for a nice picture-book tutorial on installing Ubuntu through VirtualBox on Windows.

For Linux Hitchhikers

Software you will need

Screen: This lets you pick up where you left off if your ssh connection drops, here is a good conceptual introduction. If you use screen on lxplus, you’ll have to re-initialize your kerberos tokens after logging in with kinit -5, otherwise you won’t have read access to your files.
tmux: Offers the same features as screen.
ROOT: The industry standard for High Energy Physics analysis. Beware: this program uses an Infinite Improbability Drive to perform analysis.
BASH: The command shell of choice for ATLAS Physicists. You may think you could use ZSH, but it’s better just to stick with what everyone else uses. CMS Physicists prefer TCSH for some weird reason.
Editor: Choose you’re religion wisely, it will eventually permeate your being and change the way you approach life in general.

The Terminal

You will, regardless of which operating system you use, be typing commands into a terminal. It’s inevitable, powerful, and intimidating to new users. HEP hitchhikers should feel at home. Proficiency with the command line is essential to being a functioning HEP researcher.

The terminal is like the Galaxy Hitchhiker’s utility towel. Every hitchhiker needs a terminal, and each hitchhiker customizes his or her towel to their needs.

If you’ve never touched a terminal before, and don’t know what the command line is, there are two options:

The great pedagogical introduction (12 page PDF) by Carl Mason
“An Introduction to Unix” (a comprehensive, modern take on the unix ecosystem) by Oliver Elliott

You should read both, Carl Mason’s tutorial should be read “cover-to-cover”, where as Oliver’s is written very much in the spirit of this guide, so bookmark it and refer to it after you’ve read Carl Mason’s tutorial.

Line Editing

Most modern operating systems’ default shell is bash. Be aware that bash’s line editor is set up to respect emacs keybindings, this means “C-a” is beginning of line “C-e” the end, etc. You can change to vi bindings by typing:

set -o vi

If you forget which mode your in, check it by typing:

set -o

If you want these changes to be permanent, add them to your .bashrc. If these commands give you an error, type:

echo $SHELL

And see what it says (/bin/bash if it’s bash, may be /bin/zsh or /bin/tcsh). If it is not bash, then you need to google information for the line editor of whichever shell you are using.

Managing Jobs

Sometimes it will be convenient to spawn a process and continue working in the current shell. Usually this is accomplished by redirecting the stdout and stderr to a file:

myLongRunningCommand foo bar baz 42 &> theProcess.log &

When you launch the command, you’ll see something like:

[1] 19509

The number 19509 is the PID of the process. If you have multiple jobs going they can be summarized by typing jobs

[1]   Running                 myLongRunningCommand foo bar baz 42 &> theProcess.log &
[2]-  Running                 myLongRunningCommand foo bar baz 41 &> theProcess.log &
[3]+  Running                 myLongRunningCommand foo bar baz 40 &> theProcess.log &

Occasionally you’ll realize that you don’t want the jobs to run anymore, so to kill them:

kill %2

where %1 is the job number you are referencing. You’ll see something like:

[2]-  Terminated              myLongRunningCommand foo bar baz 41 &> theProcess.log &

Configuring SSH

Many of these tips are lifted from here. Put this in your ~/.ssh/config file:

ControlMaster auto
ControlPath /tmp/ssh_mux_%h_%p_%r
ControlPersist yes
ServerAliveInterval 30
ServerAliveCountMax 1

It is possible to setup ssh shorthand to route you to remote machines. The syntax (in ~/.ssh/config) is:

Host shortname
  #expands to shortname.remote.location.edu
  HostName %h.remote.location.edu 
  User username
  ForwardX11 yes #this is equivalient to ssh -Y
  IdentityFile ~/.ssh/id_rsa #path to your pubkey

SSH Keys

Follow this guide, stop at step 3. Now, when you need to start using a new machine:

ssh-copy-id user@remote.machine.name

Then enter your password. Now, when you type ssh user@remote.machine.name you will authenticate yourself with your newly minted RSA key, and you won’t have to enter your password. The downside is that you’ll have to enter your key’s passphrase to unlock it. See below for a way to unlock it once per session.

NOTE While it is cryptographically more secure to authenticate yourself with ssh keys, if your machine is compromised (ie stolen or hacked) your ssh keys can provide the attacker with easier access to all the machines you had access to. This means you should:

Use a strong pass phrase, not password. You need to maximize the number of bits of entropy in your key in order to make it difficult to crack should the keys fall into enemy hands.
Inform the admins of any machines you had access to if your machine is compromised
Encrypt your ssh keys (and other sensitive information) in a private directory that only you can access
NEVER EVER store your ssh keys on a third party site (like Dropbox or similar services)

SSH Agent

If you have ssh-agent running (through the gnome-keyring service on Ubuntu, or directly in your .xinitrc through ssh-agent blah) you can type ssh-add when you log in and it will add your ssh key to the keyring, then you can ssh to any machine that you have copied your key to without entering the password!

NOTE Once you’ve added your key to the ssh-agent, anyone can sit down at your keyboard and log into a remote machine as you! This means if you step away from your computer (even for a moment) you should lock the screen or log out.

Version Control Systems

The major version control systems in HEP is Git. Git is a series of tools and utilities to allow collaboration on large pieces of software.

Git also provides programmers with a convenient “paper trail” through the course of developing a piece of software. It allows them to revert the source code they are working on to any state that they’ve previously checked in.

Git is a software that was written by Linus Torvalds, the hacker behind Linux. It was written to manage the Linux kernel, a massive piece of software. Git’s model for managing source code is slightly different. In Git, you maintain the entire repository in your local copy. This makes committing, managing, and branching very fast. It also means you can work with all of the advantages of a version control system without internet access. Simultaneously there is a copy of the repository on a remote server. Git handles syncing these two repositories when instructed. This can lead to confusion if you’ve used other versioning systems, but shouldn’t be a problem if you have no expectations.

Some good Git tutorials:

type “man gittutorial” in the command line
Pro Git (an online book, modular and comprehensive in scope)
Git Immersion
Visual tutorial on branching
Git Concepts Simplified (slide show, click to advance)

Intermediate or advanced topics:

Undoing, fixing, or removing commits in Git
Simple Git workflow is simple
Git tips from the trenches

*rc Files

*rc files are special files that are executed every time a program starts. They almost exclusively live in the user’s home directory, and can be shadowed by the system. Sometimes they have a special syntax for setting options, sometimes they are written in a scripting language. The most relevant rc files for a new hitchhiker are:

.rootrc: the file that sets root options
.bashrc: this is executed every time you open a terminal in bash
.tcshrc: as above but for tcsh (hopefully you aren’t using this!)
.zshrc: as above for the zsh shell
.vimrc: configuration options for the venerable vim editor
.screenrc: options for gnu screen
.emacs: written in emacs lisp, is executed on startup, breaks the rc naming scheme. Advanced emacs users have multi-thousand line rc files

There are other files, if you want to know about them you can do:

ls .*rc

And google the ones that look interesting. Alternatively you can look at the system defaults:

ls /etc/*rc

Sometimes its useful to copy the system file to your home directory and then edit it there in order to add your customizations. Some programs document their options that way.

In recent years programs have begun migrating to putting their configuration files and options the .config subdirectory in a user’s $HOME directory.

For Mac OS X Hitchhikers

Everyone should read For Linux Hitchhikers to understand what functionality they’ll need (especially when working with or on remote machines). As a Mac user, you should also read ”It just works… or does it? The dark side of Macs in HEP” by Andy Buckley. It explains in detail issues with software development on a Mac. It is an opinion piece, so don’t expect it to be balanced. Also, consider asking your supervisor for an account on a Linux box and never look back.

Software you will need

XQuartz: Like XMing for Windows, XQuartz runs a local X11 server for tunneling X11 applications over SSH, unlike Windows, you don’t need a separate SSH program, ssh is built in.
Terminal.app: This is Mac OS’s default terminal emulator. It comes with Mac OS, so you shouldn’t need to install it. You should be aware of it though.
ROOT: The industry standard for High Energy Physics analysis. Beware: this program uses an Infinite Improbability Drive to perform analysis.
Aquamacs: A port of Emacs that uses Aqua as a standard OS X application. This integrates Emacs with the Mac OS UI. In the long history of corporate acquisitions a lot of Emacs hackers (from NeXTSTEP) ended up at apple, you will find that Mac OS integrates the Emacs experience much more fundamentally than any other OS in existance. (This doesn’t mean you need to use Emacs if you use Mac OS, just that your muscle memory will thank you subconsciously.)
MacPorts: A system for compiling and installing open source software on the Mac
Home Brew: A package manager for Mac OS, allowing you to install various utilities that don’t necessarily come pre-installed with Mac OS.

Editors

Like the major world religions, there are also major editors. In the *nix ecosystem there are two main editors: Emacs and Vim. There are others, but they are many, and beyond the scope of this guide.

The most important thing to do after choosing an editor is to work through its corresponding tutorial. An oft heard recommendation is that “Emacs is easier to learn than vi(m)”. A more accurate statement may be that it is easier to make things happen in Emacs than Vim, but the two editors are in some sense the yin and yang of text. True enlightenment in either of these editors takes roughly the same amount of time after completing the corresponding tutorial.

Finding an editor Guru

After you have finished the tutorial for your editor of choice, then it’s time to find a guru. Guru’s are best located by asking around. If you are talking with someone and notice they use your editor, don’t be afraid to ask them how they did something. Most of the time the Guru will be flattered and may even volunteer to help you with any other editor related questions.

Editor Guru etiquette

While it is generally OK to ask your Guru any editor related question, it is best to keep questions restricted to the editor in question. Flame wars have been fought for decades over which is the “one true editor.”

In order to prevent a faux pas, it is best to make sure you know which editor your guru uses. This is especially true in the case of a vi(m) or Emacs guru.

Another thing to be careful of is repeatedly asking basic questions. Again, some gurus will tolerate this at the beginning, but after a point the guru expects you to master the basics (on your own). The most valuable knowledge your guru can impart is not written in the tutorial that came with the editor.

Keeping your Guru happy

Guru’s subsist mainly on a liquid diet of caffeinated beverages during the day and beer (occasionally wine) at night. It is important that your Guru remain well lubricated. It is generally considered a good gesture to offer your Guru his/her beverage of choice if you’ve found him/her to be especially helpful on your path to enlightenment.

Emacs

The end goal of any student of the Church of Emacs is to obtain proficiency reprogramming the editor to solve the task at hand. This is ultimately stems from the philosophy of lisp (this gift was given to us by St. IGNUcious an AI hacker from MIT where Emacs was born). In lisp, the flexibility of the language allows it to be re-written to solve the problem as clearly as possible. In Emacs, an enlightened user will write a substrate of elisp (Emacs’ dialect of lisp) in order to solve the editing problem at hand.

While customizing and writing your .emacs (the initialization file loaded by Emacs in your home directory) is a spiritual journey, there are those who have done their best to illuminate the path. A brief guide to customization philosophies here.

The Editor finds the following packages essential:

tramp: If your reading this in Emacs, you can follow the link with “C-c C-o”. It is the most important aspect of Emacs for HEP users. It allows you to “visit” files on remote machines from the Emacs running on your desktop. It does this through ssh. To visit a remote file, type “C-x C-f” and then type ‘/ssh:user@remote.host:~/remote/path’, note that tab completion works remotely just the same as visiting a file locally! Tramp is also aware of ssh aliases in ~/.ssh/config, see Configuring SSH.
Calc - “”Calc” is an advanced desk calculator and mathematical tool written by Dave Gillespie that runs as part of the GNU Emacs environment.” It handles barns and electron volts out of the box!
filladapt: a mode for more intelligently filling text in paragraphs
flyspell: a spell checker that highlights mispelled words (will check in comments if in a programming mode)
rect-mark: Adds facilities for marking yanking and otherwise editing columnar formatted text.
dired (another info link): a directory editor for manipulating files in the Emacs way
solarized-theme: A theme by Ethan Schoonover, comes in dark and light variants that actually complement each other well, another good one is zenburn or gruvbox
ibuffer: changes the buffer interface and allows you to group buffers based on various buffer attributes
paredit: Enhances Emacs’s awareness of parenthetic structure
smartparens: Electrically pairs and deletes delimeters when appropriate (never miss a closing brace again!)
auto-complete: When setup properly, tab completes anything at any point depending on past input or names in other buffers.
auctex: LaTeX editing facilities (for when org-mode doesn’t quite cut it)
org-mode: This guide is written in org-mode. Org-mode can manage todo lists, write websites, serve as a lab notebook, execute code for literate programming and many other things. More relevant for physicists is the org-mode cookbook. People switch to Emacs just to get org-mode!

Init files of famous Emacs hackers are (in no order of awesomeness) Magnar Sveen, Technomancy, John Wiegley. There are also software packages that intend to comprehensively change the Emacs out of the box to a better user experience. The two most famous are Prelude and Emacs Live. An example (slightly annotated) init file can be found here.

Finally, there are some Emacs gurus who post blogs on the internet. Some particularly useful ones are Emacs Redux, Mastering Emacs, and Emacs Fu.

Various religious texts granting Emacs users various powers (such as reading email, chatting, tweeting, playing games, listening to music) can be found at the Emacs Wiki.

Vim

If Emacs is like Catholicism, then Vim is like Buddhism. Vim is the modern incarnation of vi, a modal text editor that descended from ed. The modal way of editing is by expressing in a few keystrokes how the text should be manipulated. This is in contrast to Emacs, where text is manipulated directly. This fundamental difference is the source of much confusion for new users, and is also why many people recommend Emacs as “being easier to learn.” This should not deter new users from learning vi(m), as its editing facilities are substantial.

A functional .vimrc looks like:

syntax on
set cursorline
set hlsearch
set ic
set incsearch
set ruler
set shiftwidth=4
set tabstop=4
set wrap

To learn Vim, type vimtutor at the command lime and follow the instructions. Take your time, and repeat the tutorial once or twice over a few days. In the mean time editors such as gedit or nano offer a more traditional experience. As your Vim skills improve, you will feel more comfortable with Vim and can stop using the less powerful editors.

Some useful links include:

Vim Videos Tutorial videos by Derek Wyatt, the novice videos are must see if you are new to vi(m)
Vim Genius a drill website for learning Vim commands
New user Vim Tutorial
Vim Koans tidbits of wisdom to ponder
A collection of extensions and plugins for Vim
YouCompleteMe A Vim autocompletion engine for editing.

Others

Followers of the Unix way realize that there are situations where a using a set of shell commands piped together may fit the task at hand more efficiently than either of the other two editors. Tools you should be familiar with are:

sed and one-liners
awk and one-liners
perl (and its poetry)
grep
Heretics exist which exhort the use of pico or even nano.

Always keep in mind

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. – Jaimie Zawinski

Software Design

Well designed software is a true marvel, in the same way architecture is a marvel. You are a stone mason, and you are building a cathedral. Repeat that last sentence every time you want to take a shortcut when coding. A cathedral can’t stand on a flimsy foundation.

In order to help you on your way, you should read the following:

Architecture of Open Source Applications
How to Write Unmaintainable Code (warning, many physicists take this guide literally)

Good software design is very hard, but when you have the pleasure of using well designed software, it is a true joy. Some examples of good HEP software:

Rivet: Robust Independent Validation of Experiment and Theory
Fastjet: Software package for jet finding

A brief introduction to C++

Caveat Emptor This information was written for C++03, it was old when I wrote it (2014), C++11 was too shiny and new. Now in 2022, C++11 is old and crufty. Modern C++ memory management looks very different than it did a decade ago. I’ve been out of the game too long to know what HEP is doing… They’re probably using C++11.

C++ is the industry standard programming language for analysis in HEP. Even if you are fortunate enough to do most of your work in Python, you will eventually be calling C++ code, and should understand some core concepts in order to debug problems should they arise.

Things to keep in mind:

This portion of the guide covers C++ at a high level. Very little specific syntax will be covered. When you have a C++ question, google is your friend.
When writing in any language, prefer that languages idioms. Don’t write python in C++, C in C++ or C++ in python.
C++ is a vast language, however being familiar with its roots, C, is invaluable.
If faced with a decision between learning C++ vs Python, prefer C++. C++’s syntax is more rigid and requires more overhead. Once you know C++, python is much easier to pick up.
There’s always an exception to the rule, just make sure it’s the right exception!

C++ is an imperative, object oriented language. It started out as a “C with classes” but has since bolted on significant language features different from C. Proficiency with C++ should be aimed towards comfortable use of the template meta-programming features of the language, although it is entirely possible to spend an entire career writing C++ without exercising this feature (just read the ROOT source code).

Pointers

Required Reading: The Descent to C

As C++ has evolved from C, it retains parts of C’s low level nature. Part of this is the need to be explicit about managing memory manually. This is in stark contrast to languages such as Java or Python where memory management is handled for the programmer.

A consequence of this is the ability to address specific cells of memory (the smallest accessible unit, typically a byte). An object (int, double, float, char, string, etc) may span several memory cells. A pointer is the computer’s representation of a memory cell’s location in memory, ie a memory address. Ultimately the programmer is interested in the data contained in the set of memory cells “pointed to” by the pointer. The act of retrieving this data is called ”dereferencing a pointer”.

As in physics, facility with manipulating pointers is best gained through experience, however many analogies have been developed to ease confusion. One analogy is street addresses, A street address is a sequence of numbers (the pointer) which instructs someone, a mailman say, (the computer), how to find a specific location. Once at that location, it is possible to manipulate objects located at that address (deliver mail if your the mailman, break the mailbox if your a bored teenager, knock on the door if you are a vacuum salesman etc).

Now some syntax:

Foo* bar = new Foo("Baz",42,"What is the question?");
std::cout << "object bar lives at memory address:"<<bar<<std::endl;
std::cout << "bar calculated a question to the answer to \"The Ultimate Question\" as "<<bar->TheAnswer()<<std::endl;
std::cout <<"Another way to get the answer is: "<<(*bar).TheAnswer()<<std::endl;

Lots of interesting things have been introduced here. Let’s look at a possible output of this program:

object bar lives at memory address: 0xd29ad0
bar calculated a question to the answer to "The Ultimate Question" as "What is 6x9?"
Another way to get the answer is: "What is 6x9?"

What happened? Let’s look at the first line

Foo* bar = new Foo("Baz",42,"What is the question?");

Foo* is a pointer of type Foo. It’s an address to a chunk of memory that contains an instance of Foo.

Question: Why does the compiler need to know that it’s a Foo type object at that address?

Answer: Foo might fall across several memory cells, in which case the compiler must know how many memory cells to move if you ask for the bar+1 spot. In fact, in C there is a concept called the void*, a type-less pointer that is an address to anything. It is the programmer’s responsibility to cast the void* to the correct type.

OK, so we have a pointer to an object of type Foo called bar.

Question: What happens on the right hand side of the assignment operator (=)?

Answer: C++ reserves the keyword “new” for memory allocation. The “new” keyword takes a class constructor on the right hand side, and returns a memory address on the left hand side. This address gets stored in the variable bar.

Operationally, the “new” keyword allocates a chunk of memory to hold the object on the right hand side, and returns a pointer to the beginning of the chunk.

What happens when we want to access the memory that the pointer points to? There is another operation called “dereferencing” which goes to the address pointed to and returns the object contained at that point in memory. Consider the following snippet:

double* foo = new double(3.14159);
double pi = *foo;
std::cout <<"Pi is: "<<pi<<std::endl;

Here, a chunk of memory has the value 3.14159 written to it, then that value is retrieved and stored in another location of memory called pi. That data is the written out the terminal by std::cout and std::endl.

Now we can understand this line:

std::cout <<"Another way to get the answer is: "<<(*bar).TheAnswer()<<std::endl;

It means, retrieve the object pointed to by bar, and call the method ”TheAnswer()” on it. Programmers abhor syntax that can easily get them into trouble, so the language designers (of c) added a shorthand for this kind of operation (the -> operator):

Foo* bar=new Foo();
if(bar->Value()==(*bar).Value()){
  std::cout<<"They're the same!"<<std::endl;
}

Quiz: What will the output of this snippet be?

Why are pointer’s useful at all?

Clever hitchhikers will notice that this appears to be a bunch of bureaucratic mucking about with pointless details, most of the time it is. Since most of HEP deals with pointless details bureaucratically, a lot of HEP code uses pointers.

To understand the real purpose of pointers, we must examine dynamic allocation. Consider the following code:

double* foo(){
  double* bar = new double(0.0);

  {
    double baz=42;
    *bar=baz;//dereference bar, and store the value of baz 
  }
  //baz is out of scope
  return bar;
}
int main(void){
  double* foobar=foo();
  std::cout <<"The Answer to the Ultimate question is :"<<*foobar<<std::endl;
  delete foobar;
  return 0;
}

Let’s execute the code in our mind:

execute main(void)
a double* named foobar is allocated.
foo() is executed
1. a double* named bar is allocated
2. new initializes a double with value 0.0
3. new assigns the address containing that double to bar
4. enter the braces, intialize a double named baz with value 42
5. dereference bar and copy the value of baz into it
6. exit the braces and free the memory where baz was
7. return the address containing the value of bar;
Assign the value returned by foo() to foobar
stream the string “The Answer…” to stdout
dereference *foobar to obtain the value stored at bar, 42, stream that to stdout
add a newline to the output and flush the result to the terminal with std::endl
free the memory allocated for bar when foo was called.
return 0
exit the program

References

A similar concept present in C++ (but not c) are references. They can be thought of as aliases (the way Dave is an alias of David). Their syntax is:

int foo=42;
int& theAnswer=foo;
foo=0;
std::cout<<theAnswer<<std::endl;

Here foo is initialized to the value 42, then a reference named theAnswer is declared and assigned to foo. All this does is make a new name for the same object. What does the program output?

The answer is 0. References seem pointless (pun intended) until they’re used in function definitions:

void bar(int& foo){
  //complicated calculation for foo
  foo++;
}
int main(void){
  int baz=41;
  bar(baz);
  std::cout<<baz<<std::endl;
}

Question: What is the output of this program?

Answer: 42

Methods

Methods, or functions are defined as:

return_type function_name(arg1_type arg1, arg2_type arg2, ...){
  //statements that define function_name
}

It is possible to “forward declare” functions, these are “promises to the compiler” that you have a function with a particular signature:

double foo(double,double);
// important other stuff
double foo(double theta, double phi){
  return sin(theta)*sin(theta) + cos(phi)*cos(phi);
}

Notice that the compiler doesn’t need to know the names of the arguments in the forward declaration.

Before we move onto the next topic, a note on methods. Most of the time during development, you only have a few helper functions. This is fine! Just write your helper functions in a header file, and include them. Write the main function and move on with your life. There are many examples in HEP, where methods have been pigeon-holed into classes. The result is a cumbersome interface for the user (YOU!) or more importantly your supervisor. With that in mind, lets move on to classes.

Classes

STOP! Read the last paragraph of the previous section.

Now, ask yourself: Do I really need a class? No. Ok, great!

Yes. Are you sure? Maybe your needs are better served with a few functions and a well defined interface.

Do you have complicated data structures that need to be operated on by many methods? No? Maybe your needs are better served with a few functions and a well defined interface.

Yes? Maybe you should rethink your design.

You’ve rethought it and realized that you have to use a class because the person before you did, now there isn’t a clean way to do it any other way. OK, classes.

Editor’s Note: Our dear reader may have been studying this guide and thought to themselves: “Gee wiz, there is a lot of great advice here, I should follow this exactly or dragons might eat my analysis program!” While this may be a possibility, the previous advice on classes may be a little, well, misguided.

More sound advice may sound like this: When beginning development, make your first (and maybe second) drafts of the program without classes. This will force you to recognize which parts of your program fit together (data+methods) and which are separate. Then, revise your program to include classes with the appropriate encapsulation of data and methods. This will ensure that you write a clean interface, and that you don’t end up with 20 methods that all take the same four arguments.

The basic syntax is:

class A{
public:
  A():a(0),b(0){};
  A(int _a):a(_a),b(0){};
  ~A(){};
  void SetA(int new_a){a=new_a;};
  void GetA(){return a;};
  void SetB(int new_b){b=new_b;};
  void GetB(){return b;};
private:
  int a;
protected:
  int b;
};

This is a trivial example, and it breaks many rules about naming conventions and clarity. It is not a good example. It should not be used as a good example to win arguments about concise code. In fact, you probably shouldn’t have read it.

Important features of the code: Anything after the public keyword is accessible to the outside world, ie:

A myObj;
myObj.SetA(10);
myObj.GetA();
myObj.SetB(42);
myObj.GetB();

Is all valid code, anything that is written after private, is just that. You cannot access it outside of the class:

myObj.a; //Compiler error
myObj.b; //protected is a special form of private

The protected keyword is for class inheritance. It says that these variables and methods are private for users of the class, but if another class inherits from this one, they inherit these symbols.

Normally in class inheritance, you only inherit the public members of the class. The private members are not inherited. Protected offers a way to encapsulate data, but also share data among inheritance diagrams.

In case you haven’t picked up on it, classes are one of the hairier aspects of C++. It’s better if you refer to some other resource for a tutorial on classes, as their subtleties are beyond the scope of The Guide.

An even briefer introduction to Python

Python is a wonderful language. It is expressive and allows rapid prototyping with a shell type environment. Try learning it the hard way. Another approach is to google what you’re trying to do, and make it run on a small test case.

When writing code, it is best to be idiomatic. This is especially true in python. Python’s driving philosophy is “one right way” but since python is being developed by multiple hackers, there are “many right ways.”

Here are some references, learn python3.

The official recommendations for best practices.
A non-programmer’s tutorial on python 3 (also see the linked version for 2.7 as HEP is still using 2.7 or earlier in many cases)

PyROOT: ROOT bindings in python

Mostly Harmless.

ROOT

For better or worse, for richer or poorer (always poorer), HEP Physicists are married to (read: stuck with) ROOT. It’s the Unix of HEP. There is method in the madness, though it is not clear what the method is (just yet). The Editor is fairly certain that ROOT uses Bistromathics for many of its statistical operations.

ROOT is the industry standard tool for analyzing and manipulating gobs of data. Other statistical analysis tools crash and burn on the datasets that ROOT eats for breakfast.

Recently, ROOT has begun the transition from the 5 series to the 6 series. This transition has resulted in huge changes to the front and back-end of ROOT. Below there will be some comparisons to both versions. When something applies to the 5 series, it will be annotated ROOTv5. It should be assumed that ROOT or ROOTv6 refers to the newest 6 series. Without further ado, let’s set it up and get to work!

Installing and setting up

The ROOT devs have done a great job of documenting how to install pre-compiled versions on various OS’s. Sadly its not straightforward on Debian-based (aka Ubuntu) systems. Windows binaries are available but they’re beta-use.

Building From source

These instructions are for *nix based systems (ie it was written for Ubuntu, but MacOS shouldn’t be much different and Windows is out of the question).

When choosing a version of ROOT, always pick the ‘pro’ (pro for production) version. It’s the latest, stable, version recommended by the ROOT Devs.

Nota Bene If you’re doing this on Mac OS, you’ll need to use brew install blah instead of apt-get install blah, and the package names will probably be different.

If you haven’t yet, read the section of the guide relevant to your OS.

For our install of ROOT, we’ll be compiling and running it locally. This has a few advantages:

“Uninstalling” is easy, either unset the environment variables pointing to ROOT, or completely delete the folder that root lives in (in this example ~/root)
Having multiple versions side-by-side is possible, you could have:
- ~/root-clang a version of root compiled with clang
- ~/root-5.34 a stable version of ROOT
- ~/root-5.99 the beta version of ROOT 6
- To use any of them you would just have to source ~/root-ver/bin/thisroot.sh
You don’t need root (administrative) access on the machine (as long as the pre-req’s are installed). This is generally nice since it decouples ROOT from the hosting OS.

Getting the Pre-Requisites

Now, all of the following information is documented at root.cern.ch, but it is even less organized than this guide. The following is a straight-shot from no source code to a fully working ROOT binary on a clean install of Ubuntu (Ubuntu 21.10)

End to end, this takes ~40min on a machine circa 2011, so budget some coffee time.

sudo apt-get install git dpkg-dev cmake g++ gcc binutils libx11-dev libxpm-dev \
     libxft-dev libxext-dev python libssl-dev  gfortran libpcre3-dev \
     xlibmesa-glu-dev libglew1.5-dev libftgl-dev \
     libmysqlclient-dev libfftw3-dev libcfitsio-dev \
     graphviz-dev libavahi-compat-libdnssd-dev \
     libldap2-dev python-dev libxml2-dev libkrb5-dev \
     libgsl0-dev qtwebengine5-dev

This takes about 1gb of disk space. You may have some of these packages already if you’ve installed build-essential or git before. In either case, apt is smart enough to see that and not re-install them.

The above list of packages are for a full-blown, all-features-enabled version of ROOT. If you want a stripped down version, you’ll have to get the pre-reqs from http://root.cern.ch.

May the Source be with you

Let’s get a copy of the source:

git clone http://root.cern.ch/git/root.git
cd root

For future reference, if you want to update:

git pull
git tag -l 
git checkout -b tag-name tag-name

git tag -l lists all the available tags, choose the one you want and substitute it for tag-name

For now, lets checkout the latest pro branch:

git checkout -b $(git tag -l | tail -n 1) $(git tag -l | tail -n 1)

This will checkout the branch latest stable tag and give it the same name. This assumes that when the ROOT devs tag a branch its for stable production use… but the prescription is future proof. If you’re new to ”version control systems”, or ”source control management” then it’s useful to do a tutorial to learn the basics. HEP and open source projects have mostly migrated to git.

Building

ROOT uses cmake to build now. Its slightly more straightforward, and much quicker but it has some idiosyncrasies. Specifically (and not explicitly spelled out in the ROOT docs), cmake requires three separate spaces, a place for build artifacts (object files and libraries), a place for the source code (that is managed by git), and an install target. Traditionally the install target is somewhere in the system directories where other programs are installed. In ROOT’s case, there are many situations where it is necessary to have multiple versions available on one system. The ROOT devs created a ’thisroot.sh’ script which when sourced into a user’s shell session, makes the root in that folder the currently active one. It does this by setting appropriate environment variables that instruct the operating system to search the external directories for programs and required libraries.

# assumes we're in the root source directory
mkdir -p root_build root_install; 
cd root_build; 
cmake -DCMAKE_INSTALL_PREFIX=../root_install/ ../

After configuring you should see:

...
Install path    /home/username/root/root_install
...
-- Enabled support for:  asimage builtin_afterimage builtin_clang builtin_cling builtin_gl2ps builtin_llvm builtin_lz4 builtin_lzma builtin_nlohmannjson builtin_openui5 builtin_tbb builtin_vdt builtin_xrootd builtin_xxhash clad dataframe exceptions fftw3 fitsio gdml http imt mathmore mlp minuit2 mysql opengl pyroot roofit webgui root7 rpath runtime_cxxmodules shared ssl tmva tmva-cpu spectrum vdt x11 xml xrootd
-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/root/root_build

After cmake runs, it creates a Makefile for the rest of the build process. Make sure the install path is pointing to root_install, if its pointing to /usr/local you’ll need administrator privileges to install, and it’ll be harder to uninstall cleanly. Now you’re ready to build:

make -j $(cat /proc/cpuinfo | grep processor | cut -d ':' -f 2 | tail -n 1)

The -j option tells make how many jobs it can run simultaneously. Without -j, only one job runs. A good rule of thumb is to choose the number of cores you have on your computer. If you are compiling on remote computer, it is probably shared by others, in which it is good etiquette to run your jobs single threaded. On your laptop, you should choose one less than the number of cores you have (so you don’t notice a slow-down while its building in the background).

See man make and look under the option “-j [jobs]” for more detailed information about this switch. Some software doesn’t like being built in parallel. If a build fails (root won’t) run it without the ’-j’ option.

This will take (depending on your hardware (even in 2022)) between 20-45min, so now is a good time for a cup of tea, or coffee with your editor guru.

When it’s finished, it will print out:

[lots of boring crap]

[100%] Linking CXX executable ../../bin/hist2workspace
[100%] Built target hist2workspace

If you do not get this output, but the build just ends with [lots of boring crap] find a senior grad student and have them look at the [lots of boring crap] (it won’t be boring to them). They will be able to instruct you on what went wrong.

If you don’t have a senior grad student handy, try googling some of the output and seeing if you can get anywhere. There is also the ROOT Talk Forums.

Finally, you need to ‘install’ root.

make install

Using your new superpower

Now, when you want to use root, you can run the command:

source ~/root/root_install/bin/thisroot.sh

If you are using tcsh (you shouldn’t be) you would need to run:

source ~/root/root_install/bin/thisroot.csh

There are differing opinions about whether or not you should put something like this in your bashrc. One school of thought (especially applicable when you bounce between different versions) is that you should keep your environment as clean as possible and only setup what you need. In that case adding:

alias setupROOT='source ${HOME}/root/root_install/bin/thisroot.sh'

To your ~/.bashrc file is enough.

Then, whenever you need root, you have to run setupROOT, before you can run root.

Another school of thought is that, you should always have some copy of root available if possible. In that case the following will always setup root when bash runs if the setup file exists:

[ -f ~/root/root_install/bin/thisroot.sh ] && source ~/root/root_install/bin/thisroot.sh

Now you can start root by typing root at the command line. You should see:

   ------------------------------------------------------------------
  | Welcome to ROOT 6.25/02                        https://root.cern |
  | (c) 1995-2021, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Feb 20 2022, 01:06:19                 |
  | From tags/v6-25-02@v6-25-02                                      |
  | With c++ (Ubuntu 11.2.0-7ubuntu2) 11.2.0                         |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------

root [0]

To start root without the splash screen type root -l in which case you see:

root [0]

Getting Started with xrootd

It appears that xrootd is built with root by default. If you want to be able to access files hosted at CERN, you’ll have to have a kerberos client installed (either krb5-user or heimdal-clients on debian or ubuntu systems). Then before connecting, run kinit -5 ${CERN_USER}@CERN.CH to generate a kerberos certificate.

This might be a little in the weeds, but if your event processing bandwidth is smaller than the network bandwidth you can achieve the same performance as a local file-system without needing to copy large data sets locally. Of course there are lots of considerations, not least of which where your machine is relative to the data on the network.

A Path to ROOT enlightenment

There are three methods of running code through ROOT to produce results. These methods are listed below, each more sophisticated than the last. They also include example code intended as a starting point for hacking.

As a reminder, see Obtaining a copy and supporting material for instructions on obtaining and running the example code.

Level 1: Macros

The first, and simplest way to execute ROOT related code is the humble macro. A macro is a set of ROOT commands enclosed by braces. For example:

{
  TFile* file= TFile::Open("MeaningOfLife.root");
  TH1F* hist = (TH1F*)file->Get("Hist1");
  cout << hist->GetNbinsX() <<endl;
}

For ROOT 6: ROOTv6 macros are compiled with the LLVM backend “just in time” (JIT). This means that you have to type valid C++ in order to make things work correctly. This alleviates a lot of the issues that make Macros fragile, and so it is easier to transition from Level 1 to Level 2, and you can achieve more with a macro than before.

For ROOT 5: While not immediately obvious, ROOTv5 macros are not written in C or C++, but CINT. CINT covers “most of” ANSI C and ISO C++ 2003. There are some important differences:

foo.blah and foo->blah are interchangeable
a semicolon ‘;’ at the end of a line is optional
No need to ”#include” headers

As you progress in writing more sophisticated C++, you will run into CINT’s shortcomings as a C++ interpreter. It is recommended that you move to Level 2 or 3 before this happens.

While it is possible to write complicated CINT macros (files with multiple function definitions) it is not recommended. CINT has a habit of keeping up the appearance of doing one thing when in reality something entirely different is happening “behind the scenes”.

CINT is best used for quick scripts to plot histograms already saved to a disk, or to inspect a few branches from a TTree. More sophisticated analyses are better served by Levels 2 and 3.

Level 2: Compiled Macros

Compiled macros are full-blown C++ programs. Generally there is a “steering macro” that handles compiling and loading the required libraries. An example steering macro:

{
  //may need to load other libraries or files that depend on analysis.C
  gROOT->ProcessLine(".L analysis.C++");
  gROOT->ProcessLine("doAnalysis()");
}

The compiled macro itself looks more like a traditional C++ program:

#include <cstdlib>
#include "TFile.h"
#include "TH1F.h"

int doAnalysis(){
  
  return 42;
}

Since the ROOT binary already defines a “main” an error will occur if you redefine another function named “main”, therefore we use the verb “doAnalysis”.

The steering macro that compiles each source file can become arbitrarily complex. To some this may read “flexible”, to others it may read “disorganized”. If your analysis grows into a multi-file program, it’s probably time to ascend to Level 3.

Level 3: Compiled Programs

A compiled program is just that. Here ROOT takes the role of a rich set of libraries for composing a C++ based analysis.

An example program (and supporting Makefile) is included here.

Makefiles come with their own overhead, but the make system is very powerful. The make manual is very readable with many examples.

A note on Enlightenment

Master Foo, of Rootless Root, gives the sage advice:

“When you are hungry, eat; when you are thirsty, drink; when you are tired, sleep.”

To spell it out (and to prevent the reader from enlightenment), it is wise to choose the use of ROOT which is most appropriate for a task at hand. The practicing HEP physicist is proficient with all three levels, and can pick and choose which approach is best for the task at hand.

PyROOT

PyROOT are a set of python bindings to ROOT. It works fairly well out of the box, but there are some things to keep in mind.

Idiomatic python avoids ”from ROOT import *”, prefer ”from ROOT import blah”
the ROOT devs know you aren’t going to be idiomatic, so instead they’ve implemented a lazy loading system (ROOT is huge, so ”from ROOT import *” would take forever). This may be confusing if your a python expert and expect exploration commands like dir() to work with ROOT.
If performance matters, try to stay in C++ land (ie call C++ functions from python) as much as possible
- If performance really matters, write it in python and then port it to C++. This is fairly advanced, but not impossible. You’ll have to generate CINT dictionaries for your source files.

If you’re looking for a more “pythonic” (not my word) experience, maybe give rootpy a shot. See also <a href=”An even briefer introduction to Python”>An even briefer introduction to Python for resources to learn python itself.

Fitting Data with RooFit

RooFit is a shiny penny compared to the rest of the ROOT ecosystem. It has its own quirks and idioms, but the interfaces are fairly reasonable and the manual is well written. The latest version of the manual and quickstart can always be found here:

ROOT User’s guide, Roofit Manual

Direct links are here, though they aren’t guaranteed to be current:

RooFit Manual (PDF) 2.91-33
RooFit Quick Start Guide (PDF) 3.00

Styling Plots

A well designed graph is truly a work of art. The path from paltry graphics spit out by ROOT to something worthy of framing (and yes, truly amazing data visualizations are routinely framed) is long and fraught with naysayers who will insist that you are doing it wrong. Ignore them, and do what needs to be done to communicate your hard won data clearly and concisely. To get you started, give ”Principles of Information Display for Visualization Practitioners” a read. It is an executive summary of Edward Tufte’s works on visualizing information. If what you read resonates with you, then please read Tufte’s books. They make for delightful coffee time distractions. After reading his books you will start to see his influence in particularly nice graphics. You will also see many sins committed by other practitioners. Choose your role-models wisely.

Sage advice (passed down from The Editor’s first mentor):

When you make a plot, take the time to make it publication quality and reproducible.

This means two things:

Make it good enough to go into a paper
Prefer generating it with C++/Python over any other format (data inputs will frequently change at the last minute, and being able to “hit a button” and get the plot is very useful unless you have a room full of Mechanical Turks lying around)

This also means it’s probably a good idea to keep *.root files containing your histograms for last minute style changes if you are writing a presentation.

Producing a publication quality plot can be challenging, however ROOT includes the concept of a “Style” which can be applied. These are global rules for how plots should be printed. In previous versions of ROOT, the default style was notoriously bad. In The Editor’s humble opinion, this was done to simultaneously encourage each physicist to set their own standard, and to immediately identify ROOT newbies from seasoned ROOT hackers.

Now, things are better, though the idea that “each physicist set their own standard” has stuck, and so there are many styles floating around.

Example Style

An example style (probably from the CMS TDR, the details are lost to time):

{
  TStyle *tdrStyle = new TStyle("tdrStyle","Style for P-TDR");

  cout << "TDR Style initialized" << endl;

// For the canvas:
  tdrStyle->SetCanvasBorderMode(0);
  tdrStyle->SetCanvasColor(kWhite);
  tdrStyle->SetCanvasDefH(600); //Height of canvas
  tdrStyle->SetCanvasDefW(600); //Width of canvas
  tdrStyle->SetCanvasDefX(0);   //Position on screen
  tdrStyle->SetCanvasDefY(0);

// For the Pad:
  tdrStyle->SetPadBorderMode(0);
  // tdrStyle->SetPadBorderSize(Width_t size = 1);
  tdrStyle->SetPadColor(kWhite);
  tdrStyle->SetPadGridX(false);
  tdrStyle->SetPadGridY(false);
  tdrStyle->SetGridColor(0);
  tdrStyle->SetGridStyle(3);
  tdrStyle->SetGridWidth(1);

// For the frame:
  tdrStyle->SetFrameBorderMode(0);
  tdrStyle->SetFrameBorderSize(1);
  tdrStyle->SetFrameFillColor(0);
  tdrStyle->SetFrameFillStyle(0);
  tdrStyle->SetFrameLineColor(1);
  tdrStyle->SetFrameLineStyle(1);
  tdrStyle->SetFrameLineWidth(1);

// For the histo:
  // tdrStyle->SetHistFillColor(1);
  // tdrStyle->SetHistFillStyle(0);
  tdrStyle->SetHistLineColor(1);
  tdrStyle->SetHistLineStyle(0);
  tdrStyle->SetHistLineWidth(1);

  tdrStyle->SetEndErrorSize(2);
  //tdrStyle->SetErrorMarker(20);
  tdrStyle->SetErrorX(0.);
  
  tdrStyle->SetMarkerStyle(20);

//For the fit/function:
  tdrStyle->SetOptFit(1);
  tdrStyle->SetFitFormat("5.4g");
  tdrStyle->SetFuncColor(2);
  tdrStyle->SetFuncStyle(1);
  tdrStyle->SetFuncWidth(1);

//For the date:
  tdrStyle->SetOptDate(0);
  // tdrStyle->SetDateX(Float_t x = 0.01);
  // tdrStyle->SetDateY(Float_t y = 0.01);

// For the statistics box:
  tdrStyle->SetOptFile(0);
  tdrStyle->SetOptStat(0); // To display the mean and RMS:   SetOptStat("mr");
  tdrStyle->SetStatColor(kWhite);
  tdrStyle->SetStatFont(42);
  tdrStyle->SetStatFontSize(0.025);
  tdrStyle->SetStatTextColor(1);
  tdrStyle->SetStatFormat("6.4g");
  tdrStyle->SetStatBorderSize(1);
  tdrStyle->SetStatH(0.1);
  tdrStyle->SetStatW(0.15);
  // tdrStyle->SetStatStyle(Style_t style = 1001);
  // tdrStyle->SetStatX(Float_t x = 0);
  // tdrStyle->SetStatY(Float_t y = 0);

// Margins:
  tdrStyle->SetPadTopMargin(0.15);
  tdrStyle->SetPadBottomMargin(0.13);
  tdrStyle->SetPadLeftMargin(0.13);
  tdrStyle->SetPadRightMargin(0.15);

// For the Global title:

//  tdrStyle->SetOptTitle(0);
  tdrStyle->SetTitleFont(42);
  tdrStyle->SetTitleColor(1);
  tdrStyle->SetTitleTextColor(1);
  tdrStyle->SetTitleFillColor(10);
  tdrStyle->SetTitleFontSize(0.05);
  // tdrStyle->SetTitleH(0); // Set the height of the title box
  // tdrStyle->SetTitleW(0); // Set the width of the title box
  // tdrStyle->SetTitleX(0); // Set the position of the title box
  // tdrStyle->SetTitleY(0.985); // Set the position of the title box
  // tdrStyle->SetTitleStyle(Style_t style = 1001);
  // tdrStyle->SetTitleBorderSize(2);

// For the axis titles:

  tdrStyle->SetTitleColor(1, "XYZ");
  tdrStyle->SetTitleFont(42, "XYZ");
  tdrStyle->SetTitleSize(0.06, "XYZ");
// The inconsistency is great!
  tdrStyle->SetTitleXOffset(1.0);
  tdrStyle->SetTitleOffset(1.5, "Y"); 

// For the axis labels:

  tdrStyle->SetLabelColor(1, "XYZ");
  tdrStyle->SetLabelFont(42, "XYZ");
  tdrStyle->SetLabelOffset(0.007, "XYZ");
  tdrStyle->SetLabelSize(0.05, "XYZ");

// For the axis:

  tdrStyle->SetAxisColor(1, "XYZ");
  tdrStyle->SetStripDecimals(kTRUE);
  tdrStyle->SetTickLength(0.03, "XYZ");
  tdrStyle->SetNdivisions(510, "XYZ");
  tdrStyle->SetPadTickX(1);  // To get tick marks on the opposite side of the frame
  tdrStyle->SetPadTickY(1);

// Change for log plots:
  tdrStyle->SetOptLogx(0);
  tdrStyle->SetOptLogy(0);
  tdrStyle->SetOptLogz(0);

  tdrStyle->SetPalette(1,0);
  tdrStyle->cd();
}

If your working with one of the major experiments, they’ll most likely have a style for you to use (It will invariably be 95% the same as above, but the 5% will make all the difference).

Transparent Plots

Add this to your ~/.rootrc (or create it if it doesn’t exist):

# Flag to set CanvasPreferGL via gStyle
OpenGL.CanvasPreferGL:                      1

Now, in your plotting code:

TColor* color = gROOT->GetColor(TColor::GetColor(red,green,blue));//Use ints from 0 to 255 
color->SetAlpha(0.5);//0 is fully transparent, 1 fully opaque
hist->SetFillColor(color->GetNumber());

Is this a clean interface? No, but it can be just what your graphic needs to remain clear without cluttering the canvas with hatching.

Two warnings:

As of this writing, this is only supported for “popular” output formats (pdf, svg, gif, jpg, and png), though notably not postscript (ie ps).
It’s very easy to create a shade that cannot be properly rendered on a projector, making the transparent component of your plots invisible.

Extending ROOT with custom classes

Sometimes it is useful to add your own developed classes to ROOT (so they can be used in ROOT macros/PyROOT scripts or so they can be stored in a ROOT file). For example you may have a class of the form (in a file called CustomEvent.h, for example):

#ifndef CustomEvent_h
#define CustomEvent_h

#include "TObject.h"

class CustomEvent : public TObject {
  ClassDef(CustomEvent,1);
private:
  int    _eventID;
  double _eventEnergy;
public:
  CustomEvent() {}
  virtual ~CustomEvent() {}

  void set_eventID(const int eid);
  void set_eventEnergy(const double e);

  int eventID() const { return _eventID; }
  double eventEnergy() const { return _eventEnergy; }
};

#endif

It is a good idea to inherit from TObject if you require reading or writing objects to disk. Official documentation of the ins and outs of adding classes can be found in Chapter 15 of the User’s Guide, as well as information here: cint and ClassDef. We also need CustomEvent.cxx:

#include "CustomEvent.h"

CustomEvent::CustomEvent() {}
CustomEvent::~CustomEvent() {}

void CustomEvent::set_eventID(const int eid) { _eventID = eid; }
void CustomEvent::set_eventEnergy(const double e) { _eventEnergy = e; }

To make this class accessible within (Py)ROOT you must create a LinkDef.h file of the form:

#ifdef __CINT__
#pragma link off all globals;
#pragma link off all classes;
#pragma link off all functions;

#pragma link C++ class CustomEvent+;

#endif

Now you must generate a dictionary for the class:

rootcint -f CustomEventDictionary.cxx -c CustomEvent.h LinkDef.h

This generates CustomEventDictionary.cxx and CustomEventDictionary.h. Now you can compile the source for your class and the new dictionary source. Then finally link them together in a library:

g++ -fPIC -c CustomEvent.cxx `root-config --cflags`
g++ -fPIC -c CustomEventDictionary.cxx `root-config --cflags`
g++ -shared -o libCustomEvent.so CustomEvent.o CustomEventDictionary.o `root-config --glibs`

Now by loading the library libCustomEvent.so you can use your class within ROOT. For example, the macro:

{
  gSystem->Load("libCustomEvent");
  CustomEvent a;
  a.set_eventID(42);
  std::cout << a.eventID() << std::endl;
}

Important Gotcha’s

At some point you’ll get stuck. Hopefully you’ll be stuck on a good problem, but more often than not you’ll be stuck on some quirk that ROOT has. Remember ROOT’s mantra: ”It’s not a bug, it’s a feature!”

ROOT’s object protocol is very strange. The naming schema is based on an industry standard for C programs where it’s not possible to use namespaces. The result is very confusing for new users (very good for HEP). Every object in root that can be written to disk (ie saved in a ROOT file) derives from a TObject base class. This base class defines a protocol for objects. (All objects can print themselves, have a name, have a title, have a class name, etc). This makes it possible to have a list of disparate objects (as long as it’s a list of TObject*). As you gain more experience with ROOT, this becomes a power tool. Like any power tool (as Tim Taylor can attest), this can be abused to no end.

TTrees

Drawing trees

When you call TTree::Draw to draw multi-dimensional histograms, the order of the axes is “z:y:x” rather than the expected “x:y:z”

Caching trees

Use TTreeCache to loop over trees rather than the standard TTree::GetEntry(i) idiom. A TTreeCache learns which branches you access most often and caches them, speeding up your processing time significantly. Documentation here. Since you won’t read to the end (no one does…) here is the docs for when not to use a TTreeCache:

  SPECIAL CASES WHERE TreeCache should not be activated


 When reading only a small fraction of all entries such that not all branch
 buffers are read, it might be faster to run without a cache.


 HOW TO VERIFY That the TreeCache has been used and check its performance


Once your analysis loop has terminated, you can access/print the number
of effective system reads for a given file with a code like
(where TFile* f is a pointer to your file)

 printf("Reading %lld bytes in %d transactions\n",f->GetBytesRead(),  f->GetReadCalls());

Splitting Trees

If you want to split a TTree into $n$ statistically independent parts use something like:

TTree* outTree=tree->CopyTree("Entry$%n==i");

Here, n is the number of parts requested, i is the i’th part. If you’re just splitting it in half, a full blown macro (from the trenches) would look like:

// A macro to split a tree
{
  TFile* file=TFile::Open("./merged_dijets.root");
  TTree* tree=(TTree*)file->Get("micro");
  TFile* fileA=new TFile("UnfoldingStudy.dijets-pt1.root","RECREATE");
  fileA->cd();
  TTree* treeA=tree->CopyTree("Entry$%2==0");
  treeA->Write();
  fileA->Write();
  fileA->Close();
  TFile* fileB=new TFile("UnfoldingStudy.dijets-pt2.root","RECREATE");
  fileB->cd();
  TTree* treeB=tree->CopyTree("Entry$%2==1");
  treeB->Write();
  fileB->Write();
  fileB->Close();
  
}

TH1

Despite the name, TH1 is the base class for all histograms. This can lead to much !FUN!. Be extra wary of null pointers when handling TH1’s of unknown origin.

TH2

Splitting a 2D

One would expect an interface method like TH2D::Split(), but instead you need to use the appropriate THStack constructor: THStack(const TH1* hist, Option_t* axis = "x", ...)

Then call THStack::GetHists to get a TList of the histograms. Of course, you’ll have to use ROOT’s idioms for iterating over lists,

Another, slightly more direct option is TH2::ProjectionX() and TH2::ProjectionY(), used in the following fashion:

std::vector<TH1D*> split_list;
for(size_t i = 1; i < Hist2D->GetNbinsX()+1; i++) {
  split_list.push_back(Hist2D->ProjectionY("_py",i,i+1,"e"));
}

Depending on your ROOT version, you may have to change the =”_py”= string to be something unique. This can be done as follows:

std::vector<TH1D*> split_list;
char buff[256];
for(size_t i = 1; i < Hist2D->GetNbinsX()+1; i++) {
  snprintf(buff,sizeof(buff)/sizeof(*buff),"%s_%u_py",Hist2D->GetName(),i);
  split_list.push_back(Hist2D->ProjectionY(buff,i,i+1,"e"));
}

User Beware: not using a unique name can cause unexpected behavior when drawing the list of slices.

THStack

THStack is not mostly harmless. In fact, if you want to do reasonable things with THStack, it probably won’t work, and if it does, it may work once and not the second time. If you can avoid THStack, then do it.

Get Sum of Stack

To get a histogram representing the sum of a stack use THStack::GetStack()->Last()

A pointer here, a pointer there, who am I?

If you want to iterate over the stack, there are two ways to get the underlying objects. They are not equal. Option 1:

THStack stack = new THStack("Stack","A stack of histograms");
// later on ...
TIter next(stack->GetHists());
TH1* hist = NULL;
while((hist=dynamic_cast<TH1*>(next()))){
  // Do something with *original* histograms added to stack
}

Option 2:

TH1* hist = NULL;
for(int i = 0; i < stack->GetStack()->GetEntries(); i++){
  hist = dynamic_cast<TH1*>(stack->GetStack()->At(i));
  // Do something with histograms to be painted on the canvas
}

These two methods appear equal but are not. The TList method from THStack::GetHists give you the original pointers (ie those that you added with THStack::Add). The TObjArray method from THStack::GetStack gives you the internal histograms which will be used to draw the histograms on the canvas.

TFile

TFile’s are greedy about object ownership. In fact, object ownership in ROOT is a very common ~~bug~~ feature. Many times you’ll own objects you thought you didn’t (memory leak) or, you’ll delete objects you thought you did (double free core-dump).

The rule of thumb to keep in mind is “TObjects declared after a file is opened are owned by previously opened file”

Contrast:

TH1F hist("hist","Higgs Discovery Plot", 50,0,200);
TFile output("discovery.root","RECREATE");
output.Close();

with:

TFile output("discovery.root","RECREATE");
TH1F hist("hist","Higgs Discovery Plot", 50,0,200);
output.Close();

In the former, the file discovery.root will be empty. In the latter, it will contain a copy of hist.

This can get really hairy when you’re dealing with pointers. Therefore (instead of being a responsible programmer), the best approach to managing memory in ROOT is to not manage it until you have to.

Recreate, create, new, update

From the ROOT docs:

If option = NEW or CREATE   create a new file and open it for writing,
                             if the file already exists the file is
                             not opened.
           = RECREATE        create a new file, if the file already
                             exists it will be overwritten.
           = UPDATE          open an existing file for writing.
                             if no file exists, it is created.
           = READ            open an existing file for reading (default).

Important: Recreating will destroy the file if it exists. BE CAREFUL when you use this option!

Extracting ACLiCs compilation steps

Here’s a tip you hope you never need. The use case is when you are writing a standalone program and you know you need to invoke rootcling to generate dictionaries. When you follow the correct prescription, the produced dictionary fails even when linked to your code. What you can do is follow the ”loader.C prescription” and then dump what ACLiC is doing under the hood with:

root [7] gDebug=7
(const int)7
root [8] .L libLoader.C++

This will dump all of the g++ and rootcling calls that are required to generate a working dictionary.

Debugging with ROOT

Eventually you’ll encounter a segmentation fault or segfault in ROOT. (They can also go under the name core dump). This happens when you try to read, write, or otherwise abuse a part of memory that doesn’t belong to you. The result is that the program (ROOT usually) crashes. ROOT has gotten pretty good at realizing that this has happened, and printing information about what was going on when the crash happened.

A “crash” course on reading a stack trace

Here’s a stack trace from a real-live analysis program (SFrame in this case)

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x0000003ba7e9a075 in waitpid () from /lib64/libc.so.6
#1  0x0000003ba7e3c741 in do_system () from /lib64/libc.so.6
#2  0x00002b4f86156256 in TUnixSystem::StackTrace() ()
   from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libCore.so
#3  0x00002b4f86155b2c in TUnixSystem::DispatchSignals(ESignals) ()
   from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libCore.so
#4  <signal handler called>
#5  0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
#6  0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#7  0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
   from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
#8  0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#9  0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#10 0x000000000040226c in main ()
===========================================================


The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
#6  0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#7  0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
   from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
#8  0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#9  0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#10 0x000000000040226c in main ()
===========================================================

The numbered lines followed by the memory address (the 64bit hex numbers) represent the order in which each function was called. The most recent call is at the top of the list. Since this code was running a single thread, there is a only one stack trace. If there were multiple threads, there would be a trace for each thread. Typically the fastest route to a user called function is to look at the portion:

The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x00002b4f93f9a47d in UnfoldingStudy::ExecuteEvent(SInputData const&, double) () from /home/dmb60/bFrame/SFrame/lib/libMiniReaders.so
#6  0x00002b4f85c33616 in SCycleBaseExec::Process(long long) ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#7  0x00002b4f8a62e1e0 in TTreePlayer::Process(TSelector*, char const*, long long, long long) ()
   from /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/root/5.34.07-x86_64-slc5-gcc4.3/lib/libTreePlayer.so
#8  0x00002b4f85c47ce8 in SCycleController::ExecuteNextCycle() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#9  0x00002b4f85c43872 in SCycleController::ExecuteAllCycles() ()
   from /home/dmb60/bFrame/SFrame/lib/libSFrameCore.so
#10 0x000000000040226c in main ()
===========================================================

This strips out the system calls that clutter the full trace, and the top frame (#5 in this case) is the last function called that was defined (UnfoldingStudy::ExecuteEvent). This means that somewhere in that function, someone tried to access memory they shouldn’t have.

You can trace the whole program from the main() invocation. To gain more insight into all the information contained in the stack trace, it is very useful to go through a gdb tutorial.

Also, see Getting Help for more problem solving strategies before filing a bug report. Remember: you probably just found a feature, not a bug.

Using gdb

Here are some good gdb (GNU DeBugger) tutorials

Debugging under Unix: gdb Tutorial
RMS’s gdb Debugger Tutorial (not the same RMS)
GDB manual from GNU.org

Valgrind

If you have memory related problems, you should be aware of valgrind. The quick-start is here.

Physics

At some point (not necessarily right away) a hitchhiker will want to better understand the physics underlying the research he/she is doing. There are numerous textbooks on the subject, but a few stand out as particularly good.

Introduction to Elementary Particles 2nd Ed. - David Griffiths
Quarks and Leptons: An Introductory Course in Modern Particle Physics - Francis Halzen and Alan D. Martin

Study Guide For Griffiths

If you’ve never touched HEP or heard of Particle Physics, read chapter 1 and 2. Otherwise here’s a rough path through the book (with suggested exercises to work):

Chapter 3 (Problems 3.4, 3.14 (for fun), 3.15, 3.16, 3.25, 3.26 (last two cover Mandelstam variables, which are very useful tools))
Chapter 6 (Problems 6.8 or 6.9, 6.12, 6.13, 6.14, 6.15)
Chapter 7 (Problems 7.6, 7.8 (optional), 7.14, 7.23, 7.30, 7.36, 7.37, 7.39, 7.51)
Chapter 8 (Problems 8.14, 8.15, 8.16, 8.19, 8.23, 8.28)
Chapter 9 (Problems 9.2, 9.3, 9.6, 9.14, 9.17, 9.20, 9.23, 9.25, 9.31, 9.32)
Chapter 10 (Problems 10.4 10.5, 10.6, 10.15, 10.16, 10.21, 10.23)

Halzen and Martin has a better treatment of Quantum Chromodynamics, but that just adds another book to your library.

This assumes the reader has covered the material in chapter 4 in a undergraduate quantum course (at the level of Griffiths). Chapter 4 is too much of a review to be useful as a primary source. Griffith’s quantum book covers it, but Townsend’s “A Modern Approach to Quantum Mechanics” is a better text to have on your bookshelf.

Coordinate Systems used in HEP

HEP uses cylindrical coordinates (well, a combination of cylindrical and spherical really) with the z axis oriented along the beam line, R radially “up” and the φ curling right-handed around the beam axis.

In addition, HEP physicists think in terms of a variable called “pseudorapidity” denoted η. This is defined as $$ η = - log\left(tan\frac{θ}{2}\right) $$ Where θ is the polar angle from the beam azis. Why use this crazy coordinate system? Well, it’s related to rapidity, the relativistic counterpart to speed. More important to HEP experiments, differences in pseudorapidity are Lorentz invariant along the beam axis.

If you want to keep a mental map of angles (from wikipedia):

η	Location
0	“Straight up” (θ=π/2)
0.88	“Forty Five Degrees” (θ=π/4)
4.5	“Along the Beam Pipe” (θ=2$^ˆ$ )
∞	“Beam Axis” (θ=0)

Monte Carlo Event Weights

This is taken from The Editor’s Lab Notebook (which is locked by a password, and hence will only be read by one person):

Someone generates gobs of MC for use in any analysis for some process ( $W→ μ\nu+p$ where $p$ is a parton). I want to study what a variable will look like in data, so I run my analysis over the MC and get a plot out. Then I run the same code over data and plot the two on top of each other. The trouble is that the number of MC events I ran over is not the same as the amount of data I ran over. Now the problem becomes how to appropriately scale the Monte Carlo prediction to match the data (the result of the experiment).

As experimental high energy physicists, we define variables so that we can relate the cross sections predicted by theory to what we measure out of the beam. To this end, we can get the number of expected events from \begin{equation} N_D=σ \mathcal{L} \end{equation} Here $σ$ is the cross section in barns (typically pico-barns) and $\mathcal{L}$ is the integrated luminosity collected by the experiment. How do we compare this to the Monte Carlo prediction? There, the computer counts up the number of events that were generated for a specific process, what we want is \begin{equation} N_exp=W N_MC \end{equation} Here $N_exp$ is the expected number of events we get from Data. In order to compare data to MC we need to scale $N_exp$ to the same order as $N_D$. Remember, there is a cross section calculated from theory that the MC prediction used, therefore we can write \begin{equation} N_exp=σ_MC\mathcal{L} \end{equation} It is possible that $σ_MC$ is corrected to the next order. In order to avoid recalculating everything, a $k$ factor is reported as the ratio of $σ_NLO/σ_MC$. Then all you have to do is multiply $σ_MC$ by the $k$ factor, and the calculation is updated to the newest NLO prediction.

We’re interested in the weight, so \begin{equation} W=\frac{σ_MCk\mathcal{L}}{N_MC} \end{equation} Now, when each bin in the Monte Carlo histogram is scaled by W, it will be equal to the theoretically expected yield calculated by $σ_MCk$.

Drawing Feynman Diagrams Digitally

There are many possibilities. The most “user friendly” is probably JaxoDraw. It’s even got a name reminiscent of Douglas Adams!

JaxoDraw comes as a jar file, you may want to make it more command line friendly by making a script to invoke it (put this in ${HOME}/local/bin/jaxodraw):

#!/bin/bash
java -jar ${HOME}/local/lib/java/jaxodraw-2.1-0.jar $@

Now make it executable:

chmod +x ${HOME}/local/bin/jaxodraw

And add it to your PATH. I have my path set up to search this path from my .bashrc:

export PATH=${HOME}/local/bin:$PATH

Now to test it out by typing jaxodraw at the command prompt, it should just launch. If it doesn’t try again.

Someday you may be working on a presentation and want the figures exported by jaxodraw to have a transparent background. Some kind soul has hacked the conversion steps to do this for you. You can grab the scripts here: http://personal.psu.edu/jcc8//jd-conversions/

If you have jaxodraw setup as above you can just dump the scripts in $HOME/local/bin and run them on the xml file produced by jaxodraw.

What to do if you’ve lost a 2π

Calm down, take a deep breath and read the first line of The Guide. Then come back here. Somewhere a fellow grad student has a copy of “Introduction to Elementary Particles (2nd Edition)” by David Griffiths. Read Chapter 6 in entirety paying special note to the footnote on page 205.

Responsible Research

This is a topic better covered else where. The part that overlaps with this guide is in documenting the work you do. It is important that you keep a traceable paper-trail of everything you do.

Lab Notebooks

When keeping a lab notebook, it is important to make the barrier for writing something down as small as possible. If it’s difficult or inconvenient, you will not be inclined to document as well as you should.

Pen/Paper Notebook

This is the Science standard, and it is perfectly applicable to HEP research. Keep it open and write in it as you work. Make sure there is a 1-to-1 mapping from what you are writing in your notebook and what can be found on a hard drive somewhere (plots, text, code etc).

Org-mode Notebook

The Editor keeps an org-mode notebook. There is a thorough write-up here. (Editor’s note: I always wanted to flesh this out, but it became a joke to link to the repo and let it lie.)

Flat text file

One option is to keep a (well organized) text file for recording what you’re doing. Paste links and your thoughts here. Choose a markdown language and write your posts in that. This way you are forced to keep more structure in the file, and you can export to whatever output formats are supported by your markdown language.

Wiki/ELog

Some research groups reserve webspace for hosting private e-logs. These are typically wiki syntax, but they can also be bulletin board style formatting. In any case, it’s possible to use a browser extension like ”Ghost Text” to use your favorite editor. One downside to these style notebooks is that it is typically awkward to post many plots at once.

Getting Help

For better or worse, eventually you will get stuck. This is Research! If you’re not stuck half the time you’re not doing it right!

Here’s a rough strategy for tackling problems:

Google it, don’t just Google specific errors, search related terms. Eventually you will be able to make the search more and more general until you get the answer you want, try reading the Google guide.
Read documentation. Many times the answer is buried deep inside the program. Don’t be afraid to crack open the source code and actually try to understand whats going on (after exhausting any manuals or user guides available).
Just take a deep breath, go get a cup of coffee and come back to the problem.
If the coffee didn’t help, table the problem for the moment and work on something else. Most likely when you come back to it (today or tomorrow) the solution will be obvious.
- Now may be an appropriate time for an email to the person you are working with directly. Chances are they’ve already encountered and solved your problem.
If you’re still stuck, it may be time to post to a forum or mailing list. In general, when soliciting help from people you don’t know, it is polite to avoid contacting them directly.
- Above all else, do not be a help vampire. (They do exist!)

FAQ

I tried: #include <TheAnswerIsFortyTwo.h> in C++ but it doesn’t work, what gives?

In short: ”#include <...>” instructs the compiler to look in specific system include directories. In contrast ’#include ”...”’ instructs the compiler to look in all directories that it can find (usually specified with the -I flag). (Further information)

I was typing along and all of the sudden my terminal froze!? The only way to continue was to exit the program!

Most likely you accidentally typed “C-s” (the control key followed by the s key). This sends the XOFF command to your terminal emulator. To fix it (and “unfreeze” your terminal) type “C-q”. More Info Here.

I have some information that would be useful for your guide, can you use it?

Yes! Please see Contributing.

I’m lost and confused, your guide is overwhelming and overbearing, but I really like HEP, what can I do?

Please don’t despair, with time things will come into focus. In the mean time, it will probably be useful to seek out other sources of information. If you’re having trouble finding material, you might try the Google guide.

I have a question not covered by this FAQ that doesn’t involve a research problem, whats the best way to communicate it to you?

You can email me at hhg-svc@ccbar.us. If it is related to some technical issue and you already have a GitHub account, please open an issue on GitHub’s issue tracker for this text.

Files

index.org

Latest commit

History

index.org

File metadata and controls

DON’T PANIC

A Hitchhiker’s guide to High Energy Physics

Preface

Introduction

Navigating the guide

Disclaimer

Obtaining a copy and supporting material

For Windows Hitchhikers

Software you will need

For Linux Hitchhikers

Software you will need

The Terminal

Line Editing

Managing Jobs

Configuring SSH

SSH Keys

SSH Agent

Version Control Systems

*rc Files

For Mac OS X Hitchhikers

Software you will need

Editors

Finding an editor Guru

Editor Guru etiquette

Keeping your Guru happy

Emacs

Vim

Others

Software Design

A brief introduction to C++

Pointers

Why are pointer’s useful at all?

References

Methods

Classes

An even briefer introduction to Python

PyROOT: ROOT bindings in python

ROOT

Installing and setting up

Building From source

Getting the Pre-Requisites

May the Source be with you

Building

Using your new superpower

Getting Started with xrootd

A Path to ROOT enlightenment

Level 1: Macros

Level 2: Compiled Macros

Level 3: Compiled Programs

A note on Enlightenment

PyROOT

Fitting Data with RooFit

Styling Plots

Example Style

Transparent Plots

Extending ROOT with custom classes

Important Gotcha’s

TTrees

Drawing trees

Caching trees

Splitting Trees

TH1

TH2

Splitting a 2D

THStack

Get Sum of Stack

A pointer here, a pointer there, who am I?

TFile

Recreate, create, new, update

Extracting ACLiCs compilation steps

Debugging with ROOT

A “crash” course on reading a stack trace

Using gdb

Valgrind