# MadMiner particle physics tutorial

# Appendix 1: Adding systematic uncertainties

Johann Brehmer, Felix Kling, Irina Espejo, and Kyle Cranmer 2018-2019

In this tutorial we'll explain how to add systematic uncertainties to the MadMiner workflow.

## (UNDER CONSTRUCTION)

## Preparations

Before you execute this notebook, make sure you have running installations of MadGraph, Pythia, and Delphes.

In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals

import logging
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline

from madminer.core import MadMiner
from madminer.lhe import LHEReader
from madminer.sampling import combine_and_shuffle
from madminer.sampling import SampleAugmenter
from madminer import sampling


Please enter here the path to your MG5 root directory.

In [None]:
mg_dir = '/Users/johannbrehmer/work/projects/madminer/MG5_aMC_v2_6_4'

MadMiner uses the Python `logging` module to provide additional information and debugging output. You can choose how much of this output you want to see by switching the level in the following lines to `logging.DEBUG` or `logging.WARNING`.

In [None]:
# MadMiner output
logging.basicConfig(
    format='%(asctime)-5.5s %(name)-20.20s %(levelname)-7.7s %(message)s',
    datefmt='%H:%M',
    level=logging.INFO
)

# Output of all other modules (e.g. matplotlib)
for key in logging.Logger.manager.loggerDict:
    if "madminer" not in key:
        logging.getLogger(key).setLevel(logging.WARNING)

## 1.-2. Parameters and benchmarks

We'll just load the MadMiner setup from the first part of this tutorial:

In [None]:
miner = MadMiner()
miner.load('data/madminer_example.h5')

## 3. Set up systematics, save settings, run MadGraph

This is where things become interesting: We want to model systematic uncertainties. Currently this can be done in one of two ways: based on scale variation or based on PDF variations. You can also use both simultaneously. Here we just vary the scales:

In [None]:
miner.set_systematics(scale_variation=(0.5,2.), pdf_variation=None)

Again, we save our setup:

In [None]:
miner.save('data/madminer_example_systematics.h5')

Now it's time to run MadGraph. MadMiner will instruct MadGraph to use its built-in `systematics` tool to calculate how the event weights change under the scale variation.

In [None]:
miner.run(
    sample_benchmark='sm',
    mg_directory=mg_dir,
    mg_process_directory='./mg_processes/signal_systematics',
    proc_card_file='cards/proc_card_signal.dat',
    param_card_template_file='cards/param_card_template.dat',
    run_card_file='cards/run_card_signal.dat',
    log_directory='logs/signal',
    python2_override=True,
)

## 4. Run smearing and extract observables

This is just as before:

In [None]:
lhe = LHEReader('data/madminer_example_systematics.h5')

lhe.add_sample(
    lhe_filename='mg_processes/signal_systematics/Events/run_01/unweighted_events.lhe.gz',
    sampled_from_benchmark='sm',
    is_background=False,
    k_factor=1.1,
)

lhe.set_smearing(
    pdgids=[1,2,3,4,5,6,9,22,-1,-2,-3,-4,-5,-6],   # Partons giving rise to jets
    energy_resolution_abs=0.,
    energy_resolution_rel=0.1,
    pt_resolution_abs=None,
    pt_resolution_rel=None,
    eta_resolution_abs=0.1,
    eta_resolution_rel=0.,
    phi_resolution_abs=0.1,
    phi_resolution_rel=0.,
)

lhe.add_observable(
    'pt_j1',
    'j[0].pt',
    required=False,
    default=0.,
)
lhe.add_observable(
    'delta_phi_jj',
    'j[0].deltaphi(j[1]) * (-1. + 2.*float(j[0].eta > j[1].eta))',
    required=True,
)
lhe.add_observable(
    'met',
    'met.pt',
    required=True,
)

lhe.add_cut('(a[0] + a[1]).m > 124.')
lhe.add_cut('(a[0] + a[1]).m < 126.')
lhe.add_cut('pt_j1 > 30.')

In [None]:
lhe.analyse_samples()
lhe.save('data/madminer_example_systematics_with_data.h5')

### A look at distributions

Let's see what our MC run produced:

In [None]:
_ = plot_uncertainty(
    filename='data/madminer_example_systematics_with_data.h5',
    parameter_points=['sm', np.array([10.,0.])],
    line_labels=['SM', 'BSM'],
    uncertainties='none',
    n_bins=20,
    n_cols=3,
    normalize=True,
)

## 6. Make (unweighted) training and test samples with augmented data

In [None]:
sampler = SampleAugmenter('data/madminer_example_systematics_with_data.h5')

When we generate training data, we now also have to specify the values of the nuisance parameters. The helper functions `sampling.nominal_nuisance_parameters()` and `sampling.iid_nuisance_parameters()` can be used in addition to the usual ones. The `theta0` and `theta1` return now includes values for the nuisance parameters.

In [None]:
x, theta0, theta1, y, r_xz, t_xz, _ = sampler.sample_train_ratio(
    theta0=sampling.random_morphing_points(100, [('gaussian', 0., 15.), ('gaussian', 0., 15.)]),
    theta1=sampling.benchmark('sm'),
    nu0=sampling.iid_nuisance_parameters("gaussian", 0., 1.),
    nu1=sampling.nominal_nuisance_parameters(),
    n_samples=1000,
    folder='./data/samples',
    filename='train'
)

To be continued...