-
Notifications
You must be signed in to change notification settings - Fork 1
/
cms_pub_example.md
248 lines (201 loc) · 8.73 KB
/
cms_pub_example.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
# CMS Public tutorial example
[CMS](https://cms.cern/) is one of the Large Hadron Collider
([LHC](https://home.cern/science/accelerators/large-hadron-collider))
experiments at [CERN](https://home.cern/). It is a general-purpose detector, and
has been used to discover the Higgs boson, and to search for new physics beyond
the Standard Model.
The
[CMS Public Tutorial](https://ippog-static.web.cern.ch/ippog-static/resources/2012/cms-hep-tutorial.html)
can be performed in many ways, depending on the physics you are interested in.
For simplicy, we will focus on the
[Z boson](https://en.wikipedia.org/wiki/W_and_Z_bosons) in this example. We are
aiming to reproduce the outputs showin in
[FAST-HEP example repository](https://github.com/FAST-HEP/examples/tree/main/cms/public_tutorial/example_outputs).
To perform this analysis, we will need to:
1. **Create variables** not present in the data: Muon transverse momentum, muon
isolation, and number of isolated muons.
2. Select isolated muons pairs with opposite charge and **calculate the
invariant mass** of the pair.
3. **Create histograms** of the number of muons and number of isolated muons per
event.
4. **Select events** based on number of isolated muons, a trigger decision, and
the transverse momentum of the muons.
5. **Create histograms** of the invariant mass of the muon pair.
6. Present the results in a **publication-ready plot**.
In the following sections, we will go through each of these steps, and show how
to define them in a `fasthep-flow` workflow.
## Setup
## Preparing the data
The data for this tutorial are available on CERNBOX:
```json
{
"data.root": "https://cernbox.cern.ch/index.php/s/9QU53dsR2AQPxz8/download",
"dy.root": "https://cernbox.cern.ch/index.php/s/x4cRGGXNvQ2ZnDy/download",
"qcd.root": "https://cernbox.cern.ch/index.php/s/IYznddwu1oX9zpc/download",
"single_top.root": "https://cernbox.cern.ch/index.php/s/8cwxZYSAz1QV83w/download",
"ttbar.root": "https://cernbox.cern.ch/index.php/s/3s6Haj3SLGqPuAK/download",
"wjets.root": "https://cernbox.cern.ch/index.php/s/tGNjygyJFvSs2Dc/download",
"ww.root": "https://cernbox.cern.ch/index.php/s/dFaiOi8JJVzCN8L/download",
"wz.root": "https://cernbox.cern.ch/index.php/s/W7hNNy47F7D8X80/download",
"zz.root": "https://cernbox.cern.ch/index.php/s/CRlo8JP0Htvg4Dm/download"
}
```
You can download manually or using the `fasthep-cli`:
```bash
fasthep download --json /path/to/json --destination /path/to/data
```
```{note}
While you can automate the data download and curator steps, we will do them manually in this example.
Both could be added as stages of the type `fasthep_flow.operators.bash.BashOperator`.
```
## Putting together the workflow
### Input data
The first step is to define the input data. In this case, we will use the output
from the fasthep-curator step and pass it to the first stage of the workflow.
```yaml
stages:
- name: Input data
type: fasthep_carpenter.operators.InputDataOperator
kwargs:
curator_config: "/path/to/curator.yaml"
split_strategy: "file"
split_kwargs:
n: 1
method: uproot5
```
We typically would only need the `name`, `type`, and `curator_config` here as
the other values are defaults. However, we have included them here for
completeness.
### Creating variables
Adding onto the `stages`:
```yaml
- name: Create variables
type: fasthep_carpenter.operators.CreateVariablesOperator
kwargs:
variables:
- name: Muon_Pt
type: float32
expr: "sqrt(Muon_Px ** 2 + Muon_Py ** 2)"
- name: IsoMuon_Idx
type: float32
expr: "(Muon_Iso / Muon_Pt) < 0.10"
- name: NIsoMuon
type: int32
expr: "count(IsoMuon_Idx)"
```
OK, there is a lot going on here. Let's break it down.
First, we use `fasthep_carpenter.operators.CreateVariablesOperator` for the
implementation and give it a few variables to create. Each variable is defined
with a `name`, `type`, and `expr`. The `name` is the name of the variable, the
`type` is the type of the variable, and the `expr` is the expression used to
calculate the variable. The `expr` is a string that is evaluated using
[numexpr](https://github.com/pydata/numexpr) for simple expressions and
`fasthep_expr` for more complex expressions. The `expr` can use any of the
variables in the input data, and can use any of the functions in
[fasthep-carpenter](https://fast-hep.github.io/fasthep-carpenter/).
### Selecting muon pairs
Next, we want to select muon pairs and calculate their invariant mass. We will
use the `fasthep_carpenter.operators.DiObjectMass` for this:
```yaml
- name: Muon Invariant Mass
type: fasthep_carpenter.operators.DiObjectMass
kwargs:
four_momenta: ["Muon_Px", "Muon_Py", "Muon_Pz", "Muon_E"]
output: "DiMuonMass"
when:
all:
- "NIsoMuon >= 2"
- "Muon_Charge[0] == -Muon_Charge[1]"
```
```{note}
There is also a more general `fasthep_carpenter.operators.InvariantMassOperator` that can be used to calculate the invariant mass of more than two objects.
```
### Creating histograms
There are two places in this analysis example where we want to create
histograms: before the selection and after. Let's start with the first
histogram: the number of muons and number of isolated muons per event. We
already have the definitions of these variables, so we can use them directly:
```yaml
- name: Histograms before selection
type: fasthep_carpenter.operators.HistogramOperator
kwargs:
histograms:
- name: NMuon
input: "NMuon"
edges: [0, 1, 2, 3, 4, 5]
- name: NIsoMuon
input: "NIsoMuon"
edges: [0, 1, 2, 3, 4, 5]
weights: ["EventWeight"]
```
The `fasthep_carpenter.operators.HistogramOperator` takes a list of histograms
to create. Each histogram is defined with a `name`, `input`, and `edges` or
`bins`. The `name` is the name of the histogram, the `input` is the variable to
histogram, and the `edges` are the bin edges. The `input` can be any variable
defined in the workflow, and the `edges` can be any list of numbers. The
`weights` are optional, and can be any list of variables defined in the
workflow.
```{note}
All histograms will be created in a folder named after the name of the stage with spaces replaced with `_`. All histograms will be prepended by `hist_`. This behaviour can be changed by setting the `histogram['folder_rule']` and `histogram['prefix']` in the `global` section of the `fasthep-flow` configuration file.
```
### Selecting events
For this example the selection is very simple: make sure a High Level Trigger
(HLT) path was fired, that there are at least two isolated muons in the event,
and the first muon in the event has at least a transverse momentum of 25 GeV. We
can use the `fasthep_carpenter.operators.SelectorOperator` for this:
```yaml
- name: Select events
type: fasthep_carpenter.operators.SelectorOperator
kwargs:
when:
all:
- "triggerIsoMu24 == 1"
- "NIsoMuon >= 2"
- "first(Muon_Pt) > 25"
```
The `fasthep_carpenter.operators.SelectorOperator` takes a list of conditions to
select events. Each condition is defined with a `when` key, and a list of
conditions. The `when` key can be `all` or `any`, and the conditions can be any
variable defined in the workflow. The `when` key is optional, and defaults to
`all`. Selection stages are special, since they also keep track of the number of
events before and after the selection.
### Creating histograms after selection
It is now time to create histograms of the invariant mass of the muons after the
selection. We can use the `fasthep_carpenter.operators.HistogramOperator` again:
```yaml
- name: Histograms after selection
type: fasthep_carpenter.operators.HistogramOperator
kwargs:
histograms:
- name: DiMuonMass
input: "DiMuonMass"
bins: { low: 60, high: 120, nbins: 60 }
weights: ["EventWeight"]
```
### Output data
Finally, we want to save the output of the workflow. We can use the
`fasthep_carpenter.operators.OutputDataOperator` for this:
```yaml
- name: Output data
type: fasthep_carpenter.operators.OutputDataOperator
kwargs:
path: "/path/to/output"
method: uproot4
```
The `fasthep_carpenter.operators.OutputDataOperator` takes a `path` and a
`method`. The `path` is the path to the output file(s), and the `method` is the
method to use to write the output file. The `method` can be any method supported
by [fasthep-carpenter](https://fast-hep.github.io/fasthep-carpenter/).
### Making paper-ready plots
The final step is to make a paper-ready plot. We will use the
`fasthep_flow.operators.bash.BashOperator` for this:
```yaml
- name: Make paper-ready plot
type: fasthep_flow.operators.bash.BashOperator
kwargs:
bash_command: |
fasthep plotter \
--input /path/to/output \
--output /path/to/output/plots/
```
### Putting it all together