Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiments with end-to-end training #12

Merged
merged 52 commits into from
Jan 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
97dc87d
start adding gnn
jmduarte Nov 28, 2019
747947b
Merge branch 'master' of github.com:jpata/particleflow
jmduarte Nov 28, 2019
fead6b5
add gnn to benchmarks
jmduarte Nov 28, 2019
9e7bf53
Update run_training.sh
jmduarte Nov 28, 2019
4247d5c
update graph_data and EdgeNet to include edge_attr and benchmarking
jmduarte Dec 5, 2019
8a7c654
add notebook for plotting
jmduarte Dec 5, 2019
bbca252
added first end-to-end training example
jpata Jan 13, 2020
380f3d0
up
jpata Jan 13, 2020
dd5df68
Merge branch 'gnn_jmd_v2' into endtoend_gnn
jpata Jan 13, 2020
0267ba6
up
jpata Jan 13, 2020
3f6bd14
Merge branch 'endtoend_gnn' of https://github.com/jpata/particleflow …
jpata Jan 13, 2020
af1046d
up
jpata Jan 18, 2020
3621e15
added end2end training examples
jpata Jan 18, 2020
d30b858
up
jpata Jan 21, 2020
6cc0a74
up
jpata Jan 21, 2020
ef87251
up
jpata Jan 22, 2020
37e1cfd
up
jpata Jan 22, 2020
d9b0299
up
jpata Jan 22, 2020
cdb38b6
up
jpata Jan 22, 2020
020ab79
cmdline args
jpata Jan 22, 2020
b2cddd4
added sequential conv
jpata Jan 22, 2020
092504a
added cls accuracy monitoring
jpata Jan 22, 2020
21abfe1
up
jpata Jan 22, 2020
68a57d7
remove additional edges
jpata Jan 22, 2020
628e5c0
add act
jpata Jan 22, 2020
6e19786
dataset location
jpata Jan 22, 2020
3bba476
elem id encoding, fix norm
jpata Jan 23, 2020
184b093
fix nans
jpata Jan 23, 2020
9fa2ab0
add ntest
jpata Jan 23, 2020
784ed1e
added num pred and true plotting:
jpata Jan 23, 2020
0b787c2
Merge branch 'endtoend_gnn' of https://github.com/jpata/particleflow …
jpata Jan 23, 2020
6d36494
added npy file saving
jpata Jan 23, 2020
4a1df05
switch to relu
jpata Jan 23, 2020
2fd750e
pfnet7 same setup as others
jpata Jan 23, 2020
c09a839
up
jpata Jan 23, 2020
ba999c9
loss coefs configurable
jpata Jan 23, 2020
9b6339b
added model to predict only id
jpata Jan 23, 2020
a9ba8ec
fix bugs with relabeling
jpata Jan 23, 2020
a706599
fix plot title
jpata Jan 23, 2020
23f708f
cosmetic
jpata Jan 23, 2020
b410b39
Merge branch 'endtoend_gnn' of https://github.com/jpata/particleflow …
jpata Jan 23, 2020
81d5d52
up
jpata Jan 24, 2020
fac6586
up
jpata Jan 24, 2020
fa53fcb
added sinkhorn loss
jpata Jan 24, 2020
e969a67
fixes
jpata Jan 24, 2020
53d7ff4
added reordering code
jpata Jan 27, 2020
926eef3
fix printout, reweighting
jpata Jan 28, 2020
283c31e
added class weighting
jpata Jan 29, 2020
5a9e791
update readme
jpata Jan 29, 2020
cd62a86
dropout configurable, simplify cross-check model
jpata Jan 29, 2020
49d7589
fix weight application
jpata Jan 29, 2020
31e8d07
update weights
jpata Jan 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
203 changes: 114 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,96 +1,28 @@
Notes on modernizing CMS particle flow, in particular [PFBlockAlgo](https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFBlockAlgo.cc) and [PFAlgo](https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFAlgo.cc).

## Standard CMS offline PF
The following pseudocode illustrates how the standard offline PF works in CMS.

```python
# Inputs and outputs of Particle Flow
# elements: array of ECAL cluster, HCAL cluster, tracks etc, size Nelem
# candidates: array of produced particle flow candidates (pions, kaons, photons etc)

# Intermediate data structures
# link_matrix: whether or not two elements are linked by having a finite distance (sparse, Nelem x Nelem)

def particle_flow(elements):

#based on https://github.com/cms-sw/cmssw/tree/master/RecoParticleFlow/PFProducer/plugins/linkers
link_matrix = compute_links(elements)

#based on https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFBlockAlgo.cc
blocks = create_blocks(elements, link_matrix)

#based on https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFAlgo.cc
candidates = []
for block in blocks:
candidates.append(create_candidates(block))

return candidates

def compute_links(elements):
Nelem = len(elements)

link_matrix = np.array((Nelem, Nelem))
link_matrix[:] = 0

#test if two elements are close by based on neighborhood implemented with KD-trees
for ielem in range(Nelem):
for jelem in range(Nelem):
if in_neighbourhood(elements, ielem, jelem):
link_matrix[ielem, jelem] = 1

return link_matrix

def in_neighbourhood(elements, ielem, jelem):
#This element-to-element neighborhood checking is done based on detector geometry
#e.g. here for TRK to ECAL: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/plugins/linkers/TrackAndECALLinker.cc -> linkPrefilter
return True

def distance(elements, ielem, jelem):
#This element-to-element distance checking is done based on detector geometry
#e.g. here for TRK to ECAL: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/plugins/linkers/TrackAndECALLinker.cc -> testLink
return 0.0

def create_blocks(elements, link_matrix):
#Each block is a list of elements, this is a list of blocks
blocks = []

Nelem = len(elements)

#Elements and connections between the elements
graph = Graph()
for ielem in range(Nelem):
graph.add_node(ielem)

#Check the distance between all relevant element pairs
for ielem in range(Nelem):
for jelem in range(Nelem):
if link_matrix[ielem, jelem]:
dist = distance(elements, ielem, jelem)
if dist > -0.5:
graph.add_edge(ielem, jelem)

#Find the sets of elements that are connected
for subgraph in find_subgraphs(graph):
this_block = []
for element in subgraph:
this_block.append(element)
blocks.append(this_block)

return blocks

def create_candidates(block):
#find all HCAL-ECAL-TRK triplets, produce pions
#find all HCAL-TRK pairs, produce kaons
#find all ECAL-TRK pairs, produce pions
#find all independent ECAL elements, produce photons
#etc etc
candidates = []
return candidates
```


# Overview

- [x] set up datasets and ntuples for detailed PF analysis
- [ ] GPU code for existing PF algorithms
- [x] test CLUE for element to block clustering
- [ ] port CLUE to PFBlockAlgo in CMSSW
- [ ] parallelize PFAlgo calls on blocks
- [ ] GPU-implementation of PFAlgo
- [ ] reproduce existing PF with machine learning
- [x] test element-to-block clustering with ML (Edge classifier, GNN)
- [x] test block-to-candidate regression
- [ ] end-to-end training of elements to MLPF-candidates using GNN-s
- [x] first baseline training converges to multiclass accuracy > 0.96, momentum correlation > 0.9
- [ ] improve training speed
- [ ] detailed hyperparameter scan
- [ ] further reduce bias in end-to-end training (muons, electrons, momentum tails)
- [ ] reconstruct genparticles directly from detector elements a la HGCAL, neutrino experiments etc
- [ ] set up datasets for regression genparticles from elements
- [ ] develop improved loss function for event-to-event comparison: EMD, GAN?
## Presentations

- Caltech group meeting, 2020-01-28: https://indico.cern.ch/event/881683/contributions/3714961/attachments/1977131/3291096/2020_01_21.pdf
- CMS PF group, 2020-01-17: https://indico.cern.ch/event/862200/contributions/3706909/attachments/1971145/3279010/2020_01_16.pdf
- CMS PF group, 2019-11-22: https://indico.cern.ch/event/862195/contributions/3649510/attachments/1949957/3236487/2019_11_22.pdf
- CMS PF group, 2019-11-08: https://indico.cern.ch/event/861409/contributions/3632204/attachments/1941376/3219105/2019_11_08.pdf
- Caltech ML meeting, 2019-10-31: https://indico.cern.ch/event/858644/contributions/3623446/attachments/1936711/3209684/2019_10_07_pf.pdf
Expand Down Expand Up @@ -277,3 +209,96 @@ python3 test/graph.py step3_AOD_1.root
- candidates: [Ncand, Ncand_feat] for the output PFCandidate data
- candidate_block_id: [Ncand, ] for the PFAlgo-based block id
- step3_AOD_1_dist.npz: sparse [Nelem, Nelem] distance matrix from PFBlockAlgo between the candidates

## Standard CMS offline PF
The following pseudocode illustrates how the standard offline PF works in CMS.

```python
# Inputs and outputs of Particle Flow
# elements: array of ECAL cluster, HCAL cluster, tracks etc, size Nelem
# candidates: array of produced particle flow candidates (pions, kaons, photons etc)

# Intermediate data structures
# link_matrix: whether or not two elements are linked by having a finite distance (sparse, Nelem x Nelem)

def particle_flow(elements):

#based on https://github.com/cms-sw/cmssw/tree/master/RecoParticleFlow/PFProducer/plugins/linkers
link_matrix = compute_links(elements)

#based on https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFBlockAlgo.cc
blocks = create_blocks(elements, link_matrix)

#based on https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFAlgo.cc
candidates = []
for block in blocks:
candidates.append(create_candidates(block))

return candidates

def compute_links(elements):
Nelem = len(elements)

link_matrix = np.array((Nelem, Nelem))
link_matrix[:] = 0

#test if two elements are close by based on neighborhood implemented with KD-trees
for ielem in range(Nelem):
for jelem in range(Nelem):
if in_neighbourhood(elements, ielem, jelem):
link_matrix[ielem, jelem] = 1

return link_matrix

def in_neighbourhood(elements, ielem, jelem):
#This element-to-element neighborhood checking is done based on detector geometry
#e.g. here for TRK to ECAL: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/plugins/linkers/TrackAndECALLinker.cc -> linkPrefilter
return True

def distance(elements, ielem, jelem):
#This element-to-element distance checking is done based on detector geometry
#e.g. here for TRK to ECAL: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/plugins/linkers/TrackAndECALLinker.cc -> testLink
return 0.0

def create_blocks(elements, link_matrix):
#Each block is a list of elements, this is a list of blocks
blocks = []

Nelem = len(elements)

#Elements and connections between the elements
graph = Graph()
for ielem in range(Nelem):
graph.add_node(ielem)

#Check the distance between all relevant element pairs
for ielem in range(Nelem):
for jelem in range(Nelem):
if link_matrix[ielem, jelem]:
dist = distance(elements, ielem, jelem)
if dist > -0.5:
graph.add_edge(ielem, jelem)

#Find the sets of elements that are connected
for subgraph in find_subgraphs(graph):
this_block = []
for element in subgraph:
this_block.append(element)
blocks.append(this_block)

return blocks

def create_candidates(block):
#find all HCAL-ECAL-TRK triplets, produce pions
#find all HCAL-TRK pairs, produce kaons
#find all ECAL-TRK pairs, produce pions
#find all independent ECAL elements, produce photons
#etc etc
candidates = []
return candidates
```


## Acknowledgements

Part of this work was conducted at **iBanks**, the AI GPU cluster at Caltech. We acknowledge NVIDIA, SuperMicro and the Kavli Foundation for their support of **iBanks**.
Binary file added data/EdgeNet_14001_ca9bbfb3bb_jduarte.best.pth
Binary file not shown.
Binary file modified data/clustering.h5
Binary file not shown.
Binary file modified data/preprocessing.pkl
Binary file not shown.
Binary file modified data/regression.h5
Binary file not shown.
56 changes: 30 additions & 26 deletions notebooks/benchmarks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,12 @@
"metadata": {},
"outputs": [],
"source": [
"def plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_glue, sample):\n",
"def plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, sample):\n",
" plt.figure(figsize=(5,5))\n",
" plt.scatter(df_blocks[\"num_blocks_true\"], df_blocks[\"num_blocks_pred\"], marker=\".\", label=\"Edge classifier\", alpha=0.5)\n",
" plt.scatter(df_blocks_dummy[\"num_blocks_true\"], df_blocks_dummy[\"num_blocks_pred\"], marker=\"x\", label=\"PFBlockAlgo\", alpha=0.5)\n",
" plt.scatter(df_blocks_glue[\"num_blocks_true\"], df_blocks_glue[\"num_blocks_pred\"], marker=\"^\", label=\"CLUE\", alpha=0.5)\n",
" plt.scatter(df_blocks_clue[\"num_blocks_true\"], df_blocks_clue[\"num_blocks_pred\"], marker=\"^\", label=\"CLUE\", alpha=0.5)\n",
" plt.scatter(df_blocks_gnn[\"num_blocks_true\"], df_blocks_gnn[\"num_blocks_pred\"], marker=\"^\", label=\"GNN\", alpha=0.5)\n",
" plt.xlim(0,5000)\n",
" plt.ylim(0,5000)\n",
" plt.plot([0,5000], [0,5000], color=\"black\", lw=1, ls=\"--\")\n",
Expand All @@ -68,12 +69,12 @@
"metadata": {},
"outputs": [],
"source": [
"def plot_block_size(df_blocks, df_blocks_dummy, df_blocks_glue, sample):\n",
"def plot_block_size(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, sample):\n",
" plt.figure(figsize=(5,5))\n",
" plt.scatter(df_blocks[\"max_block_size_true\"], df_blocks[\"max_block_size_pred\"], marker=\".\", label=\"Edge classifier\", alpha=0.3)\n",
" plt.scatter(df_blocks_dummy[\"max_block_size_true\"], df_blocks_dummy[\"max_block_size_pred\"], marker=\"x\", label=\"PFBlockAlgo\", alpha=0.3)\n",
" plt.scatter(df_blocks_glue[\"max_block_size_true\"], df_blocks_glue[\"max_block_size_pred\"], marker=\"^\", label=\"CLUE\", alpha=0.3)\n",
"\n",
" plt.scatter(df_blocks_clue[\"max_block_size_true\"], df_blocks_clue[\"max_block_size_pred\"], marker=\"^\", label=\"CLUE\", alpha=0.3)\n",
" plt.scatter(df_blocks_gnn[\"max_block_size_true\"], df_blocks_gnn[\"max_block_size_pred\"], marker=\"^\", label=\"GNN\", alpha=0.3)\n",
" plt.xlim(0,3000)\n",
" plt.ylim(0,3000)\n",
" plt.plot([0,3000], [0,3000], color=\"black\", lw=1, ls=\"--\")\n",
Expand All @@ -90,11 +91,12 @@
"metadata": {},
"outputs": [],
"source": [
"def plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_glue, sample):\n",
"def plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, sample):\n",
" plt.figure(figsize=(5,5))\n",
" plt.scatter(df_blocks[\"edge_precision\"], df_blocks[\"edge_recall\"], marker=\".\", alpha=0.5, label=\"Edge classifier\")\n",
" plt.scatter(df_blocks_dummy[\"edge_precision\"], df_blocks_dummy[\"edge_recall\"], marker=\"x\", alpha=0.5, label=\"PFBlockAlgo\")\n",
" plt.scatter(df_blocks_glue[\"edge_precision\"], df_blocks_glue[\"edge_recall\"], marker=\"^\", alpha=0.5, label=\"CLUE\")\n",
" plt.scatter(df_blocks_clue[\"edge_precision\"], df_blocks_clue[\"edge_recall\"], marker=\"^\", alpha=0.5, label=\"CLUE\")\n",
" plt.scatter(df_blocks_gnn[\"edge_precision\"], df_blocks_gnn[\"edge_recall\"], marker=\"^\", alpha=0.5, label=\"GNN\")\n",
"\n",
" plt.xlim(0,1.2)\n",
" plt.ylim(0,1.2)\n",
Expand All @@ -112,12 +114,13 @@
"metadata": {},
"outputs": [],
"source": [
"def plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_glue, sample):\n",
"def plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, sample):\n",
" plt.figure(figsize=(5,5))\n",
" b = np.logspace(0.1, 4, 40)\n",
" plt.hist(df_blocks[\"max_block_size_pred\"], bins=b, histtype=\"step\", lw=2, label=\"Edge classifier, m={0:.0f}\".format(np.mean(df_blocks[\"max_block_size_pred\"])));\n",
" plt.hist(df_blocks_dummy[\"max_block_size_pred\"], bins=b, histtype=\"step\", lw=2, label=\"PFBlockAlgo, m={0:.0f}\".format(np.mean(df_blocks_dummy[\"max_block_size_pred\"])));\n",
" plt.hist(df_blocks_glue[\"max_block_size_pred\"], bins=b, histtype=\"step\", lw=2, label=\"GLUE, m={0:.0f}\".format(np.mean(df_blocks_glue[\"max_block_size_pred\"])));\n",
" plt.hist(df_blocks_clue[\"max_block_size_pred\"], bins=b, histtype=\"step\", lw=2, label=\"GLUE, m={0:.0f}\".format(np.mean(df_blocks_clue[\"max_block_size_pred\"])));\n",
" plt.hist(df_blocks_gnn[\"max_block_size_pred\"], bins=b, histtype=\"step\", lw=2, label=\"GNN, m={0:.0f}\".format(np.mean(df_blocks_gnn[\"max_block_size_pred\"])));\n",
" plt.hist(df_blocks[\"max_block_size_true\"], bins=b, histtype=\"step\", lw=2, label=\"True blocks, m={0:.0f}\".format(np.mean(df_blocks[\"max_block_size_true\"])));\n",
" plt.xscale(\"log\")\n",
" plt.legend(frameon=False)\n",
Expand All @@ -134,12 +137,12 @@
"fl = glob.glob(\"../data/NuGun_run3/step3*.pkl\")\n",
"df_blocks = get_df(fl, \"blocks\")\n",
"df_blocks_dummy = get_df(fl, \"blocks_dummy\")\n",
"df_blocks_glue = get_df(fl, \"blocks_glue\")\n",
"df_blocks_clue = get_df(fl, \"blocks_clue\")\n",
"\n",
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_glue, \"NuGun-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_glue, \"NuGun-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_glue, \"NuGun-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_glue, \"NuGun-Run3\")"
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"NuGun-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"NuGun-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"NuGun-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"NuGun-Run3\")"
]
},
{
Expand All @@ -151,12 +154,12 @@
"fl = glob.glob(\"../data/QCD_run3/step3*.pkl\")\n",
"df_blocks = get_df(fl, \"blocks\")\n",
"df_blocks_dummy = get_df(fl, \"blocks_dummy\")\n",
"df_blocks_glue = get_df(fl, \"blocks_glue\")\n",
"df_blocks_clue = get_df(fl, \"blocks_clue\")\n",
"\n",
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_glue, \"QCD-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_glue, \"QCD-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_glue, \"QCD-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_glue, \"QCD-Run3\")"
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"QCD-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"QCD-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"QCD-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"QCD-Run3\")"
]
},
{
Expand All @@ -168,12 +171,13 @@
"fl = glob.glob(\"../data/TTbar_run3/step3*.pkl\")\n",
"df_blocks = get_df(fl, \"blocks\")\n",
"df_blocks_dummy = get_df(fl, \"blocks_dummy\")\n",
"df_blocks_glue = get_df(fl, \"blocks_glue\")\n",
"df_blocks_clue = get_df(fl, \"blocks_clue\")\n",
"df_blocks_gnn = get_df(fl, \"blocks_gnn\")\n",
"\n",
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_glue, \"TTbar-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_glue, \"TTbar-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_glue, \"TTbar-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_glue, \"TTbar-Run3\")"
"plot_num_blocks(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"TTbar-Run3\")\n",
"plot_block_size(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"TTbar-Run3\")\n",
"plot_block_size_histo(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"TTbar-Run3\")\n",
"plot_precision_recall(df_blocks, df_blocks_dummy, df_blocks_clue, df_blocks_gnn, \"TTbar-Run3\")"
]
},
{
Expand Down Expand Up @@ -431,9 +435,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
Loading