Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements and modifications in the display app #44

Merged
merged 3 commits into from
Oct 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 32 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
4. [Event Visualization](#org44a4071)
1. [Setup](#orgc4a7ba6)
2. [Setup in local browser](#org288a700)
1. [1)](#orgb4acfa6)
2. [2)](#orgbafd8a5)
3. [Visualization in local browser](#org3f41a7d)
1. [2D display app](#orgb4acfa6)
2. [3D display app](#orgbafd8a5)
4. [Visualization with OpenShift OKD4](#org4164d71)
1. [Additional information](#orgab38b90)
5. [Cluster Radii Studies](#orga2b91d9)
Expand Down Expand Up @@ -40,6 +40,9 @@ This repository reproduces the CMS HGCAL L1 Stage2 reconstruction chain in Pytho
# enforce git hooks locally (required for development)
git config core.hooksPath .githooks

The user could also use [Mamba](https://mamba.readthedocs.io/en/latest/index.html), a fast and robust package manager. It is fully compatible with conda packages and supports most of conda’s commands.


<a id="dataprod"></a>
# Data production

Expand Down Expand Up @@ -163,24 +166,15 @@ Please install the following from within the `conda` environment you should have
python3 -m pip install --upgrade pip setuptools #to avoid annoying "Setuptools is replacing distutils." warning



<a id="org288a700"></a>

## Setup in local browser

Since browser usage directly in the server will necessarily be slow, we can:


<a id="orgb4acfa6"></a>

### 1)

Use LLR's intranet at `llruicms01.in2p3.fr:<port>/display`


<a id="orgbafd8a5"></a>

### 2)

Forward it to our local machines via `ssh`. To establish a connection between the local machine and the remote `llruicms01` server, passing by the gate, use:

ssh -L <port>:llruicms01.in2p3.fr:<port> -N <llr_username>@llrgate01.in2p3.fr
Expand All @@ -193,14 +187,39 @@ The two ports do not have to be the same, but it avoids possible confusion. Leav

## Visualization in local browser

In a new terminal window go to the `llruicms01` mahcines and launch one of the apps, for instance:
<a id="orgb4acfa6"></a>

### 1) 2D display app

In a new terminal window go to the `llruicms01` machines and launch one of the apps, for instance:

bokeh serve bye_splits/plot/display/ --address llruicms01.in2p3.fr --port <port> --allow-websocket-origin=localhost:<port>
# if visualizing directly at LLR: --allow-websocket-origin=llruicms01.in2p3.fr:<port>

This uses the server-creation capabilities of `bokeh`, a `python` package for interactive visualization ([docs](https://docs.bokeh.org/en/latest/index.html)). Note the port number must match. For further customisation of `bokeh serve` see [the serve documentation](https://docs.bokeh.org/en/latest/docs/reference/command/subcommands/serve.html).
The above command should give access to the visualization under `http://localhost:8080/display`. For debugging, just run `python bye_splits/plot/display/main.py` and see that no errors are raised.

<a id="orgbafd8a5"></a>

### 2) 3D display app

Make sure you have activated your `conda` environment.
conda activate <Env>

Run the following lines. With these commands, some useful packages to run the web application (e.g. `dash`, `uproot`, `awkward`, etc) will be installed in your `conda` environment:

conda install dash
python3 -m pip install dash-bootstrap-components
python3 -m pip install dash-bootstrap-templates
conda install pandas pyyaml numpy bokeh awkward uproot h5py pytables
conda install -c conda-forge pyarrow fsspec

Then go to the `llruicms01` machine (if you are indide LLR intranet) or to your preferred machine and launch:

python bye_splits/plot/display_plotly/main.py --port 5004 --host localhost

In a browser, go to http://localhost:5004/.
Make sure you have access to the geometry and event files, to be configured in `config.yaml`.

<a id="org4164d71"></a>

Expand Down
3 changes: 2 additions & 1 deletion app.sh
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
python bye_splits/plot/join/app.py --flask_port 8010 --bokeh_port 8008
python bye_splits/plot/display_plotly/main.py --host 0.0.0.0 --port 8080
#python bye_splits/plot/join/a --flask_port 8010 --bokeh_port 8008
#bokeh serve bye_splits/plot/display/ --address 0.0.0.0 --port 8080 --allow-websocket-origin=viz2-hgcal-event-display.app.cern.ch
69 changes: 36 additions & 33 deletions bye_splits/data_handle/data_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@
import numpy as np
import pandas as pd
import yaml
import h5py

import bye_splits
from bye_splits.utils import common

from utils import params
from data_handle.geometry import GeometryData
from data_handle.event import EventData
Expand Down Expand Up @@ -70,29 +70,34 @@ def baseline_selection(df_gen, df_cl, sel, **kw):
print("The baseline selection has a {}% efficiency: {}/{}".format(np.round(eff,2), nout, nin))
return data

def get_data_reco_chain_start(nevents=500, reprocess=False, tag='chain'):
def get_data_reco_chain_start(nevents=500, reprocess=False, tag='chain', particles='photons', pu=0, event=None):
"""Access event data."""
data_part_opt = dict(tag=tag, reprocess=reprocess, debug=True)
data_particle = EventDataParticle(**data_part_opt)
ds_all, events = data_particle.provide_random_events(n=nevents, seed=42)
# ds_all = data_particle.provide_events(events=[170004, 170015, 170017, 170014])
data_particle = EventDataParticle(particles, pu, tag, reprocess)
if event is None:
ds_all, events = data_particle.provide_random_events(n=nevents)
# ds_all = data_particle.provide_events(events=[170004, 170015, 170017, 170014])
else:
ds_all = data_particle.provide_event(event, merge=False)
events = event

if ds_all["gen"].empty:
raise RuntimeError("No events in the parquet file.")

tc_keep = {
"event": "event",
"good_tc_waferu": "tc_wu",
"good_tc_waferv": "tc_wv",
"good_tc_cellu": "tc_cu",
"good_tc_cellv": "tc_cv",
"good_tc_layer": "tc_layer",
"good_tc_pt": "tc_pt",
"good_tc_mipPt": "tc_mipPt",
"good_tc_cellu" : "tc_cu",
"good_tc_cellv" : "tc_cv",
"good_tc_layer" : "tc_layer",
"good_tc_pt" : "tc_pt",
"good_tc_mipPt" : "tc_mipPt",
"good_tc_energy": "tc_energy",
"good_tc_x": "tc_x",
"good_tc_y": "tc_y",
"good_tc_z": "tc_z",
"good_tc_eta": "tc_eta",
"good_tc_phi": "tc_phi",
"good_tc_multicluster_id": "tc_multicluster_id",
"good_tc_x" : "tc_x",
"good_tc_y" : "tc_y",
"good_tc_z" : "tc_z",
"good_tc_eta" : "tc_eta",
"good_tc_phi" : "tc_phi",
}

ds_tc = ds_all["tc"]
Expand All @@ -101,43 +106,41 @@ def get_data_reco_chain_start(nevents=500, reprocess=False, tag='chain'):

gen_keep = {
"event": "event",
"good_genpart_exeta": "gen_eta",
"good_genpart_exphi": "gen_phi",
"good_genpart_exeta" : "gen_eta",
"good_genpart_exphi" : "gen_phi",
"good_genpart_energy": "gen_en",
"good_genpart_pt": "gen_pt",
mchiusi marked this conversation as resolved.
Show resolved Hide resolved
"good_genpart_pt" : "gen_pt",
}
ds_gen = ds_all["gen"]
ds_gen = ds_gen.rename(columns=gen_keep)

cl_keep = {
"event": "event",
"good_cl3d_eta": "cl3d_eta",
"good_cl3d_phi": "cl3d_phi",
"good_cl3d_id": "cl3d_id",
"good_cl3d_eta" : "cl3d_eta",
"good_cl3d_phi" : "cl3d_phi",
"good_cl3d_id" : "cl3d_id",
"good_cl3d_energy": "cl3d_en",
"good_cl3d_pt": "cl3d_pt",
"good_cl3d_pt" : "cl3d_pt",
}
ds_cl = ds_all["cl"]
ds_cl = ds_cl.rename(columns=cl_keep)
return ds_gen, ds_cl, ds_tc

def EventDataParticle(tag, reprocess, logger=None, debug=False, particles=None):
def EventDataParticle(particles, pu, tag, reprocess, logger=None):
"""Factory for EventData instances of different particle types"""
with open(params.CfgPath, "r") as afile:
cfg = yaml.safe_load(afile)
if particles is None:
particles = cfg["selection"]["particles"]
if particles not in ("photons", "electrons", "pions"):
raise ValueError("{} are not supported.".format(particles))
defevents = cfg["defaultEvents"][particles]
defevents = cfg["defaultEvents"][f"PU{pu}"][particles]

indata = InputData()
indata.path = cfg["io"]["file" + particles]
indata.adir = cfg["io"]["dir" + particles]
indata.tree = cfg["io"]["tree" + particles]

tag = particles + "_" + tag
tag += "_debug" * debug
indata.path = cfg["io"][f"PU{pu}"][particles]["file"]
indata.adir = cfg["io"][f"PU{pu}"][particles]["dir"]
indata.tree = cfg["io"][f"PU{pu}"][particles]["tree"]

tag = particles + "_" + f"PU{pu}" + "_" + tag

return EventData(indata, tag, defevents, reprocess, logger)
23 changes: 20 additions & 3 deletions bye_splits/data_handle/event.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ def provide_event(self, event, merge=False):

ret = {}
for k in self.var.keys():
ret[k] = self._event_mask(self.cache[k], [event]).drop(["event"], axis=1)
ret[k] = self._event_mask(self.cache[k], [event])#.drop(["event"], axis=1)

if merge:
ret = functools.reduce(
Expand Down Expand Up @@ -155,10 +155,27 @@ def provide_random_events(self, n, seed=None):
return self.provide_events(events), events

def select(self):
with up.open(self.indata.path, array_cache="550 MB", num_workers=8) as f:
with up.open(self.indata.path, array_cache='550 MB', num_workers=8) as f:
tree = f[self.indata.tree_path]
total_events = tree.num_entries
allvars = set([y for x in self.var.values() for y in x.values()])
data = tree.arrays(filter_name="/" + "|".join(allvars) + "/", library="ak")

threshold_size_bytes = 1e+9 # 1 gigabyte
data = ak.Array([])
for array in tree.iterate(filter_name='/' + '|'.join(allvars) + '/', step_size='20 MB', library='ak'):
if (data.layout.nbytes + array.layout.nbytes) <= threshold_size_bytes:
data = ak.concatenate([data, array], axis=0)
else:
break

threshold = 0.1
try:
if len(data) / total_events < threshold:
print(f'[WARNING] Function select() in event.py\nThe number of events in the Parquet file is less than {threshold * 100}% compared to the events in the ROOT file.')
except ZeroDivisionError:
print("The input file is empty.")

#data = tree.arrays(filter_name='/' + '|'.join(allvars) + '/', entry_stop=5000, library='ak')
# data[self.var.v] = data.waferv
# data[self.newvar.vs] = -1 * data.waferv
# data[self.newvar.c] = "#8a2be2"
Expand Down
Loading