Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements and modifications in the display app #44

Merged
merged 3 commits into from
Oct 3, 2023

Conversation

mchiusi
Copy link
Collaborator

@mchiusi mchiusi commented Jun 19, 2023

Major changes in the webApp (deployed here):

  • display code:

    • scintillator tiles have been added in the 3D app. The trigger cells (TCs) geometry of the entire calorimeter is now complete: pions and taus can be studied.
    • a slider has been added in the display app in order to control live the clustering radius. Currently the app is using the reconstruction chain implemented in CMSSW (default reconstruction chain in the code) which in our framework has been translated in Python. In the future the new clustering algorithm, which includes the new coarse seeding and the seeding algorithms, can be easily implemented.
    • the new switch button layer selection allows to visualise specific portions of the calorimeter
    • the new coarse seeding algorithm can be run through the application and the coarse seeds visualised on screen. More details about it can be found here.
    • in addition to the energy scale set by default in the display app, it is possible to visualise with discrete colours the different seeds found by the reconstruction chain event by event. This functionality becomes particularly interesting when 200 pile-up photon or tau samples are used.
    • the web page layout has been updated
  • geometry.py:

    • a new plotly friendly format for the geometry can be chosen for the geometry dataframe, as alternative to bokeh.
    • added the function _display_sci_cartesian to calculate, as it is currently done for silicon, the position of the scintillator tiles in cartesian coordinates. This allows to use the same function in the display code to plot both silicon cells and tiles.
    • minor changes
  • cluster.py:

    • the clustering function has been improved, adding a second output: a collection - in natural coordinates - of the clustered trigger cells.
  • data handling:

    • in data_process.py, added a new method to save events after processing. Currently, to speed up the event processing through the reconstruction chain, each event from .parquet file is pre-processed and saved in a .hdf file. When a random event is requested from the display app the .hdf file is read. The display process lasts less than 1 second. If a particular event requested by the user is not yet pre-processed, the code finds the event in the parquet file and processes it live. The entire process lasts ~3 seconds.
    • The pre-processed events are saved in the .hdf file using their event number as primary key and the clustering radius as secondary key, i.e. the HDF keys are in the form of <event_number> / <cluster_radius> .
  • production:

    • added new genparticle_pt branch in the skimming file (produce.py).

The README.md has been updated especially in the section 3D display app

The new look of the web application. A 0 pile-up pion is shown. Scintillator tiles are clearly visible.
image
200 pile-up photon: a large cut (in cartesian coordinates) around the generated particle has been applied. Multiple seeds in different colours shown.
image
The same 200 pile-up photon analysed with different clustering radii.
image
image

@bfonta
Copy link
Owner

bfonta commented Jun 20, 2023

@mchiusi Thank you for the PR. Nice to see that the event displays we've been using for quite some time already will be part of master. Some general comments (more detailed ones will come after merging #43 to make sure both PRs are consistent):

  • Could you clean a bit your git history?
  • The slider that controls the clustering radius changes one parameter, while its visualization gives the idea that it changes an interval. Could you modify the way you control the clustering radius so that it is clear we are changing a single number? On the other hand, this slider type works very well for the layer selection.
  • I suppose you could update the "ROI" label to "CS" or similar?
  • The ROIs can be selected also for the pion dataset, but nothing happens. Could you add a check when no ROIs are available?
  • I see no modifications in tasks/roi.py. Are you using the same version as in the master branch?
  • You changed tasks/cluster.py: 1) It is tricky, as you have to make sure it does not break existing code, which will happen since you've changed what the function returns 2) The "new" information you are returning is available in the output files of the task 3) I agree that it would be nice to also return outputs directly from one function to another instead of relying always on the output files, but that will have to be done for all the tasks for consistency (to allow chained calls like cluster(seed(roi()))), not just for one
  • As a suggestion, you could add a quick_random() function that picks an event randomly from the pool of events that were already processed (in the .hdf5 file) when the user is impatient
  • Please clarify in the PR what you mean by "primary" and "secondary" keys

Copy link
Owner

@bfonta bfonta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second review

Regarding you PR text, please fix the following:

  • Please include a reply to my previous comments in this thread for future clarity
  • At least some of my comments seem not to be implemented in the web app that you link; can you please update the version shared via *.app.cern.ch (and the plots in the PR description)?
  • The scintillator tiles were actually added in Scintillator geometry #26, and the geometry was already complete; what you did was to include them in the 3D visualization, which is very different from a technical point of view
  • "the new coarse seeding algorithm has been implemented in the app": In reality no algorithm is implemented in the app; the app takes advantage of the existing algorithms, which are completely independent from the app. Could you please rephrase?
  • "the graphic has been updated": what exactly? Same for "minor issues have been fixed", if you do not list what you did is simpler just to remove them. All your modifications can anyways always be seen in the commit diffs.
  • "added the possibility to chose between bokeh and plotly" The mechanism to support more libraries was already present. You could say that you added support for plotly.
  • In the first bullet point of "data handling.py", it is not clear to me from the text what you added and what was already available

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
bye_splits/data_handle/data_process.py Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
config.yaml Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
Copy link
Owner

@bfonta bfonta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third review

These are almost all very minor comments, and we can merge it as soon as you address them. Could you just provide an answer to previous review comments?

bye_splits/data_handle/data_process.py Outdated Show resolved Hide resolved
bye_splits/data_handle/data_process.py Outdated Show resolved Hide resolved
bye_splits/data_handle/data_process.py Outdated Show resolved Hide resolved
bye_splits/data_handle/data_process.py Outdated Show resolved Hide resolved
bye_splits/data_handle/data_process.py Show resolved Hide resolved
bye_splits/tasks/cluster.py Outdated Show resolved Hide resolved
bye_splits/tasks/roi.py Outdated Show resolved Hide resolved
bye_splits/tasks/roi.py Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
@mchiusi
Copy link
Collaborator Author

mchiusi commented Oct 2, 2023

@mchiusi Thank you for the PR. Nice to see that the event displays we've been using for quite some time already will be part of master. Some general comments (more detailed ones will come after merging #43 to make sure both PRs are consistent):

  • Could you clean a bit your git history?

I have done it uploading the second set of comments. At the moment commits are organised in macro-arguments.

  • The slider that controls the clustering radius changes one parameter, while its visualization gives the idea that it changes an interval. Could you modify the way you control the clustering radius so that it is clear we are changing a single number? On the other hand, this slider type works very well for the layer selection.

Thanks for the suggestion, now the slider has been set so that continuous values are excluded (included = False)

  • I suppose you could update the "ROI" label to "CS" or similar?

In the display code I remove every "ROI" label. Should I also change the name to task/roi.py?

  • The ROIs can be selected also for the pion dataset, but nothing happens. Could you add a check when no ROIs are available?

Now when pion events are displayed the CS switch is removed.

  • I see no modifications in tasks/roi.py. Are you using the same version as in the master branch?

I was calculating the ROI in the display code. Now I moved everything in tasks/roi.py and I have updated the code.

  • You changed tasks/cluster.py: 1) It is tricky, as you have to make sure it does not break existing code, which will happen since you've changed what the function returns 2) The "new" information you are returning is available in the output files of the task 3) I agree that it would be nice to also return outputs directly from one function to another instead of relying always on the output files, but that will have to be done for all the tasks for consistency (to allow chained calls like cluster(seed(roi()))), not just for one

tasks/cluster.py now returns the same output as before, since the additional information I have added is saved in files by the same function.

  • As a suggestion, you could add a quick_random() function that picks an event randomly from the pool of events that were already processed (in the .hdf5 file) when the user is impatient

If the .hdf5 file is present in the local folder, automatically the coed picks an event randomly from it. Why should I add another function to provide an event?

  • Please clarify in the PR what you mean by "primary" and "secondary" keys

Now it should be clearer.

@mchiusi
Copy link
Collaborator Author

mchiusi commented Oct 2, 2023

Second review

Regarding you PR text, please fix the following:

  • Please include a reply to my previous comments in this thread for future clarity
  • At least some of my comments seem not to be implemented in the web app that you link; can you please update the version shared via *.app.cern.ch (and the plots in the PR description)?

Yes, now it has been implemented. The code has been slightly modified so that the input files are all stored in my eos repository. It has in the webApp branch here

  • The scintillator tiles were actually added in Scintillator geometry #26, and the geometry was already complete; what you did was to include them in the 3D visualization, which is very different from a technical point of view

I think it is clear from the PR what I did.

  • "the new coarse seeding algorithm has been implemented in the app": In reality no algorithm is implemented in the app; the app takes advantage of the existing algorithms, which are completely independent from the app. Could you please rephrase?

Answer in the previous set of comments. Now the code has been moved to tasks/roi.py and updated.

  • "the graphic has been updated": what exactly? Same for "minor issues have been fixed", if you do not list what you did is simpler just to remove them. All your modifications can anyways always be seen in the commit diffs.

Removed.

  • "added the possibility to chose between bokeh and plotly" The mechanism to support more libraries was already present. You could say that you added support for plotly.

Thanks.

  • In the first bullet point of "data handling.py", it is not clear to me from the text what you added and what was already available

Previously, there was no capability to save events after processing. I have now added this feature along with the mechanism for handling post-processed data. The existing code for searching events in the parquet file has been retained. Please let me know if I have to update the PR, it seems clear to me.

@bfonta
Copy link
Owner

bfonta commented Oct 2, 2023

Thanks for the answers above.

In the display code I remove every "ROI" label. Should I also change the name to task/roi.py?

I suppose you mean changing it to task/coarse_seeding.py (or similar), and I think that is a good thing to do. Just remember to update all the imports as well.

All the rest is good 👍

bye_splits/tasks/cluster.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@isehle isehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like this is going to bring some great new functionality, thanks for the work! Mostly just a couple of minor comments, though I think we should discuss the minor differences between run_radii_chain and run_cluster-size because they are essentially duplicate files with minor differences. We can likely find a way to extend the functionality of the latter to include the changes included in the former.

@mchiusi
Copy link
Collaborator Author

mchiusi commented Oct 2, 2023

Thanks for the answers above.

In the display code I remove every "ROI" label. Should I also change the name to task/roi.py?

I suppose you mean changing it to task/coarse_seeding.py (or similar), and I think that is a good thing to do. Just remember to update all the imports as well.

I've just realised that I could change the name task/roi.py to task/coarse_seeding.py, but then I should also change all the variables linked in some way to the roi, which are tons... It could be a bit messy (of course, I'd commit the code and create another branch to save the working code). Do you think it's worth it?

All the rest is good 👍

@mchiusi
Copy link
Collaborator Author

mchiusi commented Oct 3, 2023

Thanks for the answers above.

In the display code I remove every "ROI" label. Should I also change the name to task/roi.py?

I suppose you mean changing it to task/coarse_seeding.py (or similar), and I think that is a good thing to do. Just remember to update all the imports as well.

I've changed the names of some files and all the variables related to the ROI. Now the two chains are called:

  • default chain
  • new chain

I have tested both my display app and the run_meta.py and they works well. Let me know if there are inconsistencies in the code due to many changes I have done.

All the rest is good 👍

@mchiusi
Copy link
Collaborator Author

mchiusi commented Oct 3, 2023

Third review

These are almost all very minor comments, and we can merge it as soon as you address them. Could you just provide an answer to previous review comments?

All your comments are now included in the latest commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants