Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce general statistics and plotting classes + Addressing comments on the u/v seeding #43

Merged
merged 3 commits into from
Jun 26, 2023

Conversation

bfonta
Copy link
Owner

@bfonta bfonta commented May 29, 2023

This PR addresses all comments made to my first presentation on the new seeding on (U,V) coordinates, namely:

  • use Clopper-Pearson intervals for efficiency errors instead of Poisson
  • add TCs to efficiency plots but cutting the noise
    • I've applied a cylinder around the center of the shower
  • visualize resolution distributions also with violin plots
  • calculate the seeding efficiency having as reference the trigger cells around the generated particle, skipping the ROI step. This allows to factor the ROI out of seeding efficiency measurements.

Additionally, the PR includes a new Collector and ChainPlotter classes which will hopefully represent the first step towards unifying the efficiency/resolution statistics collection and plots which are common to all our efforts. If possible, I would very much like most plotting being done through these classes, which would slowly evolve depending on the need of everybody. For instance, I would be happy to add other libraries (besides bokeh). The rationale is to avoid duplicating code when we all do similar plots, such as efficiencies, binned distributions, ...

A new "meta" chain script was added to run individual chains and plots results of multiple chains together, for instance:

Screenshot from 2023-06-01 18-21-33

Finally, I've added the number of events as an additional command line argument; this ensures all intermediate files stored are different for different number of input events, which is both convenient and safe (no confusion can arise in case one decides to run only part of the chain for a specific study).

As an example, the meta chain can be run as follows:

python bye_splits/scripts/run_meta.py --nevents 4000

Many flags are there for fine-grained running. As an example, to do the data collection and plotting steps a second time without reprocessing the same data:

python bye_splits/scripts/run_meta.py --nevents 4000 --no_roi --no_seed --no_cluster --no_smooth --no_fill

@bfonta bfonta added the enhancement New feature or request label May 29, 2023
@bfonta bfonta marked this pull request as ready for review June 1, 2023 16:13
@bfonta bfonta changed the title Common plotter class + addressing comments on new seeding method Introduce general statistics and plotting classes + Addressing comments on the u/v seeding Jun 6, 2023
bye_splits/data_handle/data_process.py Show resolved Hide resolved
bye_splits/tasks/validation.py Show resolved Hide resolved
bye_splits/tasks/validation.py Show resolved Hide resolved
bye_splits/tasks/validation.py Show resolved Hide resolved
Copy link
Collaborator

@isehle isehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work you put into this! Besides my one question in cluster.py, this should all be fine for me. Would you also mind putting an example of how we can use the new plotting class in the PR? I.e., a command to run and its output, for example.

bye_splits/tasks/cluster.py Show resolved Hide resolved
@bfonta
Copy link
Owner Author

bfonta commented Jun 20, 2023

@isehle See scripts/run_meta.py, which should exemplify how to use the two new classes.
Please note that I am aware there are bits in the new classes that are not general, and some adaptations will likely be needed as more people use them and as we understand our needs. If there is something you do not like or think could be improved, now is the moment, since all of us will hopefully be using these from now on!

@bfonta bfonta requested a review from mchiusi June 23, 2023 08:21
Copy link
Collaborator

@mchiusi mchiusi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really looks great! I did fetch your branch to clarify some critical points - especially in the validation - by running the code in local.
The display code in written using bokeh seems at least to me much more convoluted that the same code in other libraries. Anyway I tested and it works.

@bfonta
Copy link
Owner Author

bfonta commented Jun 26, 2023

@mchiusi I am not enforcing any library, you can use the one you prefer. I would just like that all the code right before plotting is the same for all of us. We can easily add some bokeh/plotly choice later on, as we are already doing in the geometry output.

@bfonta
Copy link
Owner Author

bfonta commented Jun 26, 2023

I am keeping the three commits for clarity, since they are completely independent.

@bfonta bfonta merged commit 7d4e13a into main Jun 26, 2023
@bfonta bfonta deleted the fix/comments12Apr2023 branch June 26, 2023 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants