Skip to content
This repository was archived by the owner on May 3, 2024. It is now read-only.

add a tries member and replace local variable with it#80

Merged
tomeichlersmith merged 7 commits intotrunkfrom
69-log-different-event-counts-in-event-header
Oct 13, 2023
Merged

add a tries member and replace local variable with it#80
tomeichlersmith merged 7 commits intotrunkfrom
69-log-different-event-counts-in-event-header

Conversation

@tomeichlersmith
Copy link
Copy Markdown
Member

@tomeichlersmith tomeichlersmith commented Oct 9, 2023

Instead of using a local variable in the Process::run production mode code, we simply use the member variable of the event header. This naturally means the number of tries it took to generate any event will be stored alongside the weight in the header.

Still To Do

  • make sure we can still read files generated with EventHeader v2 with this Framework
  • see if old Framework can read files generated with EventHeader v3
  • document more of EventHeader set/get functions with where/when to use them

instead of using a local variable in the Process::run production mode
code, we simply use the member variable of the event header. This
naturally means the number of tries it took to generate any event will
be stored alongside the weight in the header.
@tomeichlersmith tomeichlersmith linked an issue Oct 9, 2023 that may be closed by this pull request
@tomeichlersmith

This comment was marked as outdated.

add comments to some setters of importance about if downstream producers
can/should call them
@tomeichlersmith tomeichlersmith marked this pull request as ready for review October 13, 2023 15:04
Copy link
Copy Markdown
Member Author

@tomeichlersmith tomeichlersmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have a few typos I noticed when re-reading in the browser.

@tomeichlersmith tomeichlersmith marked this pull request as draft October 13, 2023 15:33
@tomeichlersmith
Copy link
Copy Markdown
Member Author

Converting to draft because I just realized that the tries are not being persisted properly 😬

this is more style-friendly and I think more understandable from a
header-reading point of view

This also contains a rewrite of the get*Parmeter functions to manually
check for existence of the input parameter name before attempting to
access it - this means the exception thrown will be one of ours and
hopefully more understandable
We cannot follow the incrementation from within the EventHeader since it
is `Clear`ed on each event (successful or not) so we return to using the
numTries mechanic currently on trunk. Besides that, we also carry the
numTries counter across the num events completed boundary by using
modulo instead of greater-than when comparing to the maximum number of
tries. This incures that the sum of the number of tries is always equal
to the configures maxEvents.
@tomeichlersmith
Copy link
Copy Markdown
Member Author

Alright, I've actually tested this with more thorough-ness now. I will need to re-do the v2/v3 interop testing since so much has changed since then, but I can at least show that the functionality of the storage of the number of tries is working.

In the context of LDMX-Software/framework-testbench@d7d0c59

We want to make sure that the sum of the number of tries in the output file is always equal to the number of events actually tried. We can do this with this branch of Framework by comparing the number of tries in the output file to the status line printed on each event processing reported by the Framework during processing. In the below, I use framework-testbench's configs, scripts, and processors.

Simple Case, no abortEvent() calls and therefore no extra tries

tom@framework-testbench:~$ fire config/produce.py 
---- LDMXSW: Loading configuration --------
---- LDMXSW: Configuration load complete  --------
---- LDMXSW: Starting event processing --------
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 15:58:07.163570000+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 15:58:07.179540000+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 15:58:07.179832000+0000)
 [ Process ] 1 : RunHeader { run: 1, detectorName: , description: 
  intParameters: 
    RandomNumberMasterSeed[prod] = 1
  floatParameters: 
  stringParameters: 
}
---- LDMXSW: Event processing complete  --------
tom@framework-testbench:~$ python3 ana/print-num-tries.py test_produce.root 
[1 1 1]

We see a single Processing printout for each event number and also 1 for each entry in the tries branch in the output file.

abortEvent() with extra tries enabled

tom@framework-testbench:~$ fire config/multitry.py 
---- LDMXSW: Loading configuration --------
---- LDMXSW: Configuration load complete  --------
---- LDMXSW: Starting event processing --------
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 15:59:40.205741000+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 15:59:40.219728000+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 15:59:40.219898001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 15:59:40.220023000+0000)
 [ Process ] 1 : RunHeader { run: 1, detectorName: , description: 
  intParameters: 
    RandomNumberMasterSeed[prod] = 1
  floatParameters: 
  stringParameters: 
}
---- LDMXSW: Event processing complete  --------
tom@framework-testbench:~$ python3 ana/print-num-tries.py test_multitry.root 
[1 2 1]

We see two Processings of Event 2 and we also see that reflected in the number of tries in the second entry of the output file.

abortEvent() without extra tries

tom@framework-testbench:~$ fire config/multitry.py 1
---- LDMXSW: Loading configuration --------
---- LDMXSW: Configuration load complete  --------
---- LDMXSW: Starting event processing --------
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:01:19.287894000+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:01:19.301534000+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:01:19.301717001+0000)
 [ Process ] 1 : RunHeader { run: 1, detectorName: , description: 
  intParameters: 
    RandomNumberMasterSeed[prod] = 1
  floatParameters: 
  stringParameters: 
}
---- LDMXSW: Event processing complete  --------
tom@framework-testbench:~$ python3 ana/print-num-tries.py test_multitry.root 
[1 2]

We see only the first two events from the prior run (as expected since we do not seed the random number generator with time).

Never Complete an Event

This is an extreme and unrealistic case but it helps make sure we do not get stuck in an infinite loop.

tom@framework-testbench:~$ fire config/never.py                                                                                                                                                             
---- LDMXSW: Loading configuration --------                                                                                                                                                                 
---- LDMXSW: Configuration load complete  --------                                                                                                                                                          
---- LDMXSW: Starting event processing --------                                                                                                                                                             
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.452524000+0000)                                                                                                                           
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.465695000+0000)                                                                                                                           
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.465803001+0000)                                                                                                                           
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.465857001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.465922001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.465991001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.466015001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.466038001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.466087001+0000)
 [ Process ] 1 : Processing 1 Run 1 Event 1  (2023-10-13 16:05:41.466124001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466154001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466178001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466203001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466228001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466253001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466278001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466302001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466327001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466350001+0000)
 [ Process ] 1 : Processing 2 Run 1 Event 2  (2023-10-13 16:05:41.466373001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466411001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466446001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466470001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466520001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466556001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466591001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466625001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466647001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466695001+0000)
 [ Process ] 1 : Processing 3 Run 1 Event 3  (2023-10-13 16:05:41.466731001+0000)
 [ Process ] 1 : RunHeader { run: 1, detectorName: , description: 
  intParameters:                                   
    RandomNumberMasterSeed[prod] = 1
  floatParameters:                                 
  stringParameters:                                
}                                                  
---- LDMXSW: Event processing complete  --------
tom@framework-testbench:~$ python3 ana/print-num-tries.py test_never.root 
[]  

We see p.maxTriesPerEvent * p.maxEvents events processed but no events in the output file.

@tomeichlersmith
Copy link
Copy Markdown
Member Author

(Editor's Note: The results here are the same as the above comment that I marked as "Outdated" but I manually copied them here to show the procedure and scripts I used from framework-testbench and to show that I did indeed double check the results were the same.)

For these notes v2 and v3 reference the version of EventHeader as defined in the ClassDef call in its header file. The actual code used for v2 is the current trunk and for v3 is the branch in this PR.

EventHeader Version Interop

Writing and reading the EventHeader within either of its versions works smoothly (as expected). The v2 files don't have a tries_ subbranch of the EventHeader while the v3 files do. It gets more interesting when a user attempts to read a file from one version with an EventHeader of a different version.

Reading v3 EventHeaders with v2 EventHeader

This could occur if a user is given a file generated by someone else. When running fire, we get the following warning message. Originally, it is printed all on one line, but I've manually separated the lines for readability.

Info in <TBranchElement::InitializeOffsets>: \
  TTree created with an older schema, \
  some data might not be copied in 'slow-cloning' mode; fast-cloning should have the correct result. \
  'tries_' is missing when constructing the branch 'EventHeader'.

ROOT correctly identifies that the older schema is there. Since our Framework operates in 'slow-cloning' (i.e. event-by-event cloning), we expect the data not present in the old version will not be copied. It then correctly points out the missing subbranch.

This is important since the file output from the Framework when reading a v3 EventHeader with v2 EventHeader will have tries_ branch listed but that branch will have zero entries.

python3 ~/ana/print-num-tries.py test_recon.root 
Traceback (most recent call last):
  File "/home/tom/ldmx/framework-testbench/ana/print-num-tries.py", line 7, in <module>
    print(f['LDMX_Events/EventHeader/tries_'].array(library='np'))
  File "/usr/local/lib/python3.10/dist-packages/uproot/behaviors/TBranch.py", line 2208, in array
    _ranges_or_baskets_to_arrays(
  File "/usr/local/lib/python3.10/dist-packages/uproot/behaviors/TBranch.py", line 3493, in _ranges_or_baskets_to_arrays
    uproot.source.futures.delayed_raise(*obj)
  File "/usr/local/lib/python3.10/dist-packages/uproot/source/futures.py", line 36, in delayed_raise
    raise exception_value.with_traceback(traceback)
  File "/usr/local/lib/python3.10/dist-packages/uproot/behaviors/TBranch.py", line 3447, in basket_to_array
    raise ValueError(
ValueError: basket 0 in tree/branch /LDMX_Events;1:EventHeader/tries_ has the wrong number of entries (expected 3, obtained 0) when interpreted as AsDtype('>i4')
    in file test_recon.root

Now, as ROOT points out, if we were to clone the EventHeader from the input file to the output file in "fast-cloning" mode (i.e. where we just copy the compressed baskets directly), then the tries_ branch would be inherited. We cannot implement "fast-cloning" from the input-file in general for two reasons:

  1. It prevents us from doing any event-wide vetos during reconstruction since the baskets have pre-defined event boundaries.
  2. For the EventHeader, we actually want to be able to mutate it by adding more parameters during reconstruction if the user desires.

For these two reasons, I don't think the funky behavior of the tries_ subbranch of EventHeader when reading a v3 file with v2 will be resolved. It will just have to be noted.

Reading v2 EventHeaders with v3 EventHeader

When running with v3 EventHeader and reading a file written with v2 EventHeader, we do not see any messages printed. Looking into the resulting output file, the EventHeader is not altered and does not have a tries_ sub-branch.

tom@framework-testbench:~$ python3 ana/print-num-tries.py test_recon.root   
Traceback (most recent call last):
  File "/home/tom/ldmx/framework-testbench/ana/print-num-tries.py", line 7, in <module>
    print(f['LDMX_Events/EventHeader/tries_'].array(library='np'))
  File "/usr/local/lib/python3.10/dist-packages/uproot/reading.py", line 2093, in __getitem__
    return step["/".join(items[i:])]
  File "/usr/local/lib/python3.10/dist-packages/uproot/behaviors/TBranch.py", line 2027, in __getitem__
    raise uproot.KeyInFileError(
uproot.exceptions.KeyInFileError: not found: 'EventHeader/tries_'

    Available keys: 'EventHeader/run_', 'EventHeader/timestamp_', 'EventHeader', 'EventHeader/weight_', 'EventHeader/intParameters_', 'EventHeader/isRealData_', 'EventHeader/stringParameters_', 'EventHeader/eventNumber_', 'EventHeader/floatParameters_'...

in file test_recon.root
in object /LDMX_Events;1

which isn't exactly what I expected to happen, but I suppose ROOT is making the conservative decision to leave the older class version in an unaltered state schema-wise.

@tomeichlersmith tomeichlersmith marked this pull request as ready for review October 13, 2023 16:18
@tomeichlersmith tomeichlersmith merged commit 9f94a01 into trunk Oct 13, 2023
@tomeichlersmith tomeichlersmith deleted the 69-log-different-event-counts-in-event-header branch October 13, 2023 16:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

log different event counts in event header

1 participant