-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start EndPaths at the same time as we start Paths #21035
Conversation
The code-checks are being triggered in jenkins. |
+code-checks |
A new Pull Request was created by @Dr15Jones (Chris Jones) for master. It involves the following packages: FWCore/Framework @cmsbuild, @smuzaffar, @Dr15Jones can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The tests are being triggered in jenkins. |
+1 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
The differences in the comparisons are strictly from the MessageLogger. This is to be expected since more modules are potentially run (since stuff on EndPath can start) before the ErrorLogger module gets run and therefore more Errors/Warnings can be captured. |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar (and backports should be raised in the release meeting by the corresponding L2) |
Pull request #21035 was updated. @cmsbuild, @smuzaffar, @Dr15Jones can you please check and sign again. |
@davidlange6 wrote
I'd call them a side step? Because the LogErrorHarvester is reading from a global while multiple threads are writing to it, exactly what it records can be different for each run of the same job. The latest change to LogErrorHarvester (which was merged a while ago) makes sure that the harvester at least waits for the specified modules to finish before reading the log. However, if other modules not on the 'wait' list write to the logger, there is no guarantee on if their messages will also get harvested. This is the main reason I've never been happy with the LogErrorHarvester as a concept. |
please test |
The tests are being triggered in jenkins. |
+1 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
@slava77 The following only had changes in the number of reports from MessageLogger: The following saw the number of entries in the following be different: The following saw a 1-3 tau items shift phi and/or eta workflow 136.788, in addition to the above, also saw a change in If this pull request had a bug, I would have expected catastophic failures (e.g. whole modules not being run in correct order) not a few bins changing in histograms. This pull request can change the order in which non-dependent modules would be run. So it is possible that there is an unexpected side-effect of module run order related to the changed histograms. One item could be random numbers. If a module used random numbers but did not re-seed the generator at the beginning of each event then it would be dependent on some other module running ahead of it that did set the seed. Another item would be memory would shift around so if we had an out of memory write it could affect a different memory address with this pull request. |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar (and backports should be raised in the release meeting by the corresponding L2) |
On 10/30/17 5:42 PM, Chris Jones wrote:
@slava77 <https://github.com/slava77>
I looked at the reported RECO changes.
The following only had changes in the number of reports from MessageLogger:
workflows: 4.53, 20434.0, 25202.0, 1000.0, 4.22, 1306.0, 25.0, 1330.0,
20034.0
I suppose this is OK
The following saw the number of entries in the following be different:
/DQMData/Run 1/PixelPhase1V/Run summary/TrackingParticle
workflows: 10024, 11624.0, 10824.0, 10224.0
The following saw a 1-3 tau items shift phi and/or eta
/DQMData//HLT/Run summary/TAU/Inclusive/L1
/DQMData///L1T/Run summary/L1TStage2CaloLayer2/Isolated-Tau
workflows: 136.788, 10224.0, 136.731
workflow 136.788, in addition to the above, also saw a change in
/DQMData/Run 297557/L1T/Run summary/L1TStage2CaloLayer2/NonIsolated-Tau
IIRC, PixelPhase1V and L1 tau have occasional random issue with
reproducibility.
So, these could be ignored
…
If this pull request had a bug, I would have expected catastophic
failures (e.g. whole modules not being run in correct order) not a few
bins changing in histograms. This pull request can change the order in
which non-dependent modules would be run. So it is possible that there
is an unexpected side-effect of module run order related to the changed
histograms. One item could be random numbers. If a module used random
numbers but did not re-seed the generator at the beginning of each event
then it would be dependent on some other module running ahead of it that
did set the seed. Another item would be memory would shift around so if
we had an out of memory write it could affect a different memory address
with this pull request.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21035 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbvN_1CVY07pNAcTxnyGcdWBvbNCzks5sxeA6gaJpZM4QIB6n>.
|
@Slava I found a memory problem in SiPixelPhase1TrackingParticleV which could explain the randomness |
merge |
Starting EndPaths as the same time as Paths gives the framework more concurrent tasks to use.
If a module on the EndPath depends on the present process' TriggerResults, such as an OutputModule with a SelectEvents configuration, the regular data dependency system will guarantee the proper run order of the modules.