New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix BranchChildren/Parentage problems that occur with SubProcess and EDAlias #17950
Conversation
…EDAlias Also adds unit tests to cover these cases.
In the previous commit, I added pruning of BranchChildren. It was a nice little algorithm and I wanted to get it into the git history. But after I wrote it I realized one cannot prune with this simple algorithm with secondary file input.
A new Pull Request was created by @wddgit (W. David Dagenhart) for master. It involves the following packages: DataFormats/Provenance @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here #13028 |
please test |
The tests are being triggered in jenkins. |
-1 Tested at: 6cb2b78 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests
I found errors in the following unit tests: ---> test TestSubProcess had ERRORS |
Comparison job queued. |
Comparison is ready There are some workflows for which there are errors in the baseline: |
Please test |
The tests are being triggered in jenkins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with my review.
@@ -45,6 +45,7 @@ namespace edm { | |||
|
|||
ProductProvenance const* productProvenance() const; | |||
BranchID const& branchID() const {return stable().branchID();} | |||
BranchID const& originalBranchID() const {return stable().originalBranchID();} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining the difference between branchID
and originalBranchID
@@ -36,6 +38,7 @@ namespace edm { | |||
preg_(new SignallingProductRegistry(preg)), | |||
branchIDListHelper_(new BranchIDListHelper), | |||
thinnedAssociationsHelper_(new ThinnedAssociationsHelper), | |||
subProcessParentageHelper_(new SubProcessParentageHelper), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about switching to std::make_shared<>()
? That is actually more memory efficient
@@ -141,6 +144,7 @@ namespace edm { | |||
*preg_, | |||
*branchIDListHelper_, | |||
*thinnedAssociationsHelper_, | |||
subProcessParentageHelper_ ? &*subProcessParentageHelper_ : nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably simpler to do subProcessParentageHelper_ ? subProcessParentageHelper_.get() : nullptr
IOPool/Input/src/RootFile.h
Outdated
@@ -227,7 +227,7 @@ namespace edm { | |||
std::shared_ptr<LuminosityBlockAuxiliary> fillLumiAuxiliary(); | |||
std::shared_ptr<RunAuxiliary> fillRunAuxiliary(); | |||
std::string const& newBranchToOldBranch(std::string const& newBranch) const; | |||
void markBranchToBeDropped(bool dropDescendants, BranchID const& branchID, std::set<BranchID>& branchesToDrop) const; | |||
void markBranchToBeDropped(bool dropDescendants, BranchDescription const& branchy, std::set<BranchID>& branchesToDrop, std::map<BranchID, BranchID> const& droppedToKeptAlias) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
branchy
seems like an odd name
@davidlange6 as far as David Dagenhart and I can tell, the unit test failure is from the problem fixed by #17946 and not from this pull request. |
please test
ok, lets test again
… On Mar 20, 2017, at 3:22 PM, Chris Jones ***@***.***> wrote:
@davidlange6 as far as David Dagenhart and I can tell, the unit test failure is from the problem fixed by #17946 and not from this pull request.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The tests are being triggered in jenkins. |
Comparison job queued. |
+1 |
The intent is that dropOnInput removes entries in the ProductRegistry in addition to dropping the products. The recent commit inadvertently disabled that for branches that were already dropped. And also this exposed the fact that GetterOfProducts requires dictionaries for all products in the ProductRegistry, even the dropped ones. And this should not be necessary and wastes a little CPU, so that is fixed also.
Fix problem introduced by recent PR #17950 related to dropOnInput
Hi @wddgit, can this be backported to 90x? This problem causes a segfault in IOPool/Streamer modules (when using data from GEN-SIM), this PR fixes it. |
@dmitrijus We can probably backport the change, but we are interested in understanding how and why the seg fault is occurring. Can you tell me how to reproduce it so I can investigate this? If you already know exactly how the seg fault is occurring, can you explain it? Thanks. |
@wddgit The crash is a (double) null-pointer dereference in this line, |
@Dr15Jones I reproduced the problem. If you write a file containing an EDAlias to a ROOT format output file and then in a subsequent process you read that file and try to run the streamer output module, then a seg fault occurs. The problem is indeed in the line he pointed to in StreamSerializer.cc. It successfully get the product using the EDAlias, then using the EDAlias BranchID tries to get the ProductProvenance, but that is stored keyed on the original BranchID not the EDAlias BranchID so it fails and returns a nullptr. The streamer output module does not check the pointer and dereferences and there is a seg fault. This was fixed by this PR in 9_1_X, but not backported to 9_0_X I reproduced the error running a Framework unit test to make the file with the EDAlias and the following to read it and seg fault. It has nothing to do with the DQM PR.
Should I backport the entire PR (and the other related PR that followed) to 9_0_X? This has probably been there for a while. It is somewhat unusual to run streamer output on a file containing an EDAlias and that is probably why no one noticed before. |
I did not try it but this might also fail if you dropped the parentage and tried to run streamer output. So it could be looked at as also a bug in streamer output. |
Let's start with a fix to the Streamer. Both in 9_1 and 9_0. |
I'm working on fixing the Streamer IO. It actually is not built to handle a nullptr for the ProductProvenance at all so this is more than a one or two line fix. I'll need to trace this all the way through streamer input and output and then make several little fixes in the entire streamer input and output chain to make it work. |
@wddgit Thanks! |
@dmitrijus I submitted pull requests for the Streamer Output Module problem. PR #18126 was already merged into 9_1_X and PR #18135 is the back port to 9_0_X which was just submitted (not merged yet). These should be sufficient to avoid the seg fault you saw. |
Also adds unit tests to cover these cases.