Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exponentially long exception messages in multi-threaded setup (ref #37392) #37579

Closed
slava77 opened this issue Apr 15, 2022 · 9 comments
Closed

Comments

@slava77
Copy link
Contributor

slava77 commented Apr 15, 2022

I observed a problem similar to #37392
with

Exception Message:
basic_string::_M_construct null not valid

in 12_4_0_pre2.
While the issue goes away in a recent IB (e.g. in CMSSW_12_4_X_2022-04-14-1100),

I have a tangential question: the exception messages in 12_4_0_pre2 are issued for every thread with increasingly long stack. In a 32-thread test the last one comes out with

   [10210] While reading from source edm::HepMCProduct generatorSmeared '' SIM
   [10211] Reading branch edmHepMCProduct_generatorSmeared__SIM.
Exception Message:
basic_string::_M_construct null not valid
----- End Fatal Exception -------------------------------------------------

Is the framework really creating such deep stacks or is this just a feature of the printout?
I have a reference setup on cmsdev34:/build/slava77/reltest/CMSSW_12_4_0_pre2-orig/test/gensim-1210p5/11634.0_TTbar_14TeV+2021/sort
there with 8-threads the last exception message is 1703-deep

@cmsbuild
Copy link
Contributor

A new Issue was created by @slava77 Slava Krutelyov.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

assign core

@cmsbuild
Copy link
Contributor

New categories assigned: core

@Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@makortel
Copy link
Contributor

Looking at cmsdev34:/build/slava77/reltest/CMSSW_12_4_0_pre2-orig/test/gensim-1210p5/11634.0_TTbar_14TeV+2021/sort/step2.mt8.log I see

  • 1st trace has depth 4
  • 2nd trace has depth 264
  • 3rd trace has depth 282
  • 4th trace has depth 119
  • 5th trace has depth 638
  • 6th trace has depth 994
  • 7th trace has depth 1347
  • 8th trace has depth 1703

The traces are full of Rethrowing an exception that happened on a different read request.

I guess what happens is something along

  • the delayed read throws an exception, one of the streams reports that
  • the same exception propagates to all other delayed read requests, and those add additional context
  • the additional context gets duplicated as well

@makortel
Copy link
Contributor

The delayed reader throws its exception on "different read request" here

std::shared_ptr<WrapperBase> RootDelayedReader::getProduct_(BranchID const& k, EDProductGetter const* ep) {
if (lastException_) {
try {
std::rethrow_exception(lastException_);
} catch (cms::Exception& e) {
e.addContext("Rethrowing an exception that happened on a different read request.");
throw;
}

The message gets extended with While reading from in

try {
//another thread could have beaten us here
setProduct(reader->getProduct(branchDescription().branchID(), &principal, mcc));
} catch (cms::Exception& e) {
throw extendException(e, branchDescription(), mcc);
} catch (std::exception const& e) {
auto newExcept = edm::Exception(errors::StdException) << e.what();
throw extendException(newExcept, branchDescription(), mcc);
}

@slava77
Copy link
Contributor Author

slava77 commented Apr 15, 2022

Do I understand this correctly that this is just a length of a message, not an actual depth of the calls?

@makortel
Copy link
Contributor

Correct, it is how the exception messages are constructed (specifically how additional context information gets added), not the depth of the actual call stack.

@slava77
Copy link
Contributor Author

slava77 commented Apr 15, 2022

Correct, it is how the exception messages are constructed (specifically how additional context information gets added), not the depth of the actual call stack.

OK, I guess I can close this issue as not an issue.
Thanks for checking/clarifying.

@Dr15Jones
Copy link
Contributor

Should be fixed by #37591

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants