Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have SIGTERM print a traceback #12856

Merged
merged 1 commit into from
Dec 30, 2015

Conversation

Dr15Jones
Copy link
Contributor

Added a signal handler for SIGTERM. The plan is to switch the integration builds to send SIGTERM when we run out of time and then get a traceback. This will allow us to differentiate a crash from a timeout from the log file and exit code.

Added a signal handler for SIGTERM. The plan is to switch the integration builds to send SIGTERM when we run out of time and then get a traceback. This will allow us to differentiate a crash from a timeout from the log file and exit code.
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Dr15Jones (Chris Jones) for CMSSW_8_0_X.

It involves the following packages:

FWCore/Services

@cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @wddgit, @wmtan this is something you requested to watch as well.
@slava77, @Degano, @smuzaffar you are the release manager for this.

Following commands in first line of a comment are recognized

  • +1|approve[d]|sign[ed]: L1/L2's to approve it
  • -1|reject[ed]: L1/L2's to reject it
  • assign <category>[,<category>[,...]]: L1/L2's to request signatures from other categories
  • unassign <category>[,<category>[,...]]: L1/L2's to remove signatures from other categories
  • hold: L1/all L2's/release manager to mark it as on hold
  • unhold: L1/user who put this PR on hold
  • merge: L1/release managers to merge this request
  • [@cmsbuild,] please test: L1/L2 and selected users to start jenkins tests
  • [@cmsbuild,] please test with cms-sw/cmsdist#<PR>: L1/L2 and selected users to start jenkins tests using externals from cmsdist

@Dr15Jones
Copy link
Contributor Author

please test

@Dr15Jones
Copy link
Contributor Author

+1

@cmsbuild
Copy link
Contributor

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/10409/console

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_0_X IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar

@Dr15Jones
Copy link
Contributor Author

@smuzaffar Once this gets added to CMSSW_8_0_X we should change the IB RelVals to send a SIGTERM when the jobs reach their time limit. This will allow us to easily distinguish between a timeout and a segmentation fault.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

davidlange6 added a commit that referenced this pull request Dec 30, 2015
@davidlange6 davidlange6 merged commit be91645 into cms-sw:CMSSW_8_0_X Dec 30, 2015
@Dr15Jones
Copy link
Contributor Author

@smuzaffar How hard would it be to change the IB RelVals of just CMSSW_8_0 to have timeout send SIGTERM instead of SIGSEGV?

@smuzaffar
Copy link
Contributor

@Dr15Jones , should be trivial. I will update cms-bot to use SIGTERM for 80X and SIGSEGV for reset.

@Dr15Jones Dr15Jones deleted the addSigTermHandling branch January 6, 2016 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants