New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable Xrootd's read recovery mechanism. #18128
Disable Xrootd's read recovery mechanism. #18128
Conversation
CMSSW and Xrootd both implement a retry mechanism, one ontop of the other. We are getting reports from the IBs that Xrootd's retry mechanism is possibly causing the file-close callbacks to be invoked twice (and the second time it is done against a deleted object, causing a SIGSEGV). This work attempts to see if the problem can be avoided by relying solely on CMSSW's recovery mechanisms.
A new Pull Request was created by @bbockelm (Brian Bockelman) for master. It involves the following packages: Utilities/XrdAdaptor @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here #13028 |
…102) Valgrind detected invalid reads/writes of 4 bytes related to XrdSysMutexHelper. Discussions: cms-sw/cmssw#18102 Fixed upstream by: ae42952eb87b175159fe442369b34fa96d432be5 Alternative would be to set ReadRecovery to false within XrdRequestManager. See: cms-sw/cmssw#18128 Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
please test |
The tests are being triggered in jenkins. |
What is the status of the various ways to correct the problem? |
xrootd patch was merged a few days ago thus technically this is no more needed (unless you want it). |
Comparison job queued. |
@bbockelm is this needed anymore? |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @Muzaffar, @davidlange6, @smuzaffar |
CMSSW and Xrootd both implement a retry mechanism, one on top of the other. We are getting reports from the IBs that Xrootd's retry mechanism is possibly causing the file-close callbacks to be invoked twice (and the second time it is done against a deleted object, causing a
SIGSEGV
). This work attempts to see if the problem can be avoided by relying solely on CMSSW's recovery mechanisms.