Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Xrootd's read recovery mechanism. #18128

Merged
merged 1 commit into from Apr 19, 2017

Conversation

bbockelm
Copy link
Contributor

CMSSW and Xrootd both implement a retry mechanism, one on top of the other. We are getting reports from the IBs that Xrootd's retry mechanism is possibly causing the file-close callbacks to be invoked twice (and the second time it is done against a deleted object, causing a SIGSEGV). This work attempts to see if the problem can be avoided by relying solely on CMSSW's recovery mechanisms.

CMSSW and Xrootd both implement a retry mechanism, one ontop of the
other.  We are getting reports from the IBs that Xrootd's retry
mechanism is possibly causing the file-close callbacks to be invoked
twice (and the second time it is done against a deleted object, causing
a SIGSEGV).  This work attempts to see if the problem can be avoided
by relying solely on CMSSW's recovery mechanisms.
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @bbockelm (Brian Bockelman) for master.

It involves the following packages:

Utilities/XrdAdaptor

@cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @wddgit this is something you requested to watch as well.
@Muzaffar, @davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

@bbockelm
Copy link
Contributor Author

This is provisional work; @davidlt's testing in #18102 will help determine whether it should be merged.

davidlt added a commit to cms-sw/cmsdist that referenced this pull request Mar 31, 2017
…102)

Valgrind detected invalid reads/writes of 4 bytes related to XrdSysMutexHelper.
Discussions: cms-sw/cmssw#18102

Fixed upstream by: ae42952eb87b175159fe442369b34fa96d432be5

Alternative would be to set ReadRecovery to false within XrdRequestManager.
See: cms-sw/cmssw#18128

Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
@Dr15Jones
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 4, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/18908/console Started: 2017/04/04 14:03

@Dr15Jones
Copy link
Contributor

What is the status of the various ways to correct the problem?

@davidlt
Copy link
Contributor

davidlt commented Apr 4, 2017

xrootd patch was merged a few days ago thus technically this is no more needed (unless you want it).

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 4, 2017

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 4, 2017

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 4, 2017

@Dr15Jones
Copy link
Contributor

@bbockelm is this needed anymore?

@bbockelm
Copy link
Contributor Author

bbockelm commented Apr 12, 2017 via email

@Dr15Jones
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @Muzaffar, @davidlange6, @smuzaffar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants