-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout Xrootd requests at the CMSSW-level #18440
Comments
A new Issue was created by @bbockelm Brian Bockelman. @davidlange6, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign core |
New categories assigned: core @Dr15Jones,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
@bbockelm I'm confused, if the What am I missing? |
@Dr15Jones - yup, that's almost exactly the second option outlined. Currently, there is no callback-specific object passed around (instead, the callback is given a pointer to the long-lived I think this is acceptable since there are other places that allocate per-request, so we're already paying the cost of per-IO heap allocations. |
@bbockelm I'm still confused. If the callback is given a pointer to the 'long-lived' |
is this still an open issue? |
We currently rely on Xrootd's timeout mechanism -- issues with the current release (xrootd 4.5.0) have shown that this isn't 100% reliable.
We don't have our own mechanism because the Xrootd callbacks aren't cancellable -- they might fire at any time after we've given up, even after we've destroyed the corresponding object.
There were two ideas bounced around:
RequestManager
object. The (eventual) Xrootd callback will access the zombie object. Since the timeout/failure will generate an CMSSW exception for a read failure, the job is likely going to fail anyway -- no harm in leaking!.RequestManager
object, create a new callback-specific object on the heap. The callback-specific object would hold aweak_ptr
to the originalRequestManager
. We proceed with the callback only if theweak_ptr
can be materialized to a strong pointer.The text was updated successfully, but these errors were encountered: