Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate usage of XRD_PARALLELEVTLOOP #7709

Closed
eguiraud opened this issue Mar 26, 2021 · 6 comments
Closed

Investigate usage of XRD_PARALLELEVTLOOP #7709

eguiraud opened this issue Mar 26, 2021 · 6 comments

Comments

@eguiraud
Copy link
Member

eguiraud commented Mar 26, 2021

Explain what you would like to see improved

During PPP97 Josh brought up that xrootd checks the environment variable XRD_PARALLELEVTLOOP to decide how many event loops to spawn, and his multi-thread use-case benefited from setting it to a value higher than one.

It was agreed that we should investigate whether this is something interesting for RDF/DistRDF or other multi-thread use cases in ROOT.

Additional context

The variable is mentioned in the xrdcp man page, see e.g. here.

@github-actions github-actions bot added this to Needs triage in Triage Mar 26, 2021
@stwunsch
Copy link
Contributor

This is also discussed in an xrootd issue here: xrootd/xrootd#1425

@eguiraud eguiraud removed this from Needs triage in Triage Apr 13, 2021
@vepadulano
Copy link
Member

First simple tests:

XRD_PARALLELEVTLOOP=4

In theory this should use 4 threads, but there are 10 instead

$ XRD_PARALLELEVTLOOP=4 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
[784MB/2.09GB][ 36%][==================>                               ][11.04MB/s]
$ ps aux | grep xrdcp
vpadulan    2875 14.5  0.4 698364 77920 pts/0    Sl+  12:15   0:03 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
$ ps hH p 2875 | wc -l
10

XRD_PARALLELEVTLOOP=1

This should use 1 thread, I see 7

$ XRD_PARALLELEVTLOOP=1 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
[184MB/2.09GB][  8%][====>                                             ][10.82MB/s]
$ ps aux | grep xrdcp
vpadulan    3000 20.0  0.2 608092 46488 pts/0    Sl+  12:18   0:00 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
$ ps hH p 3000 | wc -l
7

XRD_WORKERTHREADS=1 XRD_PARALLELEVTLOOP=1

I have found another environment variable in the xrootd docs https://xrootd.slac.stanford.edu/doc/xrdcl-docs/xrdcldocs.pdf described as "Number of threads processing user callbacks." with default value 3 . Setting both variables to 1 leads to 5 threads

$ XRD_WORKERTHREADS=1 XRD_PARALLELEVTLOOP=1 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
[192MB/2.09GB][  8%][====>                                             ][10.67MB/s]
$ ps aux | grep xrdcp
vpadulan    3036 17.3  0.2 460628 48240 pts/0    Sl+  12:21   0:00 xrdcp root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root .
$ ps hH p 3036 | wc -l
5

So for now:

  1. Setting XRD_PARALLELEVTLOOP=1 makes the xrdcp process use 7 thread, of which 3 are explicable by the default value of XRD_WORKERTHREADS, 1 is the event loop, but I still can't reason about the other 3 threads.
  2. The two variables seem to be independently adding threads to the xrdcp process when they are increased.

@simonmichal
Copy link

Just a short summary of our discussion in xrootd/xrootd#1495

  1. Threads in xrootd client:
  • XRD_PARALLELEVTLOOP is by default set to 1, and is the number of event-loop threads handling the async I/O; in some cases e.g. if the xrootd client is interacting with many servers (as it does in case of XCache) a single event loop can become CPU bound and in those scenarios it makes sense to use multiple event-loops
  • XRD_WORKERTHREADS is by default set to 3, and is the number of threads in the thread-pool used to call completion handlers
  • there is also the TaskManager thread, which runs various timers, amongst others responsible for the request timeouts
  • in case of xrdcp there is also the main execution thread
  1. XRD_PARALLELEVTLOOP
    The XRD_PARALLELEVTLOOP, it is the parallel number of event loops the client is using. In case of single event loop, all socket I/O events are processed by a single thread, in general this is good because we avoid context switching (as opposed to synchronous I/O). However in some cases this can lead to a situation where the client becomes CPU bound. For example imagine xrdcp is copying data between two very fast servers (say 100GE, with ramdisk or optane). In a setup like this the event-loop will be receiving new I/O events faster than it is able to process them and as a results will limit the transfer rate. If we use 2 event-loops on the other hand, the source and the destination I/O events will be handled by separate threads/event-loops which could result in 2x faster I/O event processing (I measured 2.5GB/s vs 4.5GB/s). Similar effect could be also observed if your application is using XRootD client to fetch data in parallel from multiple locations.

@dpiparo
Copy link
Member

dpiparo commented Nov 30, 2023

Hello. This feature is now leveraged by the newest xrootd client versions, including the one that comes with ROOT. See for example this PPP meeting. The item was investigated, judged interesting and now the feature available by default to every xrootd user, including ROOT.

For this reason, @amadio @vepadulano, I propose to close the item: would you agree?

@amadio
Copy link
Member

amadio commented Nov 30, 2023

Yes, I think this can be closed. I'd just like to remark that after the Xcache server was reconfigured as a cluster internally (my suggestion, but done by David), then the performance with Xcache is now also good enough with MT, as the bottleneck was on the server side due to it being a single machine and the way connections are handled. No fake client trick should be required anymore!

@dpiparo
Copy link
Member

dpiparo commented Nov 30, 2023

Yes, @amadio , that was a nice intuition of yours. The comment also goes beyond this particular item and shows we really explored the topic in depth :) I am closing the issue.

@dpiparo dpiparo closed this as completed Nov 30, 2023
@guitargeek guitargeek added this to Issues in Fixed in: not applicable via automation Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants